Applied Language Technology

Goals and Objectives  

These learning materials consist of a two-course module that seeks to provide humanities majors with a basic understanding of language technology and the practical skills needed to apply language technology using Python. The module is intended to empower the students by showing that language technology is both accessible and applicable to research in the humanities.
 

Author(s)

Tuomo Hiippala

Role: Assistant Professor in English Language and Digital Humanities

Department of Languages

Faculty of Arts, University of Helsinki

Finland
 

Description of the Training Materials

(Sub)discipline, topic, language(s)

Topics: Language technology, digital humanities

Language: English

Keywords

language technology, digital humanities, tutorial, beginner, spaCy, Stanza, Universal Dependencies, introduction

Project URL Learning materials: https://applied-language-technology.readthedocs.io/

YouTube channel: https://www.youtube.com/c/AppliedLanguageTechnology/
CLARIN resources

The course materials build on various resources distributed through CLARIN, such as Universal Dependencies corpora. The materials refer to the CLARIN website for further study, highlighting the digital humanities course registry.

Structure and duration

The learning materials constitute a 10 ECTS module, consisting of two 5 credit courses. The materials are divided into two parts, in which each section corresponds to one week of studying.

Target audience

The learning materials are designed for audiences with no previous experience of language technology or Python, targeting especially the current generation of students who have grown up with digital media. The materials adopt a hands-on approach to maintain interest, teaching Python basics in connection with applying basic techniques in natural language processing to diverse texts. In addition, the materials are accompanied by short YouTube videos, which explain the techniques in a step-by-step manner.

Facilities required

All learning materials and their source code are available on GitHub.

The learning materials are rendered from Jupyter Notebooks, and feature a Binder integration so that anyone can launch an interactive Jupyter Notebook running in their own browser.

Format

Online learning materials and an accompanying YouTube channel.

Course(s) in which the training material was used

Working with Text in Python, 5 ECTS

Natural Language Processing for Linguists, 5 ECTS

Licence and (re)use All learning materials, including the YouTube videos, are provided with a CC BY-NC 4.0 license. Course exercises are available on request.
Creation date

October 2020

Last modification date May 17, 2021
 

Experience with Using CLARIN Resources in Teaching 

CLARIN resources are crucial, as they are used for training the language models used. However, the intended audience cannot initially engage directly with the resources during the two-course module but interact with them through models provided by Python libraries such as spaCy and Stanza. At the end of the course, however, the students should be able to load and manipulate corpora hosted by CLARIN into Python.
 

Download Information

See the course website.
 

Additional Information and Resources

The plan is to develop the learning materials into a MOOC hosted by the University of Helsinki.
 

Cite this Work

Hiippala, Tuomo (2021) Applied Language Technology: for the Humanities. In Proceedings of the Fifth Workshop on Teaching NLP. Association for Computational Linguistics.
 

Contact Information

Teachers who reuse and adapt this training material are invited to share their feedback via training@clarin.eu