Goals and Objectives
Description of the Training Materials
|(Sub)discipline, topic, language(s)||Digital Humanities|
|Keywords||“digital humanities”, “data-driven research”, “digital content processing”, “text processing”, “image processing”, “metadata”, “word embeddings”, “evaluation”, “research infrastructures”|
Teich, E. (2019). Corpus-Driven Investigation of Language Use, Variation and Change - Resources, Models, Tools CLARIN Annual Conference 2019. Leipzig, Germany. Invited talk. See videolecture.
CLARIN, DARIAH, PARTHENOS, LINDAT/CLARIAH
|Structure and duration||The course consists of ten lessons with video material and a PowerPoint presentation with the same content. Every lesson contains a practical session – either a Jupyter Notebook to work in Python or a text file with a short description of the task. Most of the practical tasks consist of running the programme and analyse the results. Although the course does not focus on programming, the code can easily be reused in individual projects.|
Students of humanities, beginner level.
Some experience in running Python code is desirable but not required.
|Facilities required||Python needs to be installed together with Jupyter Notebooks. An alternative way is to upload Jupyter Notebooks into the Google Colab platform or another online Python interpreter.|
|Course(s) in which the training material was used||This material was used in FF: PLIN064 Introduction Digital Humanities, originally with videos in Czech.|
|License and (re)use||This training material can be reused in accordance with the Creative Commons Attribution (CC BY) licence.|
|Creation date||September 2020|
|Last modification date||15 June 2021|
Experience with Using CLARIN Resources in Teaching
The CLARIN, DARIAH and PARTHENOS infrastructures were introduced during the course. Students were encouraged to use the to search for language resources and tools in the CLARIN repositories, which increased their awareness of the multimedia content (e.g. CHILDES corpus). Additionally, students browsed the catalogue of the LINDAT/CLARIN repository and learnt about the importance of persistent URLs and the license types assigned to the datasets. During the text processing sessions, students experimented with Weblicht. The students reported they were surprised how much social sciences and humanities data are already digitized, and how easily they can get particular results (e.g. OCRed or tagged text) using cloud-based services.
''The learning format based on shared Jupyter Notebooks suited me and I really enjoyed the whole subject. It would have been better for me to enroll in the course later, with more experience, however, it gave me an insight into what digital humanities actually are and what tools they use.'' (Student at Masaryk University)