Tour de CLARIN: Workshops at the Estonian Digital Humanities Conference 2017

The Centre of Estonian Language Resources (CELR) actively reaches out to local researchers in order to address the recent challenges in digital humanities, map out possible solutions and offer personalised help. CELR specialists regularly offer workshops at a variety of events and conferences that bring together digital humanists and computational experts. The annual Estonian Digital Humanities conference, which is organised by multiple CELR partner organisations (including Estonian Society for Digital Humanities, Estonian Literary Museum, Centre of Excellence in Estonian Studies and Wikimedia Estonia), is a key event attended by both Estonian and European scholars in DH and a perfect place to address the challenging issues related to the interaction of scholars within different fields.

At the 2017 conference “Open licences, open content, open data: tools for developing digital humanities”, which took place between November 1 and 3 in 2017 at the Estonian National  Museum, Aleksei Kelli held the workshop “Copyright and cultural heritage”. The workshop focused on the interaction of intellectual property (IP) and cultural heritage, with special attention given to copyright and related rights. Professor Kelli presented issues related to the free use of heritage works by public archives, museums or libraries, quotation rights and the right to use copyrighted materials for educational and research purposes.

Workshop participants were invited to send a description of a (potentially) problematic case in their research related to copyright in advance. For instance, the attending etymologists and folklorists raised the following issue related to the ambiguous legal nature of folklore. Although folklore is not inherently copyrighted, the recordings of folk songs or the retellings of stories do get protected under copyright law, in that the contributors who share folk stories or sing old folk songs retain the rights to their performance. However, in many cases, such performances are very old and often the folklorist who recorded them did not explicitly ask the contributors for their consent, as there was no such legal requirement in the past. Consequently, it is often unclear whether digitised collections of folklore can be made publicly available, although research exceptions in copyright regulations make them available for academics.

Another workshop CELR presented at this conference was the hands-on demonstration “Language annotation workflows in your browser” by Krista Liin. Estonian and foreign participants could try annotating their texts in the workflow managers Keeleliin (for Estonian) and Weblicht (built by CLARIN-D and available for several European languages) and to learn the possibilities of automatic annotation accessible through a web-browser and how to use the annotated texts in their research. Such browser-driven annotation was welcomed by the participants, all of whom worked with morphologically rich free word order languages such as Estonian. After lemmatizing different Estonian texts (i.e., standard language data and spoken language/data) with Keeleliin, participants created simple workflows and experimented with Weblicht’s easy-mode chains for tokenizing and parsing texts in English or German. Some of the participants also familiarized themselves with the open framework for interoperable NLP web services Galaxy, the multilingual text similarity analysis system WebSty, some NLP and visualisation tools in Textimager and a parsing pipeline for CONLL-U texts UDpipe.

Blog post written by Kadri Vider and Olga Gerassimenko, edited by Darja Fišer and Jakob Lenardič