Initiatives Supporting Ukrainian Researchers and Language Resource and Technology Development at CLARIN 

Submitted by Karina Berger on 19 June 2023


Written by Jurgita Vaičenonienė and Anna Kryvenko 

Online Seminar ‘Scientific Data about Society in Ukraine’ 

On 22 May, CLARIN representatives were invited to give a presentation in the online seminar ‘Scientific Data about Society in Ukraine: Ukraine in European Research Infrastructures in the Social Sciences and Humanities’, organised by the European Alliance for Social Sciences and Humanities (EASSH) in cooperation with the Strategic Working Group Social and Cultural Innovation. The aim of the online seminar was to provide research infrastructures with  a platform to share their experiences of collaboration with Ukraine, and to discuss the challenges and needs which should be addressed in order to further enhance the integration of Ukrainian academic and research communities into European research infrastructures in the context of Ukraine’s accession to the European Union during Russia’s ongoing aggression against Ukraine.  

The discussion revolved around the focal areas of data collection and data archiving in social sciences, as well as data and digitisation in the humanities, with an emphasis on two major problems: how to ensure the availability and accessibility of comprehensive scientific data on Ukraine for research in the field of SSH, and how Ukraine can cooperate with European research infrastructures.  

The speakers of the online seminar represented a range of European and Ukrainian science policy, infrastructural, and research entities such as the previously mentioned EASSH, ESFRI, European Commission, National Research Foundation of Ukraine, European Social Survey, Survey of Health, Ageing and Retirement in Europe, or Consortium of European Social Data Archives, among others. CLARIN was the only infrastructure participating in the online seminar, specialising in language resource and technology development. It was represented by the national coordinator of CLARIN-LT, Jurgita Vaičenonienė, and a Ukrainian researcher, Anna Kryvenko.  



CLARIN’s Support for Ukraine 

The presentation on CLARIN’s support for Ukraine covered initiatives launched by the CLARIN community during 2022. In March 2022, CLARIN immediately responded, both centrally and locally, by expressing support for Ukraine. Since then, it has continued to search for ways to help the Ukrainian research and academic communities in the social sciences and humanities. During the first weeks of the war, CLARIN’s national coordinator forum formed a working group to discuss how the CLARIN infrastructure could reach out and help Ukrainian researchers. 

As a result, a structured web page, aimed at Ukrainian and other European researcher communities, was developed. The page offers information about diverse types of support by European academic communities to Ukrainian researchers. In addition, the web page informs European researchers about Ukrainian language technology, and presents the Ukrainian language technology research community and its needs. To increase the flow of information, it is still possible to contribute to the web page by filling in this form. Moreover, CLARIN has disseminated information about various support initiatives and has made these visible on the central CLARIN portal, in its monthly newsletter, its impact stories, and the annual conference. Thus, one of the ways of CLARIN contribution is continuous information flow, continuous visibility, and raising awareness.   


Contributions by National Nodes

In parallel, various activities and initiatives sprang up in different national consortia. Some of them were oriented particularly at Ukrainian refugees and their needs, such as easier communication in a foreign country. For example, the Institute of Formal and Applied Linguistics of Charles University developed an automatic translator for Czech and Ukrainian with support of LINDAT/ CLARIN-CZ. 

Another refugee-oriented machine translation system for Ukrainian to Finnish was developed by Jörg Tiedemann, Professor of Language Technology in the department of Digital Humanities at the University of Helsinki and the Helsinki group. One of the Lithuanian contributions was oriented at helping researchers staying in Ukraine so that they could continue their work, establish cooperation with the European Union educational institutions and have a platform for their voices to be heard internationally. 

In cooperation with the Ukrainian researcher Ruslana Martseniuk from Kyiv, an online course about the history of Ukraine was developed. The course offered the possibility for all those interested to learn about the history of the country as presented by a professional Ukrainian historian. The Italian CLARIN community aimed at involving Ukrainian data in their ongoing ‘Voices from Ravensbrück’ project. In particular, the part dedicated to ‘Slavic Voices from Ravensbrück’, conducted at University of Siena by Yuliia Chernyshova, Alessandra Carbone, aimed at locating new archives in Ukraine and abroad.   



Another way in which CLARIN showed its support was to include the Ukrainian language into the CLARIN flagship project ParlaMint, which offers a collection of comparable and uniformly annotated multilingual corpora of parliamentary sessions. The project has a strong researcher community developing and researching the data. The ParlaMint-UA corpus, co-funded by CLARIN ERIC and the Slovenian Research Agency, projects P6-0436 & N6-0288, became the first full-text corpus of Ukraine’s parliamentary proceedings. 

This has implications for the development of the Ukrainian language as a digital resource, and also for making Ukraine’s parliamentary data interoperable, comparable, and accessible to the international research community. It has already resulted in a manual enhancement of an NER model for the Ukrainian language and a machine translation of the Ukrainian transcripts to the English language. Furthermore, CLARIN.SI has financially supported the extension of the ParlaMint-UA corpus, which will double in size and will have additional features for language identification between Ukrainian and Russian. From a social perspective, the project’s extension facilitated building a team of Ukrainian parliamentary corpus researchers and students, who are currently scattered across Europe, and helping them stay professionally engaged and playing a role in bringing Ukraine closer to the EU. 

The inclusion of the ParlaMint-UA corpus into the ParlaMint family of corpora will contribute to the development of parliamentary discourse studies in Ukraine and the recognition of Ukrainian parliamentarism as part of Europe’s diverse political and socio-cultural heritage. Potential future steps may include the establishment of Ukraine’s observer status with CLARIN ERIC and further provision of technical assistance in strategic planning of research infrastructures development, leading Ukrainian research projects in SSH, managing intellectual property, etc. (e.g. via Knowledge Centres).             

The webinar ended by a joint discussion of all participants on the future plans and possibilities of cooperation. It was agreed that more events similar to this will be organised in the future to share the best practices and challenges of Ukrainian and European cooperation. A full report on the content of the webinar can be found here.