CLARIN Newsflash November 2023



CLARIN2023 Keynote: Lost in Meaning - Found in Translation by Jörg Tiedemann

Translation involves understanding and generation of language. In this way, translation combines the two essential challenges in language technology. Specifically focusing on the effects of linguistic diversity, Jörg Tiedemann (University of Helsinki) discussed the ability of neural translation models to pick up linguistic properties and to generalise to meaningful representations when trained on large amounts of multilingual data.


New: CLARIN Forum

We are proud to introduce our new CLARIN Forum! Whether you have a question on any CLARIN-related topic, or would like to share something with the wider community – the forum is for all user groups, regardless of experience or background. All welcome!

Ask your questions here! 


ParlaMint: New Release and Recognition by ELG

A new release of parliamentary corpora ParlaMint 4.0 is now available, featuring three more parliaments: Spanish, Finnish and Basque, as well as metadata on political orientation.

In addition, the multilingual sentiment dataset of parliamentary debates, ParlaSent 1.0, has been featured by the European Language Grid (ELG) as resource of the month October. ParlaSent 1.0 consists of min-length sentences from the parliamentary proceedings of Bosnia and Herzegovina, Croatia, Czechia, Serbia, Slovakia, Slovenia and the UK.


ParlaMint Visits CLARIN-LV

On 12 September, the CLARIN-LV annual conference ‘Latest language resources and tools for digital humanities’ took place in Riga, at the Institute of Mathematics and Computer Science of the University of Latvia (IMCS UL). At the conference, ParlaMint corpora were introduced to the Latvian digital humanities community by Maciej Ogrodniczuk (CLARIN-PL). Recently created language resources were presented by CLARIN-LV consortium members – the Institute of Latvian Language, Latvian National Library, Rēzekne Technology Academy, Institute of Literature, Folklore and Arts, Livonian Institute and Institute of Mathematics and Computer Science. The conference gathered about 40 participants, including representatives from the Ministry of Education and Science, and several key Latvian higher education institutions. 

DeriMo 2023 Workshop in Dubrovnik Co-Organised by HR-CLARIN 

The fourth International Workshop on Resources and Tools for Derivational Morphology (DeriMo 2023) was held in Dubrovnik, Croatia, on 5 and 6 October. The workshop was organised by the University of Zagreb, the Croatian Language Technologies Society and HR-CLARIN consortium. It was already the fourth edition of the workshop where discussions on language resources and tools for derivational morphology continued. A wide range of topics from derivational morphology was covered.


Integrating Research Infrastructures Into Teaching

Lecturers seeking to use CLARIN services to teach students how to search, find, use, share and archive language research data are invited to take a look at the infrastructure guide developed in the UPSKILLS project. The guide gives an overview of the CLARIN central services, such as the Virtual Language Observatory, and how they can be used in the classroom, with further learning resources on Moodle. It also includes a  research tracking tool developed at the University of Zurich and Geneva, and a list of open corpora accessible via the CLARIN.SI infrastructure. Teachers and lecturers using CLARIN in the classroom are invited to share their experiences, ideas and good practices via  


FAIR SSH Working Group

The European Research Infrastructures in the Social Sciences and Humanities - CESSDA, CLARIN, DARIAH and OPERAS - have decided to meet regularly in order to coordinate their activities regarding FAIR principles training, implementation, and adaptation to their specific community’s needs. The working group brings SSH perspective to FAIR principles implementation and complements the work of other working groups, such as RDA and EOSC.



Call for Papers: CALD-Pseudo Workshop on Computational Approaches to Language Data Pseudonymisation

21 or 22 March 2024, St. Julians,  Malta

A cross-disciplinary forum for advancing privacy protection of unstructured text data and data openness through pseudonymisation, held as part of EACL 2024.

Deadline for submissions: 18 December 2023

Read more... 

Call for Papers: SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

21 or 22 March 2024, St. Julians, Malta

Also co-located with EACL 2024, this workshop supports work in the Humanities, Social Sciences, and Cultural Heritage communities, but also promotes work in the Computational Linguistics community to build richer and more effective tools and models.

Deadline for submissions: 18 December 2023


Launch of the SIGHUM Endorsement Initiative

If you are organising an event concerning language technologies for the socio-economic sciences and humanities, and would like to apply for an endorsement by SIGHUM, please email with your event details.

Call for Proposals: Open Repositories Conference 2024

3 -6 June 2024, Göteborg, Sweden

On the theme of ‘Empowering Global Progress’, OR2024 provides an opportunity to explore and reflect on the ways repositories enable transparent and sustainable information and data. 

Deadline for submissions: 15 December 2023

Read more... 


Computational Literary Studies Infrastructure (CLS INFRA): Transnational Access Fellowship Programme

The CLS INFRA Fellowship Programme provides access to a wide range of data, tools and knowledge. Scholars with an interest in computational literary studies methods are invited to apply for a fellowship grant at one of the infrastructure providers. 

Deadline for submissions: 4 December 2023