Skip to main content

CLARIN Newsflash: January 2023

Call for Abstracts (CLARIN2023)

16-18 October 2023, Leuven, Belgium (hybrid event)

The call for the submission of extended abstracts for the CLARIN Annual Conference 2023 (CLARIN2023) is now open.

The CLARIN Annual Conference is organised for the wider humanities and social sciences community in order to exchange experiences in working with the CLARIN infrastructure, share best practices, and discuss plans for future developments. CLARIN2023 will be a face-to-face event, which will also be fully accessible virtually. The conference will take place in the historic city of Leuven, Belgium.

You are invited to submit your abstract until 14 April 2023.

Read the full call

New Impact Story: ParlaMint – A Resource  for Democracy

Public discourse, the language we use in our exchanges with each other, in the media and in political debates, shapes and guides how we view the world. Dario Del Fante and Virginia Zorzi explore the public discourse on migration and migrants, using, among others, CLARIN’s ParlaMint dataset as a resource. Looking at parliamentary debates as well as news articles, the researchers analyse how migration and migrants were referred to during the so-called migration crisis (2015/16) and the advent of COVID-19 (2020) in two countries – Italy and the UK – and show how this may impact public opinion on the topic.

Read the full Impact Story



New CLARIN-IT Member: University of Padua

The University of Padua (UniPD) joined the Italian National Consortium in December 2022. Thanks to the expertise of Giorgio Maria Di Nunzio, Open Data Advisor of the UniPD Open Science Commission, UniPD will enrich CLARIN-IT with an offer of metadata and language resources hosted at its institutional repository. Furthermore, it will make CLARIN-IT initiatives increasingly visible, in particular to humanities departments, and will promote the creation and management of data and language resources within UniPD.

Read more


Outcomes of DELAD Workshop on GDPR-Compliant Data-Sharing

Corpora of speech of individuals with communication disorders (CSD) are hard to obtain, come with privacy issues and are often small in size. This makes reuse a challenge. The DELAD initiative stands for Database Enterprise for Language And speech Disorders and has been organising workshops around the accessibility and inventorisation of CSD corpora. This two-day workshop, which was held on 22 and 23 September 2022, focused on sharing data in a GDPR-compliant way. Among the topics discussed were of sharing CDS, sharing clinical data, and the impact of the Data Governance Act and Data Altruism. The outcomes have been collated and are available now.

Read more  


Training Resource: Lithuanian Collocations: Usage, Teaching, Learning, and Translation

This resource is a book devoted to collocation teaching, learning and translation in Lithuanian, which was submitted via the Teaching with CLARIN Call. The training resource consists of two parts: the first part, ‘Collocation teaching and learning’ (by Jolanta Kovalevskaitė and Erika Rimkutė), is primarily intended for teachers and lecturers who teach Lithuanian as a foreign language, students of linguistics, and Lithuanian language learners. The second part of the book ‘Translation of collocations' (by Jurgita Vaičenonienė) is aimed at students and teachers in the field of translation studies. Each chapter includes the basic terms, brief theoretical overviews, exercises, and suggested readings. Trainees will learn to use open-access resources available in the CLARIN-LT infrastructure that are important for collocation teaching, learning, and translation, increase their collocation learning/teaching competence individually, and apply various methods for cross-cultural comparison and translation of collocations. The training material is in Lithuanian, though some exercises require English language knowledge. The book has an accompanying website with interactive tasks and answers to the exercises. Access is also provided to additional resources, such as the Database of Lithuanian Collocations, a tool for collocation recognition and the data necessary for task completion in the resource book. 

Read more

Meet the CLARIN Interns

CLARIN has launched a remote internship programme for MA students interested in learning how to use the research infrastructure to access and engage with digital language data with the help of advanced technologies. The internship aims to help the students enhance their language data handling and processing skills, and shows them how to manage language resources according to the FAIR Data Principles. 

Read more




CLARIN2022 Keynote: Peter Leinen – Enabling Digital Research

On Day 1 of CLARIN2022, Peter Leinen (German National Library) discussed the growing digitisation in memory institutions, such as libraries and archives. This creates new tasks in these institutions, such as the collection and long-term archiving of a continuously increasing amount of digital objects and data. At the same time, it facilitates new forms of utilisation, such as text and data mining, as well as new forms of cooperation with the scientific community. 


Panel Discussion: What Will the Language Professional of Tomorrow Do?

9 February 2023, 15:00-17:00 (CET, virtual event)

Language technology plays an increasingly important role in the multilingual European societies we live in. Will language specialists be replaced by data scientists, or will expert knowledge of languages stay a key competence on the job market? In cooperation with Drongo, the translation service of the European Commission is organising a panel discussion between trainers, students, companies, researchers and language professionals about the job market for language experts. The discussion will take place on the occasion of the publication of OTTO: an inventory of training programmes in the field of language and technology in the Netherlands and Belgium. This event is of interest to trainers, students, employers in the language industry, buyers of language services, and future and experienced language professionals. Participants can speak in Dutch, English and French.

Read more 



Final Event of the ‘Meta-Pandemic’ GLAMers Project

13 February 2023, Limassol, Cyprus, and Online

The Cyprus University of Technology invites you to the final event of the ‘Meta-Pandemic’ GLAMers  Project (Erasmus+, 2021-2023). The project offered  advice and training opportunities to support galleries, libraries, archives and museums (GLAMs) in their digital transformation through the participation of youth. It allowed GLAMs to implement and then assess the digitally enhanced activities and map social changes in the relationship between GLAMs and youth with regards to cultural and civic values for better societies.

The final event, which is organised by CLARIAH-CY,  will explore the main results of the project, and will be an opportunity to take stock of the post-pandemic situation of European GLAMs.

Read more



‘Humanities and AI – One Answer to a Million Questions?’ Hybrid Lecture

16 February 2023, 15:00- 16:15, Utrecht, Netherlands, and Online 

The general current perspective from the field of Artificial Intelligence (AI) on language could be summarised as ‘an endless stream of very interesting data’. The latest AI developments, Large Language Models (LLMs) such as ChatGPT, seem to confirm this view. Trained on internet-scale amounts of language, their eye-catching capacity to generate text even seems to have brought Artificial General Intelligence a step closer. But what is the role of theory in this data-driven age? And what is the impact of LLMs on fields such as digital humanities?

During this hybrid lecture, Professor Antal van den Bosch (member of the CLARIN Board of Directors) and Assistant Professor Dong Nguyen (Utrecht University) will discuss the impact of the latest AI developments on linguistics and the field of digital humanities in general.

Read more


Call for Papers: DARIAH Annual Event 2023

6 - 9 June 2023, Budapest, Hungary

Collections in libraries, archives and museums have been at the heart of humanities research for centuries. However, with the current focus on data-driven research, data management plans and the research data lifecycle, in what ways do we need to think differently about cultural heritage collections as data? This year’s DARIAH Annual Event is centred on the question whether cultural heritage data can be seen as humanities research data. The event aims to explore questions related to sustainable access, workflows, data management, and training around digitised and born-digital data. The organisers invite contributions from (digital) humanities researchers, cultural heritage professionals, humanities data stewards, as well as computer, information and data scientists.

Deadline for submissions: 1 February 2023

Read more

Call for Papers: 44th Annual Conference of the International Computer Archive for Modern and Medieval English (ICAME)

17- 21 May 2023, Vanderbijlpark, South Africa

The conference theme is ‘English Going Places, Corpora Crossing Spaces’. Participants are invited to reflect on how the diffusion of English across spaces – physical, virtual or metaphorical – contributes to shaping the language.

Deadline for submissions: 10 February 2023

Read more



Call for Papers: SIGHUM Workshop 

2 or 6 May 2023, Dubrovnik, Croatia (date tbc)

In conjunction with EACL 2023, SIGHUM will organise the 7th Joint Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature. The workshop will focus on new challenges for research, such as noisy, non-standard textual or multi-modal input, historical languages, literary texts, vague research concepts, and multilingual parts within one document. Topics of interest include the adaptation of NLP tools to cultural heritage, social sciences, humanities and literature, automatic error detection and discourse analysis.

Deadline for submissions: 13 February 2023

Read more 

All EACL workshops 

19th Workshop on Multiword Expressions 

2 or 6 May 2023, Dubrovnik, Croatia (date tbc)

In conjunction with EACL 2023, the special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL) is organising a full-day workshop on multiword expressions (MWEs). Multiword expressions (MWEs) are word combinations that exhibit lexical, syntactic, semantic, pragmatic, and/or statistical idiosyncrasies (e.g. by and large, hot dog, pay a visit). Their meaning often does not result from the direct combination of the meanings of their parts. Given their irregular nature, MWEs often pose complex problems in linguistic modelling, NLP tasks, and end-user applications (e.g. natural language understanding and ). The workshop will bring together researchers in various NLP subfields, encouraging them to submit MWE-related research. There will also be a joint session with the Clinical NLP workshop. Since clinical texts contain an important amount of multiword expressions (e.g. medical terms or domain-specific collocations), the goal of the joint session is to foster future synergies that could address scientific challenges in the creation of resources, models and applications to deal with MWEs.

Read more

All EACL workshops 

1st European Summer School in Artificial Intelligence (ESSAI 2023)

24 - 28 July 2023, Ljubljana, Slovenia

The European Summer School in Artificial Intelligence (ESSAI) is a new annual summer school with the ambition to become the central meeting place for students and young researchers in Artificial Intelligence (AI) to discuss current research and share knowledge. ESSAI provides an interdisciplinary setting in which courses are offered in all areas of AI, and also from wider scientific, historical, and philosophical perspectives. 

Read more



34th European Summer School in Logic, Language and Information (ESSLLI 2023)

31 July-11 August 2023, Ljubljana, Slovenia

The ESSLLI Summer School provides an interdisciplinary setting in which courses and workshops in logic, linguistics and computer science are offered. Courses are for all levels and cover a wide variety of topics within three interdisciplinary areas of interest: language and computation, logic and language, and logic and computation. In addition to the workshops and courses, there will be four evening lectures, given by prominent researchers, on topics that are at the forefront of research in logic, language and computer science, as well as from wider scientific, historical, and philosophical perspectives.

Read more

5th Learner Corpus Research Summer School

3 - 7 July 2023, Louvain-la-Neuve, Belgium

This Summer School is organised by the Centre for English Corpus Linguistics (CECL), and will introduce learner corpus research through a series of lectures, workshops and hands-on sessions. It is intended both for (junior and seasoned) researchers who have recently embarked on a learner corpus project, as well as those who simply want to know more about this exciting field of research.

Read more


Call for Participation: Shared Task on Multilingual Grammatical Error Detection

22 May 2023, Tórshavn, Faroe Islands

The Computational SLA working group invites participation in the first shared task on Multilingual Grammatical Error Detection, MultiGED-2023, which includes five languages: Czech, English, German, Italian and Swedish. The aim of the shared task is to detect tokens in need of correction across five different languages, labelling them as either correct or incorrect , i.e. performing binary classification at the token level. Participants can work on one or any combination of the provided languages.

The shared task is part of the NLP4CALL workshop, which will take place on 22 May 2023, co-located with the NoDaLiDa conference.

Read more




Postdoctoral Researcher in Digital Humanities – Global History of the Portuguese Empire

Tadeusz Manteuffel Institute of History of the Polish Academy of Sciences, Warsaw, Poland

The project focuses on communication patterns of imperial commoners who produced similar characteristics, narratives, and thinking strategies in the Atlantic Portuguese Empire, displaying attitudes that transcended class and gender, and how the Empire responded to that. The postdoctoral research is envisaged to work on research questions related to the experiences of non-elite residents of the Portuguese Empire and how these changed across space and time. Using cutting-edge technology, the project offers innovative perspectives on the history of the Portuguese Empire and provides new insights into the workings of colonial social networks.

Read more