CLARIN Annual Conference (CLARIN2023)
16 - 18 October 2023, Leuven, Belgium (hybrid event)
Registration for the Bazaar Is Now Open
The CLARIN Bazaar is an informal space at the CLARIN Annual Conference where you can meet people from other centres and share experiences and ideas. If you would like to have a stall in the CLARIN Bazaar, please register here.
Deadline for registration: 29 September 2023
Call for Submissions: Using CLARIN in Training and Education
The CLARIN Annual Conference 2023 will dedicate a session to Training and Education to showcase initiatives and best practices in the SSH community for integrating CLARIN’s services, resources and tools into teaching and training. Teachers, trainers and curriculum designers are invited to submit a 500-word abstract describing their initiative, project, course or training material, specifying how they used the CLARIN infrastructure to provide their target group with the skills required to interact with the infrastructure’s services, resources and/or tools. We also welcome submissions illustrating the adoption or adaptation of existing training materials published on our Learning Hub. All selected relevant submissions will be presented at the CLARIN Annual Conference and showcased on the CLARIN website. The session will be in hybrid format and remote presentation is possible.
Deadline for submissions: 15 July 2023
Registration for the CLARIN Annual Conference 2023 will open in July!
Vacancy at CLARIN ERIC: Member of the Board of Directors
As of September 2023, CLARIN will have an opening for the position of member of the Board of Directors (20% FTE), who will work closely together with the other directors. The appointment will be for a term of two years, with the possibility of prolongation for another two years.
Deadline for applications: 30 June 2023
New Publication: ‘The CLARIN Infrastructure as an Interoperable Language Technology Platform for SSH and Beyond’
This open-access paper, published in the journal Language Resources and Evaluation, provides an overview of the CLARIN infrastructure, with a particular focus on tool interoperability, ease of access to research data, tools and services, the importance of sharing knowledge, and community building. By taking into account the FAIR principles from the very beginning, CLARIN succeeded in becoming a successful example of a research infrastructure that is actively used by its members and the wider community.
CLASSLA K-Centre Releases Web Corpora for Croatian, Serbian and Slovenian
The CLASSLA Knowledge Centre for South Slavic Languages is delighted to announce the release of the pilot versions of the CLASSLA web corpora for Croatian (2.3bn words), Serbian (2.4bn words) and Slovenian (1.9bn words). These pilot versions are intended to gather valuable user feedback (please get in touch via email@example.com). The official versions, for all South Slavic languages, will be released later this year. The main features of the corpora are considerable size, recency, automatically identified genre, and linguistic processing with the improved CLASSLA-Stanza annotation pipeline. For ideas on how to use the corpora for research, users are invited to read the step-by-step tutorial on the use of CLASSLA web corpora via the open CLARIN.SI concordancers.
CLARIN Café: A New CLARIN Resource Family for Lexical Semantic Change Research
5 July 2023, 14:00-16:00 (CEST), virtual event
The field of lexical semantic change (LSC) has attracted growing interest recently. However, the available resources and tools to conduct this type of research are currently scattered across different CLARIN Resource Families (CRF) or are not included in any CRF at all. In this CLARIN Café, we will present our groundwork for the creation of a new CRF for LSC research. As our preliminary work mainly focuses on historical languages, we would like to solicit contributions from the CLARIN community to cover further languages. The event will be accompanied by a tutorial, explaining how the components of the CRF can be used for LSC research. This event is open to the entire CLARIN community, and especially researchers in the field of diachronic semantics, annotation of word senses, and the construction of resources and tools for research on lexical semantics and LSC.
CLARIN2022: Selected Papers Now Published
This volume presents the highlights of the eleventh CLARIN Annual Conference in 2022 (CLARIN2022), edited by Tomaž Erjavec and Maria Eskevich. CLARIN2022 was held from 10 to 12 October 2022 as a hybrid event in Prague, Czechia.
UPDATES FROM THE NATIONAL CONSORTIA
Czech ASR for Oral History Archives
Two LINDAT/CLARIAH-CZ team members, Jan Švec and Martin Bulín from the Department of Cybernetics, University of West Bohemia in Pilsen, attended the EHRI-CLARIN Oral Testimonies Workshop 2023, held at King’s College London, from 15-17 May. They presented their state-of-the-art system for speech recognition designed specifically for oral history archives, developed at the University of West Bohemia. They also showed a demonstration of the semantic search technology built on top of recognised oral history interviews. Their research and development results have received very positive feedback, and will be shared in a CLARIN Impact Story soon.
New CLARIN-IT Member
CLARIN-IT is pleased to announce that the Università degli Studi di Napoli ‘L'Orientale (UNIOR) became a member of the Italian National Consortium in May 2023. The membership establishes that the metadata relating to language resources and tools available at the headquarters of UNIOR are deposited in the repository of the ILC4CLARIN centre. The persons appointed to be actively involved in CLARIN-IT are Prof. Johanna Monti and Dr Maria Pia di Buono.
Jakob Lenardič Receives Dr Anna Mayer Kansky Prize
Jakob Lenardič has been awarded the Dr Anna Mayer Kansky prize for his outstanding doctoral thesis at the University of Ljubljana. Using CLARIN resources, his thesis studied the Middle High Germanic constructions in Slovene, East Slavic and some Germanic languages within the framework of post-minimalist grammar.
TRAINING AND EDUCATION
Training Resource of the Month: ‘Privacy by Design in Research’
The aim of this training material, developed by Esther Hoorn and Henk van den Heuvel, is to teach researchers how to perform a data protection impact assessment for their research project in order to enable responsible re-use of archived speech corpora.
UPSKILLS Learning Content and Educational Games Available on Moodle
The courses produced in the UPSKILLS Erasmus+ project are now available for public view on the Moodle platform (you need to create an account here first). Please note that the learning content blocks are still under review and they will be finalised by the end of August. Teachers and programme coordinators who intend to use parts of the UPSKILLS learning content blocks in the upcoming two academic years are kindly invited to let us know via this registration form. Should you have any feedback after browsing or using the UPSKILLS materials, please contact the block's content creators directly.
Report from the 4th UPSKILLS Multiplier Event - Learn by Playing? Upskilling Linguistics and Languages HE Students with the Aid of Educational Games
The 4th UPSKILLS Multiplier Event took place on 30 May in Malta and focused on the creation and implementation of educational games in higher-education linguistics and language-related courses. A full account of the event, including slides and recordings, is available.
News From The DH Course Registry
The DH Course Registry is a discovery platform of courses, programmes and continuing education training in the Digital Humanities, maintained by DARIAH and CLARIN. We are excited to announce that a new search bar has been implemented in the DH Course Registry to improve user experience. You can now search for DH courses in the database by their name or the name of their institution. This functionality is an addition to the existing search filters.
New Courses Available Now
Two new courses have been added to the Digital Humanities Course Registry by Koenraad de Smedt, CLARIN National Coordinator for Norway:
- Language and computers, a 15 ECTS BA course starting on 15 January 2024, which gives an introduction to formal methods in linguistics. The course is open to exchange students.
- Introduction to programming for NLP in Python, an online self-study course, which is part of the BA physical programme.
CLARIN and DARIAH Joint Workshop on Research Infrastructures, DH Masters and Industry
As a joint initiative of DARIAH and CLARIN, the ‘Digital Humanities and Industries: Identifying Employment Niches’ workshop was organised as part of the DARIAH Annual Conference in Budapest on 6 June. The aim of the event was to bridge the gap between the DH Master’s programmes and the needs of future employers in the ever-evolving digital landscape. More than 30 DH programme directors and representatives shared their insights on the topic through remote interviews and onsite discussions. The outcomes of the workshop will be captured in a white paper, which will be opened up for feedback later this year. You can access the presentation and a digital booklet showcasing DH programme profiles on Zenodo.
Meet CLARIN Ambassador Eva Soroli
The CLARIN Ambassadors Programme encourages participation in CLARIN ERIC in disciplines and communities that are not yet fully integrated in CLARIN. Ambassador Eva Soroli is Associate Professor of Psycholinguistics at the University of Lille, and lead of the Language Re-appropriation team of the CNRS (National Centre of Scientific Research) STL lab. Trained in linguistics, cognitive science and psycholinguistics, she was awarded her PhD at the University of Paris 8. She is interested in the relationship between language and thought in typical and atypical populations. Her current projects involve experimental and corpus investigations from a cross-linguistic perspective combining offline and online data in several research domains: bilingualism, L1/L2 acquisition, language assessment, aphasia speech, language and cognitive processing.
EVENTS & CALLS
Corpus Linguistics 2023 (CL2023)
3 - 6 July 2023, Lancaster, UK
The international Corpus Linguistics conference (CL2023) will focus on the topics of Corpus methodology, theoretical corpus linguistics, lexis and lexicon and applications of corpus linguistics, register variation, sociolinguistics, pragmatics, language variation and applied linguistics. Although CL2023 is an in-person event, it will be possible to follow the conference online for free.
Joint Conference on Research on Text Analytics
4 July 2023, Mannheim, Germany
Within the German National Research Data Infrastructure (NFDI), text data is an important source for analysis, for example in the consortia that deal with humanities research (Text+) and Business, Economic and Related Data (BERD@NFDI). Although both consortia address different communities and research questions, there is overlap in methods, including machine learning, aspects of language models and other natural language processing techniques. The conference seeks to present the different application areas and identify possible synergies when looking at similar methods.
CLARIN-CH Event: ‘Management of Sensitive and Personal Data: Ethical & Legal Aspects for Linguistic Data in Switzerland’
29 September 2023, virtual event
When it comes to Open Research Data, the management of sensitive and personal data as well as dealing with ethical and legal aspects for linguistic data can be challenging. In an effort to address this, the CLARIN-CH consortium, in partnership with national research infrastructures such as SWISSUbase/LaRS and LiRI, will launch the CLARIN-CH working group (WG) ‘Management of Sensitive and Personal Data: Ethical & Legal Aspects for Linguistic Data in Switzerland’. This kick-off event for the WG will focus on data collection, protection and preservation and their associated procedures, with respect to different types of linguistic data, such as multimodal, historical and experimental data, data from social media, and data from different age groups.
Symposium on Corpus Approaches to Lexicogrammar (LxGr2023)
6 - 8 July 2023, virtual event
The focus of LxGr is the interaction of lexis and grammar, envisioning the two as notional ends of a continuum (lexicogrammar). LxGr primarily focuses on papers reporting on corpus-based research on any aspect of the interaction of lexis and grammar, as well as position papers discussing theoretical or methodological issues relevant to both lexicogrammar and corpus linguistics.
Deadline for registration: 4 July 2023
Call for Papers: DHASA2023 on ‘Digital Humanities for Inclusion’
27 November - 1 December 2023, Nelson Mandela University, Eastern Cape South Africa
The Digital Humanities Association of Southern Africa (DHASA) is pleased to announce its fourth conference on the theme ‘Digital Humanities for Inclusion’. The conference aims to foster growth and collaboration in the field of Digital Humanities (DH) and provides a comprehensive overview of the current state-of-the-art in DH, particularly within the Southern Africa region. As such, the organisers particularly welcome submissions conducted by individuals from Southern Africa or research focused on the geographical area of Southern Africa.
DHASA2023 welcomes interdisciplinary contributions from researchers in various domains of DH, including, but not limited to, language, literature, visual art, performance, media studies, music, history, sociology, psychology, language technologies, library studies, philosophy, methodologies, software and computation.
Deadline for submissions: 15 August 2023
Call for Papers : JOHD Collection ‘Representing the Ancient World through Data’
The Journal of Open Humanities Data (JOHD) features peer-reviewed publications describing humanities research with a high potential for reuse and is planning a special collection entitled ‘Representing the Ancient World through Data’. From textual corpora and field reports to collections of inscriptions and museum catalogues, the study of the ancient world strongly benefits from the availability and distribution of well-curated datasets. It has become crucial to ensure the data is shared to enable reuse and reproducibility. Nevertheless, publication of ancient world data is still rare, leading to a widespread lack of recognition for work such as corpus design or data curation. This creates a problem for the field, especially because the specific characteristics of ancient world data (their fragmentary and often incomplete nature, the lack of interpretative frameworks accompanying the data themselves, and so on) create a unique set of challenges.
Deadline for submissions: 15 September 2023