New Impact Story: Using a Monitor Newspaper Corpus to Trace Changing Language as a Result of COVID-19
This month’s Impact Story 'Using a Monitor Newspaper Corpus to Trace Changing Language as a Result of COVID-19' illustrates the possibility to trace, almost in real time, changes in language in response to a crisis. The study examines the linguistic changes that occurred in the Norwegian language during the first wave of the COVID-19 pandemic.
Read more here
Tour de CLARIN: Poland
The latest centre to be featured in Tour de CLARIN series is the CLARIN PL B-Centre. Their mission is to maintain and promote tools and services that are already available within the CLARIN-PL infrastructure, such as the Polish wordnet plWordNet, the valency lexicon Walenty, and the topic modelling tool Topic, as well as to create completely new tools and resources. The interview on the CLARIN-PL B-Centre features Olga Czeranowska, a sociologist working in the field of migrant studies who used the CLARIN-PL sentiment analyser MultiEmo to tag the sentiment of the Tweets of Polish migrants during the COVID-19 pandemic.
CLARIN Resource Families: Tools
Text normalisation is the process of transforming parts of a text into a single canonical form. It represents one of the key stages of linguistic processing for texts in which spelling variation abounds or deviates from the contemporary norm, such as in texts published in historical documents or on social media.
The CLARIN infrastructure offers fourteen tools for text normalisation. Most of the tools are aimed at normalising texts within a single language, while some have a very broad multilingual scope. Half of the tools are dedicated normalisers, while the others provide additional functionalities, such as PoS-tagging, lemmatisation and named entity recognition.
UPDATES FROM THE NATIONAL CONSORTIA
CLARIN-LV Organises Practical Seminar for Digital Humanities Researchers on Corpus Use
On 12 January, CLARIN-LV organised a tutorial in order to show digital humanities researchers how to efficiently use and find language data. The practical seminar was designed to provide DH researchers with practical knowledge on how data can be retrieved from language corpora and to introduce a new feature of korpuss.lv, the federated search in the diverse collection of Latvian language corpora.
Participants included DH researchers from the Institute of Latvian Language, Faculty of Humanities of UL, and Liepāja University. Materials can be used by other DH researchers who need support in their corpus studies, including researchers from the SAFMORIL K-centre.
The new version of the Latvian corpora website korpuss.lv was developed by researchers from the Artificial Intelligence Laboratory of the Institute of Mathematics and Computer Science, University of Latvia (IMCS UL). It lists all the corpora developed by IMCS UL in close collaboration with other academic institutions and private companies (corpora are part of the CLARIN-LV repository content).
BLOGS & REPORTS
Final Release of the SSH Open Marketplace
The SSH Open Marketplace is a resource platform for academics and scholars of the social sciences and humanities, and was developed within the Social Sciences and Humanities Open Cloud project (SSHOC).
Currently, the SSH Open Marketplace hosts more than 5,000 records and represents a discovery portal: With just a single search term, users can get access to the larger research context, consisting of tools, related tutorials, use cases, datasets, and more. Additionally, the Marketplace database is now also being curated by the community, which allows users to enrich and extend the existing content. Not only will community curation be of benefit to the SSH domain, but it will also serve the larger European Open Science Cloud ( ) so that the content and its creators can gain visibility in many other disciplines.
Together with European Research Infrastructures CESSDA and DARIAH and the other SSHOC project partners, CLARIN will continuously extend the user base and content of the SSH Open Marketplace to ensure sustainability.
Read more and start exploring
TRAINING AND EDUCATION
A Recap on CLARIN Café: Towards Guidelines for Integrating CLARIN into Teaching
The CLARIN Café ‘Towards Guidelines for Integrating CLARIN into Teaching - Lessons Learnt from UPSKILLS’ took place on 1 December 2021 and was organised and hosted by Iulianna van der Lek and Darja Fišer. The Café continued the discussions from the 2019 CLARIN@Universities workshop on the role of the humanities infrastructures in training and education. The main aim was to share the first results from the UPSKILLS project and discuss the type of support and guidelines teachers and trainers need when using the CLARIN infrastructure in order to teach data processing and management skills in language-related disciplines.
CLARIN Training Calls
There are three funding calls available for teachers and trainers seeking either opportunities to collaborate with CLARIN or to integrate CLARIN resources and services into their training programmes.
Teaching with CLARIN Call: If you have experience using CLARIN resources and services in training and education, use this call to showcase your teaching or training materials within the CLARIN network.
Application deadline: 30 June 2022
Trainer Network Programme: Apply for this call if you would like to participate in the CLARIN Trainer Network, a group of experts who carry out training events at prominent summer schools, conferences or give guest lectures on CLARIN topics at universities.
CLARIN Training Suite: If you are interested in developing new teaching or training materials on either general or domain-specific topics using the CLARIN services, resources and tools, you are welcome to submit your proposal for the CLARIN Training Suite.
Open Access Guide Book on Humanities Data Analysis
Humanities Data Analysis: Case Studies with Python by Karsdorp, Kestemont and Riddell is a practical guide book on data science in the humanities using the Python programming language. It is now available as an Open Access interactive Juptyer Book, and covers four themes: Parsing and manipulating data, modelling and data representation, creating data visualisations, and working on real-world case studies with publicly available data sets. A printed version of the book is available through Princeton University Press.
CLARIN2021: State of the Technical Infrastructure
Watch the presentation by Dieter Van Uytvanck on the ‘State of the Technical Infrastructure’. This session took place on 27 September during the first day of CLARIN2021. For more detailed information and slides, please visit the CLARIN2021 programme page.
EVENTS & CALLS
Call for Papers: ParlaCLARIN III Workshop at LREC2022
20 June 2022, Marseille, France
The ParlaCLARIN III workshop at LREC2022 will focus on the topic of ‘Creating, Enriching and Using Parliamentary Corpora’. Parliamentary (language) data serves as a communication channel between elected political representatives and members of society, thus reflecting socio-politically relevant information. The development of accessible, comprehensive and well-annotated parliamentary corpora is crucial for a number of disciplines, such as political science, sociology, history, and (socio)linguistics. The workshop will bring together developers, curators and researchers of regional, national and international parliamentary debates from across diverse disciplines in the humanities and social sciences.
Deadline for submissions: 15 March 2022
COESO Open Call for Pilot Project Funding
The Collaborative Engagement of Societal Issues project (COESO) is funding up to five pilot projects (up to twelve months, 50.000 Euros). Projects should be in the disciplines of the social sciences and humanities, and address a societal challenge. COESO is looking for collaborative project proposals, which connect researchers and stakeholders from outside academia. Applicants must be legal entities, such as civil society organisations, non-governmental organisations, higher education institutions, research centres and small and medium enterprises.
Deadline first round of applications: 30 January 2022
EOSC ‘Ask Me Anything’ Series
Starting 1 February 2022, 14:00 CET (monthly)
The EOSC Future project is launching a series of webinars under the heading ‘Ask Me Anything’. These monthly Q&A sessions, planned to take place on every first Tuesday of the month, will boost uptake among users as well as show potential service providers how their resources can be featured on EOSC. The first session will be on the topic of data storage.
Online Forum on ‘Innovation, Technologies and Plurilingualism’
7 - 9 February 2022, online event
Within the framework of the French Presidency of the Council of the European Union, this two-day event offers a broad spectrum of topics, including European policy, diversity, the impact and future of digital resources, and artificial intelligence. Ministers, writers and researchers, among others, will share their thoughts and lead workshops and discussions on ‘Innovation, Technologies and Plurilingualism’.
Call for Papers: Workshop on Audio Collection Human Interaction (AudioCHI 2022)
14 March 2022, online event
We engage with spoken material for a variety of reasons, including entertainment, current affairs, education, and research.The AudioCHI workshop explores human factors in interaction with spoken audio content in search settings, including live stream audio and collections. The workshop seeks to bring together researchers in spoken content retrieval, researchers in interactive information retrieval, and researchers interested in engagement with speech data. The organisers are encouraging submissions of short position papers introducing their interests in the topics of the workshop.
Deadline for submissions: 11 February 2022
Call for Papers: Spatial Humanities 2022
7 - 9 September 2022, Ghent, Belgium
On the topic of geospatial technologies and their contribution to humanities research, this conference will bring together researchers from the disciplines of history, archaeology, literary studies, linguistics, and anthropology, as well as from interdisciplinary and/or technical fields including digital humanities and computer science. The organisers welcome proposals in various formats, including short papers and workshops.
Deadline for submissions: 14 February 2022