CLARIN Annual Conference (CLARIN2023)
16 - 18 October 2023, Leuven, Belgium (hybrid event)
Extended Deadline: Call for Abstracts
The call for extended abstracts for the CLARIN Annual Conference 2023 (CLARIN2023) has been extended. We welcome submissions until 28 April 2023.
Registration for the CLARIN Annual Conference will open in June 2022. Stay tuned!
Call for Nominations for PhD Session at CLARIN2023
National consortia are invited to put forward a candidate for the PhD Session at this year’s conference. During the session, PhD students can share their CLARIN-related research and receive feedback on their work from CLARIN experts. Selected candidates will be invited to submit a 500-word abstract of their PhD project, clearly outlining how their work makes use of the CLARIN infrastructure or how the results will contribute to CLARIN. All submitted abstracts will be published on the CLARIN website, as part of the conference programme, and the PhD students will be invited to submit their full papers for the Selected Papers publication after the conference.
Confirmed Keynote Speakers at CLARIN2023
We are delighted to announce that Laurence Devillers, Professor of Computer Science Applied to Social Sciences at Paris-Sorbonne University, and Jörg Tiedemann, Professor of Tiedemann Language Technology at the Department of Digital Humanities at the University of Helsinki, will deliver keynote lectures at the 12th CLARIN Annual Conference.
Obituary Chris Cieri (1963-2023)
We are very sad to announce the passing of Christopher Cieri in March 2023 at the age of 59. Chris, based in Philadelphia, PA, USA, was a long-term Executive Director of the Linguistic Data Consortium (LDC), leading it for 25 years. Chris established strong relationships between the LDC and like-minded organisations, collaborators, and sponsors, resulting in a vital global network that supported the growing language technology/ community. CLARIN is honoured to have been part of this network. Chris contributed to CLARIN as a member of its Scientific Advisory Board from 2015 until 2021, as well as several CLARIN centres, such as LINDAT/CLARIAH-CZ, where he also served as a member of the International Advisory Board. The whole CLARIN network will miss Chris deeply.
New Impact Story: Chatbots and Copyright: CLARIN Café Addresses Key Issue for DH
The release of ChatGPT, Bing Chat and, more recently, GPT-4, has turned the spotlight on language models and generative chatbots. The interest in AI-generated language data and human-machine interaction is significant, underpinned by extensive coverage in the media. However, many of the discussions tend to focus on the chatbots’ ability to produce content, and their potential uses in the future. Few address the specific issues relevant to those working with digital language resources.
CLARIN’s legal experts - the CLARIN Legal and Ethical Issues Committee, or CLIC - made this topic the focus of the latest CLARIN Café. Entitled 'Do Chatbots Dream of Copyright?' Copyright in AI-generated Language Data, the Café brought together a group of experts from different fields in order to explore the legal implications of working with or using AI-generated texts, and specifically whether these texts can—and should—be protected by copyright.
CLARIN Signs CoARA Agreement
CoARA, the Coalition for Advancing Research Assessment, has published an agreement which establishes a common direction for research assessment reform, while respecting organisations’ autonomy. CoARA’s Agreement on Reforming Research Assessment sets a shared direction for changes in assessment practices for researchers and research-performing organisations, with the overarching goal to recognise diverse outputs, practices and activities that maximise the quality and impact of research. The agreement includes the principles, commitments and a timeframe for reforms, and lays out the principles for a coalition of organisations willing to work together to implement the changes. To date, more than 500 organisations have signed the agreement and CLARIN is proud to be one of them.
New Resource Family: Corpus Query Tools
Text and corpus analysis lie at the heart of digital scholarship in the humanities and social sciences, and a wide range of software tools are available in this domain. The software applications included in this new Resource Family allow searching, exploring, analysing and visualising linguistic corpora and texts. The tools represent prime examples of the ways in which language technologies can support research across a range of disciplines, and are therefore central to CLARIN’s mission. The Resource Family includes applications for installation on users’ own computers (desktop), as well as those accessible via a web browser (online), with some key information in order to help users choose the best option for their particular research goal.
UPDATES FROM THE NATIONAL CONSORTIA
#LancsBox X: A Blogpost by CLARIN-UK
In a guest blog post on the CLARIN-UK website, Professor Vaclav Brezina from the ESRC Centre for Corpus Approaches to Social Science (CASS) at Lancaster University presents the latest version of the #LancsBox desktop corpus analysis application. #LancsBox X is a powerful tool for the analysis of very large corpora, and the latest version has a new Lucene-based architecture, with new features including the ability to ingest XML texts to run customisable R scripts. The blog post follows a CLARIN-UK workshop introducing the new version of the tool, which had more than 1,300 online participants.
BLOGS & REPORTS
Knowledge Exchange Report: ‘Building the Plane as We Fly It: The Promise of Persistent Identifiers’
This report is the main outcome of a study commissioned by Knowledge Exchange (KE), for which FIN-CLARIN’s Martin Matthiessen (CSC) has co-authored the scoping paper. The study set out to investigate ‘Risks and trust in pursuit of a well functioning Persistent Identifier infrastructure for research’. The investigation analysed the current state of the Persistent Identifier ( ) landscape in the six KE partner countries and beyond. In addition, it examined the roles of relevant stakeholders as PID service providers, higher education institutions, researchers, publishers and national libraries. The report describes the benefits of an efficient PID infrastructure and how this is a necessity for the research community. It also provides recommendations for each stakeholder group to support the successful implementation and management of PID processes. Seven complementary case studies accompany the report.
TRAINING AND EDUCATION
Training Resource of the Month: ‘The GATE Toolkit for Language Processing’
This resource, developed by Diana Maynard and Kalina Bontcheva, focuses on teaching the use of GATE, a freely available open-source toolkit for Natural Language Processing (NLP) that is widely used in academia and industry for many different NLP tasks.
The modules provide instructions on how to get to grips with the GATE toolkit for basic language processing, as well as more advanced techniques, and include a number of different scenarios, such as processing social media, hate speech and misinformation detection. The resource includes modules both for programmers who want to further develop their own tools within the toolkit, and for non-programmers who only want to make use of existing tools. In addition, the modules provide information on how to adapt GATE to one’s own needs (for example, to adapt English tools to a different language).
Digital Humanities Course Registry: What’s New
This month, we showcase three courses from the University of Helsinki. The Digital Humanities Department, in close collaboration with the FIN-CLARIN research infrastructure, offers the following courses to MA and PhD students:
- Data Clinic introduces students to various aspects in Linguistic Research Data Management, with focus on the handling of text and speech language data
- Introduction to Speech Analysis teaches students how to handle speech recordings using Praat
- Corpus Linguistics and Statistical Methods provides students with basic skills in corpus linguistics, showing them how to use technology to compile a corpus for research purposes.
All courses are taught both in Finnish and English by Mietta Lennes, who is a teacher, researcher, and co-chair of the CLARIN User Involvement Committee. For more information about the courses and application procedure, please check the university website.
The DH Course Registry is a discovery platform of courses, programmes and continuing education training in the Digital Humanities, maintained by DARIAH and CLARIN. If you would like to promote your DH-related activities via the registry, you can create an account here.
CLARIN and Other SSH Platforms: ‘You’ll Never Innovate Alone’
Several recent initiatives in the SSH domain have led to the launch of discovery portals that help their audiences find and access digital data. The diversity in discipline-specific portals comes with interesting functionalities and offers welcome support for a range of use cases. However, it may not always be easy for researchers to know which is the right portal for their task, or the most suitable platform for depositing their resources. For a fruitful co-existence in the ecosystem and a coherent offer of infrastructural support, it is crucial that providers of SSH services collaborate and innovate together. In this panel at CLARIN2022, representatives of several portals explain how they see their role in the SSH landscape and discuss the interoperability and complementarity of the different platforms.
EVENTS & CALLS
CLARIN TwinTalks 4 Workshop at DH2023
10 July 2023, 9:00 - 12:30 (CEST), Graz, Austria
This joint initiative by CLARIN and DARIAH will form part of the DH 2023 Conference: Collaboration and Opportunity and follows previous successful TwinTalks workshops. The main objective of the workshop is to develop a better understanding of the dynamics when researchers, teachers and/or professionals with different – but often overlapping – areas of competence engage in remote collaboration to address humanities research questions.
The workshop will also explore how the education and training of humanities scholars, cultural heritage professionals and technical experts can help to make remote collaboration across disciplines more efficient, effective, innovative, and inclusive for all participants.
The organisers invite submissions reporting on all aspects and stages of engaging in remote collaborative research and teaching in DH, including the obstacles encountered and solutions found. Position papers on the role of research infrastructures in facilitating remote collaboration in DH are also welcome.
Deadline for submissions: 15 May 2023
CLARIN At The European Parliament Event ‘Humanities in the Digital Age: Securing Innovation and Empowering Democracy’
26 April 2023, 14:30 (CEST)
Digital Humanities (DH) lies at the crossroads of social sciences and computer sciences, focused on the use of computational methods in the humanities, social and behavioural sciences. This includes the use of digital analysis and visualisation techniques in, among others, (art) history, languages, instructional sciences, the use of digital games in social sciences, or the use of digital technologies in the sector of museums, archives and libraries. This event will explore, highlight and discuss the opportunities and challenges related to the research and development of DH and their implications on policy decisions in areas from education, culture and research to communications and security. Maciej Ogrodniczuk will present CLARIN’s flagship project ParlaMint to illustrate how DH offers insight and foresight in relation to social and political impact.
CLARIAH-NL Launches Series of Lunch Lectures
Starting Thursday, 20 April 2023, 12:00 - 13:00 (CEST)
CLARIAH-NL is starting a new series of Lunch Lectures, which offer a great opportunity to get a glimpse of all kinds of research surrounding the CLARIAH research infrastructure, and a chance to exchange ideas with researchers from different domains in an open, informal setting.
In the first Lunch Lecture, Peter Verhaar, Digital Scholarship Librarian and Lecturer at Leiden University, discusses the outcomes of the project ‘Connecting to the Network of Cultural Heritage’. The project has resulted in a range of teaching materials and interactive Jupyter notebooks that encourage students and researchers to make use of the large quantities of open data collections made available by cultural heritage institutions.
34th European Summer School in Logic, Language and Information (ESSLLI 2023)
31 July-11 August 2023, Ljubljana, Slovenia
The ESSLLI Summer School provides an interdisciplinary setting for courses and workshops in logic, linguistics and computer science. The courses are suitable for all levels and cover a wide variety of topics within three interdisciplinary areas of interest: language and computation, logic and language, and logic and computation. In addition to the workshops and courses, there will be four evening lectures, given by prominent researchers on topics that are at the forefront of research in logic, language and computer science, as well as from wider scientific, historical, and philosophical perspectives.
Early-bird registration closes 31 May 2023.
First European Summer School in Artificial Intelligence (ESSAI 2023)
24 - 28 July 2023, Ljubljana, Slovenia
The European Summer School in Artificial Intelligence (ESSAI) is a new annual summer school with the ambition to become the central meeting place for students and young researchers in Artificial Intelligence (AI) to discuss current research and share knowledge. ESSAI provides an interdisciplinary setting in which courses are offered in all areas of AI, and also from wider scientific, historical, and philosophical perspectives.
Registration closes 15 May 2023.
Call for Papers: International Conference on CMC and Social Media Corpora for the Humanities
14 - 15 September 2023, Mannheim, Germany
This conference is the 10th anniversary edition of an annual conference series dedicated to the collection, annotation, and exploration of corpora of computer-mediated communication (CMC) and social media for research in the humanities. The conference brings together language-centred research on CMC and social media in linguistics, philology, media and communication sciences, and social sciences with research questions from the fields of corpus and computational linguistics, language technology, text technology, and machine learning.
Deadline for submissions: 30 April 2023
EURIZON Fellowship Program: Remote Research Grants for Ukrainian Researchers
This new EURIZON Fellowship Program is dedicated to supporting Ukrainian researchers whose work is hampered or prevented by Russia's war against Ukraine. The programme is aimed at teams of vulnerable researchers and other related staff from Ukrainian research institutes (PhD students, engineers, technicians, etc.) from all scientific domains. The temporary grants, funded by the European Union through the EURIZON H2020 project, are intended to enable Ukrainian research teams to continue their scientific work, remotely, by participating in research projects with partners from European research institutes. The proposed research projects can have a duration of 6 or 12 months. Research teams should consist of at least two members.
Deadline for applications: 8 May 2023
Competition: Shared Task on Speaker Attribution in Newswire and Parliamentary Debates
A new shared task on speaker attribution in German political discourse is taking place this spring and summer under the auspices of the GermEval Campaign. The goal of the shared task is the identification of speakers in political debates as well as in news articles, and the attribution of speech events to their respective speakers. Being able to identify this information automatically - identifying who says what to whom - is a necessary prerequisite for a deep semantic analysis of unstructured text. Training and development data for the tasks are now available on GitHub, with submissions opening on 1 July 2023.
The workshop for the shared task will be held as part of the Conference for Natural Language Processing (KONVENS 2023) in Ingolstadt, Germany, in September 2023.
Save the Date: LREC-COLING 2024
20-25 May 2024, Turin, Italy (hybrid event)
The ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL) will jointly organise the 2024 International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).
Organised as a hybrid event, the conference brings together researchers and practitioners in computational linguistics, speech, multimodality, and natural language processing, with special attention on the evaluation and development of resources that support work in these areas. Following in the tradition of the well-established parent conferences COLING and LREC, the joint conference will address challenges and provide an opportunity to exchange information and ideas through both oral presentations and extensive poster sessions, complemented by a social programme. The three-day main conference will be accompanied by a total of three days of workshops and tutorials held in the days immediately before and after.
Call for Papers: Journal of Open Humanities Data
The Journal of Open Humanities Data (JOHD) publishes open-access, peer-reviewed publications describing humanities research objects with high potential for reuse, including (annotated) linguistic corpora, ontologies, and lexicons. The editors invite submissions of both short papers and full-length research papers. JOHD provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge.
Open Science Officer
DARIAH ERIC, Berlin, Germany
DARIAH is looking for an independently minded and highly motivated Open Science Officer to contribute to the design and implementation of policy statements, guidelines and services related to the open dissemination of research outputs in the arts and humanities. Tasks include helping design, foster and implement Open Science policies within DARIAH and across the DARIAH networks, representing arts and humanities scholars in European Open Science policy debates, advocating Open Access and Open Science best practices through lectures, workshops, consultations and training measures, and managing the DARIAH Open Science services portfolio (OpenMethods, DARIAH Open, the OS helpdesk and the DARIAH OpenAIRE Gateway). The position is full-time and preferably located at the DARIAH Coordination Office in Berlin, although remote applications from highly qualified candidates will be considered.
Deadline for applications: 30 April 2023
Research Job Opportunities at HiTZ
HiTZ Basque Centre For Language Technology, San Sebastian and Bilbao, Spain
The HiTZ research centre at the University of the Basque Country (UPV/EHU) invites applications for several funded positions in Natural Language and Speech Processing. The centre brings together more than 60 multidisciplinary researchers belonging to two different research groups (IXA and Aholab) who have 30 years’ experience in the processing of spoken and written data. HiTZ is part of the Spanish CLARIN K-centre, and its main activities focus on training, research and innovation in the area of language technologies, with special attention on the computational processing of Basque. Currently, the centre is offering five PhD positions and six positions for researchers, both pre- and post-doctoral. Characteristics such as duration, starting date or stipend depend on each specific position.