This edition of the CLARIN café is organised by: Eva Soroli, CLARIN Ambassador, Associate Professor, University of Lille, France; Thomas Gaillat, Associate Professor, University of Rennes, France; Franck Cinato, CNRS researcher, University of Paris, France. The CLARIN host will be Eva Soroli
Date: 29 April 2022
Time: 14:00-16:15 (CEST)
Venue: CLARIN virtual Zoom meeting
Twitter hashtag: #CLARINcafe
A linguistic corpus is a collection of language productions (text/oral/multimodal data) selected and brought together in order to reveal something about human behaviour. Bilingual and multilingual corpora are very common in language studies and are relevant to researchers working, among other domains, in historical linguistics, language acquisition, variation, dialectal and typology studies.
The objective of this Café is to focus on the features of bi-/multilingual parallel, comparable and dialectal corpora (new or already published), and provide demonstrations on how to collect/build, annotate, explore, analyse and archive them in an interoperable way.
Each demo will include hands-on presentations and best practice recommendations for:
- Bi-/multilingual corpus building/collection and metadata
- Corpus exploitation (principles and tools of transcription, annotation)
- Data exploration, cleaning, output reorganisation and analysis
- FAIR issues and perspectives for knowledge sharing.
How to Join
The European Infrastructure CLARIN and its Knowledge Centres
Eva SOROLI, University of Lille, France
CORLI (Corpus, Language and Interactions): a CLARIN Knowledge-Centre
Christophe PARISSE, University of Nanterre & Céline POUDAT, Université Côte d'Azur, France
The multidialectal corpus of the Crescent dialects: collection, exploitation and analysis
Maximilien GUERIN, University of Paris & CNRS - HTL (UMR 7597)
14.50-15.00 Questions & Discussion
Building CIEP+, the parallel Corpus of Indo-European Prose Plus
Annemarie VERKERK & Luigi TALAMO Universität des Saarlandes, Germany
15.20-15.30 Questions & Discussion
A dynamic architecture to structure and analyse comparable learner corpora: the case of the French and English Corpus InterLangue (CIL).
Thomas GAILLAT, University of Rennes, LIDILE, France
15.50-16.00 Questions & Discussion
16.00-16.15 Wrap-up Session : Franck CINATO