CLARIN at LREC 2022 | CLARIN ERIC

Monday, 20 June - Saturday, 25 June 2022 , All day

General Information

Date

21-22-23 June 2022: Main Conference
20-24-25 June 2022: Workshops & Tutorials

Location

Palais du Pharo, Marseille, France, (floor map)

About

LREC is the major event on Language Resources (LRs) and Evaluation for Human Language Technologies (HLT). The conference provides an overview of the state-of-the-art regarding LRs and their applications. Participants can exchange information, discuss methodologies, industrial use cases and requirements coming from e-science and e-society, with respect to scientific and technological issues as well as policy and organisational ones.

CLARIN-related activities at LREC 2022

Contributions to the Main Conference

Workshops

ParlaCLARIN III Workshop – organised by CLARIN ERIC

Monday 20 June, from 9:00 to 13:00 and from 14:00 to 18:00

Palais du Pharo, Old Palace Level 1, Room: Grand Large (floor map)

The ParlaCLARIN III workshop at LREC2022 will focus on the topic of ‘Creating, Enriching and Using Parliamentary Corpora’. Parliamentary (language) data serves as a communication channel between elected political representatives and members of society, thus reflecting socio-politically relevant information. The development of accessible, comprehensive and well-annotated parliamentary corpora is crucial for a number of disciplines, such as political science, sociology, history, and (socio)linguistics. The workshop will bring together developers, curators and researchers of regional, national and international parliamentary debates from across diverse disciplines in the humanities and social sciences.

LEGAL 2022: Legal and Ethical Workshop – co-organised by Ingo Siegert, Khalid Choukri, Mickaël Rigault, Paweł Kamocki, Andreas Witt, Krister Lindén

Friday 24, from 9:00 to 13:00 and from 14:00 to 18:00

Pharo Old Palace Level 2, Room: Mucem (floor map)

Deep learning technologies for language resources and the demand for high-quality data interactions have increased the need for data collections, which are largely subject to legal constraints. Legal frameworks continuously need to adapt to the advancements in technology, while also taking into consideration the interests of stakeholders. This workshop invites technology and legal experts to discuss current legal and ethical issues concerning human language technology.

SIGUL 2022 Workshop – organised by CLARIN-IT

Friday 24 Saturday 25, from 14:00 to 18:00

Palais du Pharo, Old Palace Level 1, Room: Grand Large (floor map)

The first annual meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022) will take place as part of the LREC2022 conference. The workshop will provide academic and industry researchers with a forum for networking, as well as discussing and presenting cutting-edge research in the sector of natural language processing for under-resourced languages. In the tradition of the CCURL-SLTU Workshop Series, SIGUL 2022 spans the research interest areas of less-resourced, under-resourced, endangered, minority and minoritised languages.

The 4th Financial Narrative Processing Workshop (FNP 2022) –co-organised, among others, by CLARIN ambassador Paul Rayson

Friday 24 June, from 9:00 to 13:00 and from 14:00 to 18:00

Palais du Pharo, Old Palace Level 1, Room: Estaque (floor map)

Oral and Poster Presentations

Day 1, Tuesday 21 June

11:40- 13:00 (Poster Area 1) Session P1: Language Resource Infrastructures and Policy issues. Chair: Labropoulou, Penny	Language Technologies for the Creation of Multilingual Terminologies. Lessons Learned from the SSHOC Project (Federica Gamba, Francesca Frontini, Daan Broeder and Monica Monachini)
15:15 - 16:35 (Auditorium) Session O5: Language Resource Policies and Management. Chair: Di Persio, Denise, Co-Chair: Frontini, Francesca	Ethical Issues in Language Resources and Language Technology – A Tentative Categorisation (Paweł Kamocki and Andreas Witt)
16:55 - 18:15 (Poster Area 1) Session P12: Evaluation and Validation Methodologies (1) Chair: Refaee, Eshrag Ali A.	The Subject Annotations of the Danish Parliament Corpus (2009-2017) - Evaluation with Automatic Multi-label Classification. (Costanza Navarretta and Dorte Haltrup Hansen)
16:55 - 18:15 (Poster Area 1) Session: P10 - Lexicons (1) Chair: Olsen, Sussi	Making a Semantic Event-type Ontology Multilingual Zdenka Uresova, Karolina Zaczynska, Peter Bourgonje, Eva Fučíková, Georg Rehm, Jan Hajic Charles University, German Research Center for Artificial Intelligence, ³Morningsun Technology, DFKI NomVallex: A Valency Lexicon of Czech Nouns and Adjectives Veronika Kolářová, and Anna Vernerová, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University

Day 2, Wednesday 22 June

9:30 - 10:50 (Poster Area 2) Session P14: Corpora and Annotation (2) Chair: Ogrodniczuk, Maciej	ALIGNMEET: A Comprehensive Tool for Meeting Annotation, Alignment, and Evaluation (Peter Polák, Muskaan Singh, Anna Nedoluzhko and Ondřej Bojar) Spoken Language Treebanks in Universal Dependencies: an Overview (Kaja Dobrovoljc)
11:10 - 12:30 (Poster Area 1) Session P18: Corpora and Annotation (3) Chair: Montemagni, Simonetta	Evolving Large Text Corpora: Four Versions of the Icelandic Gigaword Corpus (Starkaður Barkarson, Steinþór Steingrímsson, Hildur Hafsteinsdóttir)
15:15 - 16:35 (Poster Area 2) Session P22: Lexicons (2) Chair: Yildiz, Olcay Taner	Constructing a Lexical Resource of Russian Derivational Morphology (Lukáš Kyjánek, Olga Lyashevskaya, Anna Nedoluzhko, Daniil Vodolazsky and Zdeněk Žabokrtský)
15:15 - 16:35 (Poster Area 2) Session P26: Dialogue and Conversational Systems (2) Chair: Hartholt, Arno	ELITR Minuting Corpus: A Novel Dataset for Automatic Minuting from Multi-Party Meetings in English and Czech (Anna Nedoluzhko, Muskaan Singh, Marie Hledíková, Tirthankar Ghosal and Ondřej Bojar)
15:15 - 16:35 (Poster Area 2) Session: P24 - Evaluation and Validation Methodologies (2) Chair: Zeldes, Amir	Quality and Efficiency of Manual Annotation: Pre-annotation Bias Marie Mikulová, Milan Straka, Jan Štěpánek, Barbora Štěpánková, Jan Hajic Charles University, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
16:55 - 18:35 (Poster Area 1) Session P27: Corpora and Annotation (4) Chair: Pęzik, Piotr	The Bulgarian Event Corpus: Overview and Initial NER Experiments (Petya Osenova, Kiril Simov, Iva Marinova and Melania Berbatova)

Day 3, Thursday 23 June

9:30 - 10:50 (Salle 120) Session O31: Document Classification, Text Categorisation Chair: Volk, Martin Co-Chair: Zhang, Mike	HeLI-OTS, Off-the-shelf Language Identifier for Text (Tommi Jauhiainen, Heidi Jauhiainen, Krister Lindén)
9:50 - 10:10 (Salle 92) Session O32: Lexicon and WordNet Chair: Vossen, Piek Co-Chair: Frontini, Francesca	Towards the Construction of a WordNet for Old English (Fahad Khan, Francisco J. Minaya Gómez, Rafael Cruz González, Harry Diakoff, Javier E. Diaz Vera, John P. McCrae, Ciara O'Loughlin, William Michael Short and Sander Stolk)
15:15 - 16:35 (Poster Area 2) Session: P38 Less-Resourced Languages (2) Chair: Soroa, Aitor	Latvian National Corpora Collection – Korpuss.lv. (Baiba Saulite, Roberts Darģis, Normunds Gruzitis, Ilze Auzina, Kristīne Levāne-Petrova, Lauma Pretkalniņa, Laura Rituma, Peteris Paikens, Arturs Znotins, Laine Strankale, Kristīne Pokratniece, Ilmārs Poikāns, Guntis Barzdins, Inguna Skadiņa, Anda Baklāne and Valdis Saulespurēns)
15:35 - 15:55 (Salle 120) Session O37: Anaphora and Coreference Chair: Magnini, Bernardo Co-Chair: De Bruyne, Luna	CorefUD 1.0: Coreference Meets Universal Dependencies (Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, Amir Zeldes and Daniel Zeman)

Friday 24 June (remote)

Session: R2 - Corpora and Annotation	Universal Grammatical Dependencies for Portuguese with CINTIL Data, LX Processing and CLARIN support. (António Branco, João Silva, Luís Gomes, João Rodrigues)

Contributions to Co-Allocated Events

Oral and Poster Presentations at Co-allocated Workshops

Monday 20 June

Immigration in the Manifestos and Parliament Speeches of Danish Left and Right Wing Parties between 2009 and 2020 (Costanza Navarretta, Dorte Haltrup Hansen and Bart Jongejan; Accepted at ParlaCLARIN III)
What if Ground Truth is Subjective? Personalized Deep Neural Hate Speech Detection (Kamil Kanclerz, Marcin Gruza, Konrad Karanowski, Julita Bielaniewicz, Piotr Milkowski, Jan Kocon and Przemyslaw Kazienko; Accepted at NLP Perspective workshop)
StudEmo: A Non-aggregated Review Dataset for Personalized Emotion Recognition (Anh Ngo, Agri Candri, Teddy Ferdinan, Jan Kocon and Wojciech Korczynski; Accepted at NLP Perspective workshop)

Friday 24 June

Advantages of a complex multilayer annotation scheme: The case of the Prague Dependency Treebank. (Eva Hajičová, Marie Mikulová, Jiří Mírovský, Barbora Štěpánková; accepted at LAW workshop)
9:30–9:50 Extending the SSJ Universal Dependencies Treebank for Slovenian: Was it Worth it? (Kaja Dobrovoljc and Nikola Ljubešić; Accepted at LAW XVI The 16th Linguistic Annotation Workshop)
11:40 - 12:40 Advantages of a complex multilayer annotation scheme: The case of the Prague Dependency Treebank (Eva Hajicova, Marie Mikulová, Barbora Štěpánková and Jiří Mírovský; Accepted at LAW XVI The 16th Linguistic Annotation Workshop)

CLARIN Booth at LREC2022

CLARIN will be present throughout the whole conference with a booth, you can visit us to get to know CLARIN better, to talk to people from the CLARIN network or browse through our latest publications. At the booth you will also have the possibility to watch some tutorial videos like the CLARIN and EOSC and CLARIN and notebooks ones.

Booth Attendance Schedule

	Tuesday 21	Wednesday 22	Thursday 23
Morning coffee break	11:20 - 11:40 Members of CLARIN ERIC Board of Directors Franciska de Jong Dieter Van Uytvanck	11:50 - 11:10 Kaja Dobrovoljc Dedicated to the paper ‘Spoken Language Treebanks in Universal Dependencies: an Overview’ (Kaja Dobrovoljc) Petya Osenova, Kiril Simov Dedicated to the paper ‘The Bulgarian Event Corpus: Overview and Initial NER Experiments' (Petya Osenova, Kiril Simov, Iva Marinova and Melania Berbatova)	Fahad Khan Dedicated to the paper Towards the Construction of a WordNet for Old English (Fahad Khan, Francisco J. Minaya Gómez, Rafael Cruz González, Harry Diakoff, Javier E. Diaz Vera, John P. McCrae, Ciara O'Loughlin, William Michael Short and Sander Stolk)
13:00 - 14:30 Lunch	Francesca Frontini, Monica Monachini Dedicated to the paper ‘Language Technologies for the Creation of Multilingual Terminologies. Lessons Learned from the SSHOC Project’ (Federica Gamba, Francesca Frontini, Daan Broeder and Monica Monachini)	Starkaður Barkarson Dedicated to the paper ‘Evolving Large Text Corpora: Four Versions of the Icelandic Gigaword Corpus' (Starkaður Barkarson, Steinþór Steingrímsson, Hildur Hafsteinsdóttir)	Tommi Jauhiainen Dedicated to the paper HeLI-OTS, Off-the-shelf Language Identifier for Text Tommi Jauhiainen, Heidi Jauhiainen, Krister Lindén University of Helsinki
Afternoon coffee break	Paweł Kamocki Dedicated to the paper ‘Ethical Issues in Language Resources and Language Technology – A Tentative Categorisation’ (Paweł Kamocki and Andreas Witt)

Address

Marseille
France