You are here

Programme CLARIN Annual Conference 2018

CLARIN Annual Conference 2018
Location: Hotel Galilei Congress in Pisa: 
Date: Monday 8 October 2018 - Wednesday 10 October 2018
For general information on the Conference, see the event page.

Preliminary programme

Day 1: Committee meetings, opening ceremony, invited talk, and parallel paper sessions


Monday 8 October 2018 Room
09:00-10:30 Committee meetings I (members and invitees only)  
  CLARIN National Coordinators' Forum (NCF) meeting Vip Hall
  Centre Assessment Committee (CAC) meeting Mig Hall

Coffee break

Gallilei Foyer
11:00-13:00 Committee meetings II (members and invitees only)  
  CLARIN National Coordinators' Forum (NCF) meeting Vip Hall
  User Involvement Group meeting Tornado Hall
  Standing Committee on CLARIN Technical Centres (SCCTC) meeting Mig Hall
  Taskforce CMDI + Metadata Curation meeting Lear Hall


14:00-16:00 Committee meetings III (members and invitees only)  
  User Involvement Group meeting Tornado Hall
  Knowledge Sharing Infrastructure Committee (KSI) meeting Lear Hall
  CLARIN Standards Committee (CSC) meeting Mig Hall
  CLARIN Legal Issues Committee (CLIC) meeting Vip Hall
  Taskforce PID Taskforce FCS + AAI meeting Concorde Hall
  Taskforce FCS + AAI meeting Concorde Hall

Coffee break

Gallilei Foyer
  Start of the conference  

CLARIN 2018 opening ceremony
Franciska de Jong (Executive Director), Inguna Skadiņa (Programme Chair) and Monica Monachini (Local Organizer)

Concorde Hall


Invited talk 1
CLAMS: Computational Linguistic Applications for Multimedia Services  
James Pustejovsky (Brandeis University, USA)

Concorde Hall


Paper session 1 (parallel): CLARIN in Relation with Other Infrastructures and Projects

Chair: Lars Borin

  • Using Linked Data Techniques for Creating an IsiXhosa Lexical Resource - a Collaborative Approach. Thomas Eckart, Bettina Klimek, Sonja Bosch and Dirk Goldhahn. 
  • A Platform for Language Teaching and Research (PLT&R). Maria Stambolieva, Valentina Ivanova and Mariyana Raykova. 
  • Curating and Analyzing Oral History Collections. Cord Pagenstecher.

Concorde Hall /Tornado Hall


Paper session 2 (parallel): CLARIN Knowledge Infrastructure, Legal Issues and Dissemination​

Chair: Kadri Vider

  • New exceptions for Text and Data Mining and their possible impact on the CLARIN infrastructure. Pawel Kamocki, Erik Ketzan, Julia Wildgans and Andreas Witt. 
  • Processing personal data without the consent of the data subject for the development and use of language resources. Aleksei Kelli, Krister Lindén, Kadri Vider, Pawel Kamocki, Ramūnas Birštonas, Silvia Calamai, Chiara Kolletzek, Penny Labropoulou and Maria Gavrilidou. 
  • Toward a CLARIN Data Protection Code of Conduct. Pawel Kamocki, Erik Ketzan, Julia Wildgans and Andreas Witt. 
Concorde Hall /Tornado Hall



Shuttle bus departure to the Conference welcome reception location

Welcome reception (drinks + food)

 Villa Comel

Day 2: General sessions, invited talk, thematic session, student session, and Bazaar


Tuesday 9 October 2018


State of the Infrastructure
CLARIN ERIC Board of Directors 

Concorde Hall
09:40-10:00 Presentation on CLARIN in Italy Concorde Hall
10:00-10:30 Coffee break Gallilei Foyer


Thematic session: Multimedia, Multimodality, Speech


Invited talk 2

Chair: Maciej Piasecki
Speech and gestures: computational linguistic studies
Costanza Navarretta (University of Copenhagen, Denmark)

Concorde Hall

Thematic papers 

Chair: Jan Odijk​

  • EXMARaLDA meets WebAnno. Steffen Remus, Hanna Hedeland, Anne Ferger, Kristin Bührig and Chris Biemann. 
  • Human-human, human-machine communication: on the HuComTech multimodal corpus. Laszlo Hunyadi, Tamás Váradi, István Szekrényes, György Kovács, Hermina Kiss and Karolina Takács. 
  • Oral History and Linguistic Analysis. A Study in Digital and Contemporary European History. Florentina Armaselu, Elena Danescu and François Klein. 
  • The Acorformed Coprus: Investigating Multimodality in Human-Human and Human-Virtual Patient Interactions. Magalie Ochs, Philippe Blache, Grégoire Montcheuil, Jean-Marie Pergandi, Roxane Bertrand, Jorane Saubesty, Daniel Francon and Daniel Mestre. 
  • Media Suite: Unlocking Archives for Mixed Media Scholarly Research. Roeland Ordelman, Liliana Melgar, Carlos Martinez-Ortiz and Julia Noordegraaf. 
Concorde Hall
13:00-14:30 Lunch Restaurant

Student paper slam

Chair: Darja Fišer

  • Automatic genre identification with machine learning methods. Andrea Dömötör. 
  • Improving OCR of historical newspapers and journals published in Finland by adding Swedish training data. Senka Drobac. 
  • Utilising Large Quantities of Found Audio Data. Per Fallgren. 
  • Dramatic Languages: Foreign Languages in the Writings of Peter Handke. Vanessa Hannesschläger. 
  • Instrument of parliamentary discourse analysis – Saeima debate corpus. Aija Kažoka. 
  • Automatization of Detection of Information-Dense Texts. Danguolė Kalinauskaitė. 
  • Reviling relationships between Old Norse texts using computer-assisted methods. Katarzyna Anna Kapitan. 
  • Speech Analysis in the Clarin-PL Project. Mariusz Kleć. 
  • Corpus-driven conversational agents: tools and resources for multimodal dialogue systems development. Maria Di Maro. 
  • Archival Dynamics : the Langues de France project and the building of the Judeo-Spanish Oral Archive (JSOA). Panteleimon Mavrogiannis. 
  • Modeling Lexical Knowledge for Natural Language Processing. Alexander Popov. 
  • Completing the BLARK for Portuguese with finely-tuned Distributional Semantic Models. João Rodrigues. 
  • Studying the ins and outs of external possession with the CLARIN infrastructure. Hannesschläger Scholten. 
  • Word Embeddings for Cross-Language Learning in Low-Resource Languages. Tadej Škvorc. 
  • Automatic Collocation Identification Using Word Embeddings. Yana Strakatova. 
  • Using the Dutch Parallel Corpus to Calculate English Dutch Word Translation Entropy. Bram Vanroy. 
  • Regional variation in spoken Russian. Benedikte Vardoy. 
Concorde Hall
15:00-16:30 Student posters Gallilei Foyer
16:30-17:00 Coffee break Gallilei Foyer

Short reports on Committee meetings, by the respective chairs

Concorde Hall


CLARIN Bazaar Gallilei Foyer



Shuttle bus departure to the Conference dinner location

Conference dinner

Stazione Leopolda

Day 3: Parallel paper sessions, posters, general sessions and closing session


Wednesday 10 October 2018 Room

Paper session 3 (parallel): Use of the CLARIN infrastructure

Chair: Antonio Branco 

  • From Language Learning Platform to Infrastructure for Research on Language Learning. David Alfter, Lars Borin, Ildikó Pilán, Therese Lindström Tiedemann and Elena Volodina. 
  • Bulgarian Language Technology for Digital Humanities: a focus on the Culture of Giving for Education. Kiril Simov and Petya Osenova. 
  • Multilayer Corpus and Toolchain for Full-Stack NLU in Latvian. Normunds Grūzītis and Artūrs Znotiņš. 
  • (Re-)Constructing “public debates” with CLARIAH MediaSuite tools in print and audiovisual media. Berrie van der Molen, Jasmijn van Gorp and Toine Pieters. 
  • Improving Access to Time-Based Media through Crowdsourcing and CL Tools: WGBH Educational Foundation and the American Archive of Public Broadcasting. Karen Cariani and Casey Davis-Kaufman. 
Concorde Hall /Tornado Hall


Paper session 4 (parallel): Design and construction of the CLARIN infrastructure

Chair: Tomaž Erjavec​

  • Discovering software resources in CLARIN. Jan Odijk. 
  • Towards a protocol for the curation and dissemination of vulnerable people archives. Silvia Calamai, Chiara Kolletzek and Aleksei Kelli.
  • Versioning with Persistent Identifiers.Martin Matthiesen and Ute Dieckmann. 
  • Interoperability of Second Language Resources and Tools. Elena Volodina, Maarten Janssen, Therese Lindström Tiedemann, Nives Mikelic Preradovic, Silje Karin Ragnhildstveit, Kari Tenfjord and Koenraad de Smedt. 
  • Tweak Your CMDI Forms to the Max. Rob Zeeman and Menzo Windhouwer. 
Concorde Hall /Tornado Hall

Coffee break

Gallilei Foyer

Poster slam

Chair: Kiril Simov

  • CLARIN Data Management Activities in the PARTHENOS Context. Marnix van Berchum and Thorsten Trippel. 
  • Integrating language resources in two OCR engines to improve processing of historical Swedish text. Dana Dannélls and Leif-Jöran Olsson.
  • Looking for hidden speech archives in Italian institutions. Vincenzo Galatà and Silvia Calamai. 
  • Setting up the PORTULAN / CLARIN centre. Luís Gomes, Frederico Apolónia, Ruben Branco, João Silva and António Branco. 
  • LaMachine: A meta-distribution for NLP software. Maarten van Gompel and Iris Hendrickx. 
  • XML-TEI-URS: using a TEI format for annotated linguistic ressources. Loïc Grobol, Frédéric Landragin and Serge Heiden. 
  • Visible Vowels: a Tool for the Visualization of Vowel Variation. Wilbert Heeringa and Hans Van de Velde.
  • ELEXIS - European lexicographic infrastructure. Milos Jakubicek, Iztok Kosem, Simon Krek, Sussi Olsen and Bolette Sandford Pedersen. 
  • Sustaining the Southern Dutch Dialects: the Dictionary of the Southern Dutch Dialects (DSDD) as a case study for CLARIN and DARIAHVan Keymeulen Jacques, Sally Chambers, Veronique De Tier, Jesse de Does, Katrien Depuydt, Tanneke Schoonheim, Roxane Vandenberghe and Lien Hellebaut. 
  • SweCLARIN – Infrastructure for Processing Transcribed Speech. Dimitrios Kokkinakis, Kristina Lundholm Fors and Charalambos Themistokleous. 
  • TalkBankDB: A Comprehensive Data Analysis Interface to TalkBank. John Kowalski and Brian MacWhinney. 
  • L2 learner corpus survey – Towards improved verifiability, reproducibility and inspiration in learner corpus research. Therese Lindström Tiedemann, Jakob Lenardič and Darja Fišer. 
  • DGT-UD: a Parallel 23-language Parsebank. Nikola Ljubešić and Tomaž Erjavec. 
  • DI-ÖSS - Building a digital infrastructure in South Tyrol. Verena Lyding, Alexander König and Elisa Gorgaini. 
  • Linked Open Data and the Enrichment of Digital Editions: the Contribution of CLARIN to the Digital Classics. Monica Monachini, Francesca Frontini, Anika Nicolosi and Fahad Khan. 
  • How to use DameSRL: A framework for deep multilingual semantic role labeling. Quynh Ngoc Thi Do, Artuur Leeuwenberg, Geert Heyman and Marie-Francine Moens. 
  • Speech Recognition and Scholarly Research: Usability and Sustainability. Roeland Ordelman, Arjan van Hessen. 
  • Towards TICCLAT, the next level in Text-Induced Corpus Correction. Martin Reynaert, Maarten van Gompel, Ko van der Sloot and Antal van den Bosch. 
  • SenSALDO: a Swedish Sentiment Lexicon for the SWE-CLARIN Toolbox. Jacobo Rouces, Lars Borin, Nina Tahmasebi and Stian Rødven Eide. 
  • Error Coding of Second-Language Learner Texts Based on Mostly Automatic Alignment of Parallel Corpora. Dan Rosén, Mats Wirén and Elena Volodina. 
  • Using Apache Spark on Hadoop Clusters as Backend for WebLicht Processing Pipelines. Soheila Sahami, Thomas Eckart and Gerhard Heyer. 
  • UWebASR – Web-based ASR engine for Czech and Slovak. Jan Švec, Martin Bulín, Aleš Pražák and Pavel Ircing. 
  • Pictograph Translation Technologies for People with Limited Literacy. Vincent Vandeghinste, Leen Sevens and Ineke Schuurman. 
Concorde Hall

Poster and demo session

Gallilei Foyer
13:00-14:00 Lunch Restaurant

State of the Technical Infrastructure
Dieter Van Uytvanck

Concorde Hall

14:30-14:45 Award ceremony Concorde Hall


Looking forward
Board of Directors

Acknowledgements and goodbye
Inguna Skadiņa

Concorde Hall
15:30 End of conference  

Day 4: Post-conference SAB meeting


Thursday 11 October 2017

09:00-10:30 Scientific Advisory Board (closed meeting) Lear Hall
10:30-11:30 Coffee break Lear Hall
11:00-13:00 Scientific Advisory Board (closed meeting) Lear Hall
13:00 Lunch Restaurant