CLARIN is a pan-European initiative which aims to build a research infrastructure for language resources integrating numerous tools and resources in a distributed architecture, and which will respond to the needs of researchers across the humanities and social sciences. CLARIN is being built on open standards, but also with a recognition that standards and guidelines are only one part of a complex jigsaw which needs to be assembled to create reliable, durable and high quality services. The Text Encoding Initiative is a long-standing community which develops guidelines for the encoding of scholarly texts in XML, and works with associated technologies. This workshop brings together those involved in these two sets of activities to share experiences and knowledge, and to find ways to work together productively in the next generation of infrastructure services.
10:00 Keynote address: TEI for written historical corpora: why and how? - Alexander Geyken (Berlin-Brandenburg Academy of Sciences) abstract presentation (pdf)
11:30 Presentation 1: The new corpus query engine KorAP: connections with CLARIN and the TEI - Andreas Witt & Piotr Bański (Institut für Deutsche Sprache) presentation (pdf)
12:00 Presentation 2: Poio API1: a CLARIN-D curation project for language documentation and language typology - Peter Bouda (Centro Interdisciplinar de Documentação Linguística e Social, Minde) abstract (pdf) presentation (pdf)
12:30 Presentation 3: TEI, ALTO and METS: why we need all of them - Günter Mühlberger (University of Innsbruck) abstract (pdf) presentation (pdf)
13:00 lunch (not provided by conference organisers - see conference website for local restaurants)
14:30 Presentation 4: TEI and the Component Metadata Framework - Matej Durco and Karlheinz Mörth (Austrian Academy of Sciences) abstract (pdf) presentation (pdf)
15:00 Presentation 5: WebLicht's Text Corpus Format: susTEInability of CLARIN-D web services? - Jens Stegmann (University of Stuttgart)
15:30 Panel discussion: Responses: problems and opportunities - Arianna Ciula, Karlheinz Mörth and Laurent Romary
16:30 Panel discussion part 2: Next steps
Background and further information
The organizing committee of this workshop invited proposals for presentations on topics which link together CLARIN and the TEI, including:
- the role of the TEI in developing standards for CLARIN services,
- technical issues in the integration of TEI-conformant resources or TEI-aware tools in CLARIN services,
- barriers and problems with the deployment and linking of CLARIN and TEI technologies,
- training, awareness and advocacy activities.
Presenters are asked not to simply present an overview of their work, but to focus on precisely how, why (or why not) TEI formats, guidelines and technologies are being deployed, and to go into some technical detail to do this if necessary.
It is hoped that this will be only the start of promoting dialogue and collaboration between CLARIN and the TEI at many levels. One result would be an improved dialogue about the use of the TEI in higher-level initiatives to develop standards for the CLARIN architecture, but another would be enhanced engagement directly with the TEI community of developers and researchers in the many centres and institutions related to CLARIN.
TEI for written historical corpora: why and how?
Dr Alexander Geyken, Berlin-Brandenburg Academy of Sciences
Martin Wynne (Chair)
Oxford e-Research Centre
University of Oxford
Institute for Corpus Linguistics and Text Technology
Austrian Academy of Sciences
KU Leuven / U.Utrecht
Belgium / the Netherlands
Institut für Deutsche Sprache
Xavier Gomez Guinovart
Seminario de Linguistica Informatica
Universidade de Vigo