- Clarin
- Publications
- Clarin Groups
- Events
- Resources
- Help Desk
Join CLARIN
CLARIN Newsletter
CLARIN is an acronym for Common Language Resources and Technology Infrastructure.
CLARIN aims at uniting existing digital archives in Europe that contain language-based material into a federation that will allow the social sciences and humanities research communities to have unified access to the content. It wants to make the wealth of language and speech processing tools that have been developed over the recent years available to interested researchers with a view to opening up new research avenues. Another goal of CLARIN is to provide web based services that will allow non-expert users (especially humanities and social sciences researchers without technological background) to perform complex tasks on the materials contained in the archives, such as ‘Summarize Le Monde of March 17 2008 – in Polish’.
How do we unite existing archives from all over Europe into a single federation? How can we manage the wide variability of conventions for describing resources, in many different languages? How might we chain together existing tools and applications, with their specific expectations in terms of input and output formats? Can we help to provide a minimum level of technological coverage for all languages, irrespective of the number of speakers? How do we protect the intellectual property rights of those who have provided data or tools? How do we ensure that whatever infrastructure facilities we manage to build will be sustainable?
These questions will have an answer at the end of the CLARIN preparatory phase at the end of 2010.
Not yet. CLARIN is presently approaching the end of the preparatory phase (2008-2010), in which the technological and organizational specifications are being finalized. The infrastructure will be set up during the construction phase (2011 - 2015) and will be in exploitation from 2016 on. However, in the meantime, we do aim at creating some prototype services.
At the end of the preparatory phase, CLARIN intends to make the transition towards a more permanent structure. CLARIN is now investigating the possibility that this be an ERIC - European Research Infrastructure Consortium - a legal entity based on EU law (Article 171 of the EC Treaty). It was designed to facilitate the joint establishment and operation of research facilities of European interest (see the European Commission site for more details).
CLARIN aims at bringing together producers and consumers of language resources and technologies. A producer is a contributor of linguistic data, tools or services. Mainly these are research organisations, such as a university or an industrial company that is collecting and annotating textual or speech corpora, or is inventing and implementing natural language processing technologies or applications. A consumer is a researcher or a group in need of linguistic data or processing technologies. In CLARIN, this person is seen mainly as a scholar in the humanities or social sciences. Although the consumer may have no computational linguistics background, CLARIN will aim to help to address research problems involving processing of linguistic data.
In the preparatory phase (2008-2010) CLARIN is funded by the EU through the 7th Framework ESFRI programme. One of the objectives of the preparatory phase is to come with cost estimations for the construction and exploitation phase. The main funders will then be the national governments, with a possible minor contribution from the EU for some generic costs of the infrastructure. The cost estimate will include European wide aspects such as comprehensive training and education programs.
CLARIN publishes a newsletter every 3 months and a newsflash every month. Have a look at the newsletter and at the newsflash (you can also subscribe to them, if you would like to receive these updates in your email).

