CLARIN Technology

CLARIN is based on a distributed network of organisations (or centres) that host language resources and related services. These centres each have their own expertise. Within a single country, they are often grouped into a national consortium.

Each consortium appoints one centre as a representative in CLARIN's technical body, the Centre Committee. This is where most of the technical work happens: Writing specifications, planning software development and organising the quality control for each of the centre candidates. The independent Center Assessment Committee analyses each of the candidate centres and provides feedback with regards to compliancy to the technical and organisational requirements.


CLARIN's Technical Pillars

  • Federated Identity: Letting users log in to protected data and services with their own institutional login and password
  • Persistent Identifiers: Enabling sustainable citations of electronic resources
  • Repositories: Digital archives where language resources can be stored, accessed and shared
  • Component Metadata and concept definitions: To ensure semantic interoperability when describing language resources
  • Content Search: Offering a search engine for a wide range of language resources
  • Web service chaining: giving users the possibility to freely combine language processing services.

Software Development

All software that is developed by CLARIN is Open Source and released under the GPL v3 license. Please note that while in the past most software projects were hosted at our own SVN repository, new projects are managed via GitHub.


For CLARIN core software, see the CLARIN GitHub Repository. A list of other CLARIN-related software that is hosted at GitHub is available here.

Development Guidelines

If you intend to contribute code or to start a CLARIN software project, please make sure to read the Guidelines for Software Development (requires trac access). More related information can be found at the EURISE network website and on the Development Information page.


Digital Object Gateway (DOG)

DOG is an abstraction layer for accessing Digital Objects that are referred to with a Persistent Identifier () or URL. The end user of DOG does not need to know the specifics of how to access one particular data repository. Instead, by sending the identifier of an object (typically the landing page) to the DOG, the DOG can perform a range of standard operations to explore and process the object. See the Digital Object Gateway page for more information. 

CLARIN for Programmers

Programmers who are interested in using and contributing to CLARIN's infrastructure can find more information about a range of topics, such as Open Source Code, APIs, Widgets and Notebooks, on our CLARIN for Programmers page. 


Get in Touch


Learn More

If you want to learn more about the technology behind CLARIN, there are several sources of information: