Frequently Asked Questions - CLARIN resources

CLARIN is not the direct owner of any resource. It only provides access to a series of repositories where information about existing resources can be found. There are three main categories of resources: data, services and tools.

Several ways to access these resources are available:

  • browse them using the Virtual Language World or the CLARIN Language Resouce/Language Tool inventory (you can find them both inside the Virtual Language Observatory);

Another source of information regarding existing resources is LT-World. LT World is also a valuable source of information about linguistic terms and types of resources from a theoretical point of view.
Take note that this is not the entry point into the CLARIN world of information. When CLARIN centres will be in place, an extensive search service will provide access to all the types of resources.

All kinds of relevant linguistic data: textual and speech corpora, either raw or accompanied by metadata (annotated), lexica, grammars, video (sign-language recordings) and multimedia (text and speech, video, speech and subtitles), etc.

Yes, although not in CLARIN, but through CLARIN... You can use the facetted search tool for data to find a specific type of linguistic data.

If you are a member, you can add data by filling in the form here.

The procedure to update existing (meta)data is decribed here.

Tools are programs doing specific transformations over linguistic data.

Yes, you can use the facetted search interface for tools to search for various NLP tools. You can filter these tools by type, language, platform, license and author organization.

If you are a member, you can add tools by filling in the form here.

Services are the on-line equivalent of tools. The difference between a tool and a service is that a tool needs to be run locally (where the data is), while a service runs remotely. When using a web service the input data and the program that does the processing can reside on different machines. The data is transferred via protocols to the remote server, and the output results are transferred back when the processing is complete.

If you are a member, you can add tools by filling in the form here. You need to specify the type "Web service".

The Virtual Language World is a modality to browse the resources and tools in the CLARIN repository using the Google Earth Interface. You can find it here (note that you need to have Google Earth installed in order to use it). The VLW contains data from multiple sources: from the CLARIN LRT inventory, from OLAC providers, from IMDI and from the DFKI software registry.