Resources overview

Please click here to add a resource to the inventory. Number of resources that match the current filter criteria: 848

Please click here to add a tool to the tools inventory.

Namesort icon Languages Type Description Country Institute (not a CLARIN member) Finalization year Distribution Type
6305-QC Spanish other 6305 questions in spanish tagged manually with the Hoby categories, for Question Answering Classification Spain University of Jaén (Spain) 2008
ADAM Italian Spoken Corpus spoken, domain-specific, monolingual; 450 dialogs, 78787 words, 7h 34m 48s (length of audio signal); prosodic, morphosyntactic, semantic and pragmatic annotation in XML format. Italy
AkadTerm
English
French
German
Latvian
Russian
Terminological Resource The academic multilingual terminological database contains terminology units created, collected and aproved by Terminology Committee of Academy of Sciences of Latvia. It contains more than 790 908 terms in 6 languages. Latvia Academy of Sciences of Latvia
Álgu – Origins of Saami Words Northern Sami Lexicon / Knowledge Source The database will contain an etymological lexicon of Saami languages complete with detailed source citations. The database will be open to the public in November 2006 and will be updated regularly. Finland public
Álgu – Origins of Saami Words (Álgu – Saamen sanojen etymologinen tietokanta) -- language not in list -- Lexicon / Knowledge Source 70,000 words, over 100,000 etymological relations, Relational database
Aligner 2.0.6.7 -- language not in list -- Application / Tool A language-independent tag-oriented semi-automatic paragraph and sentence aligner. Works on MS Windows. Produces XML valid documents. Allows recording detailed bibliographical information. It has been used for creating English-Lithuanian Parallel corpus. Lithuania 2007
alinea
Catalan
English
Spanish
Application / Tool A tool for parallelizing translated texts, which has been specially designed for specialized corpora and also as a translation validator. Spain
Alpino Treebank Dutch Treebank A database of 7.000 syntactically analyzed Dutch sentences. Netherlands (the) 2002
ALTWEB Italian Written Corpus Dialect (Tuscan); 380.000 entries; written; DBT tagset Italy
Anaphora resolution system Norwegian Application / Tool University of Oslo
Anglos-Saxon charters English Written Corpus Charters written in Anglo-Saxon England before A.D. 900, marked-up in TEI XML. Browsable online. United Kingdom King's College London 2005
Annotation of EuroWordNet's Interlingual Index with the Top Concept Ontology 2.3 -- language not in list -- Lexicon / Knowledge Source About 65.000 language-independent lexical concepts (the so-called ILI, based on WordNet 1.6) connected to an ontolgy consisting on about 30 semantic features, the Top Concept Ontology developed during the EuroWordNet Project. Spain Euskal Herriko Unibersitatea, Universitat Politècnica de Catalunya 2008
Answer Type Taxonomy Italian Written Corpus Language independent with links to Italian semantic lexicons (ItalwordNet and PAROLE-SIMPLE-CLIPS), XML file with classification of type of expected answers for Question Answering application Italy
Aquén - Toponimia galega Galician Lexicon / Knowledge Source Galician Toponymy Database, 40,000 entries Spain
Aquis Communautaire MT Maltese Written Corpus 10M Words; XML Malta
AraucariaDB English Written Corpus A corpus of argumentation collected by scholars from around the world. With analyses created by the Araucaria sofware. United Kingdom Applied Computing Department of Dundee University
Arborest Estonian Treebank 149 sentences, VISL tagset Estonia
Archiv für Gesprochenes Deutsch (AGD) German Spoken Corpus Recordings of dialects, conversations, institutional interaction. AGD (“Archiv für gesprochenes Deutsch”, in former times called “Deutsches Spracharchiv”) archives german spoken corpora of research projects and allocates them for further scientific research. Available digitized soundfiles and transcripts are researchable in DGD (“Datenbank gesprochenes Deutsch”). Germany restricted
Archive of Serbian Literature Serbian Written Corpus Novels of Rastko Petrovic, Branimir Cosis, Radoje Domanovic, etc. Serbia
Argumentation and argument visualisation in promoting strategic reading and decision-making Finnish Spoken Corpus Corpus of upper secondary school students' think-aloud performances when searching the Internet for information Finland Department of Educational Sciences, University of Jyväskylä
Argumentation in studying problem-solving skills in social work education in Finnish Polytechnics Finnish
Spoken Corpus
Written Corpus
Essays and discussions by students in Polytechnics and comprehensive schools Finland Department of Educational Sciences, University of Jyväskylä
Arts and Humanities Data Service Literature, Languages and Linguistics
-- language not in list --
English
Written Corpus
Lexicon / Knowledge Source
Electronic texts, corpora, lexicons. other United Kingdom
ASK - The Norwegian Second Language Corpus Norwegian Written Corpus Learner corpus; POS, Error types Norway
Assigning lemmas and part-of-speech to wordform lists Slovenian Web Service online service Slovenia
Atlas of Place Names Finnish other The digital atlas illustrates the distribution of 234 common Finnish place-name elements based on data in the Names Archive. Finland 2007 public