- Clarin
- Publications
- Clarin Groups
- Events
- Resources
- Help Desk
Resources overview
Please click here to add a resource to the inventory. Number of resources that match the current filter criteria: 848
Please click here to add a tool to the tools inventory.
|
Name |
Languages | Type | Description | Country | Institute (not a CLARIN member) | Finalization year | Distribution Type |
|---|---|---|---|---|---|---|---|
| 6305-QC | Spanish | other | 6305 questions in spanish tagged manually with the Hoby categories, for Question Answering Classification | Spain | University of Jaén (Spain) | 2008 | |
| ADAM | Italian | Spoken Corpus | spoken, domain-specific, monolingual; 450 dialogs, 78787 words, 7h 34m 48s (length of audio signal); prosodic, morphosyntactic, semantic and pragmatic annotation in XML format. | Italy | |||
| AkadTerm |
English French German Latvian Russian |
Terminological Resource | The academic multilingual terminological database contains terminology units created, collected and aproved by Terminology Committee of Academy of Sciences of Latvia. It contains more than 790 908 terms in 6 languages. | Latvia | Academy of Sciences of Latvia | ||
| Álgu – Origins of Saami Words | Northern Sami | Lexicon / Knowledge Source | The database will contain an etymological lexicon of Saami languages complete with detailed source citations. The database will be open to the public in November 2006 and will be updated regularly. | Finland | public | ||
| Álgu – Origins of Saami Words (Álgu – Saamen sanojen etymologinen tietokanta) | -- language not in list -- | Lexicon / Knowledge Source | 70,000 words, over 100,000 etymological relations, Relational database | ||||
| Aligner 2.0.6.7 | -- language not in list -- | Application / Tool | A language-independent tag-oriented semi-automatic paragraph and sentence aligner. Works on MS Windows. Produces XML valid documents. Allows recording detailed bibliographical information. It has been used for creating English-Lithuanian Parallel corpus. | Lithuania | 2007 | ||
| alinea |
Catalan English Spanish |
Application / Tool | A tool for parallelizing translated texts, which has been specially designed for specialized corpora and also as a translation validator. | Spain | |||
| Alpino Treebank | Dutch | Treebank | A database of 7.000 syntactically analyzed Dutch sentences. | Netherlands (the) | 2002 | ||
| ALTWEB | Italian | Written Corpus | Dialect (Tuscan); 380.000 entries; written; DBT tagset | Italy | |||
| Anaphora resolution system | Norwegian | Application / Tool | University of Oslo | ||||
| Anglos-Saxon charters | English | Written Corpus | Charters written in Anglo-Saxon England before A.D. 900, marked-up in TEI XML. Browsable online. | United Kingdom | King's College London | 2005 | |
| Annotation of EuroWordNet's Interlingual Index with the Top Concept Ontology 2.3 | -- language not in list -- | Lexicon / Knowledge Source | About 65.000 language-independent lexical concepts (the so-called ILI, based on WordNet 1.6) connected to an ontolgy consisting on about 30 semantic features, the Top Concept Ontology developed during the EuroWordNet Project. | Spain | Euskal Herriko Unibersitatea, Universitat Politècnica de Catalunya | 2008 | |
| Answer Type Taxonomy | Italian | Written Corpus | Language independent with links to Italian semantic lexicons (ItalwordNet and PAROLE-SIMPLE-CLIPS), XML file with classification of type of expected answers for Question Answering application | Italy | |||
| Aquén - Toponimia galega | Galician | Lexicon / Knowledge Source | Galician Toponymy Database, 40,000 entries | Spain | |||
| Aquis Communautaire MT | Maltese | Written Corpus | 10M Words; XML | Malta | |||
| AraucariaDB | English | Written Corpus | A corpus of argumentation collected by scholars from around the world. With analyses created by the Araucaria sofware. | United Kingdom | Applied Computing Department of Dundee University | ||
| Arborest | Estonian | Treebank | 149 sentences, VISL tagset | Estonia | |||
| Archiv für Gesprochenes Deutsch (AGD) | German | Spoken Corpus | Recordings of dialects, conversations, institutional interaction. AGD (“Archiv für gesprochenes Deutsch”, in former times called “Deutsches Spracharchiv”) archives german spoken corpora of research projects and allocates them for further scientific research. Available digitized soundfiles and transcripts are researchable in DGD (“Datenbank gesprochenes Deutsch”). | Germany | restricted | ||
| Archive of Serbian Literature | Serbian | Written Corpus | Novels of Rastko Petrovic, Branimir Cosis, Radoje Domanovic, etc. | Serbia | |||
| Argumentation and argument visualisation in promoting strategic reading and decision-making | Finnish | Spoken Corpus | Corpus of upper secondary school students' think-aloud performances when searching the Internet for information | Finland | Department of Educational Sciences, University of Jyväskylä | ||
| Argumentation in studying problem-solving skills in social work education in Finnish Polytechnics | Finnish |
Spoken Corpus Written Corpus |
Essays and discussions by students in Polytechnics and comprehensive schools | Finland | Department of Educational Sciences, University of Jyväskylä | ||
| Arts and Humanities Data Service Literature, Languages and Linguistics |
-- language not in list -- English |
Written Corpus Lexicon / Knowledge Source |
Electronic texts, corpora, lexicons. other | United Kingdom | |||
| ASK - The Norwegian Second Language Corpus | Norwegian | Written Corpus | Learner corpus; POS, Error types | Norway | |||
| Assigning lemmas and part-of-speech to wordform lists | Slovenian | Web Service | online service | Slovenia | |||
| Atlas of Place Names | Finnish | other | The digital atlas illustrates the distribution of 234 common Finnish place-name elements based on data in the Names Archive. | Finland | 2007 | public |

