Skip to main content

Introducing CLARIN Resource Families

Submitted by Dieter Van Uytvanck on

We are proud to announce the launch of CLARIN Resource Families, a new initiative which aims to provide a user-friendly overview of the available corpora for many languages in the CLARIN infrastructure. These resources are valuable for researchers from digital humanities, social sciences and human language technologies.

We currently offer overviews of 5 resource families:​ Newspaper corpora, Parliamentary corpora, Parallel corpora, Computer-mediated corpora and L2 learner corpora.

The listings include the most important metadata and descriptions on copus size, text sources, time periods, annotations and licences as well as links to download pages and concordancers, whenever available. We also provide hyperlinks to other relevant materials such as the thematic CLARIN workshops and tutorials and their accompanying videolectures, as well as a list of key publications on the corpora surveyed.

In the future, we plan to include other resource families, such as historical and spoken corpora as well as add tutorials on how to query, annotate and analyse the data.

 

Launch of the CLARIN Resource Families at LREC 2018 (Miyazaki, Japan, 9 May 2018)