You are here

Blogs

 
Centre News icon

In this issue: CLARIN SPF extended with Cyprus, New CLARIN Identity Provider version, DataCite membership, DH Course Registry API

 
 

The Czech National Corpus (CNC) is a long-term academic project with the main aim to continuously map the Czech language by building, annotating and providing access to a variety of large general-purpose corpora. It has been recognized by CLARIN as a Knowledge-Centre in 2018.

 
 

In this Tour de CLARIN blog post, we present an in-depth interview with Kaja Dobrovoljc, a Slovenian corpus linguist who works at the Centre for Language Resources and Technologies and regularly collaborates with CLARIN.SI and uses its infrastructure.

 
 

In this Tour de CLARIN blog post, we present an in-depth interview with Nan Bernstein Ratner, who is along with Brian MacWhinney one of the PIs of FluencyBank, a shared database for the study of the development of fluency in typical and disordered populations.

 
 

CLARIN Slovenia (CLARIN.SI) has contributed to several user involvement events which presented the results of the project to different user groups.

 
 

Read about the CSMTiser, a supervised machine learning tool that performs word normalization by using Character-level Statistical Machine Translation.

 
 
Digital Humanities – The Perspective of Africa workshop

Guest blog post by DEMBA Kandeh that chronicles and reflects on a five-day workshop: “Digital Humanities – the perspective of Africa”, recently organized at the Lorentz Centre in Leiden.

 
 

TalkBank, which was recognized as a CLARIN Knowledge Centre in 2016, is the world’s largest open access integrated repository for spoken language data. It provides language corpora and other audio resources to support researchers in Psychology, Linguistics, Education, Computer Science, and Speech Pathology.

 
 

In 2015, researchers from the Jožef Stefan Institute in Ljubljana, Slovenia released the first emoji sentiment lexicon, called Emoji Sentiment Ranking 1.0, and published it as a resource in the public language resource repository CLARIN.SI. With 78,500 downloads to date, the lexicon is the most downloaded resource in the CLARIN.SI repository.

 
 

CLARIN.SI joined CLARIN ERIC in 2015 and is a B-certified centre which offers a LINDAT/D-Space repository that currently contains around 110 language resources for Slovenian as well as for other languages, especially Croatian and Serbian.