This Tour de CLARIN interview is with Zrinka Kolaković, a corpus linguist researching clitic phenomena using corpora provided by CLARIN.SI and the CLASSLA Knowledge Centre.
The CLARIN Knowledge Centre for South Slavic languages (CLASSLA) was recognized by CLARIN on 19 March 2019. The centre offers support for automated processing of South Slavic languages, and is operated by the Slovenian CLARIN.SI, and by the Bulgarian CLaDA-BG.
In this Tour de CLARIN blog post, we present an in-depth interview with Kaja Dobrovoljc, a Slovenian corpus linguist who works at the Centre for Language Resources and Technologies and regularly collaborates with CLARIN.SI and uses its infrastructure.
CLARIN Slovenia (CLARIN.SI) has contributed to several user involvement events which presented the results of the project to different user groups.
Read about the CSMTiser, a supervised machine learning tool that performs word normalization by using Character-level Statistical Machine Translation.
In 2015, researchers from the Jožef Stefan Institute in Ljubljana, Slovenia released the first emoji sentiment lexicon, called Emoji Sentiment Ranking 1.0, and published it as a resource in the public language resource repository CLARIN.SI. With 78,500 downloads to date, the lexicon is the most downloaded resource in the CLARIN.SI repository.
CLARIN.SI joined CLARIN in 2015 and is a B-certified centre which offers a LINDAT/D-Space repository that currently contains around 110 language resources for Slovenian as well as for other languages, especially Croatian and Serbian.