Tour de CLARIN: The Danish CLARIN Knowledge Centre DANSK

Submitted by Karina Berger on 12 April 2022

Written by Costanza Navarretta

DANSK (DANish helpdeSK) became a certified CLARIN Knowledge Centre in 2016. It involves two Danish institutions:

The Department of Nordic Studies and Linguistics (NorS) offers support on written and spoken Danish, as well as on the digitisation and annotation of older Danish texts. The Centre for Language Technology (CST), which is part of the department, also hosts the CLARIN-DK infrastructure.

The Danish Language Council (DSN) answers questions about the Danish language and its usage via telephone, email, and the online portal ‘Hjælp til Dansk (Help to Danish). The portal provides online exercises and teaching material, for example about comma setting, new words and spelling. DSN also offers courses to schools, organisations, and companies.

The Department of Danish Sign Language, which is a subsidiary of the Danish Language Council, gives advice and information about Danish Sign Language and provides online Daniresources (Danish Sign Language’s tools and resources), such as the multimodal Dictionary of Sign Language. The Department of Danish Sign Language can be contacted online, via email, Skype or text message.

In the Knowledge Centre DANSK, CST answers questions about language technology resources and applications especially for Danish, and it guides users in using the resources provided by CLARIN-DK. These resources comprise:

  • Word lists, computational lexica, and the Danish wordnet (DanNet)
  • Danish corpora from different periods and of different types. Some of the corpora are enriched with linguistic annotations, or other kinds of information. The contemporary corpora in CLARIN-DK cover general language or specific domains  
  • Parliamentary debates
  • Multimodal data and annotation schemes
  • CST lemmatiser.
Anonymous text from the Everyman Corpus of Danish texts (1500-1750).
 

The CLARIN-DK infrastructure also offers online tools, such as:

  • Annotation tools, such as tokeniser, PoS tagger, lemmatiser, named entity tagger, parsers, annotation
  • Corpus search and visualisation
  • Workflow manager, Text Tensorium: for annotating texts with many types of linguistic information. The workflow offers tools for texts in several languages and in different formats.   
Screenshot of the CLARIN-DK Text Tonsorium workflow.
CST also supports researchers who want to produce, share or use FAIR resources distributed in the CLARIN-DK repository.
 

Text Tonsorium is also included in the Text Normalisers CLARIN Resource Family.