Corpora

Tour d e CLARIN: The CLARIN PL-B-Centre

18 December 2021
 



Written by Krzysztof Hwaszcz and Jan Wieczorek



The Polish consortium CLARIN-PL, which is a founding member of CLARIN , operating since 2012, was already presented...

Using a Monitor Newspaper Corpus to Trace Changing Language as a Result of COVID-19

9 December 2021
The Project, This project illustrates the possibility to trace, almost in real time, changes in language in response to a crisis using a monitor newspaper corpus.
As a response to the dramatic developments that took place in early 2020, a sudden and dramatic increase in vocabulary took place in a very short peri
'The pandemic provided an exceptional opportunity to demonstrate the use of this CLARIN monitor newspaper corpus.’  Koenraad De Smedt, btn-arrow-circle, image-right
 
Methodology, This study used the Norwegian Newspaper Corpus as its data source. All occurrences of words starting with corona/korona in the period from 9 January 2
plot-ny-kum.png, Cumulative increase of the corona compound vocabulary.
 
Outcome, Not only was the occurrence of new compounds with the stem corona/korona in the studied timeframe very high, but the speed of vocabulary growth and th
Many of the new compounds are heavily context-dependent: for instance, korona­telt (corona tent), koronautsettelsene (corona postponements), coronalov
‘This is the first study to demonstrate the effect of such a spelling change in various Norwegian media sources.’ Koenraad De Smedt, btn-arrow-circle, image-right
 
CLARIN Tools and Resources, This study used the Norwegian Newspaper Corpus as its data source. The corpus is part of the CLARIN Resource Family ‘Newspaper Corpora’. It is updated
Access Corpuscle via the CLARINO Centre Bergen:, btn-arrow-circle, clarino-green-sml.png, image-left
Browse newspaper corpora in CLARIN's Resource Families: , btn-arrow-circle, resource-families-hires.jpg, image-left
 
Views on CLARIN, 'Newspaper monitor corpora, which incorporate new materials on a regular basis, are particularly useful for tracking linguistic changes spurred by cur
 
Koenraad De Smedt, Professor of Computational Linguistics, Department of Linguistic, Literary and Aesthetic Studies, University of Bergen, Norway
See here for more information on how CLARIN has responded to COVID-19.

ParlaCLARIN II Goes Virtual

18 May 2020

The organisers of the second ParlaCLARIN workshop on creating, using and linking parliamentary corpora with other types of political discourse share their experience with the successful virtual edition of this LREC2020 workshop that was originally envisaged to take place in Marseille. 





 

Deutsches Textarchiv

Deutsches Textarchiv logo
Search engine demonstration:
Country:
Germany
CLARIN Centre:
BBAW
Description

Das Deutsche Textarchiv (German Text Archive) provides access to a comprehensive range of German texts from around 1600 to 1900. The selection of texts is based on scholarly bibliographies of the period, resulting in a balanced corpus, containing more than 1,300 works and almost 100 million words (as of the start of 2014).