Tour de CLARIN: Poland

Submitted by on 22 December 2017

Blog post written by Jan Wieczorek and Ewa Rudnicka

The Polish consortium CLARIN-PL is a founding member of CLARIN and has been actively involved in its operations since the very beginning in 2005. It comprises six member institutions:

The leader of the consortium is the Language Technology Centre at the Wroclaw University of Technology, which is a CLARIN B-Centre. The Polish National Coordinator is Maciej Piasecki. The team in the consortium includes a very diverse group of specialists: IT specialists, linguists, literary scholars, and specialists in library and information science. 

The main goal of CLARIN-PL is to construct technical infrastructure, tools and resources for natural language processing – especially for Polish language processing. The technical infrastructure (that is, the servers) is located at Wroclaw University of Technology at CLARIN-PL Language Technology Centre. The flagship tools and resources are: 

  • plWordNet, which is the biggest wordnet in the world available through the open license together with its mapping to Princeton WordNet. It includes emotive annotation and was built in close collaboration with the valency dictionary Walenty;
  • DSpace repository, which is a large library of linguistic data and tools;
  • SPOKES, which is a corpus of conversational data;
  • Chronopress, which is chronological corpus of Polish newspaper texts;
  • Websty, which is a tool for the extraction of stylometric data. Most tools and resources work in the user-friendly web service technology (it does not require any software installation on the user's computer); and
  • various speech recognition tools, such as Align.

The second goal of CLARIN-PL is to raise awareness and popularize knowledge about among the Polish Digital Humanities scholars. To this end, the Language Technology Centre organizes has been organizing a series of workshops called “CLARIN in research practice”. The consortium is also a strategic partner in many large research projects: employees of the Consortium advise on the optimal use of the existing NLP tools and resources and help plan research, which gives them the opportunity to collect opinions and information about researchers' needs. In November 2017 at Wroclaw University of Technology, PolLinguaTec, a CLARIN Knowledge Centre for Polish Language Technology (Clarin K-Centre), was created. Its task is the continuation of user involvement activities.



CTJ: part of CLARIN-PL team from the Language Technology Centre (Wroclaw University of Technology)

CLARIN-PL Partners | first row (L-R): Maciej Piasecki (Wrocław University of Technology), Adam Pawłowski (Wrocław University); second row: Roman Roszko (Institute of Slavic Studies, Polish Academy of Sciences), Krzysztof Marasek (Polish-Japanese Academy of Information Technology); third row: Piotr Pęzik (Lodz University), Maciej Ogrodniczuk (Institute of Computer Science, Polish Academy of Sciences)


Click here to read more about Tour de CLARIN