Blog post written by Catia Cucchiarini, Ineke Schuurman and Griet Depoorter, with contributions by Darja Fišer and Jakob Lenardič
CLARIN DLU/Flanders is a founding member of CLARIN and represents Flanders, the Dutch speaking part of Belgium. The consortium consists of
- the Dutch Language Union (Nederlandse Taalunie - DLU),
- the Dutch Language Institute (Instituut voor de Nederlandse Taal – INT),
- the Centre for Computational Linguistics (Centrum voor Computerlinguïstiek – CCL – University of Leuven),
- the Language and Translation Technology Team (LT³ - University of Ghent),
- the Computational Linguistics & Psycholinguistics (CLiPS) research group (University of Antwerp),
- the PSI Speech Group (ESAT-PSI – University of Leuven), and
- the Language Intelligence & Information Retrieval research lab (LIIR – University of Leuven).
Since Flanders is not a country but a region, it did not qualify as a member in CLARIN and is therefore represented by the Dutch Language Union (DLU), an international language policy organization. The consortium is coordinated by the Dutch Language Institute (INT) and the national coordinator Griet Depoorter. The INT is a starting point for anyone who wants to know anything about the Dutch and Flemish languages through the centuries. The institute takes a central position in the Dutch-speaking world as a developer, manager and distributor of sustainable language resources, using reliable scholarly methods and is a certified CLARIN B Centre. For instance, the INT produced the Woordenboek der Nederlandsche Taal (Dictionary of the Dutch language), an enormous historical dictionary which describes Dutch words from 1500 to 1976. A few other examples of resources and tools that are available at the INT:
- the Dictionary of Contemporary Dutch (Algemeen Nederlands Woordenboek), which is online and corpus-based,
- the Reference Lexicon Dutch (Referentiebestand Nederlands), which contains 50.000 frequent Dutch words, enriched with linguistic information,
- the Dutch Parallel Corpus, a high quality sentence-aligned parallel corpus of 10 million words for the language pairs Dutch-English and Dutch-French,
- the Word list of the Dutch Language (Woordenlijst Nederlands), a list of words in the correct official spelling, and
- Blacklab, an open source corpus search engine built on top of Apache Lucene.
Apart from providing state-of-the-art language resources and tools, the consortium is also active in involving both students and researchers in its activities. For instance, a workshop was held at the Dutch Language Institute in October 2017, aimed at familiarizing Digital Humanities researchers with the resources and tools the consortium offers.
Currently, talks are being conducted with the research group Interdisciplinary Research on Technology, Education & Communication (ITEC - KULAK), the Department of Computer Science (University of Leuven) and the VUB Artificial Intelligence Lab (Vrije Universiteit Brussel) about joining the consortium.
Click here to read more about Tour de CLARIN