Skip to main content

CLARIN Software on GitHub

Most CLARIN software is kept in repositories in the CLARIN GitHub organisation. A large share of the software currently being developed, as well as other CLARIN consortia related software, is now hosted on general purpose platforms such as GitHub:

Name/link Succint description Maintainer


Format for Linguistic Annotation Maarten van Gompel


Build RESTful webservices and user interfaces for your command-line tools Maarten van Gompel

Glossa FCS connection

Interaction with Federated Content Search

Anders Nøklestad, Tekstlab

LAF Fabric

SHEBANQ demonstrator

Converts the Hebrew Text Database of the VU University Amsterdam from a specific database format into LAF (linguistic annotation framework). Used in the CLARIN-NL project SHEBANQ - see also: demonstrator presentation.

Dirk Roorda, /TLA


A corpus retrieval engine built on top of Apache Lucene. It allows fast, complex searches with accurate hit highlighting on large, tagged and annotated, bodies of text. It was developed at the Institute of Dutch Lexicology (INL) to provide a fast and feature-rich search interface on our historical and contemporary text corpora.

Jan Niestadt, INL

UFAL/LINDAT repository


Vagrant installer

This project, suitable for repository developers uses Vagrant to create a temporary Virtual Machine (VM) provisioned by Puppet (responsible for installation of additional software) with an instance of the repository system developed at LINDAT/CLARIN. It is based on DSpace and includes a number of convenient tools.

Jozef Mišutka, UFAL
Shibbie Tool to test simulate federated login from multiple Identity Providers to one Service Provider

Jozef Mišutka, UFAL

License selector

JQuery plugin for easy selection of various licenses

Jozef Mišutka, UFAL


OpenSKOS is a publication and exchange platform for vocabularies that can be mapped to the SKOS W3C recommendation.

CatchPlus project
OpenSKOS browser Front end for OpenSKOS Olha Shkaravska, Meertens Instituut


Application to retrieve and update eSciDoc Metadata records (MD-Records) in

Mitchell John Seaton


The library wlfxb and some utilities for converting CoNLL-X to/from TCF

Daniël de Kok, UTU
FLAT Fedora/Islandora based project aiming to develop an easy to use and maintainable archive setup for language resources with Component Metadata. Menzo Windhouwer, Meertens Instituut/Daniel von Rhein, MPI-PL
Attribute aggregator Shib auth logging: script that accesses raw assertions as received by the ie., before any mapping or filtering takes place. Ondřej Košarko, UFAL
CLARIN.SI Repositories of CLARIN Slovenia  

You are invited to report open source software that matches the description above and you wish to be listed via trac [at] (trac[at]clarin[dot]eu).