Experts on NLP tools for working with historical texts will meet to exchange ideas, experiences about tools and methods, and develop a resource guide, and a plan of action to integrate more tools into the CLARIN infrastructure. Participants will be invited from across the CLARIN community.
The workshop will bring together people who are creating or working with NLP tools (especially tokenizers, normalizers, morphological analyzers, part of speech taggers and lemmatizers) for historical language varieties, especially European languages in the period 1500-1800. This historical period (roughly covered by the term ‘Early Modern’ in English) is selected since it represents the time covered by many digitization programmes of early printed works, and a time when many languages were still recognizably similar in form to contemporary varieties, but with significant differences which mean that standard software tools often cannot be applied to them with acceptable levels of accuracy. This workshop will focus on the adaption of NLP tools trained on or designed for modern language varieties, as well as custom tools designed specifically for particular historical varieties. The workshop will be an opportunity for sharing expertise, know-how, tools and resources.