Blog post written by Marie Hinrichs and Christoph Draxler, edited by Nathalie Walker, Darja Fišer, and Jakob Lenardič
WebLicht (“Web-based Linguistic Chaining Tool”) is an environment for building and executing chains of natural language processing tools, with integrated capabilities for visualizing and searching the resulting annotations. It is hosted by the CLARIN centre at the University of Tübingen.
One of the main goals of WebLicht is to make a wide range of text processing tools such as tokenizers, part-of-speech taggers and syntactic parsers easily accessible to researchers in the humanities and social sciences. WebLicht’s annotation tools can be invoked via any web browser, without the need for local software installation or any prior familiarity with the tools. Researchers can select predefined processing chains, called “Easy Chains”, that have been created for the most common annotations and languages. However, custom processing chains can also be easily generated. The user is guided through each tool choice, where only tools that are valid for the current annotation task in the processing chain are made available for selection. This is made possible by detailed metadata about the input requirements and output annotations of each tool, and ensures that custom processing chains are always valid. CLARIN-D has also prepared a set of illustrative use cases and annotation examples which showcase how new users can get started with the tool.
WebLicht is tightly integrated into the CLARIN infrastructure. It uses information from the Center Registry to harvest tool metadata from all CLARIN centre repositories. The tool metadata from the Centre Registry are automatically harvested several times each day, ensuring that all tool information is up to date. WebLicht also supports log in with CLARIN Federated Identity, which allows researchers to log in through their academic institutions and makes the service available to researchers from thousands of institutions.
At BAS, work on WebMAUS continues, and CLARIN-D is actively collaborating closely with speech researchers and potential users all over the world. Recently, the first tone language – Thai – has been added, as well as six different Swiss German dialects. The CLARIN BAS team also encourages anyone who works with a language not yet covered by WebMAUS to get in touch so that the language can be added to the service.
Figure 3: Schematic description of WebMAUS input and resulting multi-level time-aligned transcript. Note that the sequence “and tells” is produced as [a n t e l z].
- Dima, E., E. Hinrichs, M. Hinrichs, A. Kislev, T. Trippel, and T. Zastrow (2012). “Integration of WebLicht into the CLARIN Infrastructure.” In: Proceedings of the Joint CLARIN-D/DARIAH Workshop at Digital Humanities Conference 2012: Service-oriented Architectures (SOAs) for the Humanities: Solutions and Impacts. Hamburg 17–23.
- Hinrichs, E., M. Hinrichs, and T. Zastrow (2010). “WebLicht: Web-Based LRT Services for German.” In: Proceedings of the Systems Demonstrations at the 48th Annual Meeting of the Association for Computational Linguistics (ACL-2010). Uppsala, 25–29.
- Kisler, T., U. Reichel, and F. Schiel (2017) “Multilingual processing of speech via web services” In: Computer Speech and Language, vol. 45, 326–347
- Kisler, T., F. Schiel, and H. Sloetjes (2012) “Signal processing via web services: the use case WebMAUS” In: Proceedings of Digital Humanities Conference 2012, Hamburg, 30–34