Skip to main content

Tour de CLARIN: A CLARIN B-centre in Vienna: The Austrian Centre for Digital Humanities and Cultural Heritage with ARCHE

Submitted by Jakob Lenardič on
 
 Blog post written by Martina Trognitz

The 2015 founded Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) is an institute of the Austrian Academy of Sciences in Vienna. The ACDH-CH pursues a dual agenda of conducting digitally enabled research and providing technical expertise and support to the research communities at the Academy, on the national and international levels. The institute has undergone a dramatic development in the last two years, evolving from the predecessor Institute for Corpus Linguistics and Text Technology (ICLTT) with its focus on language data to an institutional and national centre of expertise in the broader field of digital humanities and cultural heritage.

ARCHE (A Resource Centre for the HumanitiEs) is central to ACDH-CH’s mission of fostering the change towards the digital paradigm in the humanities. ARCHE is the successor of the 2014 initiated CLARIN Centre Vienna / Language Resources Portal (CCV/LRP). CCV/LRP’s goal was to provide depositing services and easy yet sustainable access to digital language resources created in or related to Austria. In 2017 ARCHE replaced CCV/LRP extending its mission by offering advanced and reliable data management and depositing services open to a broader range of humanities disciplines in Austria. ARCHE was already featured in Tour de CLARIN, volume I.  

ACDH-CH and ARCHE together contribute to the CLARIN infrastructure as a Service Providing Centre (CLARIN B-centre) since 2017. They are embedded within the European infrastructure consortia CLARIN ERIC and DARIAH-EU, which in Austria are jointly represented by CLARIAH-AT.

Services by ARCHE

The main aim of ARCHE (A Resource Centre for the HumanitiEs) is long-term preservation of research data and related resources. In addition to providing a set of further services and activities revolving around the deposition process and data management in general, ARCHE’s curators stand by with advice and assistance for researchers. A preformatted citation suggestion includes a (Handle) and allows researchers to download the reference in BibLaTeX format for persistent referencing.

ARCHE’s extensive metadata, which aids researchers in finding and understanding the data, is stored in a dedicated metadata schema described with OWL. All metadata is freely available under CC0, for example via ARCHE’s OAI-PMH endpoint, which provides a variety of metadata formats, including CMDI. Via , all language resources are harvested by the Virtual Language Observatory ( ).

For data, individual licences and one of three access modalities (public, academic and restricted) can be selected. Academic access is granted via an institutional login which is provided by the CLARIN ERIC and eduGAIN identity federation.

Data in ARCHE covers a wide range of humanities disciplines. In addition to linguistic resources such as dictionaries, Arabic corpora and audio recordings, the ever-increasing collections of ARCHE also include documentation archives from archaeological surveys, 3d scans of ancient objects, annotated historical data or protocols, and born-digital data from  the Digital Humanities. When it comes to languages currently represented in ARCHE, the archive stands out not only because of materials in English, French, German and Spanish, but also because of a large collection of resources in Arabic and some of its varieties (Algerian Saharan Arabic, Egyptian Arabic, Mesopotamian Arabic, North Levantine Arabic and Tunisian Arabic) and a few other languages like Dagbani, Persian and Yue Chinese.

Depending on the file type, a growing set of bespoke dissemination services allows researchers to preview, download, serialise or disseminate the file contents or its metadata. The dissemination services for each resource in ARCHE are displayed with clickable buttons.

Examples for Dissemination Services that visualise data include an online viewer for 3d files, which is based on the 3D Heritage Online Presenter (3DHOP) framework, and one for images, which is based on the International Image Interoperability Framework (IIIF). These two services are visualised in Figure 1 with two example resources.

 

Figure 1: Dissemination Services for viewing the file content online. Above: 3d viewer for an anthropomorphic Attic vase with the PID http://hdl.handle.net/21.11115/0000-000C-22F6-8. Below: IIIF based image viewer for a scan from Arthur Schnitzler’s Diaries with the PID http://hdl.handle.net/21.11115/0000-000C-5037-C.
Figure 2: A TEI resource with information about a crashed WW2 airplane (https://id.acdh.oeaw.ac.at/daacda/bomber__917.xml) in ARCHE and its view with the dissemination service “Custom TEI to HTML transformation”

 

The dissemination services allow the re-use of data stored in ARCHE in dedicated web applications. In this way, data that is stored for the long term can not only be accessed via the ARCHE GUI, but can also be displayed in a project with a custom web application.

Examples for such web applications re-using data from ARCHE include:

 

Figure 3: An image of a diary facsimile in a dedicated web application. The image is directly fetched from ARCHE via the IIIF endpoint.

 

Some dissemination services forward data from ARCHE to external services, such as the Language Resource Switchboard by CLARIN. The example in Figure 4 shows how the TEI annotated protocol of the 34th session of the congress of Achen (1818) from the data collection Mächtekongresse 1818–1822. Digital Edition can be sent to the switchboard and then be analysed with the Voyant Tools.

Figure 4: TEI Resources in ARCHE can be viewed in the CLARIN Language Resource Switchboard, which offers further tools for processing or visualisation, such as Voyant Tools.
 

Services by ACDH-CH

While ARCHE provides support and services revolving around long-term preservation of digital resources, its host institution, the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH), provides a wider palette covering all phases of the data life cycle. The ACDH-CH also pursues a digital humanities research agenda on its own. Current projects cover a wide range of humanities domains, investigating technical standards, infrastructure components, semantic technologies and text technological methods.

In addition to ARCHE, services offered by the ACDH-CH include the development of bespoke web applications and research software as well as re-usable tools for recurrent tasks. The portfolio of services is completed with personal consulting that aims at offering advice and guidance to the research community.

Prominent examples for tools developed at ACDH-CH include services for the wider CLARIN and DARIAH community, such as the CMDI Curation module to validate metadata records in the VLO, the DARIAH ELDAH - CONSENT FORM WIZARD to support humanities researchers in obtaining valid consent for data processing, the ACDH Vocabularies for persistent hosting of SKOS vocabularies, the open source database OpenAtlas with automatic mapping to CIDOC CRM, and SACHA - Simple Access to Cultural Heritage Assets for viewing historical books, newspapers and postcards.