Skip to main content

Centre news vol. 40 - January 2019

Submitted by Dieter Van Uytvanck on

New Centres

We warmly welcome the Czech National Corpus as K-centre, which is also providing an endpoint for Federated Content Search.

VLO 4.6 beta: collapsing of very similar records

As announced in the December issue of CLARIN Newsflash, a new minor version of the Virtual Language Observatory has gone into beta testing. Of course we invite you to have a look, explore new and existing features and send us your feedback (via vlo [at] clarin.eu (vlo[at]clarin[dot]eu) or use the embedded form that can be accessed from any page by clicking the ‘Your testing feedback’ button).

Furthermore we suggest that centres that are providing metadata to the have a look at how their collections appear in the beta instance of the VLO. One the major changes in this version is that each set of (near) duplicate documents within the results for any query are ‘collapsed’ into a single result item. In the current configuration, records within a collection that specify the same content language and have the same (lack of) description get collapsed. This also affects the result counts in the facets (including ‘collection’). For certain collections this leads to a substantial decrease in the number of apparent results and therefore the visibility of these collections in the VLO. In many of these cases this could be mitigated by simply offering more distinct descriptions for records within a collection. If you have questions about how your records appear in the beta instance of the VLO and/or how you could improve your metadata, you can always contact us via vlo [at] clarin.eu (vlo[at]clarin[dot]eu)

As you may know, we are always looking for new members for our ‘testers pool’. You can find out more, and sign up if you like via this page.

Curation module 2.0

The Metadata Curation Module, an online tool that provides insight in the quality of metadata definitions and records developed by ÖAW-ACDH, has been updated to its second major release. It offers statistics in terms of completeness and coverage of VLO facets at the level of metadata profiles, collections and individual records. Its logic is now fully aligned with that of the VLO. Furthermore, the curaiton module gathers and presents information on the validity of resource links provided in published metadata. Users familiar with previous versions of the curation module should note that as of the new release, all analyses presented in the module are based on CMDI 1.2  versions of the metadata, even if the original was provided as CMDI 1.1.

Details on the workings of the curation module can be found via the application’s GitHub repository. A more detailed overview of changes compared to previous versions will be added to this repository in the near future.

Interesting reads

Digital Objects as Drivers towards Convergence in Data Infrastructures

In a recent paper Wittenburg & Strawn argued that it is time to come to convergence in the domain of data based on an analysis of the developments of earlier large infrastructures such as electricity, the Internet and the Web. In this paper we look from a technical point of view how the concept of Digital Objects (DO) with its inherent abstraction, binding and encapsulation potential could help us to build FAIR compliant data infrastructures as envisaged for example by the European Open Science Cloud. Basic components of such a DO-based approach have been specified and are being used by some scientific communities.

Link: http://doi.org/10.23728/b2share.b605d85809ca45679b110719b6c6cb11