Skip to main content

Centre news vol. 42 - April 2019

Submitted by Dieter Van Uytvanck on

Upcoming assessment round

As announced earlier, the next assessment round is foreseen in Spring, with 29 May as submission deadline. If you intend to participate in this round, please send an email to assessment [at] clarin.eu (assessment[at]clarin[dot]eu) before 1 May, since so far we have not received any indication that there will be centres applying. If we have not received any expression of interest by 1 May, this assessment round will be cancelled.

In any case, there will be an assessment round in the Autumn of 2019.

 

Centre meeting 2019: updated programme

The agenda of the Centre Meeting has been updated, based on your suggestions made for topics. If you plan to attend, please register and make sure to book your hotel as soon as possible – many places are fully booked already.

New C-centre: SADiLaR

The South African SADiLaR repository is now registered as an official CLARIN C-centre. The corresponding metadata can be found in the Virtual Language Observatory.

Review CoreTrustSeal

As recently announced, there will be a new version of the CoreTrustSeal in use as of January 2020. The new version (2020-2023) of the requirements will not contain major changes, but in order to have an orderly transition, it will be impossible to submit a CoreTrustSeal request in November and December 2019.

It is also possible to submit feedback on the current guidelines until the end of April.

 

Virtual Language Observatory 4.6

Late February we released version 4.6.0 of the Virtual Language Observatory. As always with a new minor release, we combined various new features and both functional and technical improvements in a single release.

The most visible change compared to the previously deployed version is the newly introduced “result deduplication” feature. As of this version, users of the may encounter boxes below search results stating that “The search results includes N record with the same title”. These boxes can be expanded to reveal additional matching records with a title identical to the ‘primary’ search result they are grouped with. Among these ‘duplicates’, the highest ranking result, i.e. the one that best matches the user’s query, will always be presented at the top level. In some cases, this can save the user from having to go through dozens of pages of identical looking search results.

 

Metadata providers that find that many of their records get grouped together in the search results can counter this by providing unique (and, by best practice, descriptive) titles for their records. Furthermore, providing rich metadata and in particular good resource descriptions also helps getting the best matching result show up on top. Note that records with different object language information are always shown separately, even if they have the same title.

Other improvements in this version are increased visibility of landing pages, easier to recognise PIDs, information on PIDs, better performance for filtering of facet values and, at the code level, refactorings that ensure fully synchronised value mapping and post-processing logic between the VLO and the Curation Module.

More information can be found in the VLO 4.6.0 release details on GitHub.

 

Virtual Collection Registry 1.3

End of February 2019 a new release of the CLARIN Virtual Collection Registry has been made available at https://collections.clarin.eu. With this release the user interface has been completely updated to reflect the CLARIN style and add responsive behavior. Furthermore efforts have been made to simplify the collection creation process. This release is also the first release as an -hub thematic service as can be seen in the marketplace.

Under the hood various technical improvements have been made, including the following highlights (1) a new approach for the integration of external applications such as the Virtual Language Observatory, (2) a new administration interface to perform maintenance on the collections if needed and (3) an extension to the generated metadata for collections to include a landing page resource proxy.

Codebase, issues and roadmap can be  found on GitHub.

 

Data & Service announcements from our centres

Both the IvdNT and ILC4CLARIN have provided new Federated Content Search endpoints, adding many corpora and collections (e.g. Nederlab) These are now available in the Content Search aggregator.

CMDI Best Practices guide: updated version available

 

A new version of the CMDI best practices guide has been prepared by members of the CMDI and metadata curation task forces. The document contains a collection of general best practices regarding modelling and authoring CMDI metadata within CLARIN. It also describes a number of common approaches and problems related to metadata within the CLARIN infrastructure.

Some additional information about the guide is available here.

 

Past issues

On 24 March, the CLARIN Nexus container was compromised (as a result of CVE-2019-7238) and had to be restored. Other services on the same CESNET host (catalog.clarin.eu, office.clarin.eu, trac.clarin.eu, trac.clarin.nl, cmdi.clarin.eu, svn.clarin.eu) were not affected but were disabled during the investigation and recovery process.

The issue was resolved on 25 and 26 March. Nexus has been redeployed and populated. Note that we have no indication whatsoever that any service or data was actually impacted, nor any malware propagated, other than that a cryptocurrency miner being installed within the affected container. The pre-compromised state has been recovered.

 

Maintenance announcements

Activity/Issue

Start date and time (CET)

End (CET)

Impact

 

ACDH - Physical Migration of Servers

2019-04-18 08:00

2019-04-18 11:00

Curation Module will be inaccessible.

ACDH - Physical Migration of Servers

2019-04-19 07:00

2019-04-20 12:00

Curation Module (along with all other ACDH services) will be inaccessible.

More information: see clarin.eu/status