Centre news vol. 53 - December 2021

Submitted by Dieter Van Uytvanck on 17 December 2021

Log4j vulnerability 

Last week, a serious log4j issue (CVE-2021-44228) was discovered that potentially enables remote code execution.

More background information

Impact on central CLARIN services

Many of the CLARIN services are implemented in Java, and therefore use log4j. Our team has been (and remains) analyzing and patching the central services that are affected. So far, the biggest theoretical risk is associated with UnityIDM, as used for the CLARIN Identity Provider. This service has been patched to mitigate the log4j vulnerability. We have not been able to find any indication that the vulnerability was actually exploited.

Certain versions of the Virtual Language Observatory ( ) are vulnerable due to usage of an affected version of Log4j by Apache Solr, which is a necessary component for running the VLO and is therefore included in its distribution. A new version of the VLO that fixes this has been released: VLO 4.10.2. If you are running an instance of the VLO locally or on a server, you should consider upgrading to this version or at least upgrading your Solr component to a version that is not vulnerable. Details are available on the Apache Solr security page.

The link checker (which is input for the link statistics on the CLARIN curation dashboard) will be disabled until a patched version has been prepared and deployed. While it is paused, no new links will be checked but the existing link checking statistics will remain accessible via the dashboard. We expect to be able to resume link checking in the course of next week.

Impact on individual CLARIN centres

Not affected: DSpace 6.x (basis of CLARIN-DSpace); jOAI (the path of the included log4j classes, package org.apache.log4j, indicates log4j version 1.x and not the affected version 2).

Not affected directly, but needs a check if log4j is used as the binding of SLF4J: FCSSimpleEndpoint and FCSSimpleClient (no access? See our trac instructions). FCSSimpleEndpoint has been used by several CLARIN centres as the starting point for a local endpoint.

In general we recommend to take a close look at all dependencies used by Java-based applications, and to look for the affected log4j versions. In most cases this will require some manual inspection.

Lxml vulnerability

Lxml is a popular Python library for the parsing and processing of XML files. On 13 December 2021 a severe security breach was reported in Lxml 4.6.4 and below, which allows remote code execution through well crafted scripts passing HTML cleaner and/or via SVG files. It is strongly recommended to upgrade lxml to 4.6.5=< from any lower version. It is currently the only known solution. It can be obtained via pip from pypi or cloned from a repository according to the authors' instructions

Impact on central CLARIN services

The CLARIN Centre Registry was vulnerable and was patched in version 2.3.4 deployed on 16 December.

The CLARIN Digital Object Gateway is not vulnerable at runtime. The lxml dependency is only used during release testing and is being upgraded in the next version. 

Other central services which do not offer user interaction, are not affected e.g. the CLARIN central metadata pipelines.

CMDI session #2 - presentations

The second session took place on 14 December, on the topic of vocabularies. All presented slides are now available on the event page.