New version of the VLO extends support for licence and availability information

Submitted by Twan Goosen on 30 March 2016

By Twan Goosen and Krister Lindén

A new version of the VLO faceted browser was released last week.  Version 3.4 includes a number of small fixes, as well as a couple of new features and functional enhancements. The main focus during the development of this version has been the licence and availability information provided within individual metadata records. The VLO development team has worked together with the metadata curation experts from the Austrian Centre for Digital Humanities and the CLARIN Legal and Ethical Issues Committee to improve the extraction of such information from the provided metadata as well as the presentation of this information and the means by which users of the VLO can us it for searching and filtering.

The licence metadata highlights the availability of a resource, i.e. the access and usage conditions. The licence metadata is assigned based on the rights-holders' licences of the resource. Most of the availability conditions are standard conditions for storage and safe-keeping, but some of them vary according to the nature of the resource and the wishes of the rights-holders.

In order to save time for a resource user, the key conditions of the licence influencing resource availability are recorded in the metadata and visualised with icons comparable to laundry tags on clothes. CLARIN has adopted the well-established icons of the Creative Commons initiative while extending the set with some of the more frequent conditions traditionally used in language resource licences.

The availability metadata is an approximation of the licence with a narrow interpretation. This means that when reading a resource licence, the user should not be find additional restrictions unless properly warned in the metadata, but may discover additional permissions.

As user, the first place in which you will encounter the availability information is most likely in the search results. The VLO shows one or more icons for all records to indicate the availability level, licence and/or usage constraints, depending on what information is available. By hovering over the icons or clicking on them, you can find out the exact meaning of any specific icon. Although there is no absolute ranking based on the availability level, records that are free to access and use for the general public or at least the academic community get a small 'boost' in the search result, which means that among records that match the other search criteria equally well, the more 'accessible' records will appear before the ones with more restrictive access conditions.

The same icons that appear in the search results are also shown in the page with record details. The icons are presented slightly larger and are accompanied by a textual description of the availability aspects they represent. If the concrete licence that applies to the resources in a record is known to the VLO, it is also displayed and in most cases a link is provided to a page that describes the licence in detail.

Some of the icons that show up in the search results and on the record pages also appear in a new panel labelled "Availability" that appears below the panel for facet value selection, namely the icons for 'Public', 'Academic', 'Restricted for individual' and 'Unspecified'. By making a selection from these availability levels, the search results will be narrowed down to only include the selected levels. This makes it easy, for example, to search for resources that can be used freely in an academic context, in which case one would keep only 'Public' and 'Academic' checked.

Many records still do not provide any availability information in a way that can be extracted by the VLO. Such resources are currently available in the 'unspecified' licence category.

Sometimes the metadata record has been created with only a very general reference to a licence type. For such metadata records, a narrow interpretation of the licence into CMDI components has been made when harvesting the metadata for the VLO. However, the metadata provider has full control over the VLO licence metadata if providing the CMDI components himself.

Previously the licence metadata did not visibly affect the search result, but now that we have an abundance of resources to choose from, the less available resources will get less usage and less citations. This should serve as an incitement for rights-holders to licence their resources as freely as possible and to make sure that the resources are labeled accordingly. For standard licences such as CC-BY, the VLO offers well-established label sets.

When depositing a resource at a CLARIN Center, the depositor guarantees that others (including the CLARIN Center) have lawful access to the resource. However, the metadata record is only an approximation of the resource licence, so it is still the user's responsibility to acquaint himself with the details of the licence governing the resource use. This is especially important, when the VLO describes a resource available at a non-CLARIN repository.

You can visit the VLO at vlo.clarin.eu, where you will find over 150 collections, altogether containing hundreds of thousands of records, about 60 thousand of which already provide availability information.