Skip to main content

Updated VLO brings better ranking

Submitted by Twan Goosen on

This week, the  faceted browser has been upgraded to the new minor version 3.2. In this release, a number of user interface refinements are accompanied by further mapping improvements, including the addition of a new facet that helps with finding resources with specific availability levels, and a revised search result ranking strategy.

To start with the user interface improvements, most notably the list of resources on the page of an individual record is now paginated. This reduces page load times and makes navigation easier for resources with a large number of resources. In the area of search, it is now possible to 'unset' the search query while keeping the facet selections as they are by clicking the 'X' next to the search term in the breadcrumbs on the top of the page. Furthermore, a short definition of each of the facets can be viewed by hovering over the facet names in the facet boxes in the search interface.

Among the facet boxes, a new one has appeared that is labelled 'Availability'. The options in this facet are based on the licensing information provided in the metadata records themselves. Among the options are the values 'Free', 'Free for academic use' and 'Upon request', as well as a number of more (centre) specific options that could not be mapped to one of the basic categories (yet). Note that the majority of records do not have licensing information specified so selecting any of these values will take them out of the search results - in other words, the strength of this facet is in its precision, not recall. Hopefully this will improve in the future as more centres start providing licensing information via the metadata.

Other changes in the mapping logic relate to the 'Language' facet, and prevent duplicate occurrences of language names in different languages (e.g. 'German' and 'Deutsch') or spelling variations and improve the processing of languages specified in records.

Finally, a first step has been made towards set of changes with respect to the ordering of search results. Ordering is now by relevance, where the overall document content is taken into account (using tf-idf as provided by the Solr search platform by default). In the upcoming version of the VLO, currently still under development, records will get a ranking depending of the field in which the search term(s) occur (for example, a match in the title generally makes for a more relevant document than a match in most other fields) but also the nature of the record, for example whether it is a collection or not, or whether it has any resources. The next version of the VLO will allow for advanced querying by means of a powerful querying syntax that allows for logical operations and searching in specific fields only.