Content Search Tutorial

Introduction

The Aggregator application is a part of the CLARIN-FCS common federated content search infrastructure. It serves as a user interface to perform queries to CLARIN-FCS resources and display search results. The Aggregator communicates with components called endpoints, which are provided as a service by all centres who participate in the federated content search. Each endpoint provides access to one or more searchable resources. The user can select a specific resource or resources, based on the resource name or on the language, or search through all of them. The content of these resources is searched with the query supplied to the endpoint. The endpoint returns results to this query and the aggregator collects the responses from all the endpoints and displays them to the user.

Using FCS Aggregator

FCS Aggregator is available at https://contentsearch.clarin.eu/. No login is required.

Basic search

To perform a query and get search results, it is enough to enter a search term and click the magnifying glass button or press the Enter key. The default search options will apply: all the available FCS collections in any language will be searched and up to 10 hits per collection will be displayed.

If you have navigated to the Aggregator from the VLO and the resources selected in the VLO are found among the corpora resources available (e.g. TuebaDDC), then the search will be performed only in those selected resources.

Search options

The default search options can easily be changed by using the controls below the Search form.

Aggregator overview

The following search options are available:

  • Specify language of the resource:
    Click on the left-most button in the options bar, initially displaying Any Language, and a dialog box with multiple language options will be displayed. To select a language option, click on the desired language and then click on the Close button. There are currently three language filtering strategies, that can be toggled by using the selector at the bottom of the dialog window; these strategies are:
    1. only search in the collections known to have resources in the specified language;
    2. search in all the available collections, then use a language guesser and display only the results having the selected language;
    3. search only the collections having resources in the desired language, and then filter the results by using the language guesser.

Language Dialog

  • Specify collections to search in:
    By default all the corpora resources are selected. If you want to restrict your search to a specific set of resources, click on the middle button in the control bar, initially displaying All available collections and a dialog window will show up. This dialog provides the option to select and deselect all collections, or particular collections.


Some collections also have sub-collections; in this case there will be a link to expand and explore, select and deselect the sub-collections.

 

Collections Dialog

 


The collections view also offers the option to search the collections attributes and thus filter the list of displayed collections to only those of particular interest.

 

Collections Search

 

  • Specify number of records per collection per request:
    By default, 10 records per collection are returned. You can change this number by using the right-most control of the control bar. More records can be retrieved at any time, in the focused view, when available.

 

Search results

Search results will start showing immediately after the search function is invoked. The query is sent to all the selected corpora and the search results from resources are displayed dynamically as soon as they come in. Therefore, you can start inspecting the search results (the records) before all the corpora have returned a response. A progress bar at the top of the page provides information about status of the search. The search results are initially displayed in a textual view.

 

Aggregator Results

 


The textual view can be toggled to a concordance view, with the searched term in the middle, together with its left and right context.

 

Aggregator Results with Context

 


The entire set of search results can be downloaded locally onto your computer (click Download). Several formats are offered for exporting search results.

 

Aggregator Results Download

 

Focused View

For more information on a result set, click on the View button on the top-right of the result pane. This brings into focus the View dialog with more information on the collection that the results come from, and the option to download only the visible results.

 

Aggregator View

 

The More Results button at the bottom of this dialog will perform a new query request to the same collection with the same search term, and will retrieve and display the additional results. This process can be repeated until no more results are available.

To process the search results with the WebLicht application, click the button Use WebLicht and confirm by selecting the Send to WebLicht button. WebLicht can process result sets in a couple of major languages.

 

Weblicht Connection