In the first phase of the ParlaMint project (July 1 2020 – Sept. 30 2020) parliamentary corpora were compiled for four countries – Bulgaria, Croatia, Poland and Slovenia. The corpora were encoded according to the ParlaMint XML schema, a specialisation of the Parla-CLARIN TEI format and linguistically annotated with the Universal Dependencies and named entities. They are available from the CLARIN.SI repository and through associated concordancer.
The sessions in the corpora were also marked up as belonging to the COVID-19 part of the corpus (Oct 2019 – July 2020) or to its reference subset (2015 – Oct 2019).
With this call we invited proposals to add parliamentary corpora for additional countries to the ParlaMint collection.
Please note that the indicated COVID-19 part timespan (Oct 2019 – July 2020) and reference subset timespan (2015 - Oct 2019) have to be covered, but they can be extended - COVID-19 to the end of 2020 and further, and the reference subset - before 2015, given that resources and time permit.
ParlaMint Call Results
The proposals of the following applicants were assessed and approved by the ParlaMint Team together with representatives of CLARIN Board of Directors:
|Paul Rayson||Lancaster University||English|
|Ruben van Heusden||University of Amsterdam – ILPS research group||Dutch|
|Steinþór Steingrímsson||The Árni Magnússon Institute for Icelandic Studies||Icelandic|
|Tomas Krilavičius||Applied Informatics dept., Vytautas Magnus University (Vytauto Didžiojo university)||Lithuanian|
|Barbora Hladká||Charles University||Czech|
|Giulia Venturi||Institute for Computational Linguistics "A. Zampolli" (ILC-CNR)||Italian|
|Çağrı Çöltekin||University of Tübingen||Turkish|
|Costanza Navarretta||University of Copenhagen||Danish|
|Miklós Sebők||Centre for Social Sciences, Budapest, Hungary||Hungarian|
|Giancarlo Luxardo||Praxiling UMR 5267||French|
|Robers Dargis||Institute of Mathematics and Computer Science, University of Latvia||Latvian|
|Petru Rebeja||Alexandru Ioan Cuza University of Iași||Romanian|
|Jesse de Does||Instituut voor de Nederlandse Taal||Belgian Dutch/French|
The activities envisaged for this call include:
Extension of the ParlaMint model to 6 new countries.
For each country the following specific activities are expected from applicants
- Obtaining data for the COVID-19 and reference parts of the corpus
- Conversion of the data into the ParlaMint format
- Linguistic processing of the corpus with Universal Dependencies, preferably including a suitable NER module
- Producing documentation on the provided corpus [similar to: link to description of corpora]
ParlaMint team provides the following:
- Dedicated guidelines on how to prepare the data [link to PDF]
- The already existing corpora as models and the ParlaMint Schema: (http://hdl.handle.net/11356/1345)
- Upload to the concordancers:
- NoSketch Engine: https://www.clarin.si/noske/parlamint.cgi
- NoSketch Engine (public): https://www.clarin.si/noske/index-en.html
- Kontext: https://www.clarin.si/kontext/
- Upload to Parlameter: https://parlameter.org/
Size of funding /duration per proposal
- Funding: 5,000 Euro per project
- Timing and Duration: December 1, 2020 – March 31, 2021 (4 months)
- Qualifications of the team involved.
- Status of the available and/or accessible parliamentary corpora.
- Diversity of parliaments and languages.
- Potential with respect to the project goals.
- The applications will be assessed by the ParlaMint team together with representatives of CLARIN Board of Directors.
- In case more proposals come in than can be funded, the status quality criterion and the potential with respect to the project goals will play a role.
- The proposer should be affiliated with an institution that is part of a CLARIN consortium in a CLARIN member or observer country. In case the structure of a national consortium is not in place yet, or not specified unambiguously, applicants should check with the National Coordinator whether he/she can support the application.
- Personnel costs, including the relevant indirect and administrative costs, are eligible for funding.
Each participant is responsible for assembling all cost claims relevant for the project. The sum will be paid in one installment after the corpora have been delivered. Thus, the payment is envisaged for March or April 2021.
An expression of interest is expected that outlines the motivation behind the application, the expertise of the team, the status of parliamentary data. The applicant can do so by filling out this application form.
26 October 2020: Call Issued
16 November 2020: Submission deadline
20 November 2020: Results announced
1 December 2020: Projects start
30 March 2021: Delivery of the Converted Corpora