Blog post written by Jan Odijk (Utrecht University, CLARIAH NL)
On 23 and 24 May the CLARIN ParlaFormat workshop was held in Amersfoort, the Netherlands. This workshop was organized by the CLARIN Interoperability Committee, a subcommittee of CLARIN’s National Coordinators’ Forum.
There were 25 participants from 13 different countries, as well as participants from the CLARIN Interoperability Committee and from the CLARIN ERIC Board.
The goal of the workshop was to present an outline of a standard format (proposed name: parla-CLARIN, a subset of TEI) for parliamentary data to the research community, to assess the support for it, and to identify potential or real problems for its development and wide adoption. This proposal was prepared and presented by Tomaž Erjavec and Andrej Pančur (from CLARIN Slovenia).
The participants presented the formats they currently work with, indicated which aspects of these formats are important for them, and inquired whether these are covered by the new proposal. There was a very good and constructive atmosphere during the whole workshop, active contributions by all participants and lively discussions. In particular, there were extensive discussions on the existing standard Akoma Ntoso, which is in use in various parliaments, and what the relation should be between it and the newly proposed format.
After the presentations by the various participants, Tomaž Erjavec and Andrej Pančur responded to some of the questions about and criticisms on parla-CLARIN, and how they will address these in the coming period.
The parla-CLARIN format will now be further developed. Information about it and contributions to it can be provided via Github. All participants will upload samples of their data there so that these can be taken into consideration during the further development of parla-CLARIN.
In about three months a revised version of parla-CLARIN will be made public, and we plan to organize a follow-up workshop with a shared task: all participants will convert their data into the parla-CLARIN format, report on problems encountered and share conversion scripts. If that workshop is successful, we plan to organize another follow-up workshop in which participants will address research questions that cover parliaments from multiple countries, which is then possible because of the uniform format of the parliamentary data.
The agenda and the presentations are available on the ParlaFormat event page
Joint Dinner of the ParlaFormat participants