User involvement workshop on Latvian treebank

Submitted by Linda Stokman on 28 May 2019

Blog post written by Inguna Skadiņa (CLARIN-LV)

Although work-in-progress versions of Latvian Treebank have been available already for more than five years, and research results, including master thesis and publications, based on this language resource have been regularly disseminated in the community of humanities researchers, there has been no hands-on workshop or another event that would help the SSH community to learn how to use the treebank and how to write queries related to specific research questions.

On 9 May 2019, the national coordinator of CLARIN Latvia organized a hands-on workshop on how to search in the Latvian Treebank. The workshop attracted participants from different research organizations: Faculty of Humanities of the University of Latvia (UL), Latvian Language Institute (UL), Ventspils University of Applied Sciences, Agency of Latvian Language, Institute of Livonian Language (UL) and Institute of Mathematics and Computer Science (UL).

Latvian Treebank is represented in two formats: as a Universal Dependency treebank and as a hybrid dependency-constituency treebankallowing to annotate Latvian-specific syntactic phenomena. The Latvian dependency-constituency treebank (along with its UD version) is included in the LINDAT repository and has been made available to any researcher who would like to study Latvian syntax via the PML-TQ browser. Currently the treebank contains more than 13,000 sentences.

In the workshop, the participants were introduced to the hybrid dependency-constituency grammar model of the Latvian Treebank, with dependency roles and phrase structures. In the hands-on session, typical queries such as searching for a complex phrase or a subordinate clause, were introduced and examined.