The Project
In recognition of its outstanding quality, Jakob LenardiÄās PhD thesis was recently awarded best of the year 2021/2022 at the Faculty of Arts, University of Ljubljana. What made LenardiÄ's thesis stand out was the fact that he combined two different approaches: he used a robust theoretical foundation rooted in formalism and paired it with corpus-based methodology more often associated with functionalism.

Apart from its method, LenardiÄās thesis stands out for another reason: It provides an explicit compositional semantics for Slovenian grammatical structure related to the extended verbal domain. For English, many such structures have already been formalised. Consider the distinction between a passive sentence such as 'The door was opened', which necessarily entails some kind of event initiator, and 'The door opened', which has a wider meaning, namely 'The door opened by itself'. LenardiÄ explored the role that grammatical features play in meaning-making of such sentences. He did so by trying to formally capture a piece of syntactic structure, which likely necessitates the use of specific grammatical features that govern how event initiation is realised both syntactically and semantically (i.e, participle morphemes). In addition, LenardiÄ also focused on similar sentence constructions in Slovenian, studying grammatical voice, aspectual interpretation, and the interpretation of person and number features. This approach, he explains, has not been taken before in the case of Slovenian, so LenardiÄās thesis āis a bit foundational in this senseā.

Background
LenardiÄ holds a BA and MA in English literature and linguistics from the University of Ljubljana. When he started his PhD in 2016, he had little experience with computational linguistics or digital humanities (DH). Alongside his research, he was offered a job with Darja FiÅ”er as an administrative assistant in the Department of Translation in 2016. That same year, FiÅ”er was appointed Director of User Involvement at CLARIN . Over time, working together had an impact on both LenardiÄās understanding of DH and CLARIN, but also on his linguistic research. LenardiÄ explains: āEven though formally my main job concerned mostly CLARIN-related things such as Tour de CLARIN and CLARIN Resource Families until about 2020, in practice Darja also helped me pursue corpus linguistics research by getting me involved in relevant research projects at the national level, so my role slowly but surely shifted into that of a researcher that does both corpus and theoretical linguistics, often combining the two.ā
In a nutshell, LenardiÄās thesis focused on two topics: First, he explored the pronominal system and case assignment in Slovenian. Second, he focused on the syntax-semantics interface of both English and Slovenian in relation to the so-called middle construction, which in English concerns structures like The book reads well.
LenardiÄās work is based on a syntax-only, formalist approach to grammar, which he claims is underrepresented in linguistics at the University of Ljubljana. More functionalist approaches, he feels, can be vague and speculative in describing interactive factors of context, in other words āfuzzy when they donāt need to be fuzzyā. In his view, it is a misconception that formalist approaches do not consider context, and he believes that corpus-based approaches to grammar could benefit from taking formal aspects and the associated methodologies into account. His thesis is evidence that combining the two approaches leads to outstanding work.

To explore his research questions, LenardiÄ used the tools developed at CLARIN.SI, such as the noSketch Engine concordancer, on corpora relevant to his research interests. Specifically, LenardiÄ went on to investigate two sets: Gigafida, which is the reference corpus for written standard Slovenian, andĀ the corpora of the JANES family, which contains Slovenian computer-mediated communication on platforms such as Twitter and Facebook. Over time, and not least thanks to the expertise at CLARIN.SI, he developedĀ more sophisticated skills usingĀ the noSketchEngine concordancer, which were essential for exploring the linguistic structures he was interested in.Ā
Working with corpora was essential, as it helped LenardiÄ to infer subtle characteristics of Slovenian language structure that he says he would never have figured out āby resorting to [...] intuition alone.ā In his view, corpus work requires robust assumptions and should take an advanced approach to querying that goes beyond simple keyword searches, as this is crucial for a highly inflected language with pragmatic word order such as Slovenian.
Future directions - CLARIN and DH

LenardiÄ says: āIn Slovenia, there is a sizable research community which does not seem to be aware of our national consortium and the services and wealth of data that it offers. Funding opportunities such as the Mobility Grants could be especially useful for young researchers.ā To spread the word, he recently led a face-to-face workshop as part of the JTDH 2022 pre-conference programme, which introduced both the CLARIN.SI and CLARIN infrastructures to PhD students in linguistics and the wider humanities.
LenardiÄ also plans to continue collaborating with CLARIN in other ways. He says: āI hope to continue with the CLARIN Resource Families, which became much broader in scope this year due to the project funding, where a couple of projects are already underway.ā