Intelligent Computer-Assisted Language Learning

Lärka (English LARK) - Language Acquisition Reusing Korp


Lärka - “LÄR språket via KorpusAnalys” - with its English equivalent “Lark” (Language Acquisition Reusing Korp) is the ICALL platform of Språkbanken (the Swedish Language Bank). ICALL – Intelligent Computer-Assisted Language Learning – has as its main aim to draw on the opportunities offered by language resources, such as corpora, lexicons and natural language processing ( ) components including lemmatizers, parsers, etc., to build more sophisticated and flexible applications for language learners and students of grammatical theory.

The work on Lärka started in the project ‘Systems Architecture for ICALL’ financed by NordPlus Sprog from2011 to 2013. Specified as a modular web-based exercise generator that reuses available annotated corpora and lexical resources, Lärka is freely available, targeting primarily learners of Swedish as a second/foreign language and students of Swedish linguistics. Being web-based, Lärka has advantages of accessibility and ease of use.

Lärka is designed as a Service Oriented Architecture based on web services. The platform comprises two main components – user interface and web services – where the web services can be reused by other applications. Web services take care of exercise generation whereas the user interface collects user input, formats the web service output, and assigns behavior to buttons and menus.

At the moment Lärka offers exercises for two target groups: students of linguistics and learners of Swedish*. Students of Linguistics can train parts of speech, syntactic relations and semantic roles, whereas second language learners of Swedish can train spelling, vocabulary and inflection patters. Available exercises share some common features, namely:

  • Training context: sentence. The objective with the Lärka-based exercise generator has, from the onset, been to use real-life language examples from corpora. Possible copyright issues are avoided by using only a single-sentence context. We are actively searching for alternatives for working with full texts.
  • Reference materials. Relevant articles are looked up in Wikipedia, Wiktionary and Karp, while a text-to-speech module provided by SitePal offers pronunciation of relevant words and sentences. Reference materials are shown in a separate field that can be hidden when not wanted.
  • Training modes: self-study, test and timed test. The self-study mode reveals all clues (e.g. reference articles, syntactic tree structure, pronunciation, etc.) and also provides a possibility to try several answer options. In the test modes, the clues are not revealed until the answer is provided; and users cannot change their answer.
  • Feedback is offered in the form of immediate correct/incorrect symbols and a result tracker where information on correct/total number of answers is shown.

Recently, text assessment function has been added to Lärka, where reading comprehension texts alternatively learner essays can be tested for their CEFR level, i.e. a level of language proficiency according to Common European Framework of Reference (A1, A2, B1, B2, C1, C2).

There is ongoing work on diagnostic testing and learner modeling.

* Previous version of Lärka is being migrated to new technology, and the newer version does not yet offer all functionalities compared to its predecessor.


SWE-CLARIN (Språkbanken)
Project leader
Elena Volodina, Lars Borin, Markus Forsberg

Språkbanken (UGOT), CLT (UGOT), Department of Swedish (UGOT), Lars Borin, Markus Forsberg, Jonatan Uppström, Ildikó Pilán, David Alfter