LĂ€rka (English LARK) - Language Acquisition Reusing Korp

LĂ€rka
LĂ€rka - âLĂR sprĂ„ket via KorpusAnalysâ - with its English equivalent âLarkâ (Language Acquisition Reusing Korp) is the ICALL platform of SprĂ„kbanken (the Swedish Language Bank). ICALL â Intelligent Computer-Assisted Language Learning â has as its main aim to draw on the opportunities offered by language resources, such as corpora, lexicons and natural language processing ( ) components including lemmatizers, parsers, etc., to build more sophisticated and flexible applications for language learners and students of grammatical theory.
The work on LĂ€rka started in the project âSystems Architecture for ICALLâ financed by NordPlus Sprog from2011 to 2013. Specified as a modular web-based exercise generator that reuses available annotated corpora and lexical resources, LĂ€rka is freely available, targeting primarily learners of Swedish as a second/foreign language and students of Swedish linguistics. Being web-based, LĂ€rka has advantages of accessibility and ease of use.
LĂ€rka is designed as a Service Oriented Architecture based on web services. The platform comprises two main components â user interface and web services â where the web services can be reused by other applications. Web services take care of exercise generation whereas the user interface collects user input, formats the web service output, and assigns behavior to buttons and menus.
At the moment LĂ€rka offers exercises for two target groups: students of linguistics and learners of Swedish*. Students of Linguistics can train parts of speech, syntactic relations and semantic roles, whereas second language learners of Swedish can train spelling, vocabulary and inflection patters. Available exercises share some common features, namely:
- Training context: sentence. The objective with the LĂ€rka-based exercise generator has, from the onset, been to use real-life language examples from corpora. Possible copyright issues are avoided by using only a single-sentence context. We are actively searching for alternatives for working with full texts.
- Reference materials. Relevant articles are looked up in Wikipedia, Wiktionary and Karp, while a text-to-speech module provided by SitePal offers pronunciation of relevant words and sentences. Reference materials are shown in a separate field that can be hidden when not wanted.
- Training modes: self-study, test and timed test. The self-study mode reveals all clues (e.g. reference articles, syntactic tree structure, pronunciation, etc.) and also provides a possibility to try several answer options. In the test modes, the clues are not revealed until the answer is provided; and users cannot change their answer.
- Feedback is offered in the form of immediate correct/incorrect symbols and a result tracker where information on correct/total number of answers is shown.
Recently, text assessment function has been added to LĂ€rka, where reading comprehension texts alternatively learner essays can be tested for their CEFR level, i.e. a level of language proficiency according to Common European Framework of Reference (A1, A2, B1, B2, C1, C2).
There is ongoing work on diagnostic testing and learner modeling.
* Previous version of LĂ€rka is being migrated to new technology, and the newer version does not yet offer all functionalities compared to its predecessor.
SprÄkbanken (UGOT), CLT (UGOT), Department of Swedish (UGOT), Lars Borin, Markus Forsberg, Jonatan Uppström, Ildikó Pilån, David Alfter