Panel CLARIN2021: The Role of Corpora for the Study of Language Use and Mental Health Conditions


Event name: Panel: The Role of Corpora for the Study of Language Use and Mental Health Conditions
Date: Wednesday, 29 September 2021
Time: 10:15 - 11:00 (CEST)
Venue: CLARIN virtual Zoom meeting
Panellists: Gloria Gagliardi, Stefan Goetze, Saturnino Luz, Khiet Truong
Moderator: Henk van den Heuvel

This panel is part of the CLARIN Annual Confererence 2021.

About the Panel 

Automatic detection of mental health conditions from text and speech has become a very appealing research field over the last years. Research into the topic is now accumulating into an impressive body of literature and special sessions at conferences. This CLARIN conference is an excellent platform to discuss infrastructural and strategic issues that are related to the resources needed for this type of research, as well as their shareability. For instance, what are the biases that may intrude on the data annotation and tools developed, and how can these be avoided? Or how to handle the challenges of collecting and sharing language resources that typically involve vulnerable people? In this panel we will discuss these issues with experts in the field. Each of them will present a short pitch, highlighting their research and the role they see for the CLARIN infrastructure in facilitating it. Each pitch will conclude with one or two brief statements, which will serve as the basis for the discussion with the session participants. Links to relevant publications and a short summary of the panellists' work in this domain can be found below.



Henk van den Heuvel

Henk van den Heuvel is director of the Centre for Language and Speech Technology (CLST) and head of the Humanities Lab at Radboud University (Nijmegen, NL). He has been involved in the collection, compilation and validation of many spoken and written language resources at the national and international level. He has been project leader and project participant in CLARIN-NL and CLARIAH projects. He chairs the DELAD initiative for sharing corpora of speech of individuals with communication disorders and he coordinates the activities of the CLARIN K-Centre for Atypical Speech Expertise. Between 2016 and 2018 he was president of ELRA.



Gloria Gagliardi

Gloria Gagliardi is currently senior assistant professor at the University of Bologna (IT). She has concentrated her professional experience on the study of the Italian language, specifically in its spoken form. Her main research interests currently encompass Clinical Linguistics, Psychometrics, Corpus Linguistics and the computational analysis of prosodic, semantic and pragmatic features of speech. She is the author of the textbook Linguistica per le professioni sanitarie (2019) and the popular science book Che cos’è la linguistica clinica (2021).
Outline of vision and contribution to the topic of the panel
Linguistic corpora are essential for the profiling of “atypical” verbal productions. Even though the automatic detection of physical and cognitive changes (e.g., due to cognitive frailty, neurodevelopmental disorders, or mental illness) made rapid strides in the past decade, several theoretical and methodological issues remain unsolved. In particular, the filing and sharing of these data raises several ethical and legal issues, that affect their full accessibility to the scientific community.
Over the last few years, my research group collected several corpora of oral and written disordered productions (e.g., OPLON, semi-spontaneous speech by 48 patients with cognitive decline and 48 healthy controls, tested twice over an interval of 1 year; PMLAN, written texts by 17 girls with a clinical diagnosis of Anorexia Nervosa and 34 normal weighted peers, matched by gender, age and educational level). We also developed lexical databases (e.g., DILLO - Database Italiano del Lessico per Logopedisti) and language tests (e.g., T-PEC - Test di Produzione Elicitata di Clitici,; SMAAV - Semantic Memory Assessment on Action Verbs, to support Italian speech and language therapists in their clinical activities.
In my opinion, CLARIN can support this emerging field of research by i) providing advice on legal issues related to the collection, processing, and storage of this special category of personal data according the EU’s directives, and ii) promoting and enhancing the visibility of language resources developed in/for the clinical domain.
Publications related to the topic of the panel
Gagliardi G., Kokkinakis D., Duñabeitia J.A. (2021). Editorial: Digital Linguistic Biomarkers: Beyond Paper and Pencil Tests. Frontiers in Psychology, 12: 752238.
Cuteri V., Minori G., Gagliardi G., Tamburini F., Malaspina E., Gualandi P., Rossi F., Moscano M., Francia B. & Parmeggiani A. (2021). Linguistic feature of anorexia nervosa: a prospective case–control pilot study. Eating and Weight Disorders – Studies on Anorexia, Bulimia and Obesity. DOI:
Calzà L., Gagliardi G., Rossini Favretti R. & Tamburini F. (2021). Linguistic features and automatic classifiers for identifying Mild Cognitive Impairment and Dementia. Computer, Speech & Language, 65: 101113. DOI:

Stefan Goetze

Stefan Goetze is Senior Lecturer at the University of Sheffield, UK. Before, he was deputy head of the department “Hearing, Speech, and Audio Technology”, Oldenburg's institute branch of the Fraunhofer-Institute for Digital Media Technology IDMT, head of the group “Automatic Speech Recognition" and head of "Audio System Technology for Audiology and Assistive Systems". He received his Dr.-Ing. in 2013 at the University of Bremen, Germany. His research interests include sound processing, enhancement, and assessment for hands-free systems and assistive listening devices as well as automatic speech and sound recognition systems for assistive technologies, human-machine-interaction and computer dialogue systems for persons with normal or pathological hearing and speech production. Stefan Goetze was lecturer at the University of Bremen from 2007 to 2020 and project leader and coordinator of national and international research projects in the fields speech and text analysis and audio signal processing for human-machine interaction and analysis of normal and dysarthric speech. He and his team developed screening and training systems for persons suffering from speech and speaking problems due to Parkinson's or late provision with cochlear implants.
Central research question on this topic 
Availability of data is particularly important for speech and language technology. Although the problem of limited data for specific (research) questions, such as analysis of mental health conditions from the speech signal, is identified and large efforts are already undertaken to collect and annotate such data, the amount and variability is yet far away from what we know from e.g. the field of automatic speech recognition, where researchers and companies made enormous efforts to collect audio and text data in the past.
In addition to collecting more and more data for specific research questions, focus should be put on how this data could be used to answer different research questions in addition. Partly researchers e.g. already use speech data collected from persons suffering from a particular illness or condition and aim at answering research questions for different conditions. This should be kept in mind already while collecting, annotating, and archiving the data.
Language resources (data/tools) used and/or developed
Efficient annotation and storage of audio and text data are essential to develop speech recognition and analysis systems as well for ‘normal’ as for pathological speech and language data. Stefan Goetze and his team regularly recorded audio data and used annotation tools like Audacity and self-developed tools to enrich publicly available datasets e.g. like Open SLR (
Stefan Goetze and his team contributed research and development for speech and speaking analysis and training systems for persons with Morbus Parkinson and for at least severely hearing-impaired patients who suffer from speech and voice disorders due to insufficient auditory input (audiogenic speech disorders). To develop such system in collaborative research projects with partners from industry, large efforts for data collection of speech material was essential. The material had to be phonetically balanced and reflect typical distortions on phoneme, word and sentence level.
Vision on how a infrastructure such a CLARIN could enhance the field
Creating awareness of the larger context in which research data could be used is important in the field as well as connecting researchers from different domains. Furthermore, information on available data and resources is still spread over the internet and sometimes difficult to find. Here CLARIN makes a clear contribution.
Publications related to the topic of the panel
M. Meis, J.-H. Bach, A. Becker, K. Bilda, A. Erem, T. Feith, S. Goetze, J. Jürs, A. Radeloff, L. Tuschen, B. Tschuschke. Context and user requirement analyses of a new digital speech therapy system (THERESIAH). In Proc. Conf. on Implantable Auditory Prosthesis (CIAP), Lake Tahoe, CA, USA, p. 157, July 2019. 
S. Goetze, N. Moritz, J.-E. Appell, M. Meis, C. Bartsch and J. Bitzer: Acoustic User Interfaces for Ambient Assisted Living Technologies”, In: Informatics for Health and Social Care, SI Ageing & Technology, vol. 35, no. 4, Dec. 2010. DOI: 10.3109/17538157.2010.528655.
N. Moritz, S. Goetze and J.-E. Appell: Ambient Voice Control for a Personal Activity and Household Assistant. In R. Wichert and B. Eberhardt (Eds.), Ambient Assisted Living - Advanced Technologies and Societal Change, Springer Lecture Notes in Computer Science (LNCS), Springer Berlin, ISBN: 978-3-642-18166-5, pp. 63-74, Jan. 2011. pdf-file

Saturnino Luz


Saturnino Luz is a Reader at the Usher Institute, University of Edinburgh's Medical School (UK). He works in medical informatics, devising and applying machine learning, signal processing and natural language processing methods in the study of behaviour and communication in healthcare contexts. His main research interest is the computational modelling of behavioural and biological changes caused by neurodegenerative diseases, with focus on the analysis of vocal and linguistic signals in Alzheimers's disease.

Outline of vision and contribution to the topic of the panel
My recent research has focused on speech and language "digital  biomarkers" for late onset Alzheimer's disease (LOAD). LOAD is a neurodegenerative disease characterised by a prolonged and usually gradual decrease of cognitive functioning. As there is a need for cost-effetive and non-invasive biomarkers for LOAD, researchers have  become interested in identifying speech and language changes that might help diagnose and track progression of LOAD (and other neurodegenerative conditions). While a number of studies have investigated speech and language features for the detection of AD and mild cognitive impairment the field still lacks unbiased data resources against which different approaches can be systematically compared. My collaborators and I have been actively involved in creating and modelling such resources. The ADReSS Challenges on Alzheimer's Dementia Recognition through Spontaneous Speech are an example of this work. 
Publications related to the topic of the panel
Luz, S, Haider, F, Fuente, SDL, Fromm, D, MacWhinney, B (2020) Alzheimer’s Dementia Recognition Through Spontaneous Speech: The ADReSS Challenge. Proc. Interspeech 2020, 2172-2176. DOI:

Martinc M, Haider F, Pollak S and Luz S (2021) Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech. Front. Aging Neurosci. 13:642647. DOI: 

de la Fuente Garcia, S, Ritchie, CW, and Luz, S. Artificial Intelligence, Speech, and Language Processing Approaches to Monitoring Alzheimer’s Disease: A Systematic Review. 1 Jan. 2020 : 1547-1574. DOI:


Khiet Truong


Dr. Khiet Truong is an assistant professor at the Human Media Interaction group (University of Twente, NL) and has a background in speech communication and speech technology. Her research focuses on automatically analyzing and interpreting nonverbal aspects in speech communication that are indicative of speaker characteristics and conversation dynamics among speakers. Besides what is said, there is also a lot of information in how something is said: aspects of physical, socio-emotional, and mental states resonate in the voice and interaction, both consciously and unconsciously, are key foci in her research. Among other things, Khiet has worked on automatic detection of laughter, automatic emotion recognition in speech, and automatic generation of backchannels for artificial agents. The results can help conversational agents or social robots respond in a more appropriate way, enhance more user-oriented multimedia search and retrieval, and support patients' health and clinicians' work in a more efficient way. She is an active member in the research community and was part of TPC committees of this year's Interspeech, ACM ICMI, and ACII, and is currently serving as a member of the Executive Committee of the AAAC (Association for the Advancement of Affective Computing) and as an Associate Editor for the IEEE Trans. on Affective Computing, and Editorial Member of Computer Speech and Language as well.
Outline of vision and contribution to the topic of the panel
How can we develop and adopt safe and effective speech-based digital biomarkers for clinical use? Relevant subquestions include how to identify what safe and effective speech-based biomarkers are, and how to ensure uptake and trust with clinicians and clients.
To achieve the above, we need to collaborate with clinicians to develop databases that ideally can be shared among researchers. This is problematic due to privacy issues. Using existing, shared databases is great if it suits your research, however, a lot of the times it is not an ideal match due to e.g., low signal quality, lack of transcriptions or annotations. Although not directly related to mental health, we have recently recorded some databases ourselves: 
  • Nazareth, D. S., Jansen, M. P., Truong, K. P., Westerhof, G. J., & Heylen, D. (2019, September). Memoa: Introducing the multi-modal emotional memories of older adults database. In 2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII) (pp. 697-703). IEEE.
  • Jansen, M. P., Truong, K. P., Heylen, D. K. J., & Nazareth, D. S. (2020, May). Introducing MULAI: A Multimodal Database of Laughter during Dyadic Interactions. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 4333-4342).
In my view, CLARIN could help enhance and support the field I am working in the following ways (mind you that I am not completely up to date with CLARIN’s activities so I might be mentioning things that CLARIN is already offering):
  • Support data storage and sharing in a secured environment.
  • Offering tools and (standardized) pipelines in a user-friendly way to process large multimodal (i.e., audiovisual) datasets. With the emphasis on user-friendly as we are often working in multidisciplinary teams and the tools used in spoken and natural language processing are not always self-explanatory.
  • Raising awareness for the existence of spoken and natural language tools and infrastructure in different research communities. There is a lot of unused data with clinicians for example who could benefit from these tools. But they often are not aware of the existence of these tools.
Publications related to the topic of the panel
Nazareth, D.S., Tournier, E., Leimkötter, S., Janse, E., Heylen, D., Westerhof, G.J., Truong,  K.P. (2019). An Acoustic and Lexical Analysis of Emotional Valence in Spontaneous Speech: Autobiographical Memory Recall in Older Adults. Proc. Interspeech 2019, 3287-3291. DOI:  
Lamers, S. M., Truong, K. P., Steunenberg, B., de Jong, F., & Westerhof, G. J. (2014, June). Applying prosodic speech features in mental health care: An exploratory study in a life-review intervention for depression. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality (pp. 61-68). URL:   
Jansen, M. P., Truong, K. P., Heylen, D. K., & Nazareth, D. S. (2020, May). Introducing MULAI: A Multimodal Database of Laughter during Dyadic Interactions. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 4333-4342). URL: