SSHOC Speech-to-Text workshop Linking Social Surveys to Linguistic Infrastructures through Speech Interviews

Submitted by Elisa Gorgaini on 9 April 2021

Tens of thousands of individuals are interviewed through social survey across Europe each year. Respondents are selected at random from all walks of life, and the hour-long interviews provide a range of data which has value for researchers and subsequently policy makers.

While complex life histories or events may be coded into the structured taxonomies required for cutting-edge sociological research, a large proportion of the information conveyed in an interview is lost. For example a respondent's tone of voice, linguistic fluidity, and depth of vocabulary can provide insights about cognitive function, socio-economic status or verbal reasoning skills. To make use of these lost data an integration between social survey and linguistic infrastructures has to take place. Such integration underpins the vision and is at the basis of the collaboration of CLARIN with the H2020 project SSHOC.

This workshop focuses on analysing voice recorded interviews to provide both a proof of concept and a framework for future research that explores the possibility for a collaboration between the knowledge of language based research infrastructures as CLARIN and the data constantly collected by social surveys. The workshop is a crucial step for CLARIN’s involvement in the SSHOC project, as it allows a merging of expertise to tackle unexplored potentials.