|In this Tour de CLARIN blog post, we present an in-depth interview with Susanne Nylund Skog, an ethnologist and folklore researcher who collaborates with the SWELANG K-Centre in the TillTal project. The interview was conducted via Skype by Jakob Lenardič.|
1. Please describe your academic background
I am a researcher at the Institute for Language and Folklore in Uppsala, Sweden, where I work at the Department of Dialectology and Folklore Research, and an Associate Professorof ethnology at Uppsala University and of Nordic folklore studies at Åbo Akademi University, Finland.
I defended my doctoral dissertation in ethnology on childbirth stories at the University of Stockholm in 2002, and have since then done extensive ethnographic research on Jewish life in Sweden and on stories by birdwatchers. With performance and narrativity in focus, I have explored issues such as anti-Semitism, whiteness, intertextuality, emotions and materiality. I am currently doing research on archive collections within the project TillTal - Making spoken cultural heritage accessible for research, which is funded by Riksbankens Jubileumsfond, the Swedish Foundation for Humanities and Social Sciences.
2. How did you get involved with the K-Centre for the Languages of Sweden? What is the main goal of the Tilltal project?
I first came in contact with Rickard Domeij and Gunnar Eriksson from the K-Centre at the SWE-CLARIN exploratory workshop for researching audio materials from a cross-disciplinary perspective. The workshop ended in a joint research grant proposal for the multidisciplinary project Tilltal by the Institute for Language and Folklore, KTH Royal Institute of Technology and the Swedish National Archives.
The overall goal of the project is to make Sweden’s archive of recorded speech more accessible for humanities and social science research, which is also one of the main goals of the K-Centre. I am involved in the project as a qualitative researcher who studies the recordings and I collaborate with language technologists like Gunnar Eriksson who help me with technological solutions for my research questions.
3. How do speech recordings differ from other materials used in DH research? What does the Tilltal project do to promote the use of speech recordings in multridisciplinary approaches?
Speech recordings represent a seriously underutilised resource of the Swedish memory institutions, at least for Humanities and Social Sciences purposes, where researchers often only work with secondary materials, such as transcriptions of the spoken materials, instead of investigating the recordings themselves. One problem is that the number of speech recordings is very large. The archives of the Institute for Language and Folklore alone contain around 25,000 hours of recorded speech. Paradoxically, this contributes to the fact that such materials are not often used by Humanities researchers, as speech is extremely challenging and time-consuming to work with and can be quite unmanageable without appropriate tools.
To help overcome this problem, the Tilltal project has established three different case studies and one user study. In the case studies, research agendas from three different Humanities and Social Sciences fields are being pursued with the help of speech technologies. These are Case 1: From personal experience narratives to cultural heritage, which focuses on speech recordings in ethnology, Case 2: Linguistic variation in time and space, which involves collaboration between speech and language technologists and sociolinguists, and Case 3: Interaction patterns over time and type of conversation, which extends previous work within interaction analysis. In the user study, we are applying an activity-theoretical approach with the aim of involving researchers, such as me, and investigating how we use – and would like to be able to use – these archival speech resources.
4. Could you describe your research in collaboration with the K-Centre? Have there been any prominent results from this inter-disciplinary collaboration?
I am directly involved in Case 1: From personal experience narratives to cultural heritage, where I mostly work with a collection of Swedish folklore that was created by Karl Gösta Gilstring, a clergyman and high school teacher who lived in Sweden between 1915 and 1986. Gilstring worked on his collection for more than fifty years, and the result is regarded as the largest folklore collection assembled by a single Nordic researcher in modern times. It consists of more than 8,000 original letters, as well as 250 hours of recordings (mainly interviews conducted by Gilstring himself), from which Gilstring made 70,000 folklore records, divided into approximately one hundred parish collections and organised by subject matter, which aside from folk tales also includes descriptions of rural daily life and traditions.
In our case study, I am interested in establishing the motivations and scientific premises that Gilstring used to create his collection of folktales and to investigate the reasons as to why it has become an integral part of the cultural heritage of Sweden. In the TillTal project, I explore the differences between the unedited audio interviews and his edited written versions that later appeared in the collection. A prominent finding in this respect is that when Gilstring wrote down the folktales he had collected from letters and by conducting oral interviews, he sometimes omitted parts of the story that he felt were his informant’s modern interpretations and not part of a “traditional” incarnation of the folktale. This goes to show that cultural heritage is socially constructed, in that Gilstring’s rather conservative attitude, which involved a rejection of modern ideals, directly influenced the content of what we nowadays perceive as our “traditional” folklore in Sweden.
The collection is also valuable because of the geographic distribution of the materials. Gilstring’s approximately 700 informants not only came from all over Sweden, but also from the Åland Islands and Finland, while around 60 of them were Swedes who had emigrated to America. This is important with respect to the map-based interface Digitalt kulturarv, which the SWELANG K-Centre is developing, since the interface allows me to trace the geographic origins of the letters that were sent to Gilstring by his informants. For instance, I have been able to observe – on the basis of the geocoded information specifying the location of an informant at the time he or she sent the letter – that after emigrating Swedish Americans typically did not stay at a particular place in North America for a long time, but rather moved all over the country, and sometimes even came back to Sweden for a time. Additionally, it was possible for me to observe that the emigrants often presented Sweden in a romanticized manner in their letters to Gilstring, painting the country in broader strokes in comparison to the descriptions in the letters by their compatriots who never left Sweden. This highlights the fact that the ways in which people perceive and remember a particular place (Sweden in the case of the Swedish Americans) are always socially and culturally constructed, and shaped by the individual who reports them.
5. What are the main obstacles of working with audio data? How does the K-Centre help you overcome them?
Just recently, I was conducting research on an audio recording that was made with one of Gilstring’s informants – a Swedish American called Carl Nelson, who came to America in 1896 when he was 18 years old. What’s interesting about the interview is that in certain parts Nelson repeats the same folk stories that he had already described to Gilstring in their previous written correspondences. Additionally, Nelson often jumps from one story to another and then later on returns to comment on a story he’s already told. Aside from Nelson’s rather messy narration with frequent digressions, the recording is 10 hours long in total, so it took me weeks to go through it.
This shows that it is time consuming to analyse audio recordings, so it is incredibly important for me that Tilltal gives me the opportunity to collaborate with language technologists like SWELANG’s Gunnar Eriksson, who provides me with guidance on the use of automatic speech-analysis methods with which I am able to go back and forth between the different segments of a long audio interview in a time-efficient manner and to interlink them with other related materials in different formats and secondary sources. Indeed, one of the plans of SWELANG is to make available to the research community an environment in which various kinds of materials (e.g., audio recordings, written letters) can be combined so that, for example, dynamic links can be made from a recorded interview to a letter where the same subject or narrative is mentioned twice.
6. As a qualitative researcher, do you think there’s any room for improvement in the way data is presented and made available by large-scale research infrastructures?
I often feel as though the various domain-specific resources (e.g., historical corpora) available through the repositories are mostly intended for large-scale projects that deal with quantitative “big-data” questions, but it isn’t obvious to me how they are suitable for qualitative research. The problem is that many resources contain metadata describing only surface-level features, such as size and linguistic annotation, but lack metadata that are specific to the needs of my field, such as detailed descriptions of the collection process itself, information on who the contributors were in the case of folklore resources, where they came from, when they lived, and so on.
Nowadays, it is easier to get grant money if you propose a humanities project that will – aside from solving research questions that are intrinsic to the field – also involve digitisation and collaboration with researchers working in computational fields. While I of course agree that it’s extremely valuable to make the data that you’re working on accessible in online environments through such collaborations, it often feels as though only the quantity of the data is seen as a measure of success, rather than the presentation of the content of the materials themselves. I therefore think that it’s important for such collaborative projects to re-focus, at least in part, on improving access to and the presentation of the resources that are already available, which is precisely what we are doing in the TillTal project by creating a user-friendly environment for the speech analysis of audio data where the presentation and accessibility of the recordings is tailored to the needs of researchers outside computational fields, like myself.
7. What are the future goals of the TillTal project and the SWELANG K-Centre?
One of the future aims of the TillTal project – and by extension the K-Centre – is to increase the amount of available content and bring together related materials (recordings, reports of recordings etc.) through digital methods, which will be done in collaboration with the National Language Bank and SWE-CLARIN. We also plan to release a search system tailored specifically to working with recorded interviews. The system will be accompanied by a tool that will enable us to explore other related non-audio materials while listening to the recording. With this tool, we’ll also be able to add additional information about a recording on the fly, such as laughter, or mark sections with fast or otherwise intensive dialogue.
We also plan to develop crowdsourcing tools for transcription and improvement of archive materials, and further work on the mapping interface Digitalt kulturarv, with which researchers will be able to follow audio recordings through time and place, and thereby efficiently study all the documents that were created along the way. In the long run, the plan is to integrate these different technologies in a rich digital tool box, which will offer new possibilities to work with the archival materials of the Institute for Language and Folklore.
The National Language Bank and SWE-CLARIN is funded by the Swedish Research Council (2017-00626).
Click here to read more about Tour de CLARIN