Skip to main content

Holocaust Testimonies as Language Resources



General Details  

Date: 21 May 2024
Time: All day
Venue: Lingotto Conference Centre, Turin, Italy


Details on registration can be found on this LREC-COLING page.
The current registration schedule is as follows:

Early-Bird: Up to April 15, 2024 (23:59 AoE)
Late & Onsite: From April 16, 2024 (23:59 AoE)


Holocaust testimonies serve as a bridge between survivors and history’s darkest chapters, providing a connection to the profound experiences of the past. Testimonies stand as the primary source of information that describe the Holocaust, offering first-hand accounts and personal narratives of those who experienced it. The majority of testimonies are captured in an oral format, as survivors vividly explain and share their personal experiences and observations from that time period. Transforming Holocaust testimonies into a machine-processable digital format can be a difficult task owing to the unstructured nature of the text. The creation of accessible, comprehensive, and well-annotated Holocaust testimony collections is of paramount importance to our society. These collections empower researchers and historians to validate the accuracy of socially and historically significant information, enabling them to share critical insights and trends derived from these data. This workshop will investigate a number of ways in which techniques and tools from natural language processing and corpus linguistics can contribute to the exploration, analysis, dissemination and preservation of Holocaust testimonies.


The proceedings can be found here.

9.30–9.45 am

Welcome and Introduction

9:45-10:30 am

First Keynote Speech: "What Is Not in the Archives": Early Holocaust Testimony as Research Data
Michal Frankl, Masaryk Institute and Archives of the Czech Academy of Sciences

10.30 -11.00 am

Coffee break

11.00– 12:15 pm

First session: Editions and their exploitation

  • Paper  6: The Impact of Digital Editing on the Study of Holocaust Survivors' Testimonies in the context of Voci dall'Inferno Project by Angelo Mario Del Grosso, Marina Riccucci and Elvira Mercatanti
  • Paper 7: TEI Specifications for a Sustainable Management of Digitized Holocaust Testimonies by Sarah Bénière, Floriane Chiffoleau and Laurent Romary
  • Paper 9: Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools by Maria Dermentzi and Hugo Scheithauer

12:15-13:00 pm

Second paper session: Dates and  Places

  • Paper 3: Dates and places as points of attachment for memorial contents in the ISW corpus: 1938 as a turning point by Carolina Flinz and Simona Leonardi
  • Paper 4: Creating a Typology of Places to Annotate Holocaust Testimonies Through Machine Learning by Christine Liu and William J.B. Mattingly

13:00 - 14.30 pm

Lunch Break

14.30– 15.15 pm

Second Keynote Speech: The Voices from Ravensbrück project: Bringing together what is dispersed

Silvia Calamai, Università degli Studi di Siena, member of the CLARIN-IT consortium

15.15– 16.00 pm

Third session: Testimonies and Narratives

  • Paper 2: Speech Technology Services for Oral History Research by Christoph Draxler, Henk van den Heuvel, Arjan van Hessen, Pavel Ircing and Jan Lehecka
  • Paper  10: Identifying Narrative Patterns and Outliers in Holocaust Testimonies Using Topic Modeling by Maxim Ifergan, Omri Abend, Renana Keydar and Amit Pinchevski

16.00– 16.30 pm

Coffee Break

16.30– 17.10 pm

Fourth session: Traces and networks

  • Paper 5: Tracing the deportation to define Holocaust geometries. The exploratory case of Milan by Giovanni Pietro Vitali and Laura Brazzo
  • Paper 8: Zero-shot Trajectory Mapping in Holocaust Testimonies by  Eitan Wagner, Renana Keydar and Omri Abend

17.10– 17.50 pm

Panel discussion - Chair: Martin Wynne


  • Silvia Calamai (Siena University)
  • Michal Frankl (Masaryk Institute and Archives of the Czech Academy of Sciences)
  • other panelists to be announced

17.50– 18.00 pm

Closing Remark

Call for papers

Topics of interest

We expect contributions related to the following topics:

  • Creation of datasets and development of tools for the study of Holocaust testimonies:
    • Creation of language corpora of Holocaust testimonies
    • Digitization and enhancement of oral and written testimonies (including automatic speech recognition, alignment of text and speech, format conversion, OCR, handwriting recognition, machine translation)
    • Named entity recognition for identifying people, places, and events in testimonies
    • Standards, representation formats, and guidelines for annotations and vocabularies relevant to the Holocaust testimonies
    • Creation, adaptation and tuning of software applications for the creation, annotation, enhancement and use of Holocaust testimonies as language resources
  • Research using and Holocaust testimonies
    • Applications of NLP in analysing Holocaust survivor testimonies
    • Sentiment analysis and emotional content extraction from survivor narratives.
  • Data Visualisation, Knowledge Representation and Information Extraction:
    • Visualising complex data structures from Holocaust testimonies
    • Building knowledge graphs and networks to represent historical relationships
    • Interactive data visualisations for education and research
    • Extracting biographical and temporal information relevant to the Holocaust
    • Deep learning and large language models
  • Digital Archiving and Long-Term Preservation:
    • Methods and tools for digitising and preserving Holocaust testimonies
    • Best practices for metadata standards and cataloguing
    • Ensuring long-term accessibility and data integrity
  • Ethical Considerations and Privacy
    • Ethical challenges in digitising and sharing sensitive testimonies
    • Anonymisation and privacy protection in Holocaust data
    • Community engagement and consent in digital projects
  • User and application aspects
    • Development of tools and interfaces for the search, analysis and exploration of Holocaust testimonies
    • Other relevant use cases and application scenarios

All papers must clearly state and explain their relevance to the topic of 'Holocaust Testimonies as Language Resources'.

Submission & Publication

All papers must represent original and unpublished work that is not currently under review. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. We welcome the following types of contributions:

  • Standard research papers (up to 8 pages, plus more pages for references if needed);
  • Short research papers (from 4 to 6 pages, plus more pages for references if needed).

Submissions must be anonymous and strictly follow the LREC2024 stylesheet formatting guidelines. All papers should be electronically submitted in PDF format via the main conference platform via START.

Identify, Describe and Share your LRs!

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).

Important Dates

  • Paper submission deadline: 28 February 2024 (extended)
  • Notification of acceptance: 20 March 2024
  • Camera-ready paper: 15 April 2024
  • Workshop date: 21 May 2024


Please refer to the website for the details of the programme:


Invited speakers

  • Silvia Calamai (Siena University)
  • Michal Frankl (Masaryk Institute and Archives of the Czech Academy of Sciences)

Organising Committee

Programme Committee

  • Le An Ha, Ho Chi Minh City University of Foreign Languages and Information Technology, Vietnam
  • Federico Boschetti, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy
  • Estelle Bunout, University of Luxembourg, Luxembourg
  • Martin Bulin, University of West Bohemia, Czech Republic
  • Tim Cole, University of Bristol, UK
  • Angelo Mario Del Grosso, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy
  • Maria Dermentzi, King’s College London, UK
  • Robert Ehrenreich, USHMM, USA
  • Ignatius Ezeani, Lancaster University, UK
  • Ian Gregory, Lancaster University, UK
  • Wolf Gruner, Shoah Foundation, USA
  • Arjan van Hessen, Radboud University
  • Henk van den Heuvel, Radboud University & CLARIN ERIC
  • Renana Keydar, The Hebrew University of Jerusalem, Israel
  • William J.B. Mattingly, USHMM, USA
  • Patricia Murrieta-Flores, Lancaster, University, UK
  • Maciej Ogrodniczuk, Institute of Computer, Science, Polish Academy of Sciences, Poland
  • Maciej Piasecki, Wroclaw University of Science and Technology, Poland
  • Rachel Pistol, King’s College London, UK
  • Johannes-Dieter Steinert, University of Wolverhampton, UK
  • Jan Svec, University of West Bohemia
  • Gabor Toth, University of Luxembourg,Luxembourg
  • Eveline Wandl-Vogt, Austrian Academy of Sciences, Vienna