Skip to main content

Holocaust Testimonies as Language Resources



General Details  

Date: 21 May 2024
Time: All day
Venue: Lingotto Conference Centre, Turin, Italy


Holocaust testimonies serve as a bridge between survivors and history’s darkest chapters, providing a connection to the profound experiences of the past. Testimonies stand as the primary source of information that describe the Holocaust, offering first-hand accounts and personal narratives of those who experienced it. The majority of testimonies are captured in an oral format, as survivors vividly explain and share their personal experiences and observations from that time period. Transforming Holocaust testimonies into a machine-processable digital format can be a difficult task owing to the unstructured nature of the text. The creation of accessible, comprehensive, and well-annotated Holocaust testimony collections is of paramount importance to our society. These collections empower researchers and historians to validate the accuracy of socially and historically significant information, enabling them to share critical insights and trends derived from these data. This workshop will investigate a number of ways in which techniques and tools from natural language processing and corpus linguistics can contribute to the exploration, analysis, dissemination and preservation of Holocaust testimonies.

Topics of interest

We expect contributions related to the following topics:

  • Creation of datasets and development of tools for the study of Holocaust testimonies:
    • Creation of language corpora of Holocaust testimonies
    • Digitization and enhancement of oral and written testimonies (including automatic speech recognition, alignment of text and speech, format conversion, OCR, handwriting recognition, machine translation)
    • Named entity recognition for identifying people, places, and events in testimonies
    • Standards, representation formats, and guidelines for annotations and vocabularies relevant to the Holocaust testimonies
    • Creation, adaptation and tuning of software applications for the creation, annotation, enhancement and use of Holocaust testimonies as language resources
  • Research using and Holocaust testimonies
    • Applications of NLP in analysing Holocaust survivor testimonies
    • Sentiment analysis and emotional content extraction from survivor narratives.
  • Data Visualisation, Knowledge Representation and Information Extraction:
    • Visualising complex data structures from Holocaust testimonies
    • Building knowledge graphs and networks to represent historical relationships
    • Interactive data visualisations for education and research
    • Extracting biographical and temporal information relevant to the Holocaust
    • Deep learning and large language models
  • Digital Archiving and Long-Term Preservation:
    • Methods and tools for digitising and preserving Holocaust testimonies
    • Best practices for metadata standards and cataloguing
    • Ensuring long-term accessibility and data integrity
  • Ethical Considerations and Privacy
    • Ethical challenges in digitising and sharing sensitive testimonies
    • Anonymisation and privacy protection in Holocaust data
    • Community engagement and consent in digital projects
  • User and application aspects
    • Development of tools and interfaces for the search, analysis and exploration of Holocaust testimonies
    • Other relevant use cases and application scenarios

All papers must clearly state and explain their relevance to the topic of 'Holocaust Testimonies as Language Resources'.

Submission & Publication

All papers must represent original and unpublished work that is not currently under review. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop. We welcome the following types of contributions:

  • Standard research papers (up to 8 pages, plus more pages for references if needed);
  • Short research papers (from 4 to 6 pages, plus more pages for references if needed).

Submissions must be anonymous and strictly follow the LREC2024 stylesheet formatting guidelines. All papers should be electronically submitted in PDF format via the main conference platform via START.

Identify, Describe and Share your LRs!

When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. Moreover, ELRA encourages all LREC authors to share the described LRs (data, tools, services, etc.) to enable their reuse and replicability of experiments (including evaluation ones).

Important Dates

  • Paper submission deadline: 28 February 2024 (extended)
  • Notification of acceptance: 20 March 2024
  • Camera-ready paper: 15 April 2024
  • Workshop date: 21 May 2024


Please refer to the website for the details of the programme:

Contact Email:

Invited speakers

  • Silvia Calamai (Siena University)
  • Michal Frankl (Masaryk Institute and Archives of the Czech Academy of Sciences)

Organising Committee

Programme Committee

  • Le An Ha, Ho Chi Minh City University of Foreign Languages and Information Technology, Vietnam
  • Federico Boschetti, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy
  • Estelle Bunout, University of Luxembourg, Luxembourg
  • Martin Bulin, University of West Bohemia, Czech Republic
  • Tim Cole, University of Bristol, UK
  • Angelo Mario Del Grosso, CNR-Istituto di, Linguistica Computazionale “A. Zampolli”, Italy
  • Maria Dermentzi, King’s College London, UK
  • Robert Ehrenreich, USHMM, USA
  • Ignatius Ezeani, Lancaster University, UK
  • Ian Gregory, Lancaster University, UK
  • Wolf Gruner, Shoah Foundation, USA
  • Arjan van Hessen, Radboud University
  • Henk van den Heuvel, Radboud University & CLARIN ERIC
  • Renana Keydar, The Hebrew University of Jerusalem, Israel
  • William J.B. Mattingly, USHMM, USA
  • Patricia Murrieta-Flores, Lancaster, University, UK
  • Maciej Ogrodniczuk, Institute of Computer, Science, Polish Academy of Sciences, Poland
  • Maciej Piasecki, Wroclaw University of Science and Technology, Poland
  • Rachel Pistol, King’s College London, UK
  • Johannes-Dieter Steinert, University of Wolverhampton, UK
  • Jan Svec, University of West Bohemia
  • Gabor Toth, University of Luxembourg,Luxembourg
  • Eveline Wandl-Vogt, Austrian Academy of Sciences, Vienna