Skip to main content

Hands-on session: From basic corpus analysis to more complex workflows

Hands-on session by: Andreas Blätte
Title: From basic corpus analysis to more complex workflows: Using the ‘polmineR’-package as a toolkit for analysing parliamentary speeches
Abstract: In the context of the PolMine project (, the R package ‘polmineR’ was developed to serve an environment to analyse corpora of parliamentary debates. It interfaces to the Corpus Workbench (CWB) which serves a backend to manage corpora efficiently. The workshop will introduce the core functionality of polmineR to perform basic tasks in corpus analysis, such as inspecting concordances, getting (dispersions) of counts, preparing co-occurrence statistics, or generating term document and term co-occurrence matrices. A particular focus of the package is to create and work with subcorpora, and to retrieve the full text of speeches. The polmineR package is intended to serve as a basis for implementing more complex workflows. In the second part of the session, I will explore how we might use the package to analyse diachronic meaning change, to perform (a simple dictionary-based) sentiment analysis, to generate training data for machine learning tasks, or to work with annotation data we may have from an annotation project.

Back to the Workshop page.