GraphColl logo
United Kingdom

GraphColl is a tool for building and exploring networks of linguistic collocations. It was developed at the ESRC Centre for Corpus Approaches to Social Sciences (CASS) at Lancaster University in 2014-15. CASS is a member of the CLARIN-UK consortium.

Text in a particular field of discourse is organized into lexical patterns, which can be visualized as networks of words that collocate with each other. GraphColl is a tool that builds collocation networks from corpora, allowing the user to gain important insights into semantic relationships.

GraphColl 1.0 is a free tool, developed with both novice and advanced users in mind, providing full control over the statistics and methods used to build collocation networks, whilst also offering sensible defaults for casual users. The system runs locally on a desktop computer, with a graphical user interface. The interface is structured around a series of tabs, which may be followed in a wizard-like manner to construct, explore and export a collocation graph. Graphs are presented as detachable tabs, allowing multiple graphs to be generated and examined at once.

The user can define properties of the collocation graph to be produced, such as the span of left and right collocation windows, the association measure, the minimum collocate and minimum collocation frequency, and  “advanced thresholds”, which are boolean expressions written in the Groovy scripting language.

GraphColl has been used in a number of research projects, including investigations into historical moral panics about swearing, current discourse about migration in Europe, and the use of GraphColl is taught in a corpus linguistics MOOC (see