The training materials are all based around teaching the use of GATE, a freely available open-source toolkit for Natural Language Processing that has been widely used in academia and industry for many different tasks.
- Diana Maynard, Senior Research Fellow/course organiser
- Prof. Kalina Bontcheva, course co-organiser
- Ian Roberts, Xingyi Song, Mark A. Greenwood, Mehmet Bakir, Johann Petrak, Ye Jiang (additional course material providers)
Description of the Training Materials
|(Sub)discipline, topic, language(s)||
Natural Language Processing; social sciences; digital humanities; computer science; corpus linguistics
Training materials are in English
|Keywords||Natural Language Processing; Machine Learning; GATE; social media analysis; disinformation; online abuse detection; Python; Deep Learning; information extraction; digital humanities; corpus linguistics; annotation|
|CLARIN resources||GATE toolkit for language processing|
|Structure and duration||The course comprises eleven modules. Each module consists of slides and practical exercises. Hands-on materials (corpora, tools, ready-made applications, etc.) and slides are downloaded on the website. The course is designed for both real-time teaching and self-driven e-learning. Most modules are three hours long when taught by an instructor, though some are shorter. The modules can be mixed and matched depending on relevance, though some have prerequisites that are clearly explained. The real-time course is taught by members of the GATE team, and some time is allocated for the participants to try out the exercises, while further exercises are designed to be carried out by the participants in their own time. Further exercises are optional and are graded so that participants can choose depending on their skills and interests. The split between teaching and exercises varies depending on the module but is typically around 60% teaching and 40% practical exercises. Videos of the real-time course are made available to participants for later self-study. The course can also be followed as open access without formal teaching, as all materials are designed for self-study from the materials provided.|
|Target audience||The course includes modules for programmers who want to develop their tools within the toolkit further and for non-programmers who want to use existing tools, e.g. social scientists and humanities researchers. It is designed equally for students, academic researchers and researchers from industry who may want to use the tools in their work or business. The introductory modules require no specific skills beyond basic IT competence. Knowledge of Java and/or Python is required for some programming modules.|
All materials are provided, though students can also use their corpora or tools. The software is freely available for download and needs to be installed as part of the course. GATE will work with any operating system (Windows/Linux/Mac)
1. Java 8 or later - we recommend AdoptOpenJDK (best choice for Windows users) or Azul Zulu (in particular for new Apple Silicon Macs) but any compatible OpenJDK or Oracle JDK should work
3. A text editor. Please note that Word and Windows Notepad are NOT suitable (they cannot always handle files created on Linux/Mac); The jEdit Programmer's Text Editor is a good Java-based cross-platform alternative.
Online training course comprising eleven modules. Each module consists of slides and practical exercises. Hands-on materials (corpora, tools, ready-made applications, etc.) and slides are downloaded on the website. The course is designed for both real-time teaching and self-driven e-learning.
|Course(s) in which the training material was used||
In addition to the annual GATE training course, parts of the materials have been used in:
|Licence and (re)use||
The course materials are licenced under the Creative Commons Attribution-NonCommercial-Share-Alike Licence. Users are free to:
Acknowledgement of the GATE team is required.
|Creation date||20 July 2009|
|Last modification date||15 February 2021|
Experience with Using CLARIN Resources in Teaching
Additional Information and Resources
We recommend our accompanying book for additional context around the materials and tools, as well as providing further background NLP knowledge and explaining how GATE compares with alternative tools:
- D. Maynard, K. Bontcheva, I. Augenstein. Natural Language Processing for the Semantic Web. Morgan and Claypool, December 2016. ISBN: 9781627059091 Morgan and Claypool link