- Clarin
- Publications
- Clarin Groups
- Events
- Resources
- Help Desk
Join CLARIN
CLARIN Newsletter
Frequently Asked Questions - ISOcat
An ISO Data Category Registry is a step in the direction of interoperability at the level of linguistic encoding (tag sets, metadata elements, etc.). The basic idea is to register all widely used concepts/terminology so that everyone can refer to them. All is based on the ISO 12620 standard which is a generic model not restricted to linguistics.
ISOcat is the software and database that implements the ISO 12620 standard and data model. In theory it is one of the many implementations of this standard, in practice it is the only one that currently exists. It can be accessed via http://www.isocat.org/
Currently it is being filled with many categories from for example the EAGLES project, various metadata initiatives and hopefully other sub-disciplines and initiatives. There are bodies made up by linguists that take care that the content of the ISOcat registry is not too fragmented and meets a number of criteria.
No - it is just a start to offer a reference, so that users creating new resources could use the registered categories and schemas describing legacy data can refer to them. But we will found that not all tag sets which are in use for various purposes can easily be mapped on another one. It also will largely depend on the intended usage. For searching an imperfect mapping may result in less precision, but for a researcher this may not be a problem.
There is much debate about this and other question and there is no good universal answer yet. However, we need to start using the ISOcat registry to find out how the definitions can be improved, which categories are missing and which granularity should be chosen for metadata, morphology and semantic annotation to just mention a few examples.
NO - the data model was set up with the explicit intention to not include relations, since these in most cases are dependent on theories and practical intentions.
To deal with relations between data categories a framework will be offered, RELcat, that allows users to easily manipulate and share relations according to their needs. From CLARIN we intend to offer at least one set of relations with a large coverage which users may want to use or manipulate.
Use the ISOcat forum - see http://www.isocat.org/forum/viewtopic.php?f=3&t=4&p=6#p6 for details.
In origin you can mention the inspiration source for the creation of the data category. If you do not know what to enter, please enter CLARIN
For the source of the language section, please enter CLARIN
For the source of the language section, please enter CLARIN
No, unless it is required by certain language rules (e.g. nouns in German), the name of a data category should not contain capitals.
- Register at isocat.org
- Send a mail with your name to dieter.vanuytvanck@mpi.nl
- You will be added to the CLARIN group
- My Workspace > button “create new data category”
- My Workspace > Private > CLARIN > MD > button “edit this data category selection”
- My Workspace > Private > + (add this data category to selection)
- Click on the icon for “save the selected data categories”
- After inspection the new datacats can be moved to the Metadata thematic view
- (Finally, and optionally the datcats in the Metadata thematic view can be submitted to the Thematic Domain Group for official approval)
It is a data category without conceptual domain but intended to group complex data categories (or another container).
As such it can be used to combine semantics. For instance, if a CMDI component "actor" has a reference to the container data categorie actor, and it contains an element with a reference to the complex data category "name", then a search engine could infer that the name is the one of the actor ( = "actor" + "name").
Yes, there is. Have a look at http://trac.clarin.nl/trac/wiki/WikiStart/FAQ#ISOcat

