- Clarin
- Publications
- Clarin Groups
- Events
- Resources
- Help Desk
Data Categories
PLEASE NOTE: this list is outdated, for a complete overview see www.isocat.org
How to read the following list of proposed data categories?
The proposed data categories have been selected at the Athens expert meeting and also should allow upwards compatibility with the existing metadata sets defined by IMDI, DC and OLAC. The field "structure" and "substructure" are nothing else than ways to structure the elements for readability reasons. They do not define components. The descriptions are short annotations we made at the Athens meeting, but they will not be the definitions. When you click on the element name you can get detailed information about "definitions", "vocabularies", "constraints", etc. The purpose is to refine all aspects of this information so that we can get a stable set within short time and which can be entered into the ISOcat registry.
No search means that a human user probably will not search on this datcat (and thus that no high granularity is needed). Delete indicates that a datcat qualifies for removal (because it is superfluous, etc)
Number of records that match the current filter criteria: 154
Have a suggestion for a missing Data Category? Submit it here.
Switch to the complete listing of datcats or export all datcats as a CSV file
| Title | Main structure | Definition | Media | Annotations | Texts | Lexica | Lists | Tools | Not for search | Delete? | Comment count |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Availability | access | A description of the terms of availability of the resource in simple words. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 2 |
| ContactFullName | access | The name of the person that can be contacted to get access to the resource or to the tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 2 |
| FaxNumber | access | The Fax number of a person or an organization that is involved in creating, managing or providing access to resources or tools/services. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | |
| LegalStatus | access | Describes the legal status of the organization that is involved in creating, managing or providing access to a resource or a tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | 3 |
| License | access | A description of the licensing conditions under which the resource can be used. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 3 |
| Medium | access | The medium that is used to access the resource or the tool. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 3 |
| Organization | access | The organization that was leading the creation project or that is responsible for accessing the resource and the contact person is affiliated with. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | |
| Price | access | The costs that are required to access the resource, a fragment of the resource or to use the tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | |
| TelNumber | access | The telephone number of a person or an organization that is involved in creating, managing or accessing the resource. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | |
| AnnotationLevelType | content | Specifies the types of annotation levels (tiers) provided by the resource. | No | Yes | No | No | No | No | Yes | No | 4 |
| Channel | content | Indicates the channel of the communication | Yes | No | No | No | No | Yes | 3 | ||
| DeliveryFormat | content | Specifies the format used for the delivery of the resource. | No | No | Yes | Yes | Yes | No | No | No | 2 |
| Dialect | content | Names of the dialects that occur in the resource. | Yes | Yes | Yes | Yes | Yes | No | Yes | No | 2 |
| Domain | content | Indicates the application domain of the resource or the tool/service. | No | No | Yes | Yes | Yes | Yes | Yes | No | |
| DominantLanguage | content | Specifies the most frequently used language in a resource. | Yes | Yes | Yes | Yes | Yes | No | No | No | |
| EventStructure | content | Indicates the structure of the communication event. | Yes | Yes | Yes | No | No | No | No | No | 4 |
| Genre | content | The conventionalized discourse or text types of the content of the resource, based on extra-linguistic and internal linguistic criteria. | Yes | Yes | Yes | Yes | Yes | No | No | Yes | 1 |
| GeographicCoverage | content | The goegraphic scope that the content of a resources is from or about. | No | No | Yes | Yes | Yes | No | Yes | No | |
| Interactivity | content | Characterizes the degree of interactivity between all the participants in the resource. | Yes | Yes | No | No | No | No | Yes | No | 1 |
| Involvement | content | Indicates in how far the researcher was involved in the linguistic event. | Yes | Yes | No | No | No | No | Yes | No | 1 |
| LanguageID | content | Identifier of the language as defined by ISO 639 that is included in the resource or supported by the tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | |
| LanguageIn | content | An identifier of the language in which the description of the resource is written. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | |
| LanguageName | content | A human understandable name of the language that is used in the resource or supported by the tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 4 |
| LanguageScript | content | Indication of the writing system used to represent the language in form of a four letter code as it is defined in ISO-15924. | No | Yes | Yes | Yes | Yes | 4 | |||
| LexicalUnit | content | Indicate the type(s) of headword(s) used in the Lexicon. | No | No | No | Yes | No | Yes | 4 | ||
| LexiconType | content | A description of the type of the lexicon. | No | No | No | Yes | No | Yes | 2 | ||
| MainLevelInformation | content | Indicate the type of lexical attributes that can be found in the lexicon (orthography, pronounciation, morphology etc). | No | No | No | Yes | No | Yes | 1 | ||
| MetaLanguage | content | Name of the language that is used as a metalanguage in the resource. | No | No | No | Yes | No | No | No | No | 1 |
| Modalities | content | A listing of all modalities that are contained in the recording such that they can be subject of analysis or that are supported by the tool/service. | Yes | No | No | Yes | No | Yes | Yes | 10 | |
| NoLanguages | content | An indication of the number of languages that are included in the resource. | Yes | Yes | Yes | Yes | Yes | No | No | No | |
| PlanningType | content | Indicates in how far the consultant planned the linguistic event. | Yes | Yes | No | No | No | No | Yes | No | 1 |
| Register | content | Indicates the variety of language used for a particular purpose or in a particular social setting. | Yes | No | No | No | No | No | Yes | Yes | 3 |
| SocialContext | content | Indicates the social context the event took place in. | Yes | Yes | Yes | No | No | No | Yes | No | 1 |
| SourceLanguage | content | Indicates if a language is a source language. (boolean) | Yes | No | Yes | Yes | No | No | No | No | |
| StructuralUnits | content | Indication of the structural units that are marked up in the resource. | Yes | No | Yes | No | Yes | No | No | No | |
| Subgenre | content | The conventionalized discourse or text sub-types of the content of the resource, based on extra-linguistic and internal linguistic criteria. | Yes | No | Yes | No | No | No | Yes | Yes | 1 |
| SubLevelInformation | content | Indicate the sub-level information that is contained in the lexicon that qualifies the main level information. | No | No | No | Yes | No | 1 | |||
| Tagset | content | Specifies the tag set used in the annotation of the resource or a used by the tool/service or it contains a URL that points to the information about the tag set. | No | Yes | No | No | No | Yes | |||
| TagsetLanguage | content | Indicates the language of the tag set itself, expressed in the two-letter language codes of iso639. | No | Yes | No | No | No | ||||
| TargetLanguage | content | Indicates if the language is the target language. (boolean) | Yes | No | Yes | Yes | No | No | No | No | 2 |
| Task | content | The major task carried out in the resource or a typical task description of the tool/service. | Yes | Yes | Yes | No | No | Yes | Yes | No | 1 |
| TheoreticModel | content | Name of the theoretic model underlying the annotation task and/or reference (URL or bibliographic reference) to informative material about the theoretic model used. | No | Yes | No | No | No | ||||
| TimeCoverage | content | The time period that the content of a resource is about. DC-Definition: The temporal topic of the resource. | No | No | Yes | Yes | Yes | Yes | |||
| Topic | content | Describes the knowledge domain of the resource. | Yes | Yes | Yes | Yes | Yes | No | Yes | No | 1 |
| VocabularySize | content | Indication of the size of the vocabulary used in the resource. | Yes | Yes | Yes | No | No | No | No | No | 2 |
| Address | creation | The address of an organization that was/is involved in creating, managing and accessing resource or tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | |
| AnnotationMode | creation | Indicates whether the resource was created manually or by automatic processes. | No | Yes | No | No | No | No | No | No | |
| AnnotationStandoff | creation | Indicates whether the annotation was created inline or in a stand-off fashion. | No | Yes | No | No | No | No | No | No | |
| AnnotationTool | creation | Indicates the tool with help of which the annotations have been produced. <<creationtool is semantically overlapping, thus this category is suggested to be deleted>> | No | Yes | No | No | No | No | No | Yes | 1 |
| AnnotationWorkflow | creation | Indicates the workflow process in which the creation process was embedded. | No | Yes | No | No | No | No | No | No | |
| BibliographicReference | creation | <<this is a component existing of several suggested categories from this list, TEI or DC - thus not a ctegory; we leave it in here for reasons of completeness>> | No | No | Yes | No | Yes | No | No | Yes | 1 |
| CompletionYear | creation | The year in which the creation process was completed. | No | Yes | Yes | Yes | Yes | Yes | |||
| CreationDate | creation | The date when the resource or tool/service was created. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | |
| CreationTool | creation | Indicates the tool with help of which the resource or the annotations in the resource were created. | No | No | No | Yes | Yes | No | No | No | |
| CreatorFullName | creation | The name of the person who was participating in the creation project. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 3 |
| CreatorRole | creation | The role that a person had in the creation process. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | 2 | |
| DeploymentTool | creation | Indicates a specific tool that may be used for the deployment of the resource. | No | Yes | Yes | Yes | No | No | No | No | |
| DerivationDate | creation | Indicates the date of derivation in case that a resource was derived from another one. | No | No | No | Yes | Yes | No | No | No | |
| DerivationMode | creation | Indicates whether the derivation was created manually or automatic. | No | No | No | Yes | Yes | No | No | No | |
| DerivationTool | creation | Indicates the tool that was used during the derivation process. | No | No | No | Yes | Yes | No | No | No | |
| DerivationType | creation | Indication of the type of derivation. | No | No | No | Yes | Yes | 1 | |||
| DerivationWorkflow | creation | Indicates of the workflow process in which the derivation process was embedded. | No | No | No | Yes | Yes | No | No | No | |
| Description | creation | A description in general prose text of the issues that are indicated by the context. The description field can occur at many different places in a component and profile. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 1 |
| creation | The email address of a person or an organization that is involved in creating, managing or accessing resources or tools/services. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | ||
| Funder | creation | Name of the funder of the project. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| GeoCoordinates | creation | The geographical coordinates identifying the location of the creation of the resource specified as a pair according to the norms. | Yes | No | No | No | No | Yes | |||
| HarvestingDate | creation | Date when a copy was taken from the original resource. | No | No | Yes | No | Yes | ||||
| InterannotatorAgreement | creation | An indication of the inter-annotator agreement if appropriate methods where applied. | No | Yes | No | No | No | No | No | No | 2 |
| LastUpdate | creation | The date of the last update. | No | No | Yes | Yes | Yes | Yes | No | No | |
| LinguisticSubject | creation | <<the definition from OLAC needs to be taken>> | No | No | Yes | Yes | No | No | Yes | No | 2 |
| LocationAddress | creation | The address where the resource was created or originated. | Yes | No | No | No | No | No | Yes | No | |
| LocationContinent | creation | The continent of where the resource was created or originated. | Yes | No | No | No | No | Yes | 1 | ||
| LocationCountry | creation | The country where the resource was created or originated. | Yes | No | No | No | No | Yes | 1 | ||
| LocationRegion | creation | The region or sub-region where the resource was created or originated. | Yes | No | No | No | No | Yes | |||
| OriginalSource | creation | Indicates the original resources that were at the base of the creation/derivation process. | Yes | Yes | Yes | Yes | Yes | No | No | No | |
| ProjectId | creation | A unique identifier identifying the project. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| ProjectName | creation | A short name or abbreviation of the project that led to the creation of the resource or tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | |
| ProjectTitle | creation | The full title of the project that led to the creation of the resource or tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| PublicationDate | creation | The date at which the resource or tool/service was published, i.e. announced to the public. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||
| StartYear | creation | The year in which the creation process was started. | No | Yes | Yes | Yes | Yes | Yes | |||
| UpdateFrequency | creation | The Frequency with which the resource or the tool/service is updated. | No | No | Yes | Yes | Yes | Yes | No | No | |
| MetadataCreationDate | ID | The date of creation of this metadata description. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | |
| MetadataCreator | ID | The name(s) of the person(s) that created this metadata description. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| MetadataLanguage | ID | An identifier of the language in which the metadata description was written. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 1 |
| ResourceName | ID | A short name to identify the language resource. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ||
| ResourceTitle | ID | The title is the complete title of the resource without any abbreviations. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| Url | ID | A URL referring to another resource that can be used in various contexts. | Yes | Yes | Yes | Yes | Yes | Yes | |||
| Version | ID | A number that identifies the version of a metadata description, a resource or a tool/web service. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 3 |
| AnonymizationFlag | others | Indicates whether or not the name and full name of the person participating in the session are replaced by pseudo names to make him/her anonymous. (boolean) | No | No | No | No | No | ||||
| RelationType | others | Formal specification of a relation of this resource or tool/service with another resource or tool/service. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 2 |
| ParticipantAge | participants | The age of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | No | No | |
| ParticipantBirthdate | participants | The date of birth of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | No | No | |
| ParticipantCode | participants | Short unique code to identify the person participating in the content of the resource. | Yes | Yes | No | No | No | No | No | No | |
| ParticipantDominantLanguage | participants | Specifies the preferred language of a participant. | Yes | No | No | No | No | No | No | No | |
| ParticipantEducation | participants | The education of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | No | No | 2 |
| ParticipantEthnicGroup | participants | The ethnic group of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | Yes | No | |
| ParticipantFullname | participants | The full name of the person participating in the content of the resource. | Yes | Yes | No | No | No | Yes | |||
| ParticipantName | participants | The name of the person participating in the content of the recording as it is used by others in the transcription. | Yes | Yes | No | No | No | Yes | |||
| ParticipantProfession | participants | The profession of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | No | No | |
| ParticipantRole | participants | The functional role of the person participating in the content of the resource. | Yes | Yes | No | No | No | ||||
| ParticipantSex | participants | The sex of the person participating in the content of the resource. | Yes | Yes | No | No | No | No | Yes | No | |
| SocialFamilyRole | participants | The social or family role of the person participating in the content of the resource. | Yes | Yes | No | No | No | ||||
| AnnotationFormat | resources | Specifies the annotation format that is used since often the mime type will not be sufficient for machine processing. | No | Yes | No | No | No | Yes | 3 | ||
| CaptureMethod | resources | Indication of the capturing/digitization method that was used when creating the digital version. | Yes | Yes | Yes | Yes | Yes | No | Yes | No | |
| CharacterEncoding | resources | Name of the character encoding used in the resource or accepted by the tool/service. | No | Yes | Yes | Yes | Yes | Yes | |||
| CharacterSet | resources | The repertoire of characters used in the resource. A range of characters (non-coded character set) or a coded character set as defined in RFC 2050. | No | Yes | Yes | Yes | Yes | No | No | No | 4 |
| Condition | resources | Description of the technical conditions under which the resource was recorded. | Yes | No | No | No | No | No | Yes | No | |
| Duration | resources | Specification of the duration of a recording. | Yes | No | No | No | No | ||||
| Enviroment | resources | Description of the environmental conditions under which the recording was created. | Yes | No | No | No | No | ||||
| ExecutionLocation | resources | Identification of the location where the tool/service is being executed. | No | No | No | No | No | Yes | |||
| MediaType | resources | Specification of the media type of the resource or the media types the tool/service is suitable for. | Yes | No | No | Yes | No | Yes | Yes | 1 | |
| MimeType | resources | Specification of the mime-type of the resource which is a formalized specifier for the format included or a mime-type that the tool/service accepts. | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | 1 |
| OperatingSystem | resources | Identification of the operating system and its exact specification that is required to execute the tool/service. | No | No | No | No | No | Yes | |||
| PID | resources | Specification of a persistent identifier that refers to the resource or tool/service this metadata information describes. | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 1 |
| Quality | resources | A numeric indication of the quality of the media file or service (1=low..5=high) | Yes | No | No | Yes | No | Yes | |||
| Resolution | resources | Specification of the spatial resolution of images or movies. | Yes | No | No | No | No | ||||
| RunningEnvironment | resources | Specification of the running environment that is required to execute the tool/service. | No | No | No | No | No | Yes | |||
| Samplerate | resources | Specification of the sample rate that is used for the recording. | Yes | No | No | No | No | ||||
| Segmentation | resources | Specification of the pointers in time or sequence that indicates the segmentation process. | Yes | No | No | No | No | No | No | No | |
| SegmentationMethod | resources | Indication of the method with which the segmentation was achieved. | Yes | No | No | No | No | ||||
| Size | resources | The size of the resource with regard to the SizeUnit measurement in form of a number. | Yes | Yes | Yes | Yes | Yes | No | No | No | |
| SizePerLanguage | resources | Indication of the size per language that is included in the resource. | No | No | Yes | Yes | No | No | No | No | 2 |
| SizePerRepLevel | resources | Indication of the size per representational unit in the resource (i.e. Number of syntactic units, number of semantic units etc.) | Yes | Yes | Yes | Yes | Yes | No | No | No | |
| SizeUnit | resources | Specification of the unit of size that is used when specifying the size. | Yes | Yes | Yes | Yes | Yes | ||||
| Validation | resources | Indication of the validation status of the resource (boolean). | Yes | Yes | Yes | Yes | Yes | 1 | |||
| ValidationLevel | resources | Indication of the level of validation (full, partial). | Yes | Yes | Yes | Yes | Yes | ||||
| ValidationMode | resources | Indication of the validation methodology that was applied (automatic/manual). | Yes | Yes | Yes | Yes | Yes | No | No | No | 1 |
| ValidationType | resources | Indication of the validation type that was applied (formal, content, etc). | Yes | Yes | Yes | Yes | Yes | ||||
| Content-Channel | vocabulary | Unknown; Unspecified; Face to Face; Experimental setting; Broadcasting; Telephone; wizard-of-oz; Human-machine dialogue; Other; | No | No | No | No | No | ||||
| Content-EventStructure | vocabulary | Unknown; Unspecified; Monologue; Dialogue; Conversation; Not a natural format; | No | No | No | No | No | ||||
| Content-Genre | vocabulary | Unknown; Unspecified; Discourse; Ritual/religious texts; Fiction; Newspaper article; Radio/TV feature; Drama; Singing; Instrumental music; Poetry; Literature; Secondary document; Personal notes;... | No | No | No | No | No | ||||
| Content-Interactivity | vocabulary | Unknown; Unspecified; interactive; non-interactive; semi-interactive; | No | No | No | No | No | ||||
| Content-Involvement | vocabulary | Unknown; Unspecified; elicited; non-elicited; no-observer; | No | No | No | No | No | ||||
| Content-Modalities | vocabulary | Unknown; Unspecified; speech; writing; gestures; pointing-gestures; signs; eye-gaze; facial-expressions; emotional-state; haptic; song; instrumental music; | No | No | No | No | No | ||||
| Content-PlanningType | vocabulary | Unknown; Unspecified; spontaneous; semi-spontaneous; planned; | No | No | No | No | No | ||||
| Content-SocialContext | vocabulary | Unknown; Unspecified; Family; Private; Public; Controlled environment; | No | No | No | No | No | ||||
| Content-SubGenre | vocabulary | Unknown; Unspecified; | No | No | No | No | No | ||||
| Content-SubGenre-Discourse | vocabulary | Unknown; Unspecified; Narrative; Oratory; Procedural; Formulaic; Language play; Interview; Conversation; Description; Unintelligeble speech; | No | No | No | No | No | ||||
| Content-Subject | vocabulary | Unknown; Unspecified; | No | No | No | No | No | ||||
| Content-Task | vocabulary | Unknown; Unspecified; info-kiosk; travel-planning; room reservation; frog story; | No | No | No | No | No | ||||
| Continents | vocabulary | Unknown; Unspecified; Africa; Asia; Europe; Australia; Oceania; North-America; Middle-America; South-America; | No | No | No | No | No | ||||
| Countries | vocabulary | Unknown; Undefined; Unspecified; Afghanistan; Albania; Algeria; American Samoa; Andorra; Angola; Anguilla; Antarctica; Antigua and Barbuda; Arctic Ocean; Argentina; Armenia; Aruba; Ashmore and... | No | No | No | No | No | ||||
| Country | vocabulary | No | No | No | No | No | 1 | ||||
| DescriptionLanguages | vocabulary | ISO639-2:dut; ISO639-2:eng; ISO639-2:ger; ISO639-2:frn; ISO639-2:spn; ISO639-2:por; ISO639-2:rus; ISO639-2:ara; ISO639-2:ind; | No | No | No | No | No | ||||
| MediaFile-Format | vocabulary | application/pdf; application/smil+xml; audio/x-aiff; audio/x-mp2; audio/x-mp3; audio/x-mpeg4; audio/x-wav; image/jpeg; image/png; image/tiff; image/gif; text/html; video/x-mpeg1; video/x-mpeg2; video... | No | No | No | No | No | ||||
| MediaFile-Type | vocabulary | Unknown; Unspecified; audio; video; image; document; drawing; text; | No | No | No | No | No | ||||
| MPI-Languages | vocabulary | Abaza; Abkhaz; Adyghe; Aghul; Afrikaans; Akhvakh; Akurio; Amarg; American Sign Language; Andi; Arabic, Algerian; Arabic, Egyptian; Arabic, Moroccan Spoken; Arabic, Standard; Argi; Arrernte, Eastern;... | No | No | No | No | No | ||||
| Validation-Methodology | vocabulary | Unknown; Unspecified; Hand; Automatic; Semi-Automatic; | No | No | No | No | No | ||||
| Validation-Type | vocabulary | Unknown; Unspecified; None; Formal; Content; Formal/Content; | No | No | No | No | No | ||||
| WrittenResource-Derivation | vocabulary | Unknown; Unspecified; Original; Analysis; Translation; Commentary; Criticism; Annotation; | No | No | No | No | No | ||||
| WrittenResource-Format | vocabulary | application/pdf; text/html; text/plain; text/xml; text/x-chat; text/x-cut; text/x-eaf+xml; text/x-esf; text/x-lexus-config+xml; text/x-lexus-resource+xml; text/x-lmf+xml; text/x-shoebox-language;... | No | No | No | No | No | ||||
| WrittenResource-SubType | vocabulary | Unspecified; Unknown; | No | No | No | No | No | ||||
| WrittenResource-SubType-Annotation | vocabulary | Unknown; Unspecified; Gesture; Orthography; Phonetic; Phonology; Morphology; Morphosyntax; Syntax; Semantics; Pragmatics; Typology; | No | No | No | No | No | ||||
| WrittenResource-Type | vocabulary | Unknown; Unspecified; Primary Text; Annotation; Lexical Analysis; Ethnography; Study; | No | No | No | No | No |

