You are here
Frequently Asked Questions - Metadata in CLARIN: conversion to CMDI
From that profile you can generate the XSD:
And then you can transform your DC XML files into CMDI files that comply with the profile with the following XSLT:
An example (DC) inputfile:
The corresponding (CMDI) outputfile:
There is no general procedure to do this, as TEI has many variants and extensions. However, you could follow the following general workflow:
- Inspect your TEI headers and decide what the relevant parts are. Some information (e.g. layout tags etc.) might be lost during the conversion.
- Compare your needs with one of the existing TEI profiles (teiHeader type 1, teiHeader type 2, teiHeader type 3) in the CMDI component registry. If it fulfills your needs, go to the next steps. If it does not, use the TEI profile as a basis to create your own CMDI profile.
- Create an XSLT that generates CMDI instances (according to the profile that you chose in the previous step) from the TEI files. (Have a look at olac2cmdi.xsl and imdi2clarin.xsl for some inspiration).
If you have records in the Meta-Share maximal format you can use the profiles and conversion stylesheets as described at the Meta site.
Then you can use an XSLT transformation to transform your Meta-Share records into the CMDI equivalent. For the maximal Meta-Share schema, guidelines and XSLT files are provided here. For the minimal MS-schema to OLAC, the XSLT is provided here.
Related to this Jozef Misutka from UFAL has been so kind to implement an OAI-PMH module for the Meta-Share repository.
See also: CMDI interoperability workshop
If you have old records in the IMDI format you can use the following profiles:
- for sessions: https://catalog.clarin.eu/ds/ComponentRegistry#/?itemId=clarin.eu%3Acr1%...
- for sign language sessions: https://catalog.clarin.eu/ds/ComponentRegistry#/?itemId=clarin.eu%3Acr1%...
- for corpus nodes: https://catalog.clarin.eu/ds/ComponentRegistry#/?itemId=clarin.eu%3Acr1%...
From the profile you can generate the XSD:
- for sessions: https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.x/profile...
- for sign language sessions: https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.x/profile...
- for corpus nodes: https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.x/profile...
And then you can transform your IMDI files into CMDI files that comply with the profile with the following set of XSLTs: