- Clarin
- Publications
- Clarin Groups
- Events
- Resources
- Help Desk
OAI-PMH for CMDI
(draft version)
The main trick to provide CMDI over the OAI-PMH protocol is to rely on the minimal CMDI XSD (an XML schema just defining the header structure and the CMD-tags) for the definition of the metadataFormat. Then in the instances themselves you link to the profile-specific XSD file (that is generated from the profile with comp2schema.xsl).
In the example below a record for a Lexicon profile is transported over OAI-PMH.
http://someurl?verb=ListMetadataFormats
<ListMetadataFormats> metadataFormat> <metadataPrefix>oai_dc</metadataPrefix> <schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema> <metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace> </metadataFormat> <metadataFormat> <metadataPrefix>cmdi</metadataPrefix> <schema>http://www.clarin.eu/cmd/xsd/minimal-cmdi.xsd</schema> <metadataNamespace>http://www.clarin.eu/cmd/</metadataNamespace> </metadataFormat> ... </ListMetadataFormats>
http://someurl?verb=ListRecords&metadataprefix=cmdi
<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/ OAI/2.0/OAI-PMH.xsd"> <responseDate>2009-05-25T11:46:14Z</responseDate> <request verb="ListRecords" metadataPrefix="cmdi">http://corpus1/ds/oai2/oai2</request> <ListRecords> <record> <header>...</header> <metadata> <!--the CMDI lexicon profile --> <cmdi:CMD xsi:schemaLocation="http://www.clarin.eu/cmd http://www.clarin.eu/cmd/components/clarin-nl/lexicon/lexicon.xsd"> <cmdi:Header></cmdi:Header> <cmdi:Resources> <cmdi:ResourceProxyList> <cmdi:ResourceProxy id="dicovalence"> <cmdi:ResourceType>Resource</cmdi:ResourceType> <cmdi:ResourceRef>http://bach.arts.kuleuven.be/dicovalence/latest_utf8.txt</cmdi:ResourceRef> </cmdi:ResourceProxy> </cmdi:ResourceProxyList> <cmdi:JournalFileProxyList></cmdi:JournalFileProxyList> <cmdi:ResourceRelationList></cmdi:ResourceRelationList> </cmdi:Resources> <cmdi:Components> <cmdi:Lexicon ref="dicovalence"> <cmdi:Name>Dicovalence</cmdi:Name> <cmdi:Title></cmdi:Title> <cmdi:ID></cmdi:ID> <cmdi:Version></cmdi:Version> <cmdi:TimeCoverage></cmdi:TimeCoverage> <cmdi:Description></cmdi:Description> <cmdi:LexiconType> <cmdi:LexiconType>dictionary</cmdi:LexiconType> <cmdi:LexiconSubtype> <cmdi:Linguality>bilingual</cmdi:Linguality> <cmdi:HeadWordType>lemma</cmdi:HeadWordType> </cmdi:LexiconSubtype> </cmdi:LexiconType> <cmdi:LexiconContent> <cmdi:AnnotationLevelType>frequency</cmdi:AnnotationLevelType> <cmdi:AnnotationFormat>ASCII</cmdi:AnnotationFormat> </cmdi:LexiconContent> <cmdi:Size> <cmdi:TotalSize> <cmdi:Number>1</cmdi:Number> <cmdi:SizeUnit></cmdi:SizeUnit> </cmdi:TotalSize> <cmdi:cmdi:SizePerLanguage> <cmdi:Number>1</cmdi:Number> <cmdi:SizeUnit></cmdi:SizeUnit> </cmdi:SizePerLanguage> </cmdi:Size> <cmdi:SubjectLanguages> <cmdi:SubjectLanguage> <cmdi:language></cmdi:language> </cmdi:SubjectLanguage> </cmdi:SubjectLanguages> <cmdi:DocumentationLanguages></cmdi:DocumentationLanguages> <cmdi:Access> <cmdi:Availability></cmdi:Availability> <cmdi:CatalogueLink></cmdi:CatalogueLink> <cmdi:Contact> <cmdi:Email></cmdi:Email> </cmdi:Contact> </cmdi:Access> <cmdi:location></cmdi:location> <cmdi:Validation> <cmdi:Validation>Validated</cmdi:Validation> <cmdi:ValidationType></cmdi:ValidationType> <cmdi:ValidationMode></cmdi:ValidationMode> <cmdi:ValidationLevel></cmdi:ValidationLevel> </cmdi:Validation> </cmdi:Lexicon> </cmdi:Components> </cmdi:CMD> </metadata> </record>...
http://someurl?verb=ListRecords&metadataprefix=oai_dc
Or, in human words, how should I map my CMDI descriptions to the dublin core format that is compulsory when using OAI-PMH? The answer is: you probably know this the best, there is no single answer to this. It's probably a good idea to use common sense and to refer (minimally to the described resources with a DC:identifier element.
<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/ OAI/2.0/OAI-PMH.xsd"> <responseDate>2009-05-25T11:46:14Z</responseDate> <request verb="ListRecords" metadataPrefix="oai_dc">http://corpus1/ds/oai2/oai2</request> <ListRecords> <record><header>...</header> <metadata> <oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"> <dc:identifier>oai:www.mpi.nl:1839/00-0000-0000-0009-4882-E</dc:identifier> <dc:identifier>http://bach.arts.kuleuven.be/dicovalence/latest_utf8.txt</dc:identifier> <dc:title>Dicovalence</dc:title> <dc:publisher>Rik van Gijn</dc:publisher> <dc:publisher>Max Planck Institute for Psycholinguistics</dc:publisher> <dc:subject xsi:type="olac:language" olac:code="yuz">Yurakaré</dc:subject> <dc:subject xsi:type="olac:language" olac:code="spa">Spanish</dc:subject>... </oai_dc:dc> </metadata></record>...
Accessibility
Public

