OAI-PMH for CMDI

(draft version)

The main trick to provide CMDI over the OAI-PMH protocol is to rely on the minimal CMDI XSD (an XML schema just defining the header structure and the CMD-tags) for the definition of the metadataFormat. Then in the instances themselves you link to the profile-specific XSD file (that is generated from the profile with comp2schema.xsl).

In the example below a record for a Lexicon profile is transported over OAI-PMH.

http://someurl?verb=ListMetadataFormats

<ListMetadataFormats>
metadataFormat>
<metadataPrefix>oai_dc</metadataPrefix>
<schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema>
<metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace>
</metadataFormat>

<metadataFormat>
<metadataPrefix>cmdi</metadataPrefix>
<schema>http://www.clarin.eu/cmd/xsd/minimal-cmdi.xsd</schema>
<metadataNamespace>http://www.clarin.eu/cmd/</metadataNamespace>
</metadataFormat>

...
</ListMetadataFormats>

http://someurl?verb=ListRecords&metadataprefix=cmdi

<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/
OAI/2.0/OAI-PMH.xsd">

<responseDate>2009-05-25T11:46:14Z</responseDate>
<request verb="ListRecords" metadataPrefix="cmdi">http://corpus1/ds/oai2/oai2</request>

<ListRecords>
<record>

<header>...</header>

<metadata>

<!--the CMDI lexicon profile -->
<cmdi:CMD xsi:schemaLocation="http://www.clarin.eu/cmd http://www.clarin.eu/cmd/components/clarin-nl/lexicon/lexicon.xsd">
<cmdi:Header></cmdi:Header>
    <cmdi:Resources>
        <cmdi:ResourceProxyList>
            <cmdi:ResourceProxy id="dicovalence">
                <cmdi:ResourceType>Resource</cmdi:ResourceType>
                <cmdi:ResourceRef>http://bach.arts.kuleuven.be/dicovalence/latest_utf8.txt</cmdi:ResourceRef>
            </cmdi:ResourceProxy>        
        </cmdi:ResourceProxyList>
        <cmdi:JournalFileProxyList></cmdi:JournalFileProxyList>
        <cmdi:ResourceRelationList></cmdi:ResourceRelationList>
    </cmdi:Resources>
    <cmdi:Components>
        <cmdi:Lexicon ref="dicovalence">
            <cmdi:Name>Dicovalence</cmdi:Name>
            <cmdi:Title></cmdi:Title>
            <cmdi:ID></cmdi:ID>
            <cmdi:Version></cmdi:Version>
            <cmdi:TimeCoverage></cmdi:TimeCoverage>
            <cmdi:Description></cmdi:Description>
            <cmdi:LexiconType>
                <cmdi:LexiconType>dictionary</cmdi:LexiconType>
                <cmdi:LexiconSubtype>
                    <cmdi:Linguality>bilingual</cmdi:Linguality>
                    <cmdi:HeadWordType>lemma</cmdi:HeadWordType>
                </cmdi:LexiconSubtype>
            </cmdi:LexiconType>
            <cmdi:LexiconContent>
                <cmdi:AnnotationLevelType>frequency</cmdi:AnnotationLevelType>
                <cmdi:AnnotationFormat>ASCII</cmdi:AnnotationFormat>
            </cmdi:LexiconContent>
            <cmdi:Size>
                <cmdi:TotalSize>
                    <cmdi:Number>1</cmdi:Number>
                    <cmdi:SizeUnit></cmdi:SizeUnit>
                </cmdi:TotalSize>
                <cmdi:cmdi:SizePerLanguage>
                    <cmdi:Number>1</cmdi:Number>
                    <cmdi:SizeUnit></cmdi:SizeUnit>
                </cmdi:SizePerLanguage>
            </cmdi:Size>
            <cmdi:SubjectLanguages>
                <cmdi:SubjectLanguage>
                    <cmdi:language></cmdi:language>
                </cmdi:SubjectLanguage>
            </cmdi:SubjectLanguages>
            <cmdi:DocumentationLanguages></cmdi:DocumentationLanguages>
            <cmdi:Access>
                <cmdi:Availability></cmdi:Availability>
                <cmdi:CatalogueLink></cmdi:CatalogueLink>
                <cmdi:Contact>
                    <cmdi:Email></cmdi:Email>
                </cmdi:Contact>
            </cmdi:Access>
            <cmdi:location></cmdi:location>
            <cmdi:Validation>
                <cmdi:Validation>Validated</cmdi:Validation>
                <cmdi:ValidationType></cmdi:ValidationType>
                <cmdi:ValidationMode></cmdi:ValidationMode>
                <cmdi:ValidationLevel></cmdi:ValidationLevel>
            </cmdi:Validation>
        </cmdi:Lexicon>
    </cmdi:Components>

</cmdi:CMD>

</metadata>
</record>...

http://someurl?verb=ListRecords&metadataprefix=oai_dc

Or, in human words, how should I map my CMDI descriptions to the dublin core format that is compulsory when using OAI-PMH? The answer is: you probably know this the best, there is no single answer to this. It's probably a good idea to use common sense and to refer (minimally to the described resources with a DC:identifier element.


<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/
OAI/2.0/OAI-PMH.xsd">
<responseDate>2009-05-25T11:46:14Z</responseDate>
<request verb="ListRecords" metadataPrefix="oai_dc">http://corpus1/ds/oai2/oai2</request>
<ListRecords>
<record><header>...</header>
<metadata>

<oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:identifier>oai:www.mpi.nl:1839/00-0000-0000-0009-4882-E</dc:identifier>
<dc:identifier>http://bach.arts.kuleuven.be/dicovalence/latest_utf8.txt</dc:identifier>
<dc:title>Dicovalence</dc:title>
<dc:publisher>Rik van Gijn</dc:publisher>
<dc:publisher>Max Planck Institute for Psycholinguistics</dc:publisher>
<dc:subject xsi:type="olac:language" olac:code="yuz">Yurakaré</dc:subject>
<dc:subject xsi:type="olac:language" olac:code="spa">Spanish</dc:subject>...
</oai_dc:dc>
</metadata></record>...
Accessibility
Public