Skip to main content

OAI-PMH for CMDI

If you're only using one profile the main trick to provide over the protocol is to refer to the XSD generated by the component registry (available from the right-click Show Info menu option), e.g.

http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1271859438204/xsd

In the example below a record for an   profile is transported over OAI-PMH.

http://someurl?verb=ListMetadataFormats

<ListMetadataFormats>
metadataFormat>
<metadataPrefix>oai_dc</metadataPrefix>
<schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema>
<metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace>
</metadataFormat>

<metadataFormat>
<metadataPrefix>cmdi</metadataPrefix>
<schema>http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1271859438204/xsd</schema>
<metadataNamespace>http://www.clarin.eu/cmd/</metadataNamespace>
</metadataFormat>

...
</ListMetadataFormats>

http://someurl?verb=GetRecord&metadataPrefix=cmdi&identifier=oai:www.mpi.nl:MPI338011

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
    <responseDate>2011-09-24T04:02:43Z</responseDate>
    <request identifier="oai:www.mpi.nl:MPI338011" metadataPrefix="cmdi" verb="GetRecord">http://lux16.mpi.nl/ds/oai2/oai2</request>
    <GetRecord>
        <record>
            <header>
                <identifier>oai:www.mpi.nl:MPI338011</identifier>
                <datestamp>2011-09-17</datestamp>
                <setSpec>MPI0:MPI1:MPI2:MPI3:MPI337272:MPI337300:MPI337301:MPI337302:MPI337303:MPI337934</setSpec>
            </header>
            <metadata>
                <CMD CMDVersion="1.1" xmlns="http://www.clarin.eu/cmd/"                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.clarin.eu/cmd/ http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1271859438204/xsd">
                    <Header>
                        <MdCreator>imdi2clarin.xsl</MdCreator>
                        <MdCreationDate>2011-09-24</MdCreationDate>
                        <MdSelfLink>
                        test-hdl:1839/00-0000-0000-0005-285B-7</MdSelfLink>
                        <MdProfile>clarin.eu:cr1:p_1271859438204</MdProfile>
                        <MdCollectionDisplayName>Acquisition</MdCollectionDisplayName>
                    </Header>
                    <Resources>
                        <ResourceProxyList>
                            <ResourceProxy id="d170856e290">
                                <ResourceType mimetype="video/x-mpeg2">Resource</ResourceType>
                                <ResourceRef>hdl:1839/00-0000-0000-0008-A711-2</ResourceRef>
                            </ResourceProxy>
                            <ResourceProxy id="d170856e352">
                                <ResourceType mimetype="video/x-mpeg1">Resource</ResourceType>
                                <ResourceRef>hdl:1839/00-0000-0000-0005-285C-8</ResourceRef>
                            </ResourceProxy>
                            <ResourceProxy id="d170856e411">
                                <ResourceType mimetype="audio/x-wav">Resource</ResourceType>
                                <ResourceRef>hdl:1839/00-0000-0000-0005-285D-0</ResourceRef>
                            </ResourceProxy>
                        </ResourceProxyList>
                        <JournalFileProxyList/>
                        <ResourceRelationList/>
                    </Resources>
                    <Components>
                        <Session>
                            <Name>C04_7c</Name>
                            <Title>C04_7c</Title>
                            <Date>2005-03-03</Date>
                            <descriptions>
                                <Description LanguageId="">focus child plays with family and friends, part of longitudinal corpus</Description>
                            </descriptions>
                            <MDGroup>
                                <Location>
                                    <Continent>Australia</Continent>
                                    <Country>Australia</Country>
                                    <Region/>
                                    <Address/>
                                </Location>
                                <Project>
                                    <Name>Language contact and child bilingual acquisition</Name>
                                    <Title/>
                                    <Id/>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <descriptions>
                                    <Description LanguageId="">Longitudinal and crossectional study of language contact and change. The study documents the emergence of a new mixed language, LightWarlpiri, and the forces that lead to its development.  Light Warlpiri is a systematic combination of elements from Warlpiri and Aboriginal English and Kriol.  The study examines variation in adult Warlpiri across age coherts; and similarities and differences between Warlpiri and Light Warlpiri in both adults and children. It also examines children's acquisition of comprehension and production in both Warlpiri and Light Warlpiri.</Description>
                                    </descriptions>
                                </Project>
                                <Keys>
                                    <Key Name="Community">Lajamanu Community</Key>
                                </Keys>
                                <Content>
                                    <Genre>Unspecified</Genre>
                                    <SubGenre>Unspecified</SubGenre>
                                    <Task>spontaneous</Task>
                                    <Modalities>speech</Modalities>
                                    <CommunicationContext>
                                    <Interactivity>interactive</Interactivity>
                                    <PlanningType>spontaneous</PlanningType>
                                    <Involvement>no-observer</Involvement>
                                    <SocialContext>Unspecified</SocialContext>
                                    <EventStructure>Unspecified</EventStructure>
                                    <Channel>Unspecified</Channel>
                                    </CommunicationContext>
                                    <Content_Languages>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Content_Language>
                                    <Id>RFC1766:x-sil-WBP</Id>
                                    <Name>Warlpiri</Name>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Content_Language>
                                    </Content_Languages>
                                    <Keys>
                                    <Key Name="IMDI__1_9.Interactional"/>
                                    <Key Name="IMDI__1_9.Discursive"/>
                                    <Key Name="IMDI__1_9.Interactional"/>
                                    <Key Name="Key-Dialect"/>
                                    </Keys>
                                    <descriptions>
                                    <Description LanguageId="ISO639:eng">This file was generated from an IMDI 1.9 file and transformed to IMDI 3.0. The substructure of Genre is replaced by two elements named "Genre" and "SubGenre". The original content of Genre substructure was: Interactional = '', Discursive = '', Performance = ''. These values have been added as Keys to the Content information. </Description>
                                    <Description LanguageId="">spontaneous</Description>
                                    </descriptions>
                                </Content>
                                <Actors>
                                    <Actor>
                                    <Role>Unspecified</Role>
                                    <Name>C53</Name>
                                    <FullName/>
                                    <Code/>
                                    <FamilySocialRole>Unspecified</FamilySocialRole>
                                    <EthnicGroup/>
                                    <Age>Unspecified</Age>
                                    <BirthDate>Unspecified</BirthDate>
                                    <Sex>Unspecified</Sex>
                                    <Education/>
                                    <Anonymized>Unspecified</Anonymized>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <Keys/>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Actor_Languages>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Actor_Languages>
                                    </Actor>
                                    <Actor>
                                    <Role>Collector</Role>
                                    <Name>Carmel O'Shannessy</Name>
                                    <FullName>Carmel O'Shannessy</FullName>
                                    <Code>Unspecified</Code>
                                    <FamilySocialRole>Unspecified</FamilySocialRole>
                                    <EthnicGroup/>
                                    <Age>Unspecified</Age>
                                    <BirthDate>Unspecified</BirthDate>
                                    <Sex>Unspecified</Sex>
                                    <Education/>
                                    <Anonymized>false</Anonymized>
                                    <Contact>
                                    <Name>corpus.manager_at_mpi.nl</Name>
                                    <Address/>
                                    <Email>corpus.manager_at_mpi.nl</Email>
                                    <Organisation/>
                                    </Contact>
                                    <Keys/>
                                    <descriptions>
                                    <Description LanguageId="">Description</Description>
                                    </descriptions>
                                    <Actor_Languages/>
                                    </Actor>
                                    <Actor>
                                    <Role>Annotator</Role>
                                    <Name>Carmel O'Shannessy</Name>
                                    <FullName>Carmel O'Shannessy</FullName>
                                    <Code>Unspecified</Code>
                                    <FamilySocialRole>Unspecified</FamilySocialRole>
                                    <EthnicGroup>Unspecified</EthnicGroup>
                                    <Age>Unspecified</Age>
                                    <BirthDate>Unspecified</BirthDate>
                                    <Sex>Unspecified</Sex>
                                    <Education>Unspecified</Education>
                                    <Anonymized>false</Anonymized>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <Keys/>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Actor_Languages/>
                                    </Actor>
                                </Actors>
                            </MDGroup>
                            <Resources>
                                <MediaFile ref="d170856e290">
                                    <ResourceLink>../Media/C04_7c.mpeg</ResourceLink>
                                    <Type>video</Type>
                                    <Format>video/x-mpeg2</Format>
                                    <Size/>
                                    <Quality>Unspecified</Quality>
                                    <RecordingConditions/>
                                    <TimePosition>
                                    <Start>Unspecified</Start>
                                    <End>Unspecified</End>
                                    </TimePosition>
                                    <Access>
                                    <Availability/>
                                    <Date>Unspecified</Date>
                                    <Owner/>
                                    <Publisher/>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Access>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Keys>
                                    <Key Name=""/>
                                    </Keys>
                                </MediaFile>
                                <MediaFile ref="d170856e352">
                                    <ResourceLink>../Media/C04_7c.mpg</ResourceLink>
                                    <Type>video</Type>
                                    <Format>video/x-mpeg1</Format>
                                    <Size/>
                                    <Quality>1</Quality>
                                    <RecordingConditions/>
                                    <TimePosition>
                                    <Start>Unspecified</Start>
                                    <End>Unspecified</End>
                                    </TimePosition>
                                    <Access>
                                    <Availability/>
                                    <Date>Unspecified</Date>
                                    <Owner/>
                                    <Publisher/>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Access>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Keys/>
                                </MediaFile>
                                <MediaFile ref="d170856e411">
                                    <ResourceLink>../Media/C04_7c.wav</ResourceLink>
                                    <Type>audio</Type>
                                    <Format>audio/x-wav</Format>
                                    <Size/>
                                    <Quality>1</Quality>
                                    <RecordingConditions/>
                                    <TimePosition>
                                    <Start>Unspecified</Start>
                                    <End>Unspecified</End>
                                    </TimePosition>
                                    <Access>
                                    <Availability/>
                                    <Date>Unspecified</Date>
                                    <Owner/>
                                    <Publisher/>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Access>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Keys/>
                                </MediaFile>
                                <Source>
                                    <Id>ACOCVDP03Mar0501</Id>
                                    <Format>DV</Format>
                                    <Quality>1</Quality>
                                    <TimePosition>
                                    <Start>00:19:05:00</Start>
                                    <End>00:28:58:00</End>
                                    </TimePosition>
                                    <Access>
                                    <Availability/>
                                    <Date>Unspecified</Date>
                                    <Owner/>
                                    <Publisher/>
                                    <Contact>
                                    <Name/>
                                    <Address/>
                                    <Email/>
                                    <Organisation/>
                                    </Contact>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    </Access>
                                    <descriptions>
                                    <Description LanguageId=""/>
                                    </descriptions>
                                    <Keys/>
                                </Source>
                            </Resources>
                            <References/>
                        </Session>
                    </Components>
                </CMD>
            </metadata>
        </record>
    </GetRecord>
</OAI-PMH>

In the case when you use muliple profiles it gets a bit more complex. The shortcommings of CMDI in this domain have been identified and a neat solution is underway for CMDI 1.2. In the mean time the easiest solution is to provide multiple CMDI related metadata prefixes for each profile one. However, they should all have  http://www.clarin.eu/cmd/ as the metadataNamespace. This way the OAI harvester can identify which metadata prefixes deliver CMDI. See http://metalb.csc.fi/cgi-bin/que?verb=ListMetadataFormats for an example of this approach.

http://someurl?verb=ListRecords&metadataprefix=oai_dc

Or, in human words, how should I map my CMDI descriptions to the dublin core format that is compulsory when using OAI-PMH? The answer is: you probably know this the best, there is no single answer to this. It's probably a good idea to use common sense and to refer (minimally to the described resources with a :identifier element.

<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/
OAI/2.0/OAI-PMH.xsd">
<responseDate>2009-05-25T11:46:14Z</responseDate>
<request verb="ListRecords" metadataPrefix="oai_dc">http://corpus1/ds/oai2/oai2</request>
<ListRecords>
<record><header>...</header>
<metadata>

<oai_dc:dc xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:identifier>oai:www.mpi.nl:1839/00-0000-0000-0009-4882-E</dc:identifier>
<dc:identifier>http://bach.arts.kuleuven.be/dicovalence/latest_utf8.txt</dc:identifier>
<dc:title>Dicovalence</dc:title>
<dc:publisher>Rik van Gijn</dc:publisher>
<dc:publisher>Max Planck Institute for Psycholinguistics</dc:publisher>
<dc:subject xsi:type="olac:language" olac:code="yuz">Yurakaré</dc:subject>
<dc:subject xsi:type="olac:language" olac:code="spa">Spanish</dc:subject>...
</oai_dc:dc>
</metadata></record>...