You are here
Frequently Asked Questions - Metadata in CLARIN: authoring and editing CMDI
There a several options for creating CMDI records "by hand" (as opposed to having CMDI generated from a primary source by a script, repository system or other software):
- An XML editor such as oXygen or a text editor or code editor with XML support, preferably with XML Schema awareness.
- In addition to oXygen, which has a commercial licence, there are also numerous free XML editors. See for example this comparison of XML editors.
- COMEDI is an online CMDI editor developed within CLARINO that works with any profile.
- Arbil, the metadata editor developed at the Max Planck Instititue for Psycholinguistics supports any CMDI profile and provides powerful table-based editing. No longer maintained!
Dedicated editors with a limited number of supported profiles:
In the oXygen XML editor, an easy way to get started is using the "Generate sample XML files" feature:
- Choose "Generate sample XML files" in the "Tools" menu.
- Insert the URL of the schema of the desired profile (see "How can I create an XSD (XML schema) from my CMDI profile?") into the URL field.
- Make sure to select CMD as the root element (possibly overriding Oxygen's default suggestion). This should set the value of the "namespace" field to either
http://www.clarin.eu/cmd/(CMDI 1.1) or
http://www.clarin.eu/cmd/1(CMDI 1.2 or higher).
- Optionally set the desired default namespace (if you don't know what this does, you can skip this step).
- Review the options in the "Options" pane, in particular the checkboxes that determine whether optional fields should be instantiated or not.
- Click "OK" to generate a sample CMDI document.
- Use this document as a template to create your CMDI records.
- Use the autocomplete and suggestion functionality of Oxygen's Text or Grid mode to further edit your document(s).
- Make sure to validate your document regularly, in particular before publishing your metadata.
- The official documentation provides more information about editing XML documents using oXygen.
- Download Arbil (2.6 or higher) and start it
- Go to Options > Templates & Profiles
- Select in "Clarin Profiles" which profile(s) you want to use as the basis for a CMDI file anc click on Close
- Right-click on Local corpus, choose Add and select the relevant profile (the CMDI profiles are marked with a CLARIN icon)
It used to be an editor for IMDI only, the CMDI functionality has been added later on (since the beginning of 2010). This means that the support for CMDI files was not as extensive as the one for IMDI. However, since release 2.3 of Arbil the support for CMDI has been significantly improved.
Some profiles (obvious tests and the ones not intended for manual metadata creation) have been excluded from the default profile list in arbil (testing). You can see them disabling the "only load profiles selected for manual editing" in the Available Templates & Profiles dialog.
By default, if you create a new profile in the component registry, it will show up in Arbil.
Correct observation. The elements that are optional (= have a CardinalityMin of 0) are not shown by default. You need to add them explicitly. To do this, right click on the CMDI file in the "local corpus" panel and select Add
Right click on the file in the "local corpus" panel and select Edit all Metadata.
For CMDI, some additional icons are used and some icons have a slightly different meaning:
|link to an external file (ResourceProxy)|
|grouping icon for repeatable elements (more information...)|
|root node of a CMDI file|
When a component can occur multiple times (= CardinalityMax higher than 1), Arbil automatically groups all occurrences of these components in the CMDI file. You can recognize these by the following properties:
- they have the grey club icon
- the text is shown in grey
- after the node a number indicates how many times the component occurs
E.g. in this example CMDI file there is a fragment that looks like:
<ISO639> <iso-639-3-code>cat</iso-639-3-code> </ISO639> <ISO639> <iso-639-3-code>spa</iso-639-3-code> </ISO639>