CLARIN adheres to the following principles:
- Open standards are preferred over proprietary standards
- Formats and protocols should be:
- Proven (being used in practice)
- Text-based formats are (where possible) preferred over binary formats
- In the case of digitisation of an analogue signal, using no or lossless compression is recommended.
- Ongoing work by the CLARIN Standards Committee
- FAQ about recommended formats and standards
- Document: Standards for LRT
Several CLARIN centres have published information on what formats they recommend for language research data depositions:
- TALAR (EKUT)
- TLA (MPI-PL)
The CLARIN Standards Information System provides information on standards used in CLARIN and on formats accepted for data deposition at particular centres..