Preservation metadata expressed in METS and PREMIS in the SPAR system apply to any kind of digital document. Other elements of information vital to preservation are format-specific. For example,
- For text formats, the character encoding, the XML structure (if relevant)…
- For image formats, the resolution, the color profile, the color depth…
- For audio and video formats, the bit rate, the codec, the duration…
This information must be expressed in a specific metadata format. In general, it is extracted by characterization tools and provided in XML. For every type of data format, a thorough study formerly determined which metadata format should be used. Three major criteria were taken into account:
- Interoperability and durability: the format must be a standard.
- Fine-grained structure: the format must be able to express with precision the information needed.
- Community use: the format must be widely adopted and actively maintained.
Technical metadata formats used in SPAR
These metadata formats and associated tools are subject to a constant survey and may change as technological changes occur.
Technical metadata formats used in SPAR (end of 2015)
|File type ||Data format ||Metadata format ||Validation and characterization tool |
|Image ||TIFF |
|MIX version 1.0 ||JHOVE version 1.11 |
Jpylyzer version 1.10
|Text ||XML |
|textMD version 3.0 ||JHOVE version 1.11 |
|Sound ||WAV ||MPEG-7 version 2.0 ||MediaInfo version 0.7.35 |
|Video ||MPEG-2 ||MPEG-7 version 2.0 ||MediaInfo version 0.7.35 |
|Web archives ||ARC |
|containerMD version 1.0 ||JWAT Tools |
|Ebooks ||EPUB ||XMP ||Epubcheck version 4.0 |
|Multiple ||PDF ||XMP ||Tika version 1.6 |