Technical metadata for digital preservation


Preservation metadata expressed in METS and PREMIS in the SPAR system apply to any kind of digital document. Other elements of information vital to preservation are format-specific. For example,
  • For text formats, the character encoding, the XML structure (if relevant)…
  • For image formats, the resolution, the color profile, the color depth…
  • For audio and video formats, the bit rate, the codec, the duration…
This information must be expressed in a specific metadata format. In general, it is extracted by characterization tools and provided in XML. For every type of data format, a thorough study formerly determined which metadata format should be used. Three major criteria were taken into account:
  • Interoperability and durability: the format must be a standard.
  • Fine-grained structure: the format must be able to express with precision the information needed.
  • Community use: the format must be widely adopted and actively maintained.

Technical metadata formats used in SPAR

These metadata formats and associated tools are subject to a constant survey and may change as technological changes occur.

Technical metadata formats used in SPAR (end of 2015)
File type Data format Metadata format Validation and characterization tool
Image TIFF
MIX version 1.0 JHOVE version 1.11
Jpylyzer version 1.10
Text XML
textMD version 3.0 JHOVE version 1.11
Sound WAV MPEG-7 version 2.0 MediaInfo version 0.7.35
Video MPEG-2 MPEG-7 version 2.0 MediaInfo version 0.7.35
Web archives ARC
containerMD version 1.0 JWAT Tools
Ebooks EPUB XMP Epubcheck version 4.0
Multiple PDF XMP Tika version 1.6

Wednesday, June 15, 2016