National Library of France

Search Form

  Professionals

Main dates of the project

  • 2005 : acquisition of the infrastructure
  • June 2007 : the BnF launched a tender for the realization and adopted a new direction in favour of open source software to ensure maximum independence
  • 2008-2009 : specification, development and testing of the core features of the SPAR software and of the first collection to be ingested : preservation digitization of printed and manuscript documents and still images
  • Spring 2009 : first step of the SPAR project is going operational for the digitization of printed and manuscript documents and still images

See also

Digital preservation at the National Library of France [.pdf file – 120 Ko – 10 page(s)]
by Emmanuelle Bermes, Isabelle Dussert Carbone, Thomas Ledoux, Christian Lupovici. IFLA Congress 2008: "Libraries without borders: Navigating towards global understanding"

Preservation of digital material: the SPAR project

Just as manuscripts, printed material or photos, it's of upmost importance to store in a secure and permanent manner the digital objects based on a strong and performant framework, SPAR for Scalable Preservation and Archiving Repository.

Realization

The tender for the realization

Following the tender for the acquisition of the infrastructure, the National Library of France launched in june 2007 a tender for the realization of the SPAR system, won by the Atos Origin company. The preservation system to realize must be compliant with the OAIS model, offering a high level of modularity to guarantee the permanency of the system, with high response times, and covering all the channels of production of digital material:

  • preservation digitization,
  • reproduction digitization,
  • automated legal deposit,
  • negociated legal deposit,
  • records management of the BnF,
  • deposit or third party archiving,
  • acquisitions of digital material.

Beyond the achievement of the common core of the preservation system, the building of this project within the National Library of France takes place in an iterative manner, channel by channel. The first step covers the whole Digital Library. It is a subset of the overall project and is deployed in priority.

The modules of SPAR

The SPAR system is structured in independant modules which ensures a great permanency of the components (easily change) and allows a distribution according to the required performances.

diagram of the modules of SPAR

diagram of the modules of SPAR

The modularity of SPAR is directly based on the OAIS model.

The Ingest module

This module receives the data to ingest (SIP) from the producers according to the ingest policy negociated first with the administration of the Archive. Once the ingest package validated through the different controls, the data ispackaged for archival (AIP) before being given to the storage.

The Storage module

This module ensures the operations connected with the storage of the digital files ("data-objects") that it receives as packages called AIP. It's a abstraction layer of the mechanisms and the systems of storage for the other modules. Mainly, it guarantees the integrity of the data-objects, ensures the adequation of the hierarchy of storage according to the requirements of the producers and the user communities in terms of performance and availability (levels of service).

The Data Management module

This module ensures the functions and services related to the enrichment, the preservation and access of the Descriptive Information (which identifies and describes the collections of the Archive) and to the administrative data needed to manage the Archive.

The Rights Management module

This module aims to manage all the information related to the rights associated with a given data to disseminate. It's fed by rights metadata as well as decision trees, supplied by the SOLON system. It "plays" those decision trees depending on the targeted users in order to add an appropriate license to the outcoming data.

The Access module

This module is in charge of supplying data to the user community to be disseminated as packages called DIP. To achieved this, it offers search functions, reports queries as well as mechanisms to generate and transform the archived data in a displayable form.

The Administration module

This module organizes all the archival procedures and monitors their smooth running. To achieved this, it's in relation with the producers, the users as well as all the others modules which it orchestrates the work; It ensures the right sequencing of all the functions of the Archive and can bring back information.

The Preservation module

This module allows the definition and the monitoring of the formats and standards used by the SPAR system. It's fed by information coming from the business intelligence tool as well as the format registry so that it can monitor the changes of formats or plan the evolutions on storage or policies.

The information model for the archival packages

The archival packages (AIP) are made of different kinds of information, classified in the system according to their content and role in the functioning of the Archive.

In SPAR, the main following concepts are distinguished:

  • data-object, which correponds to the digital files to be preserved
  • metadata, which correponds to the information needed to understand the data-objects (in particular, the representation information and the Preservation Description Information)
  • packaging, which describes the links real or logical between the different components stored in a package on a media. In SPAR, it's expressed with a METS manifest

Concerning the metadata, the SPAR system uses the most advanced standards:

  • Dublin Core for the descriptive information, i.e. the description of the object that is archived,
  • MIX to code the technical metadata for image files,
  • textMD to code the technical metadata for text files,
  • ODRL to code the usage license of the digital objects,
  • PREMIS for the provenance information, i.e. the documentation of the history of the data-objects

The lifecycle of an archival package

Over time, it may be necessary to act on archival packages either to make a correction or, more presumably, to migrate to new formats when obsolence occurs.

In SPAR, the archival package follows a lifecycle that depends on the transformations that are applied:

diagram of the lifecycle of an archival package in SPAR

diagram of the lifecycle of an archival package in SPAR

In the end, every archival package has at most three versions:

  • the version 0 : it's the original version that has been ingested by the Archive,
  • the version n-1 : it's the penultimate version which can be of use if the last transformation applied was erroneous,
  • the version n : it's the current version which is used to disseminate the content in the latest format.

Tuesday, March 22, 2011