Soutenez la BnF
Search Form

Access to digital resources: principles

Access to digital resources: principles

Every week, several thousand documents are digitized by BnF’s in-house staff and external service providers. This means that a process is needed to make these new documents available to Internet users via Gallica. When digital resources arrive at BnF, they undergo a series of processes to add them to the BnF master catalog, convert them into a form that can be accessed by Internet users under the best possible conditions, reference them in the Gallica search engine (known as document indexing), and update the digital library.

Access to digital resources: indexes and search engines

For digital resources, two main categories of data are indexed:
  • metadata
  • the contents (“full text”), where available

It is important to note that a document’s table of contents and indexes (geographical indexes, indexes of references to people, etc.) are also converted into text format so that the document can be searched and browsed.

The Gallica index thus consists of metadata, full text where available, existing tables of contents, image keys, and information from external partners’ OAI warehouses. 

The search engine used by BnF is Lucene (the Wikipedia search engine).

Lucene is a free search engine written in Java and used to index and search text.
In particular, it enables the various indexed elements of a document to be weighted relative to each other: for example, when searching for the word “wretched”, the most relevant documents (shown at the top of the list) will be those where the word “wretched” is found in the metadata (e.g. the title) rather than in the document content.

Free software for Gallica

BnF favors the use of free software for reasons of sustainability, production cost, and software maintenance.
The whole of Gallica has been created using free software: 
  • the Lucene search engine
  • the Apache web server 
  • the Tomcat application engine 
  • the Eclipse development tool

Wednesday, November 6, 2013

Partagez