COVID-19 and the lockdown of March 2020 in the web archives

To carry out its heritage role of web deposit, the BnF regularly collects a sample of the French Web, through broad and focussed crawls. At the end of January 2020, the teams responsible for digital legal deposit began making selections of sites linked to the COVID-19 pandemic. While this selection activity continues to this day, an initial collection has already been compiled, comprising captures carried out between 1 February and 31 July 2020.

 

Archives de l’internet - BnF

A collaborative crawl

This collection of archives brings together six months of web content, from the virus’s arrival on French soil to the end of the state of health emergency in July 2020, and attempts to capture the global nature of this health, social and economic crisis. It also contains media coverage of the period, the stances taken by institutional stakeholders and individual citizens, and more broadly the actions taken to contain, document and understand the pandemic, which were echoed on the Web.

The selections of websites, blogs, social media and videos linked to the COVID-19 pandemic were carried out by curators and librarians from the BnF and the network of local correspondents working in fifteen partner establishments in the regions. They covered the whole of France. The same site was captured at regular intervals, enabling us to track changes in web content and, by extension, the health crisis over time. Between June and July 2020, Instagram and Viceo crawls were also carried out. From scientific videos to lockdown comedy videos, the video channels collected during this period also cover the different aspects of the pandemic and the March 2020 lockdown.

The BnF’s librarians and curators have been working to promote this archive collection to researchers and a wider public since the summer of 2020, and on 17 March 2021 they published a guided tour in the “Archives de l’internet” application, entitled “The COVID-19 pandemic and the first lockdown”.
To mark the occasion, the home page of the “Archives de l’internet” application is being revamped to showcase the content collected as part of the COVID-19 web archive.

A guided tour to explore the web of the pandemic

COVID-19 guided tour- BnF


The content selected as part of this guided tour retraces the chronology of the spread of the virus – particularly through its media coverage – and also illustrates the emergence of new words to describe this unprecedented situation.

One of these new words was undoubtedly “lockdown”, a political measure decreed on 16 March 2020 in France, which led to a restriction of public freedoms and an abrupt halt to freedom of movement and trade. The country was at the beginning of a major economic crisis, while the health system, which was also under significant strain, struggled to contain the outbreak of new cases. Expectations turned to medical research, which was studying this new form of coronavirus and developing a vaccine. At the same time, scientists were taking part in the public debate, sometimes expressing divergent opinions, as demonstrated by the controversy surrounding chloroquine.

Unable to meet in person, the French population took to the web and social media to educate and express themselves. There are many examples of this creativity developed during lockdown, and the web archives provide a record of it. The same is true of the outpouring of solidarity, both with the professionals on the front line against the virus and with the most fragile and vulnerable members of society. The lifting of lockdown saw the emergence of myriad interpretations of the period, and, having had time to reflect, people were looking to give new direction to their lives. A slide show illustrating some of the selected content accompanies this guided tour.

DIscover the guided tour slideshow

Discover the guided tour: The COVID-19 pandemic and the first lockdown

COVID-19 collection wordcloud - BnF - 2021 - BnF

AN archive for research 

The BnF is putting in place the resources and technical tools to enable researchers, historians, documentary filmmakers and journalists to use this unique corpus. The WARC files containing the data produced by the BnF’s bots during the period from January to July 2020 have been entirely indexed in full text. This includes on-going crawls, crawls from the online press, and dynamic crawls carried out in connection with the pandemic and its consequences. The scope of the indexing was defined in partnership with researchers involved in the reflection on how to make the most of this crawl. The BnF also wanted to test out new services based on these archives in order to develop its system for welcoming research teams. The aim is to promote and facilitate the exploitation and use of this material in partnership with the academic world. These archives are also intended to be used as part of international projects, given the global and transnational nature of the pandemic. Initial use as part of the WARCnet project has enabled comparisons to be made between national webs and the archives produced by the various European institutions.

An international cooperation 

IIPC COVID619 Collection - IIPC-Collection-Covid-19


BnF is contributing to the Novel Coronavirus (2019-nCoV) outbreak international archiving project, launched in February 2020 by the International Internet Preservation Consortium (IIPC) in association with Internet Archive.This collaborative crawl, which complements the crawls of national web content related to the COVID-19 outbreak being carried out by numerous heritage institutions that are members of the Consortium, aims to build up a transnational collection, commensurate with the global nature of the pandemic and representative of its various dimensions. The pages and websites selected for archiving by the institutions involved in the project are intended to document the scientific, medical, social, economic and political aspects of the pandemic, as well as local lockdown measures and vaccination policies. The major interest of such an archive for future research work, particularly for a comparative or transnational approach to the pandemic, was identified at a very early stage and led to the launch of the initiative. This web archive can be accessed online via the Archive-It tool on the IIPC’s website, and is an excellent way of promoting the selection work carried out by the BnF over the past year.

For more information on what goes on behind the scenes of the crawl:

How to consult the web archives?

In accordance with intellectual property law, the web legal deposit collections can be consulted in the research rooms of the BnF’s various sites, as well as in the printer legal deposit libraries in the French regions. 

For more information on conditions of access

There are three ways of accessing these very rich collections:

  • Searching by URL in the “Archives de l’internet” application allows you to consult captures of a web page at different dates. From the page chosen as the starting point, users can navigate from link to link in the archived web as on the live web;
  • Some of the collections, including the COVID-19 pandemic web archives, have been full-text indexed, enabling free text word searches in the “Archives de l’internet Labs” application;
  • Lastly, guided tours offer a thematic exploration of the Web at different times, through a selection of emblematic sites with commentary: “The COVID-19 pandemic”, “The electoral web from 2010 to 2015”, “Memories of North African immigration”, etc. Consult all the guided tours in “Archives de l’Internet”.
Find out more about the COVID-19 collection in our research blog:
See also the page devoted to zoonoses (specially update by the Sciences and techniques department)