Digital history & historiography

Review: Newspaper Navigator

written by

Lorella Viola

published on

1 January 2022

Created for the Library of Congress, Newspaper Navigator re-imagines how we search the rich visual content in historic newspapers. The first phase of the project utilized machine learning techniques to extract visual content from 16.3 million digitized newspaper pages in Chronicling America. 1 This resulted in the Newspaper Navigator dataset, released in May 2020. The dataset and finetuned machine learning model 2 are in the public domain. A paper on the dataset was presented at the 2020 ACM Conference on Information Knowledge & Management (CIKM).
The second phase consisted of building a search application for 1.5 million photos from the dataset. The search application was launched in September 2020. In addition to supporting faceted and keyword search, it empowers users to search by visual similarity by training an interactive machine learning model called an “AI navigator,” which enables users to retrieve photos of topics such as “baseball players” or “sailboats” even if their captions do not contain these keywords. An AI navigator can train and predict over all 1.5 million photos in a couple seconds. This new search affordance forms the basis for Benjamin Lee’s Ph. D. dissertation research, which re-imagines standard faceted search as “open faceted search.” A demo of the search application was presented at the 2020 ACM Symposium on User Interface and Software Technology (UIST).

Show this publication on our institutional repository (orbi.lu).

Author(s)

Lorella Viola

Lorella is a Postdoctoral Research Associate working on the DHARPA project.

More about this author →

Review: Newspaper Navigator

Author(s)

Tags

1 April 2025

Multilingual Word Embedding and Linguistic Linked Open Data for Tracing Semantic Change

31 March 2025

Hoxha, Enver

research areas

Public history

Contemporary history of Luxembourg

Contemporary history of Europe

Digital history & historiography

Review: Newspaper Navigator

Author(s)

Tags

related content

1 April 2025

Multilingual Word Embedding and Linguistic Linked Open Data for Tracing Semantic Change

31 March 2025

Hoxha, Enver

research areas

Public history

Contemporary history of Luxembourg

Contemporary history of Europe

Digital history & historiography