The New York Times has somewhere in the realm of five to seven million physical photos in its enormous archive, many of which date back more than a century. The images document vital moments and contain valuable records of our recent history, but the hard copies are vulnerable to deterioration (they fortunately survived flooding in 2015). To protect the photos, the Times is digitizing the archive with Google Cloud.
Not only will scanning all of the images help preserve them, but reporters should find it far easier to delve into the archives than by leafing through hard copies in file cabinets. Many of the photos have contextual information on the back, such as the time and location where they were taken, captions and when they were published in the newspaper. So, the Times, with the aid of Google's tech, created a system that recognizes and processes handwriting and text on both sides of each photo.
Many of the photos from the Times' more recent past will be digital anyway, so this is more about preserving the historical images, and using Google's AI to find tales hidden within them. It should be far simpler, for instance, to tell stories of how a specific location evolved over time through the paper's photography. The Times might also take advantage of Google's vision AI tools to detect objects and places in images, which could make categorizing them easier and help reporters and editors unearth them when they are browsing or searching the archive.