How Harvard's human computers helped invent modern astronomy

And how the PHaEDRA project is bringing their research into the 21st century.

The Harvard College Observatory (now the Center for Astrophysics) in Cambridge, Massachusetts has long been a bastion of astronomical research, its history stretching back to the center's founding in 1839. But for the first forty years of its existence, the HCO was quite literally an old boys club. While amateur female astronomers helped fund and even construct the observatory's telescopes, "it wasn't really seen as proper to allow them out on the roof, in the night, on their own, to actually use instruments," Daina Bouquin, Head Librarian of the Wolbach Library at the Center for Astrophysics and lead of the PHaEDRA project, told Engadget.

"The beginning of the whole capacity to do that starts like photography, with people putting together these all-sky surveys," she continued. "And the first group of people to do that, to put together a full survey of the entire visible universe at the time was the Harvard Computers."

Pickering's "Harem"

In the mid-1870s, the fourth director of the HCO, Edward Charles Pickering, started to hire women computers specifically to perform detailed analysis upon the observatory's growing collection of glass plate photographs. "Basically, the advent of photography and glass plate photography, in particular, allowed women to get involved with the science for the first time," Bouquin said.

But woe be to those who underestimate the Computers' contributions to modern astronomy. Take Henrietta Swan Leavitt, one of the early members at the HCO, for example. She studied Cepheid stars. These stars dim and brighten at regular intervals within a set range of luminosity. In Leavitt's era, the map of the universe was effectively flat, the concept of gravity wells was still years away from formulation, and astronomers were effectively unable to measure distance across space. But through her rigorous observations and analysis, Leavitt developed the period luminosity relationship, which is now called Leavitt's Law.

You may not have heard of Leavitt, but you're probably familiar with a man named Edwin Hubble. The former was nominated for the Nobel Prize after her death "because this relationship that she noticed can only really be seen across many, many plates and the very strange reductions that she did, it wound up being the basis of Hubble's work," Bouquin said. "She made it so that you could tell distance, and so then when Hubble took that calculation and incorporated it into his work, he was able to prove that we weren't the only galaxy."

Leavitt's work is also fundamental to Einstein's theories of relativity and the curvature of space. "Our understanding of whether or not the universe is the galaxy or something much greater than that," Bouquin exclaimed, "comes from the work of this one woman studying these plates."


Pickerings plan was to take full-sky surveys, photographing the night sky onto glass plates, then compare the plates to see how celestial objects move and interact over time. The catalogue itself was, and still is, massive. Between 1860 and 1990 the HCO compiled a collection of more than 500,000 glass plate photographs from all over the world. "This is the most comprehensive picture we have going back," Bouquin expounded. "And it's longitudinal time series data, so that you can actually see how individual objects change over time."

Through their work, the Harvard Computers compiled more than 2,500 log books filled with precise measurements and graphs of their analyses, "what they were doing, what they're writing, their notes and their techniques -- all of the metadata, essentially -- about their observations" went into the log books, Bouquin said.

But after completion, these log books were largely forgotten. They spent more than four decades being transferred between various archives and libraries within the school. "They just sort of went with the plates," Bouquin said. "And a lot of the focus for the longest time has been on getting the data off of the plates, because that's really the magnitudes and the photometry in the light curves that the scientists need."

Harvard College Observatory - DASCH Program

DASCH student workers scanning plates

Indeed, researchers have spent the last 15 years digitizing the school's glass plate collection as part of the DASCH (Digital Access to a Sky Century @ Harvard) program. Digitizing these plates helps astronomers better understand the universe's evolution (even on so short a timescale). "We know the universe is very, very, very old and the ability to go back over 100 years, it's a very unique thing we can do with these plates," Bouquin said. "But all of the metadata that could be used to link them to things in the modern literature is actually in the notebooks."

When Bouquin was hired on as Head Librarian a few years ago, she and her team collaborated with Lindsay Smith Zrull, the curator of the HCO's Astronomical Photographic Plate Collection, and began digging through the boxes of plates. Once they realized that the logs could be similarly digitized and published to NASA's astrophysics data system (ADS) -- think a PubMed for astronomers -- they made the case for funding the PHaEDRA program. The project now leverages both Harvard's resources as well as the Smithsonian Transcription Center.

As for why the school only now decided to better archive these works, Bouquin replied, "I think it was just good timing. Honestly, people care right now, about women in science, when maybe 20 years ago, they should have but didn't."

Harvard Computer paperdolls

The digitization and transcription process itself is pretty straightforward. From the thousands of log books currently residing in the Harvard depository, Bouquin and her team of student workers recall the books in small batches to physically inspect them and verify metadata. "They document them as thoroughly as they can," Bouquin said. They page them and figure out where stuff is and sketches that might be scan separately -- physically check the condition of the books."

The inspected books are then sent to the Harvard digitization lab where they're converted into digital images of each page. Those image files are then transferred to NASA for publication to the ADS. "They create a record basically, for every book," Bouquin continued. "So every page has its own unique resolvable link. And every book has its own record in ADS. And then we take those links that they created for every page, and we give those to the Smithsonian."

The Smithsonian transcription center then renders those images for a cadre of human volunteers to manually transcribe. Once those transcriptions are complete, they too are fed back into the the ADS, allowing anybody to search for data in these Victorian Era log books as easily as they would a modern article. "If you wanted to find one of these notebooks," Bouquin said. "It has its own coding that you can search just those notebooks or it'll just come up in full text search results, like anything else."

Bouquin figures that her team is nearly through the part of the process that demands physically handling the log books. "We're almost done with all of the scanning and kind of a technical and physical passing around of materials, conserving them all of that," she noted. "We just have the images up and then it's just transcribe and go."

But even once all these books are transcribed, Bouquin has further plans for the plate collection. Once the initial transcription process is complete, Bouquin's team hopes to go back through and tag each scanned logbook page with its corresponding glass plate.

While the plate numbers were often written in the logbooks, there is little rhyme or reason in their taggings. "People might have just put the number and not the prefix... so they're hard to actually match up the notebooks with the plates using just the transcriptions," Bouquin said. "So what we're going to have people tag the plate numbers, so that we can actually then when you pull up the notebook on ADS, ideally, you also have a list of all the plates that go with that notebook and you can link directly to the data coming off the plate."

Eventually, Bouquin hopes to leverage this process into training data for an AI. "We want to use [the logbook tags] to train an algorithm to look for sketches in other archival log books, because we're not the only place that has old observing logs," Bouquin conceded. "You could use machine learning, then based on the tag datasets, to would tweeze out and find old observations of different objects."

"I think it's really rewarding that so many people are seeing value in this right now," Bouquin reasoned. "These people were scientists and they were doing real science and we we don't really necessarily know the names of the people who gave us our fundamental understanding of the nature of reality, which is kind of problematic to me."

"Women have been there the whole time, they're still doing really important work today," she concluded. "This is one big continuum."

If you're interested in helping transcribe these logbooks, head over to the Wolbach Library's Project PHaEDRA page for more details.

Images: Harvard College Observatory (#1,3,4); Harvard College Observatory - DASCH project (#2 square plates)