Advertisement

Extra! Extra! app may be scraping news museum's feed of front pages

TUAW came across the US$3.99 app Extra! Extra! when developer Finbarr Brady solicited a review. Extra! Extra! bills itself as an app that will supply you with the daily front pages of more than 800 newspapers from around the globe.

This sounded suspiciously similar to one of the features on the Newseum's website. The Newseum, located in Washington, D.C., is a museum that documents news media history. Each morning, more than 800 newspapers from around the world send their front pages as high-quality PDFs to the museum's gallery, which features these pages for educational purposes.

In Extra! Extra!, you can select a newspaper either from a list or a map view, and the app downloads the PDF to your iPhone, iPod touch or iPad. You can then either mark it as a favorite, e-mail it to someone, or visit the paper's site -- though several of the links I checked were incorrect.

I scanned the list of papers provided by Extra! Extra! and found that it closely mirrored the Newseum's list. The FAQ section on the app site claims that it is up to the individual newspaper to decide whether or not it is included with the Extra! Extra! app.

Full disclosure: I'm a designer with The Patriot-News just outside of Harrisburg, PA. Part of my duties when I, or my co-workers, design the front page is to send a high-quality PDF to the Newseum. I checked with our executive editor this morning, who confirmed that we only send those PDFs to Newseum and not any other organization. The Patriot-News is one of the newspapers made available through Extra! Extra!

Read on for more...

When I first downloaded the app and checked The Patriot-News feed, it was still showing the front page from the previous day, which mirrored the Newseum entry -- which usually isn't updated until around 8:30 a.m. When I redownloaded it a couple hours later through Extra! Extra!, it reflected the updated front page as shown through Newseum.

Brady says the newspaper fronts he downloads are publicly accessible. "Basically I am using freely available PDF's from the newspaper websites themselves," he said in an e-mail. "These are public URL's such as http://www.independent.ie/independent.ie/editorial/todaysPaper/todayspaper.pdf."

A visit to the Independent's website found no public link to their front page, but links to their paid e-edition. However, a Google search does confirm the link Brady provided.

However, this isn't the case for a good number of the papers, and it's most likely that Brady is scraping the Newseum feed. The app claimed that today's issue of the Corpus Christi Caller-Times wasn't available. That's because it hadn't been uploaded to the Newseum for the day.

A check of the Newseum revealed no paper from Corpus Christi for the day. The Caller-Times is one of the newspapers that makes its front-page PDFs accessible from its site and today's issue is available there. If Brady was using the newspaper's public feed, the PDF would be available in the app; if he's scraping the Newseum site, as we suspect, the missing page would mirror the status there.

Another newspaper, Referans from Turkey, no longer exists after merging with another newspaper. But, it's still part of the Newseum feed and therefore is listed in Extra! Extra!

Newseum makes it clear that it has a special arrangement with newspaper companies to display these front pages. Anyone seeking permission to use a front page must contact the newspaper directly, and U.S. copyright laws apply to both the Newseum and the US-based papers it includes. Extra! Extra!'s developer is based out of Ireland, but the app is being sold in the U.S. app store.

"I store the URLs on the server side in an XML file, so if any paper contacts me and wants to be removed, I can take them out right away," Brady said. "So far, no papers have wanted to do this, as I guess my app is driving more traffic to their website which in turn is good for them. I know from talking to the Irish papers, they love the app and are happy to get more exposure this way."

That's one possibility. Another, more likely scenario is that Extra! Extra! hasn't been noticed by people in a position to know whether the front pages are being used with permission or not.

I reached out to a few designers who work at papers included on Extra! Extra!, and they were pretty shocked by the app. Some papers, like the Express in Washintgon D.C., do make their entire paper downloadable as a single PDF document, but not the front page by itself. The PDF encryption makes it difficult to separate those pages.

"We have asked [Extra! Extra!] not to use the content from our site without each newspaper's specific permission. We have told numerous other sites the same thing. We cannot stop him from doing it without impacting the performance of our site," said Paul Sparrow, senior vice-president of broadcasting with Newseum in an e-mail to Charles Apple, who runs a visual journalism blog for the American Copy Editors Society. Charles was kind enough to contact Newseum on my behalf to get their take on this.

A growing number of newspapers do have their front pages available for download -- however, those pages are still under the copyright of their respective owners, and when an app like Extra! Extra! appears in the store, it calls into question whether Apple is effectively policing copyright violators. Just because the front pages are out there for download -- whether or not it is through the Newseum or the paper's own site -- does not mean that someone had the right to cull these front pages and make a profit off of them.

Edit (6:30 p.m. ET): Commenter Kevin was gracious enough to source out the app's XML file, [Ed. note: the link is now broken, so we've pasted an image below with proof] which revealed that all of the files are coming from the Newseum. Thanks, Kevin!