How the meandering legal definition of 'fair use' cost us Napster but gave us Spotify

From DMCA takedowns to Content ID filters, labels continue to crack down on online music sharing.

By Andrew Tarantola Nov. 5, 2023 10:30 am EST

The app of the music streaming app Napster is seen on a screen while some headphones are lying on it. The numbers of people using music streaming apps grow. The biggest one is the Swedish Spotify with 83 million paying users and about 100 others, that use the free version. (Photo by Alexander Pohl/NurPhoto via Getty Images)

NurPhoto via Getty Images

We may receive a commission on purchases made from links.

The internet's "enshittification," as veteran journalist and privacy advocate Cory Doctorow describes it, began decades before TikTok made the scene. Elder millennials remember the good old days of Napster — followed by the much worse old days of Napster being sued into oblivion along with Grokster and the rest of the P2P sharing ecosystem, until we were left with a handful of label-approved, catalog-sterilized streaming platforms like Pandora and Spotify. Three cheers for corporate copyright litigation.

In his new book The Internet Con: How to Seize the Means of Computation, Doctorow examines the modern social media landscape, cataloging and illustrating the myriad failings and short-sighted business decisions of the Big Tech companies operating the services that promised us the future but just gave us more Nazis. We have both an obligation and responsibility to dismantle these systems, Doctorow argues, and a means to do so with greater interoperability. In this week's Hitting the Books excerpt, Doctorow examines the aftermath of the lawsuits against P2P sharing services, as well as the role that the Digital Millennium Copyright Act's "notice-and-takedown" reporting system and YouTube's "ContentID" scheme play on modern streaming sites.

Verso Publishing

Seize the Means of Computation

The harms from notice-and-takedown itself don't directly affect the big entertainment companies. But in 2007, the entertainment industry itself engineered a new, more potent form of notice-and-takedown that manages to inflict direct harm on Big Content, while amplifying the harms to the rest of us.

That new system is "notice-and-stay-down," a successor to notice-and-takedown that monitors everything every user uploads or types and checks to see whether it is similar to something that has been flagged as a copyrighted work. This has long been a legal goal of the entertainment industry, and in 2019 it became a feature of EU law, but back in 2007, notice-and-staydown made its debut as a voluntary modification to YouTube, called "Content ID."

Some background: in 2007, Viacom (part of CBS) filed a billion-dollar copyright suit against YouTube, alleging that the company had encouraged its users to infringe on its programs by uploading them to YouTube. Google — which acquired YouTube in 2006 — defended itself by invoking the principles behind Betamax and notice-and-takedown, arguing that it had lived up to its legal obligations and that Betamax established that "inducement" to copyright infringement didn't create liability for tech companies (recall that Sony had advertised the VCR as a means of violating copyright law by recording Hollywood movies and watching them at your friends' houses, and the Supreme Court decided it didn't matter).

But with Grokster hanging over Google's head, there was reason to believe that this defense might not fly. There was a real possibility that Viacom could sue YouTube out of existence — indeed, profanity-laced internal communications from Viacom — which Google extracted through the legal discovery process — showed that Viacom execs had been hotly debating which one of them would add YouTube to their private empire when Google was forced to sell YouTube to the company.

Google squeaked out a victory, but was determined not to end up in a mess like the Viacom suit again. It created Content ID, an "audio fingerprinting" tool that was pitched as a way for rights holders to block, or monetize, the use of their copyrighted works by third parties. YouTube allowed large (at first) rightsholders to upload their catalogs to a blocklist, and then scanned all user uploads to check whether any of their audio matched a "claimed" clip.

Once Content ID determined that a user was attempting to post a copyrighted work without permission from its rightsholder, it consulted a database to determine the rights holder's preference. Some rights holders blocked any uploads containing audio that matched theirs; others opted to take the ad revenue generated by that video.

There are lots of problems with this. Notably, there's the inability of Content ID to determine whether a third party's use of someone else's copyright constitutes "fair use." As discussed, fair use is the suite of uses that are permitted even if the rightsholder objects, such as taking excerpts for critical or transformational purposes. Fair use is a "fact intensive" doctrine—that is, the answer to "Is this fair use?" is almost always "It depends, let's ask a judge."

Computers can't sort fair use from infringement. There is no way they ever can. That means that filters block all kinds of legitimate creative work and other expressive speech — especially work that makes use of samples or quotations.

But it's not just creative borrowing, remixing and transformation that filters struggle with. A lot of creative work is similar to other creative work. For example, a six-note phrase from Katy Perry's 2013 song "Dark Horse" is effectively identical to a six-note phrase in "Joyful Noise," a 2008 song by a much less well-known Christian rapper called Flame. Flame and Perry went several rounds in the courts, with Flame accusing Perry of violating his copyright. Perry eventually prevailed, which is good news for her.

But YouTube's filters struggle to distinguish Perry's six-note phrase from Flame's (as do the executives at Warner Chappell, Perry's publisher, who have periodically accused people who post snippets of Flame's "Joyful Noise" of infringing on Perry's "Dark Horse"). Even when the similarity isn't as pronounced as in Dark, Joyful, Noisy Horse, filters routinely hallucinate copyright infringements where none exist — and this is by design.

To understand why, first we have to think about filters as a security measure — that is, as a measure taken by one group of people (platforms and rightsholder groups) who want to stop another group of people (uploaders) from doing something they want to do (upload infringing material).

It's pretty trivial to write a filter that blocks exact matches: the labels could upload losslessly encoded pristine digital masters of everything in their catalog, and any user who uploaded a track that was digitally or acoustically identical to that master would be blocked.

But it would be easy for an uploader to get around a filter like this: they could just compress the audio ever-so-slightly, below the threshold of human perception, and this new file would no longer match. Or they could cut a hundredth of a second off the beginning or end of the track, or omit a single bar from the bridge, or any of a million other modifications that listeners are unlikely to notice or complain about.

Filters don't operate on exact matches: instead, they employ "fuzzy" matching. They don't just block the things that rights holders have told them to block — they block stuff that's similar to those things that rights holders have claimed. This fuzziness can be adjusted: the system can be made more or less strict about what it considers to be a match.

Rightsholder groups want the matches to be as loose as possible, because somewhere out there, there might be someone who'd be happy with a very fuzzy, truncated version of a song, and they want to stop that person from getting the song for free. The looser the matching, the more false positives. This is an especial problem for classical musicians: their performances of Bach, Beethoven and Mozart inevitably sound an awful lot like the recordings that Sony Music (the world's largest classical music label) has claimed in Content ID. As a result, it has become nearly impossible to earn a living off of online classical performance: your videos are either blocked, or the ad revenue they generate is shunted to Sony. Even teaching classical music performance has become a minefield, as painstakingly produced, free online lessons are blocked by Content ID or, if the label is feeling generous, the lessons are left online but the ad revenue they earn is shunted to a giant corporation, stealing the creative wages of a music teacher.

Notice-and-takedown law didn't give rights holders the internet they wanted. What kind of internet was that? Well, though entertainment giants said all they wanted was an internet free from copyright infringement, their actions — and the candid memos released in the Viacom case — make it clear that blocking infringement is a pretext for an internet where the entertainment companies get to decide who can make a new technology and how it will function.

How the meandering legal definition of 'fair use' cost us Napster but gave us Spotify

Seize the Means of Computation

Recommended