How artificial intelligence can be corrupted to repress free speech

It's easier than you think, even here in America.

Getty Creative

The internet was supposed to become an overwhelming democratizing force against illiberal administrations. It didn't. It was supposed to open repressed citizens' eyes, expose them to new democratic ideals and help them rise up against their authoritarian governments in declaring their basic human rights. It hasn't. It was supposed to be inherently resistant to centralized control. It isn't.

In fact, in many countries, the internet, the very thing that was supposed to smash down the walls of authoritarianism like a sledgehammer of liberty, has been instead been co-opted by those very regimes in order to push their own agendas while crushing dissent and opposition. And with the emergence of conversational AI -- the technology at the heart of services like Google's Allo and Jigsaw or Intel's Hack Harassment initiative -- these governments could have a new tool to further censor their citizens.

Turkey, Brazil, Egypt, India and Uganda have all shut off internet access when politically beneficial to their ruling parties. Nations like Singapore, Russia and China all exert outsize control over the structure and function of their national networks, often relying on a mix of political, technical and social schemes to control the flow of information within their digital borders.

The effects of these policies are self-evident. According to a 2016 report from internet-liberty watchdog Freedom House, two-thirds of all internet users reside in countries where criticism of the ruling administration is censored -- 27 percent of them live in nations where posting, sharing or supporting unpopular opinions on social media can get you arrested.

Take China. An anonymous source within Facebook last November claimed to the NYT that the company had developed an automated censorship tool for the Communist Party of China -- a token of loyalty CEO Mark Zuckerberg hopes will open the Chinese market to the Western social network. While Facebook likely won't censor user-generated content directly, the effect will be the same if the tool is used by a third-party company in China.

If Facebook is willing to do that in China, what's to stop the company from doing the same here in America at the behest of a Trump administration? What's to keep Twitter, Instagram (which is owned by FB) or Snapchat from following suit? Twitter, Facebook and Intel all declined to comment for this story. However, Dr. Toby Walsh, a leading researcher of AI and current guest professor at the Technical University of Berlin, believes such an outcome is plausible. "When we think of 1984-like scenarios, AI is definitely an enabling technology," he said.

China Development Forum 2016 In Beijing

Facebook's Mark Zuckerberg and Alibaba's Jack Ma speak at the China Development Forum 2016 -- VCG via Getty Images

While China has slowly warmed to capitalist markets and a more open economy, the CPC has long maintained a tight grip on digital culture. A quarter of the world's population -- nearly 700 million people -- are online in China. Ninety percent of web users in that nation access the web from a mobile device and, in 2015 alone, more than 40 million new users signed on for the first time.

And yet, some of the biggest cultural stories in China's modern history simply don't exist within its borders. All references to the 1989 Tiananmen Square crackdown have been so thoroughly scrubbed from the Chinese national internet that, in 2015, financial institutions were reportedly unable to accept monetary transfers that included a 4 or 6 because those digits refer to the protests' June 4th anniversary. Of course, there is no such thing as perfect security. "People are creative in how they work around such systems," Jason I. Hong, associate professor at the Human Computer Interaction Institute at Carnegie Mellon University, wrote to Engadget. "In China, people sometimes refer to Tiananmen Square protests as May 35 (June 4), which evaded censors for a while."

What's more, according to, around 3,000 websites had been blocked by the country's government as of 2015. Those include Google, Facebook, Twitter and The New York Times. This ubiquitous censorship is a testament to China's top-down design for its national network.

Essentially, Chinese censorship halts the flow of dissenting ideas before they can even start by continually keeping an eye on you. Unlike in the US, Chinese ISPs and websites are legally liable for what their users post, which has forced them into becoming unofficial editors for the state. So much as linking to political opinions critical of the CPC's conduct is a prosecutable offense. By keeping ISPs and websites under threat of closure, the government is able to leverage that additional labor force to help monitor a larger population than it would otherwise be able to. A conversational AI system would be able to accomplish the same effect more efficiently and at an even larger scale.

State censorship even extends to social media. This past July, the Cyberspace Administration of China, the administration in charge of online censorship, issued new rules to websites and service providers that enabled the government to punish any outlet that publishes "directly as news reports unverified content found on online platforms such as social media." That is, if a news organization gets a tip from a reader via Weibo, that organization is going to be fined or shuttered.

"It means political control of the media to ensure regime stability," David Bandurski of the University of Tokyo told The New York Times. "There is nothing at all ambiguous about the language, and it means we have to understand that 'fake news' will be stopped on political grounds, even if it is patently true and professionally verifiable."

It's not that bad here in America, yet. Over the past 20 years, "self-expression has proliferated exponentially. And the Supreme Court, especially the Roberts Court, has been, on the main, a strong defender of free expression," Danielle Keats Citron, professor of law at the University of Maryland Carey School of Law, wrote to Engadget.

Historically, the court has upheld specific forms of speech like snuff films, video-game violence and falsified military-service claims because they don't meet the intentionally narrow threshold for unprotected speech -- like yelling "fire" in a crowded theater. "At the same time," Keats Citron continued, "much expression occurs on third-party platforms whose speech decisions are not regulated by the First Amendment."

A sizable portion of this expression takes the form of online harassment -- just look at the Gamergate, Pizzagate, Lizard Squad and Sad/Rabid Puppies debacles, or the cowardly attacks on Leslie Jones for her role in the GhostBusters reboot. Heck, even Donald Trump, the newly-installed president of the United States, has leveraged his Twitter feed and followers to attack those critical of his policies.

"The thing to remember about these platforms is that the thing that makes them so powerful -- that so many people are on them -- is also what makes them so uniquely threatening to freedom of speech," Frank Pasquale, professor of law at the University of Maryland Carey School of Law said.

All of this hate and vitriol has a stifling effect on speech. When constantly inundated with this abuse, many rational people prefer to remain silent or log off entirely, as Ms. Jones did. Either way, the effect is the same: The harassment acts as a form of peer censorship. However, a number of the biggest names in technology are working to leverage machine-learning algorithms and artificial intelligence to combat this online scourge. And why not? It certainly worked in League of Legends. The popular game managed to reduce toxic language and the abuse of other players by 11 percent and 6.2 percent, respectively, after LoL's developer, RiotGames, instituted an automated notification system that reminded players not to be jerks at various points throughout each match.

Intel CEO Brian M. Krzanich speaking at the 2016 Intel AI Day in San Francisco -- YouTube

Intel's Hack Harassment initiative, for another example, is "a cooperative effort with the mission of reducing the prevalence and severity of online harassment," according to Intel PR. Intel is developing an AI tool in conjunction with Vox Media and the Born This Way Foundation that actively "detects and deters" online harassment with the goal of eventually creating and releasing an open API.

Ina Fried, senior editor at ReCode, spoke with Intel's Lori Smith-DeYoung about the program at the 2016 Intel AI Day in San Francisco last November. "Online harassment is a problem that technology created, so it's actually kind of important that we as an industry help solve it," Fried explained. ReCode's role is "really just talking about the issue, amplifying it and bringing voices to the issue showing the problem." The group has already built a demo app that looks at tweets and identifies content that constitutes harassment. It can warn users about their actions before they hit send or the system could, in theory, be built "into online communities and help monitor [harassment] and prevent some of it from being seen, or at least be seen as prevalently."

Google has undertaken a similar effort with recently acquired subsidiary, Jigsaw. The team's Conversation AI system operates on the same fundamentals as Hack Harassment. It leverages machine learning to autonomously spot abusive language. "I want to use the best technology we have at our disposal to begin to take on trolling and other nefarious tactics that give hostile voices disproportionate weight," Jigsaw president, Jared Cohen, told Wired. "To do everything we can to level the playing field."

One major hurdle for these systems is sarcasm -- something even people have trouble discerning in online writing without the help of additional contextual clues like emoji. "Context is crucial to many free-speech questions like whether a threat amounts to a true threat and whether a person is a limited-purpose public figure," professor Keats Citron told Engadget. "Yet often the full context of a threat or a person's public-figure status depends upon a wide array of circumstances--not just what happens on Twitter or Facebook but the whole picture of the interaction."

In Conversation AI's case, Jigsaw's engineers educated the machine-learning system by inundating it with roughly 17 million flagged comments from The New York Times website. It was also exposed to 130,000 bits of text from Wikipedia discussions. All of the Wiki snippets were also viewed by a crowdsourced 10-person panel that independently determined if each one constituted a "personal attack" or harassment.

After providing the system all of these examples, Conversation AI can recognize harassment a startling 92 percent of the time with only a 10 percent false-positive rate compared to a 10-member human panel. The results are so impressive that the NYT now employs the system to auto-block abusive comments before they can be vetted by a human moderator. The team hopes to further improve the system's accuracy by expanding its scope to look at long-term trends like the number of posts a certain account has made over a set period of time.

Both of these programs are pursuing a noble goal. However, it's one that could set a dangerous precedent. As Fried said during a subsequent AI Day panel discussion, "An unpopular opinion isn't necessarily harassment." But that decision is often left to those in power. And under authoritarian regimes, you can safely bet that it won't be the will of the people.

"I'm really surprised there hasn't been more of a discussion of this post-Snowden," TU's Walsh told Engadget. "I'm surprised that people were surprised that our emails are being read. Email is the easiest thing to read; it's already machine-readable text. You've got to assume that any email being read is not private."

Keats Citron made a similar point. "As private actors, intermediaries like Facebook, Twitter, and Google have free reign to decide what content appears online," she said. "Whereas government cannot censor offensive, hateful, annoying, or disturbing expression, intermediaries can do as they please. For that reason, I've urged platforms to adopt clear rules about what speech is prohibited on their sites and some form of due process when speech is removed or accounts suspended on the grounds of a ToS violation."

These are not small issues and they are not inconsequential, especially given the authoritarian tenor struck by the new presidential administration. "What I find most troubling from over the past few weeks is that you have Trump surrogate Newt Gingrich go on the news and say 'Look, the rules are the president can order someone to do something terrible and then pardon,'" Pasquale noted. He further explained that Trump's current actions are not wholly unprecedented, but rather a "logical extension of the Unitary Executive Theory...which would effectively put the executive branch above the law."

As mentioned above, even the threat of oversight from a government is enough to curtail free speech online and off. "Even though there are many rights, either under the First Amendment or subsequent statutes passed after J. Edgar Hoover's COINTELPRO program," Pasquale said. "You barely ever see someone taking advantage of that statute to, say, win monetary damages or otherwise deter the [government's] activity."

However, the industry itself is beginning to wake up to the dangers of misusing AI systems. "There's increasing awareness within the AI community of the risks -- both intentional and unintentional -- so there are a number of initiatives now to promote best practices to think about some of these ethical questions," Walsh said. "I've been involved with initiatives from IEEE, the largest professional organization within the space, to draw up ethical guidelines for people building AI systems."

Should our government implement an automated censorship system akin to the one Facebook developed, even if it had only a fraction of the capability of Jigsaw's Conversation AI, the threat to civil liberties and the First Amendment would be immediate and overwhelming.

"[Edward] Snowden did America and the world a service by revealing the extent of the wiretapping that was going on and the fact that it was not just external parties but citizens of the United States," Walsh concluded. "I don't think we've seen enough of [the discussion Snowden was attempting to instigate], people are not fully aware of quite how much the intelligence services must already be reading and the technologies that they're able to bring to bear."

Lead image: Getty Creative