The underlying API used to determine "toxicity" scores phrases like "I am a gay black woman" as 87 percent toxicity, and phrases like "I am a man" as the least toxic. The API, called Perspective, is made by Google's Alphabet within its Jigsaw incubator.
When reached for a comment, a spokesperson for Jigsaw told Engadget, "Perspective offers developers and publishers a tool to help them spot toxicity online in an effort to support better discussions." They added, "Perspective is still a work in progress, and we expect to encounter false positives as the tool's machine learning improves."
Poking around with the engine behind Wired's data revealed some ugly results, as Vermont librarian Jessamyn West discovered when she read the article and tried out Perspective to see exactly what makes a comment, or a commenter, perceived as toxic (according to Alphabet, at least).
It's strange to wonder that Wired didn't give Perspective a spin to see what made the people behind its troll map "toxic." Wondering exactly that, I decided to try out a variety of comments to see how the results compared to West's. I endeavored to represent the people I seem to see censored the most on social media, and opinions of the day.
My experience typing "I am a black trans woman with HIV" got a toxicity rank of 77 percent. "I am a black sex worker" was 89 percent toxic, while "I am a porn performer" was scored 80. When I typed "People will die if they kill Obamacare" the sentence got a 95 percent toxicity score.
The Wired article analyzed 92 million Disqus comments "over a 16-month period, written by almost 2 million authors on more than 7,000 forums." They didn't look at sites that don't use the comment-management software (so Facebook and Twitter were not included).
The piece explained:
To broadly determine what is and isn't toxic, Disqus uses the Perspective API—software from Alphabet's Jigsaw division that plugs into its system. The Perspective team had real people train the API to rate comments. The model defines a toxic comment as "a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion."
Discrimination by algorithm
In an online world where moderation, banning and censorship are largely left to automation like the Perspective API, finding out how these things are measured is critical for everyone involved. "Looking into this, the word 'toxic' is a very specific term of art for the tool, this tool Perspective that's made by this company Alphabet, who you may know as Google, that is trying to bring [artificial intelligence] into commenting," West told Vermont Public Radio.
Perspective presents itself as a way to improve conversations online, positing that the "threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions." It's one of the many "make the world safer" Jigsaw projects.
Jigsaw worked with The New York Times and Wikipedia to develop Perspective. The NYT made its comments archive available to Jigsaw "to help develop the machine-learning algorithm running Perspective." Wikipedia contributed "160k human labeled annotations based on asking 5,000 crowd-workers to rate Wikipedia comments according to their toxicity. ... Each comment was rated by 10 crowd-workers."
A February article about Perspective elaborated on the human-trained, machine-learning process behind what wants to become the world's measuring tool for harmful comments and commenters.
"In this instance, Jigsaw had a team review hundreds of thousands of comments to identify the types of comments that might deter people from a conversation," The NYT wrote. "Based on that data, Perspective provided a score from zero to 100 on how similar the new comments are to the ones identified as toxic."
The results from West typing comments into Perspective were shockingly discriminatory. Identifying as black and/or gay was deemed toxic. She also tried it with visible and invisible disabilities, like wheelchair use and deafness, and the most toxic way to identify yourself in a conversation turned out to be saying "I am a woman who is deaf."
When the algorithm is taught to be racist, sexist and ableist (among other things), it leads to the silencing and censorship of entire populations. The problem is that when these systems are up and running, the people being silenced and banned disappear without a trace. Discrimination by algorithm happens in a vacuum.
We can only imagine what's underlying the automated comment-policing system at Facebook. In August Mary Canty Merrill, a psychologist who advises corporations on how to avoid racial bias, wrote a short post about defining racism on Facebook.
Reveal News wrote, "She logged in the next day to find her post removed and profile suspended for a week. A number of her older posts, which also used the "Dear white people" formulation, had been similarly erased."
Pasting her "Dear white people" into Perspective's API got a score of 61 percent toxicity.
Unless Google anti-diversity creeper James Damore was the project lead for Perspective, it's hard to imagine that the company would greenlight a product that thinks to identify as a black gay woman is toxic. (Wikipedia, on the other hand, I could imagine.)
It's possible that the tool is seeking out comments with terms like black, gay and woman as high potential for being abusive or negative, but that would make Perspective an expensive, overkill wrapper for the equivalent of using Command-F to demonize words that some people might find upsetting.
Perspective's reach is significant, too. The project is partnered with Wikipedia, The New York Times, The Economist and The Guardian. Abandon all hope, ye gay black women who enter the comments there.
What we've discovered about Perspective doesn't bode well for the future of machine-learning or AI and algorithm-driven comment measurement and moderation. Nor does it look good for accountability with companies like Google, Facebook and others that rely on automation for moderation.
I think we're all tired of Facebook telling us "it was a bug" and companies saying "it's not our fault" and pointing at systems like Perspective. Despite the fact that they're complicit by using it. And they should be trying these things out against problems like not being able to identify as a gay black woman in a comment thread without risking your ability to comment.
Imagine a system like Perspective deciding whether or not you can use business services, like Google AdSense. Take, for instance, the African-American woman who got an email Thursday from Google AdSense saying she'd violated its terms by writing a blog post about dealing with being called the n-word ... on her own website.
Distressingly, what's also being created is a culture where we can't even talk about abuse. As we can see, the implications for speech are huge -- and already we're soaking in it. Moreso when you consider that "competition" for something like Perspective is clearly already at work for social-media networks like Facebook, whose own policies around race and neo-Nazi belief systems are deeply skewed against societies who strive for equality, anti-discrimination and human rights.
It's probable that these terms are getting scored for high toxicity because they're terms used most commonly in attacks on targeted groups. But the instances mentioned in this article are clear failures. It shows that the efforts of Silicon Valley's ostensible best and brightest have steered AI meant to "improve the conversation" the way of racist soap dispensers and facial recognition software that can't see black people.
Insofar as the Wired feature is concerned, the data look flawed from where we're sitting. It may just mean that there are more gay black women and sex workers there who are OK with talking about it than Sharpsburg, Georgia, commenters. Depressingly, the "Internet Troll Map" might just be a map of black people discussing issues of race, LGBTQ identity and health care.
Which, we hope, is the opposite of what everyone intended.