Twitter says it limited the reach of over 700,000 tweets that violated its policy

It will also apply its moderation labels to more tweets.


Twitter has published an update on how its "Freedom of Speech Not Reach" moderation approach is working, and according to the company, it has seen some encouraging results. In April, the website started limiting the reach of tweets violating its hateful conduct policy and applying a label to them that reads: "Visibility limited: this tweet may violate Twitter's rules against hateful conduct." Apparently, Twitter has applied the label to more than 700,000 posts since then and has proactively prevented ads from appearing adjacent to those content.

The company also said that the label reduces the reach of a post by 81 percent, thereby effectively limiting the visibility of posts that potentially exhibit hateful conduct. In addition, Twitter revealed in its update that more than one-third of users choose to delete labeled tweets themselves once they've been notified that they have violated the website's policy and only four percent of authors have appealed labels.

The company charging for API access means most researchers studying hate speech can't independently verify these claims. But Twitter is clearly claiming that its approach has been effective so far. In fact, the website is pushing through with its plan to expand its labels and include more types of policy violations. According to its announcement, it will now also label and downrank posts that violate its Abusive Behavior and Violent Speech policies. Tweets that will be labeled in the coming weeks include posts with malicious content targeting individuals, those that encourage others to harass an individual or group of people, those that threaten to inflict physical harm on others, and tweets that encourage others to commit acts of violence or harm.