Fake news has dominated post-election headlines, and important questions have been asked: Would Hillary have won had almost a million people not read that Pope Francis had endorsed Trump? (Probably not). Did Facebook take enough action to prevent fake news proliferating on its network? (Definitely not). But few have asked why these articles were so popular in the first place. Why were so many people duped into clicking these stories?
Earlier this month, BuzzFeed News' Craig Silverman analyzed engagement (likes, comments, shares, etc.) across Facebook and identified the most popular real and fake articles across three distinct periods: February to April, May to July and August to Election Day.
With this analysis, Silverman was able to show that the 20 most popular fake posts were "engaged with" (Facebook's term for likes, shares and so on) 8.71 million times in the lead up to the election, compared to just 2.97 million times in February to April. Mainstream news showed the opposite pattern: Starting at 12.4 million, and falling to 7.37 million in the final period -- 1.34 million less than the fake news. The overall number of engagements is fairly steady, too, suggesting that, at least to some extent, Facebook users were sharing fake news instead of real stories.
Last year, a group of researchers from Brazil's Federal University of Minas Gerais and the Qatar Computing Research Institute, analyzed 70,000 articles from four major news organizations (BBC News, Daily Mail, Reuters and The New York Times) to measure the correlation between headline sentiment and popularity. Although results varied from publication to publication, the general finding was that the more extreme the emotion in a headline, the more likely it is to be clicked on.
This runs both ways, the group said: "A headline has more chance to be successful if the sentiment expressed in its text is extreme, towards the positive or the negative side. Results suggest that neutral headlines are usually less attractive." You're more likely to click on a story that says "This is the best" or "This is the worst" than "This is quite okay."
"A headline has more chance to be successful if the sentiment expressed in its text is extreme."
Could sentiment analysis explain fake news' popularity? As Silverman made the statistics he gathered public, I asked the researchers to run the same dissection on all of the stories in the dataset. Their script uses multiple methods (including valence scoring) to determine the positivity (or negativity) of each word in a headline, before giving it a final "sentiment score."
The scale we're using runs from -4 (negative) to +4 (positive). A few examples: Vox's "The smug style in American liberalism" got a -1, while The Guardian's "Millions of ordinary Americans support Donald Trump. Here's why," is one of the more positive, scoring a +0.75. Few stories get higher or lower than a full point away from neural. Washington Post's "Max Lucado: Trump doesn't pass the decency test" was one of the most extreme mainstream stories, scoring a. -1.37.
The key finding from the team's analysis was that, on average, the fake news headlines were more negative than the mainstream ones in each period. Of course, that just shows "that fake news headlines contain more negative words in the title than real news," research scientist Haewoon Kwak told Engadget.
Inferring meaning from this requires some guesswork, and there are two possible explanations, according to Kwak. Either fake news writers are intentionally doing this for "clickbait" purposes, or they "naturally use more negative words" because of the topics they're writing about. An in-depth qualitative study might show which of the explanations is valid, but given the fake news writers were generally trying to make money from clicks, the former seems likely to be a driving factor.
Looking across the three periods, something else becomes clear: The average sentiment of real news became slightly more negative (from -0.14 to -0.2) while fake news became more positive (from -0.4 to -0.23). "This contradicts with our expectation that fake news became more aggressive over time," said Kwak. Look at the stories themselves, though, and you can see the problem with the sentiment analysis. The fake news headlines in the pre-election period are atypical. Unlike those of traditional news, these headlines are complex, long and often switch viewpoint.
Take the fake story "BREAKING: Hillary Clinton To Be Indicted... Your Prayers Have Been Answered." The first half of this headline is clearly negative, the second half, clearly positive. As the sentiment scoring ascribes an average for the entire headline, the story was given a +0.29 score. There are other examples of this: "Thousands Of Fake Ballot Slips Found Marked For Hillary Clinton! TRUMP WAS RIGHT!" Kwak explained that things like Trump endorsements "could boost the sentiment score slightly to be positive."
This mixture -- very positive with very negative -- is something of an anomaly and would require a new method of sentiment analysis to really dig into. But, it would appear, it does make for a very clickable headline. Kwak also noted that the dataset mixes headlines about Trump and Clinton. "If there were separate datasets for each candidate, it may bring a more interesting result."
Sentiment analysis can't be used to show whether news is true or false, but it does show the way headline writers, like marketers, manipulate our emotions to inspire us to click, like or share a story.