It is a truth universally acknowledged that people are often wrong on the internet. This can manifest itself in the form of conspiracy theories, inaccurate information related to breaking news, or misleading (or just plain wrong) information related to science and research. Sometimes inaccurate information is annoying, or even comical. Sometimes, however, inaccurate information can have serious consequences – such as online memes that mislead people about public health issues or when news reports say that an innocent person is the perpetrator of a mass shooting.
The internet is an incredibly valuable tool. It gives us efficient access to a staggering amount of information; it allows us to communicate with each other quickly and easily. But the same features that provide those perks also allow us to access a staggering amount of misinformation, and share it with each other just as easily.
Communication researchers want to better understand the dynamics of how this works; how do we find and share inaccurate information?
A team of researchers in the U.S. and China are developing a tool that may be able to help us better understand the dynamics of both how people share fake news and how people share information that corrects that fake news. Is it still in development? Yes. Will this only scratch the surface of what is clearly a huge and complex problem? Also yes. But I think the project is interesting, so wanted to write a little about it here.
At issue is a tool called Hoaxy, which aims to be “a platform for the collection, detection and analysis of online misinformation and its related fact-checking efforts,” according to a proof-of-concept pre-print paper about the tool. The pre-print, “Hoaxy: A Platform for Tracking Online Misinformation,” was authored by Chengcheng Shao of China’s National University of Defense Technology and Giovanni Ciampaglia, Alessandro Flammini and Filippo Menczer of Indiana University; it was posted on Arxiv March 4. (“Pre-print,” by the way, means “not peer reviewed,” so caveat emptor.) Ciampaglia and Flammini have collaborated on previous work related to fact checking which is interesting in its own right. (Open-access paper on that here; write-up of that work here.) [UPDATE, March 21: Menczer contacted me to let me know the paper has been peer-reviewed and is in press in the Proceedings of the ACM WWW Conference. The paper will be presented at a workshop there (see update at bottom of page). Also, Menczer notes that Shao is currently also a visiting scholar at Indiana University.]
“A scientific understanding of the dynamics of the Web is increasingly critical, and the dynamics of online news consumption exemplify this need, as the risk of massive uncontrolled misinformation grows,” the paper states. “Our upcoming Hoaxy platform for the automatic tracking of online misinformation may provide an important tool for the study of these phenomena.”
“The goal,” the authors write, “is to let researchers, journalists, and the general public monitor the production of online misinformation and its related fact checking.”
The paper is only six pages long, and is freely available, so I won’t go into the details of how it works (you can read it yourself). Suffice to say that it collects data from news websites and social media. From the news sites, it can track the evolution of fake news stories and corrective, fact-checking news stories. From social media, it tracks how those stories are disseminated online.
As part of their proof-of-concept, the researchers did a preliminary evaluation on Twitter traffic from Oct. 14, 2015 to Jan. 24, 2016. They collected tweets containing URLs from fake news sites, as designated by fakenewswatch.com (removing tweets with URLs from The Onion and other satirical sites). They also collected tweets containing URLs from fact-checking sites, like Snopes and FactCheck.org.
What they found highlights the extent to which misinformation outweighs fact-checking online (or at least on one social media platform).
There were 1,287,769 fake news tweets from 171,035 users linking to a total of 96,400 different URLs in the three-month data sample. By comparison, there were only 154,526 fact-checking tweets from 78,624 users linking to 11,183 different URLs during the same time period.
Granted, many of the fake news sites may have a smaller readership than, say, Snopes, and one article on FactCheck.org may address misinformation found on multiple sites or URLs. But there is still an alarming disparity between fake news and stories designed to correct the record (and give people accurate information).
The researchers also examined how fake news and fact-checking were disseminated via Twitter. They report that their analysis strongly suggests “rumor-mongering is dominated by few very active accounts that bear the brunt of the promotion and spreading of misinformation, whereas the propagation of fact checking is a more distributed, grass-roots activity.” (Again, read the paper for details.)
At this point there is no clear answer to the “So what?” question. But I find the idea of tools like Hoaxy to be incredibly interesting. I’ll be following Hoaxy with interest, and would like to know about other, similar projects (so, if you are aware of any, please let me know).
The paper does say the researchers plan to expand their analysis to “a larger set of news stories and investigate how the lag between misinformation and fact checks varies for different types of news.” I hope science or research news are on that list.
Sometimes I write about questions that have no clear answers. In this case, I’m writing about one project that hopes to help us find some answers. It’s early days, but I’m intrigued.
I’ve contacted two of the study authors with questions about when and where the work may be published or presented, as well as when they anticipate rolling out a functional iteration of Hoaxy for public use. I’ll let you know what I hear (see update, below).
UPDATE, March 21, 2016: I heard back from Giovanni Ciampaglia today. The paper on Hoaxy will be presented at the Social News On the Web workshop April 12 in Montreal. As far as the future of Hoaxy, Ciampaglia says, “We don’t have a release date for Hoaxy yet; we are currently very busy with the upcoming release of the social media observatory, a related project about meme diffusion in general. I’ll be happy to keep you posted as soon as we have more updates about either of those.”