Science Blogging and Citations

Photo credit: North Carolina State University
Photo credit: North Carolina State University

Paige Brown Jarreau, author of the SciLogs blog From The Lab Bench, recently wrote a lengthy post on the science of science blogging. The post included a lengthy list of related journal articles, and one of them caught my eye: “Do blog citations correlate with a higher number of future citations?” With Paige’s blessing, I decided to unpack that particular paper a bit.

The full title of the paper is “Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics,” and it was published earlier this year in the Journal of the Association for Information Science and Technology (full citation below). [Note: I’ll be giving an overview of the findings here, but I encourage you to read the paper itself. For example, the paper’s authors talk quite a bit about altmetrics, which I won’t be going into.]

So, is there a correlation between science blogging and journal citations? Yes.

The Background

The authors of the paper, Shema, et al., wanted to determine whether journal articles that were blogged about shortly after publication generated more citations than other articles from the same journal that were not written about in blogs.

They looked at blogs that were part of and focused on posts that included academic citations from the same year as the post. Specifically, Shema, et al. looked at blog posts published in 2009 that were about journal articles that had been published in 2009, and posts published in 2010 about articles from 2010. This came to 4,013 posts for 2009 and 6,116 posts for 2010.

The researchers then further limited their sample by looking solely at journals that had 20 or more articles reviewed in a given year. This limited the sample size to blog posts about 887 articles in 12 journals in 2009 and posts about 1,394 articles in 19 journals for 2010.

The researchers then looked at the median number of citations for all of the articles in a journal over a three-year period, as well as the median number of citations for blogged-about articles in each journal over a three-year period (2009-11 for 2009 papers, 2010-12 for 2010 papers).


The median number of citations was higher for 10 of the 12 journals in 2009, and much higher for the New England Journal Medicine (NEJM), in which blogged-about articles had a median of 172 citations, as opposed to a median of 56 citations for papers that hadn’t been written about in blogs.

The numbers for 2010 were even more pronounced. The median number of citations was higher for 16 of the 19 journals. Again, NEJM led the way, with blogged-about articles garnering an average of 138 citations, versus 51 citations for the journal’s overall median.

Null Hypothesis and P-Values

Some of you may know this already, but bear with me. In statistical analysis, the “null hypothesis” holds that there is no statistical relationship between two things. It’s basically the default. For example, the null hypothesis for this paper would be that there is no relationship between science blogging and journal citations.

To address the null hypothesis, researchers use tests (in this case the Mann-Whitney test) to determine the p-value, which basically helps you determine whether the null hypothesis is true. P-values can range anywhere between 1 and zero. If the p-value is really low (less than 0.05), the null hypothesis is deemed invalid (i.e., there IS a relationship between blogging and citations!). But if the p-value is relatively high (greater than 0.05), the null hypothesis is probably valid (i.e., blogging and citations probably aren’t related). (I may be showing my ignorance of statistics here, but I’m doing my best!)

Why am I telling you all this? Because the p-values in this paper are pretty interesting.

For the 2009 papers, seven of the 12 journals (58 percent) had low p-values – and six of the journals had really low p-values, less than 0.01. That means that, statistically, papers in those seven journals that were blogged about were likely to have more citations than other papers in the same journals that were not blogged about.

And the different was more pronounced for the 2010 papers, in which 13 of the 19 journals (68 percent) had low p-values (10 of them below 0.01).

So What?

Some bloggers (including me) write about papers without talking to the relevant researchers. But others seek out the researchers who did the work. They have questions. But researchers often don’t want to talk to mainstream reporters, much less bloggers they’ve never heard of.

One take-home message from this paper is that researchers might be well served to answer a blogger’s questions about the work – the resulting blog post may help boost citations for the relevant journal article.

The second take-home message is for bloggers: you’re not working in a void. This paper indicates that blogging may be more than idle entertainment. When you write about a paper that captures your interest, you may be helping to capture the interest of other researchers in related fields – drawing attention to the paper, contributing to discussion in the field, and ultimately helping to increase citations for the paper in question. The research shows correlation, not causation, but I find it heartening all the same.

It’s also worth noting that there was a big jump in the number of journals with 20+ articles being blogged about between 2009 and 2010. There was also an increase in the percentage of journals in which blogged-about papers outperformed the journal average, and a higher percentage of journals with low p-values.

I suspect that this is because, while it is not a new phenomenon, science blogging has grown significantly in recent years in terms of the number of blogs, visibility, and readership. I would be curious to see what it would look like if we applied the methodology in the Shema, et al., paper to the years 2011 and 2012. Would we see a trend that shows a strengthening correlation between blogged-about papers and citations? Or would the growing number of science blogs “dilute” their potential impact?

Maybe someone will do that study (maybe they’re doing it now!). If so, I can’t wait to read it.


Do blog citations correlate with a higher number of future citations? Research blogs as a potential source for alternative metrics.” Journal of the Association for Information Science and Technology, 65(5), 1018-1027, (2014), by Shema, H., Bar-Ilan, J., & Thelwall, M. DOI: 10.1002/asi.23037


4 thoughts on “Science Blogging and Citations

  1. Pingback: [BLOCKED BY STBV] Morsels For The Mind – 31/10/2014 › Six Incredible Things Before Breakfast

  2. JL

    Correlation isn’t causation. Couldn’t it be that bloggers are simply more likely to writing about papers that are more likely to also be liked (and cited) by other researchers? Need a RCT.


  3. You’re absolutely right. And the idea you’re referring to is called “the earmark hypothesis.” But I’m not sure if it would even be possible to test for causation. (If there’s a study design that would work, I’d love for someone to do that study!)

    In the one similar(ish) study that I’ve seen that gets close to addressing causation (Phillips, et al., 1991), the earmark hypothesis didn’t hold water. But that’s only one study, it dealt with newspapers (not blogs), it’s dated as heck, and it was only possible due to a pretty random circumstance that is unlikely to be repeated (a printer’s strike). More on the Phillips study here: Link to the Phillips paper:


  4. Pingback: [BLOCKED BY STBV] Don’t Academics Already Know Why Social Media is Important? › The Leap

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s