Reproducibility of Research — Do We Have a Problem Houston?


The Column

ColumnThe Column-01-19-2016
Volume 12
Issue 1
Pages: 12–14

Incognito talks about reproducibility in research.

Photo Credit: zoran micetic/Getty Images

Incognito talks about reproducibility in research.

I’ve been trying this week to reproduce some experiments from a paper by a well-known research group, and whilst I have results (finally), they would appear to be pointing to a very different conclusion than that drawn by the paper’s authors. My aim was to start by reproducing the results from the paper and then trying to adapt the methodology to use a different sampling technique to improve the sensitivity of the method - a situation, the like of which, I’m sure, happens globally on a daily basis. However, as I have found several times in my career, I was unable to reproduce the original experiments and therefore unable to validate my starting position for the new experiments.

So, one of two things is true: the original research and data is flawed, or I am not capable of replicating that data because of flaws in my own experimentation.

I’m unable to tell which is true here - but I do know that I have wasted a couple of days’ work. The original paper was difficult to follow, with what I thought to be several key variables and pieces of methodological information missing. I’m not blaming this - the issues could well be with my own work - but I’m still cross, whoever is to blame.

So cross in fact, that I went back to re-read an excellent recent edition of Nature, regarding the issues of reproducibility in scientific publications.1 Not the statistical measure of repeatability, rather the ability of another group to repeat and substantiate the work of the originators.

In modern research and development, it’s all too easy to jump to conclusions and find patterns in what may otherwise be considered to be random data, as so often we have a vested interest in the data - a PhD thesis, tenure, further funding, advancing a commercial project, maintaining the reputation of the department (academic or industrial), kudos, etc. This makes us sound like a thoroughly unscrupulous lot, but that’s not what I’m alleging.

Psychologist Brian Nosek of the non-profit Center for Open Science in Charlottesville, Virginia, USA, which works to increase the reproducibility of scientific research, puts it much better than I can: “As a researcher, I’m not trying to produce misleading results, but I do have a stake in the outcome.”2 I’m suggesting then that we might be pre-disposed to certain outcomes in our work which leads to actions and decisions that do not give a true reflection of what the data is telling us, or that we can choose those experiments which lead us to substantiate our theories at the exclusion of other, more rigorous, experiments.

So, what can be done, and what does Nature tell us about what analytical science can do to avoid fooling itself and wasting time?

Well first of all, science has always operated on the postulation of a theory or conclusion from experimentation, which has then been repeated and validated or refuted by other groups who will go on to expand upon the original work or postulate an alternative theory. Fine - that’s how things work, but there has been some shocking failures at reproducing academic research on a large scale, which brings into question how much time and resource is wasted producing meaningless data that does not advance science and in fact may even be holding up the good research.

In 2012, researchers at biotechnology firm Amgen in Thousand Oaks, California, USA, reported that they could replicate only 6 out of 53 landmark studies in oncology and haematology.3 In 2009, workers at the Meta-Research Innovation Center at Stanford University in Palo Alto, California, described how they had been able to fully reproduce only 2 out of 18 microarray-based gene–expression studies.4

So, what can be done?

Be More Rigorous with Academic Publications

Here’s the advice to authors on Materials and Methods in a leading academic journal:5 Provide sufficient detail to allow the work to be reproduced. That’s it. Enough? I really don’t think so. We need checklists to ensure even the most esoteric details are addressed so that reproduction is possible. Checklists would include minor experimental details, experimental design, method performance via statistical analysis, etc. Is there a restriction on space or content for method details? If so - abolish it. What about electronic submission of raw data? In review - blind the author’s names and institutions to avoid bias or deference.

How about replicating papers prior to publication? I can hear the gasps from readers already, but why not? Set aside funding, allow publication of replication studies (more publications for sound science), give replicated publications more research credits, establish replication groups. Where advanced equipment is used, allow replicators to take charge of original equipment to repeat experimentation. Non-replication needn’t necessarily disbar work from publication, but some notification of the failure to reproduce should be made. This approach would foster and encourage collaboration as well as introducing rigour and help to underpin good science. The most I ever learn about our science is when troubleshooting issues that arise from method transfers to client laboratories - would this be any different?


Recognize Cognitive Bias and Build in Safety Measures

There are many ways in which we can fool ourselves or have underlying bias in our work. Even the most ethical of researchers are susceptible to self-deception; outlined below are a few of the reasons why:

Hypothesis Myopia - A natural inclination to favour only one hypothesis and look for evidence to support it, whilst playing down evidence against it and being reluctant to adjust or propose more than one hypothesis.

Sharpshooter - Fire off a random series of shots, then draw a target around the bullet holes to ensure the highest number of bullseyes. Getting some encouragement from your on-going experimental data and deciding that this must be the correct path to go down, without realizing that the data could actually support many different conclusions from the one you are drawing.

Asymmetric Attention (Disconfirmation Bias) - Giving the expected results smiling approval, whilst unexpected results are blamed on experimental procedure or error rather than being accepted as a true challenge to your hypothesis.

Just-So Storytelling - Finding rational explanations to fit the data after the fact. The problem is, we can find a story to fit just about every type of data - it doesn’t mean to say the story is true! Also known as JARKing - “justifying after the results are known” - because it’s really difficult to go back and start again once we are at the end of the process.

The Ikea Effect - Everyone has a vested interest in loving the furniture they built themselves. Is it the same with our analytical data?

So what strategies might we employ to overcome these innate biases?

Strong Inference Techniques - Develop opposing or competing hypotheses and develop experiments to distinguish which is correct. Not having a favourite child avoids Hypotheses Myopia and cuts down on the need for Just-So Storytelling.

Open Science - Publication of methods and raw data for various groups to scrutinize. Even more radical - publish and seek approval or revision of research methods and design prior to any practical work, perhaps with an “in-principle” promise of publication. It’s not really any different to having a methods scrutiny panel or a good laboratory manager in industry!

Adversarial Collaboration - Invite your rivals to work with you, or on a competing hypothesis, which will ultimately result in a joint publication. For all of you thinking, “They will never agree on enough results or conclusions to write a joint paper”, ask yourself if its healthy to have beliefs so entrenched that pure experimentation and data analysis cannot be used to prove or disprove either side’s argument. These entrenched beliefs are prevalent and deeply unhealthy.

Blinding - There are many ways to “blind” our data analysis. Mixing up columns in databases, introducing dummy or biased data, and mixing up data into different sets are all possible. Once the data have been analyzed, the blinds are lifted to see if the conclusions remain valid. The obvious example is the use of “placebo” medications in clinical trials or bioanalytical work, where the placebo subjects are not revealed until after the data analysis.

I feel that writing a clever conclusion to this piece will do nothing other than detract from the thoughts and theories contained within. I’d just ask you to consider if you recognize, in your own work, any of the problems or biases presented and if you agree or disagree with the proposed solutions. Just having you think about the issue of reproducibility is a good start; if this piece then allows you to adopt measures to avoid any of the problems, then even better.

Oh, and by the way, these issues don’t just apply to research in an academic environment!


  1. Nature526(7572), 164–286 (2015).
  2. Regina Nuzzo, Nature:
  3. C.G. Begley and L.M. Ellis, Nature483, 531–533 (2012).
  4. J.P.A. Ioannidis et al.,Nature Genet.41, 149–155 (2009).

Contact author: Incognito


Related Content