It is curious. In the era of big data, an alarming amount of research is flawed. The reason? it is simply too easy to generate statistical evidence for pretty much anything(1). Can we do something about it? This article in New Scientist is a good review.
Don’t tell me you have not noticed it before: If you think that wine is good for you, you are right. There is a research paper showing that. Oh, you don’t agree! You think wine is not good for health. No problem, there is also a research paper for that. There is always a paper for that!
Last year a paper(2) in Science, described a major effort to replicate 100 psychology experiments published in top journals, and found that the success rate was little more than a third. There is growing alarm about results that cannot be reproduced. Psychology has been leading the controversy so far, but other fields of science, like biomedicine, seem to be falling also into the same reproducibility crisis that could undermine medical advances.
In FiveThirtyEight, Christie Aschwanden argues that the replication crisis is a sign that science is working. He thinks that science isn’t broken, it’s just hard.
Years ago, someone asked John Maddox how much of what his prestigious science journal Nature printed was wrong. “All of it,” the renowned editor quickly replied. “That’s what science is about — new knowledge constantly arriving to correct the old.”
Yes failure is what moves science forward (setting aside funerals), but beware. No research paper can ever be considered to be the final word, but there are too many that do not stand up to further study.
John Ioannidis was one of the first to cry wolf in 2005, and now he is leading a project focused on transforming research practices to improve the quality of scientific studies in biomedicine and beyond. His paper(3) “Why Most Published Research Findings Are False” is one of the most downloaded technical papers from PLoS Medicine. In it, he explains that research findings are less likely to be true:
- When the studies conducted in a field are smaller;
- When effect sizes are smaller;
- When there is a greater number and lesser preselection of tested relationships;
- Where there is greater flexibility in designs, definitions, outcomes, and analytical modes;
- When there is greater financial and other interest and prejudice; and
- When more teams are involved in a scientific field in chase of statistical significance.
In 2008, Chris Anderson announced the end of theory. He quoted Peter Norvig, Director of Research at Google Inc, paraphasing George Box’s maxim: “All models are wrong, and increasingly you can succeed without them.” to conclude that:
Petabytes allow us to say: “Correlation is enough.” We can stop looking for models.
The problem with correlation is something economist knows very well, specially those under pressure to bring evidence to politically charged debates, or even worse designs. As Ronald Coase said: “If you torture the data long enough, it will confess.” Imagine what Anderson’s Petabytes can confess nowadays, and the many spurious incentives that may guide unscrupulous data torturers.
How to fight against them? The recipe may sound naive: transparency and openness. It is not simple, of course, but there is a wide consensus on a minimum number of best practices(4). Some scientific fields have a strong tradition of sharing data, and using common databases (e.g. to share big telescopes or high energy physics experiments), but the academic reward system does not sufficiently incentivize open practices.
The situation is a classic collective action problem. Individual researchers lack strong incentives to be more transparent, though the credibility of science would benefit if everyone were more transparent. Maybe it is the time to explore (not so) new collective action management tools!
____________________
(1) Simmons, J., Nelson, L., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant Psychological Science, 22 (11), 1359-1366 DOI: 10.1177/0956797611417632
(2) Open Science Collaboration (2015). Estimating the reproducibility of psychological science Science, 349 (6251) DOI: 10.1126/science.aac4716
(3) Ioannidis, J. (2005). Why Most Published Research Findings Are False PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
(4) Nosek, B., Alter, G., Banks, G., Borsboom, D., Bowman, S., Breckler, S., Buck, S., Chambers, C., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E., Freese, J., Glennerster, R., Goroff, D., Green, D., Hesse, B., Humphreys, M., Ishiyama, J., Karlan, D., Kraut, A., Lupia, A., Mabry, P., Madon, T., Malhotra, N., Mayo-Wilson, E., McNutt, M., Miguel, E., Paluck, E., Simonsohn, U., Soderberg, C., Spellman, B., Turitto, J., VandenBos, G., Vazire, S., Wagenmakers, E., Wilson, R., & Yarkoni, T. (2015). Promoting an open research culture Science, 348 (6242), 1422-1425 DOI: 10.1126/science.aab2374
Featured Image: SMU’s Assistant Professor of Finance Gennaro Bernile (right) receiving the Ig Nobel Prize with his co-researcher, Professor Raghavendra Rau, during the 2015 Prize Ceremony at Harvard University
[…] invited John Ioannidis to have a broader perspective about this key question. Ioannidis has been fighting for better research practices and, in particular, improving the “reproducibiity” of scientific research, for years. […]
[…] A good model of relevant uncertainties is of fundamental importance to judging the significance and reproducibility of quantitative […]
[…] and social networks, but specifically in scientific research and technological innovation. Here, here, here or here I have pointed to the increasing difficulty of telling facts apart from […]
[…] able to replicate scientific findings is crucial for scientific progress. A paper(1) published this week in Nature […]
[…] inference has an outsized role in replicability discussions due to the frequent misuse of statistics such as the p-value and threshold for […]
Interesting stuff!
Typo alert: “a
minimunminimum number of best practices”All links check out good, except for the one on ‘large scale physical experiments’, which is invalid (the href is ‘http://large-scale physics experiment’; couldn’t figure out what it was intended to be).
This comment was brought to you courtesy of ?Random Raiders! 🙂
Thank you very much Random Raider Captain!