It is curious. In the era of big data, an alarming amount of research is flawed. The reason? it is simply too easy to generate statistical evidence for pretty much anything(1). Can we do something about it? This article in New Scientist is a good review.
Don’t tell me you have not noticed it before: If you think that wine is good for you, you are right. There is a research paper showing that. Oh, you don’t agree! You think wine is not good for health. No problem, there is also a research paper for that. There is always a paper for that!
Last year a paper(2) in Science, described a major effort to replicate 100 psychology experiments published in top journals, and found that the success rate was little more than a third. There is growing alarm about results that cannot be reproduced. Psychology has been leading the controversy so far, but other fields of science, like biomedicine, seem to be falling also into the same reproducibility crisis that could undermine medical advances.
In FiveThirtyEight, Christie Aschwanden argues that the replication crisis is a sign that science is working. He thinks that science isn’t broken, it’s just hard.
Years ago, someone asked John Maddox how much of what his prestigious science journal Nature printed was wrong. “All of it,” the renowned editor quickly replied. “That’s what science is about — new knowledge constantly arriving to correct the old.”
Yes failure is what moves science forward (setting aside funerals), but beware. No research paper can ever be considered to be the final word, but there are too many that do not stand up to further study.
John Ioannidis was one of the first to cry wolf in 2005, and now he is leading a project focused on transforming research practices to improve the quality of scientific studies in biomedicine and beyond. His paper(3) “Why Most Published Research Findings Are False” is one of the most downloaded technical papers from PLoS Medicine. In it, he explains that research findings are less likely to be true:
- When the studies conducted in a field are smaller;
- When effect sizes are smaller;
- When there is a greater number and lesser preselection of tested relationships;
- Where there is greater flexibility in designs, definitions, outcomes, and analytical modes;
- When there is greater financial and other interest and prejudice; and
- When more teams are involved in a scientific field in chase of statistical significance.
In 2008, Chris Anderson announced the end of theory. He quoted Peter Norvig, Director of Research at Google Inc, paraphasing George Box’s maxim: “All models are wrong, and increasingly you can succeed without them.” to conclude that:
Petabytes allow us to say: “Correlation is enough.” We can stop looking for models.
The problem with correlation is something economist knows very well, specially those under pressure to bring evidence to politically charged debates, or even worse designs. As Ronald Coase said: “If you torture the data long enough, it will confess.” Imagine what Anderson’s Petabytes can confess nowadays, and the many spurious incentives that may guide unscrupulous data torturers.
How to fight against them? The recipe may sound naive: transparency and openness. It is not simple, of course, but there is a wide consensus on a minimum number of best practices(4). Some scientific fields have a strong tradition of sharing data, and using common databases (e.g. to share big telescopes or high energy physics experiments), but the academic reward system does not sufficiently incentivize open practices.
The situation is a classic collective action problem. Individual researchers lack strong incentives to be more transparent, though the credibility of science would benefit if everyone were more transparent. Maybe it is the time to explore (not so) new collective action management tools!
(1) Simmons, J., Nelson, L., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant Psychological Science, 22 (11), 1359-1366 DOI: 10.1177/0956797611417632
(2) Open Science Collaboration (2015). Estimating the reproducibility of psychological science Science, 349 (6251) DOI: 10.1126/science.aac4716
(3) Ioannidis, J. (2005). Why Most Published Research Findings Are False PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
(4) Nosek, B., Alter, G., Banks, G., Borsboom, D., Bowman, S., Breckler, S., Buck, S., Chambers, C., Chin, G., Christensen, G., Contestabile, M., Dafoe, A., Eich, E., Freese, J., Glennerster, R., Goroff, D., Green, D., Hesse, B., Humphreys, M., Ishiyama, J., Karlan, D., Kraut, A., Lupia, A., Mabry, P., Madon, T., Malhotra, N., Mayo-Wilson, E., McNutt, M., Miguel, E., Paluck, E., Simonsohn, U., Soderberg, C., Spellman, B., Turitto, J., VandenBos, G., Vazire, S., Wagenmakers, E., Wilson, R., & Yarkoni, T. (2015). Promoting an open research culture Science, 348 (6242), 1422-1425 DOI: 10.1126/science.aab2374
Featured Image: SMU’s Assistant Professor of Finance Gennaro Bernile (right) receiving the Ig Nobel Prize with his co-researcher, Professor Raghavendra Rau, during the 2015 Prize Ceremony at Harvard University