Redefine statistical significance

DJ Benjamin, JO Berger, M Johannesson… - Nature human …, 2018 - nature.com
Nature human behaviour, 2018nature.com
The lack of reproducibility of scientific studies has caused growing concern over the
credibility of claims of new discoveries based on 'statistically significant'findings. There has
been much progress toward documenting and addressing several causes of this lack of
reproducibility (for example, multiple testing, P-hacking, publication bias and under-powered
studies). However, we believe that a leading cause of non-reproducibility has not yet been
adequately addressed: statistical standards of evidence for claiming new discoveries in …
The lack of reproducibility of scientific studies has caused growing concern over the credibility of claims of new discoveries based on ‘statistically significant’findings. There has been much progress toward documenting and addressing several causes of this lack of reproducibility (for example, multiple testing, P-hacking, publication bias and under-powered studies). However, we believe that a leading cause of non-reproducibility has not yet been adequately addressed: statistical standards of evidence for claiming new discoveries in many fields of science are simply too low. Associating statistically significant findings with P< 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems.
For fields where the threshold for defining statistical significance for new discoveries is P< 0.05, we propose a change to P< 0.005. This simple step would immediately improve the reproducibility of scientific research in many fields. Results that would currently be called significant but do not meet the new threshold should instead be called suggestive. While statisticians have known the relative weakness of using P≈ 0.05 as a threshold for discovery and the proposal to lower it to 0.005 is not new 1, 2, a critical mass of researchers now endorse this change.
nature.com