Often used to describe a new research
finding, statistical significance is one of the
most misunderstood scientific terms – even
by scientists themselves.
To gauge whether the result of some
experiment is ‘significant’ or not, formulae
are used to work out the chances of getting
at least as impressive a result if fluke was
the true cause. If these are less than 1 in 20,
then the result is deemed statistically
significant. But – contrary to what many
scientists think – this doesn’t mean the
chances of the result being a one-off are
also 1 in 20. As the calculation was made
assuming fluke to be responsible, it can’t
give the chances of that assumption being
right. In order to work that out, the inherent
plausibility of the finding is taken into
account. Phew! When all these calculations
have been made, many ‘statistically
significant’ but implausible findings end up
showing a high risk of being flukes.
Statisticians have issued warnings about
the dangers of misunderstanding statistical
significance for decades, to little or no avail.
Some now suspect this misunderstanding
lies behind the current ‘replication crisis’ in
science, where many research findings fail
to be confirmed by follow-up studies.