"P-value hacking," also known as "p-hacking," is used to describe the manipulation of data or statistical analysis until insignificant findings appear significant, represented by a p-value of less than 0.05. In the context of statistical hypothesis testing, the p-value is the probability of obtaining a result as extreme as, or more potent than, the observed data, given that the null hypothesis is true. The null hypothesis typically states that there is no effect or relationship between two measured phenomena. When researchers engage in p-hacking, they typically:
- Conduct multiple analyses using different combinations of variables and only report those that yield significant results.
- Collect data until a test reaches statistical significance.
- Exclude or include outliers based on whether their inclusion makes the results significant.
- Transform the data to achieve a more "favorable" result.
- Report a result as significant based on a one-tailed test after performing a two-tailed test.
The problem with p-hacking is that it leads to an increase in false-positive findings — results that appear significant but are actually due to chance. It undermines the reliability and reproducibility of scientific research. In recent years, there has been a push within the scientific community to address this issue, for example, by pre-registering study designs, encouraging the publication of null results, and focusing more on effect sizes and confidence intervals rather than solely on p-values.
In a way, it is related to the Veil of Ignorance.