I analyzed the content of 98,709 randomly chosen research papers from PubMed to learn more about bias.
Specifically, I wanted to do 2 things:
- Rank 64 types of biases by popularity, in order to determine on which ones professional researchers focus the most in practice.
- Test the hypothesis that addressing bias issues is a sign of high-quality research.
Let’s first start with a brief summary of the results.
Here’s a summary of the key findings
1. Only 11% of medical research papers addressed the issue of bias (i.e. mentioned at least 1 type of bias).
2. The 10 most popular types of biases in research are:
- Confounding
- Selection Bias
- Recall Bias
- Reporting Bias
- Sampling Bias
- Information Bias
- Detection Bias
- Attrition Bias
- Ascertainment Bias
- Performance Bias
3. It is not enough to just mention the types of biases that may be present in your study, instead you should look for ways to eliminate, limit, or at least quantify the effect of bias on your results. In fact, our data suggest that just mentioning the type of bias is a statistically significant predictor of a lower-quality study!
64 different types of biases sorted by popularity
Out of the 98,709 research papers analyzed, only 10,811 (or 11%) mentioned at least 1 type of bias.
Here’s a table that summarizes these data:
Rank | Bias Type | Number of Articles that Mentioned this Type of Bias (Out of 10,811 Articles) | In Percent |
---|---|---|---|
1 | Confounding * | 6449 | 59.65% |
2 | Selection Bias | 3180 | 29.41% |
3 | Recall Bias | 1107 | 10.24% |
4 | Reporting Bias | 544 | 5.03% |
5 | Sampling Bias | 401 | 3.71% |
6 | Information Bias | 207 | 1.91% |
7 | Detection Bias | 193 | 1.79% |
8 | Attrition Bias | 176 | 1.63% |
9 | Ascertainment Bias | 128 | 1.18% |
10 | Performance Bias | 127 | 1.17% |
11 | Hawthorne Effect | 112 | 1.04% |
12 | Misclassification Bias | 112 | 1.04% |
13 | Observer Bias | 105 | 0.97% |
14 | Confounding by Indication | 90 | 0.83% |
15 | Referral Bias (a.k.a. Admission Rate Bias or Berkson’s Bias) | 86 | 0.80% |
16 | Immortal Time Bias | 55 | 0.51% |
17 | Language Bias | 55 | 0.51% |
18 | Confirmation Bias | 50 | 0.46% |
19 | Non-Response Bias | 37 | 0.34% |
20 | Lead Time Bias | 31 | 0.29% |
21 | Outcome Reporting Bias | 26 | 0.24% |
22 | Verification Bias | 21 | 0.19% |
23 | Volunteer Bias | 20 | 0.18% |
24 | Allocation Bias | 18 | 0.17% |
25 | Temporal Bias | 18 | 0.17% |
26 | Collider Bias | 16 | 0.15% |
27 | Availability Bias | 15 | 0.14% |
28 | Perception Bias | 13 | 0.12% |
29 | Incorporation Bias | 10 | 0.09% |
30 | Spectrum Bias | 10 | 0.09% |
31 | Funding Bias (a.k.a. Sponsorship Bias) | 9 | 0.08% |
32 | Protopathic Bias | 6 | 0.06% |
33 | Overconfidence Bias | 5 | 0.05% |
34 | Length Time Bias | 3 | 0.03% |
35 | Exclusion Bias | 2 | 0.02% |
36 | Popularity Bias | 2 | 0.02% |
37 | Lack of Blinding Bias | 1 | 0.01% |
38 | Proxy Bias | 1 | 0.01% |
39 | Inferential Bias | 1 | 0.01% |
40 | Novelty Bias | 1 | 0.01% |
41 | Unmasking Bias | 1 | 0.01% |
42 | Chronological Bias | 1 | 0.01% |
43 | Prevalence-Incidence Bias (a.k.a. Neyman’s Bias) | 0 | 0.00% |
44 | Spin Bias | 0 | 0.00% |
45 | Unacceptability Bias | 0 | 0.00% |
46 | Previous Opinion Bias | 0 | 0.00% |
47 | All’s Well Literature Bias | 0 | 0.00% |
48 | Positive Results Bias | 0 | 0.00% |
49 | Differential Reference Bias | 0 | 0.00% |
50 | Apprehension Bias | 0 | 0.00% |
51 | Centripetal Bias | 0 | 0.00% |
52 | Compliance Bias | 0 | 0.00% |
53 | Diagnostic Access Bias | 0 | 0.00% |
54 | Diagnostic Momentum Bias | 0 | 0.00% |
55 | Diagnostic Suspicion Bias | 0 | 0.00% |
56 | Exposure Suspicion Bias | 0 | 0.00% |
57 | Partial Reference Bias | 0 | 0.00% |
58 | Hot Stuff Bias | 0 | 0.00% |
59 | Informed Presence Bias | 0 | 0.00% |
60 | Insensitive Measure Bias | 0 | 0.00% |
61 | Mimicry Bias | 0 | 0.00% |
62 | Non-Contemporaneous Control Bias | 0 | 0.00% |
63 | One-Sided Reference Bias | 0 | 0.00% |
64 | Wrong Sample Size Bias | 0 | 0.00% |
Mentioning the type of bias in your study is not enough
In this section, we will be interested in whether or not addressing bias issues is a sign of high-quality research.
In order to answer this question, I used a linear regression model to study the influence of “mentioning at least 1 type of bias in a research paper” on “the Journal Impact Factor (JIF)” which is a good proxy for the quality of research — as higher-quality articles tend to be published in high JIF journals.
Here’s a summary of the linear regression results:
Coefficient | Standard Error | p-value | |
---|---|---|---|
Intercept | 3.80 | 0.01 | < 0.001 |
Bias Mention (Yes versus No) | -0.30 | 0.03 | < 0.001 |
The model shows that mentioning at least 1 type of bias in the study is associated with a 0.3 mean drop in the value of JIF (p < 0.001). Specifically, the average article that does not mention any type of bias is published in a journal with an impact factor of 3.8, compared to a JIF of 3.5 for the average article that does mention at least 1 type of bias.
Now it may be argued that the -0.3 difference is not practically significant, therefore at the very least, we can conclude that:
Just mentioning the type of bias that may affect your results does not make your study better. Instead, you should try tweaking your design, or adjusting your statistical analysis in order to control or limit the effect of that bias.
References
The list of biases used in this article is mostly based on catalogofbias.org along with other resources such as textbooks and research papers.