Publication bias detection is the process of identifying whether the studies included in a meta-analysis represent a systematically skewed sample of all research conducted on a given question. When statistically significant results are more likely to be published than null or negative findings, the pooled estimate from a meta-analysis can overestimate the true effect. Detecting, testing, and adjusting for this bias is a core requirement for any credible evidence synthesis.
In our meta-analyses at Research Gold, we routinely run Egger's test alongside funnel plots before finalizing any pooled estimate. This article covers the full workflow: what publication bias is, how it distorts meta-analysis results, visual and statistical methods for detection, adjustment techniques, and how to report your findings in line with PRISMA 2020 and GRADE requirements. For a broader overview of the meta-analysis process, see our guide on how to do a meta-analysis step by step.
What Is Publication Bias
Publication bias occurs when the likelihood of a study being published depends on the direction or statistical significance of its results. Studies that report positive, statistically significant findings are more likely to be submitted by authors, accepted by journals, and available for inclusion in a systematic review. Studies with null results, small effect sizes, or inconclusive findings are more likely to remain unpublished, sitting in file drawers, conference abstracts, or institutional repositories where systematic reviewers cannot find them.
The consequence for meta-analysis is straightforward: if the available evidence is enriched with positive results and depleted of null results, the pooled effect size will be larger than the true population effect. The meta-analysis does not merely summarize the evidence, it summarizes the evidence that made it through a publication filter. The Cochrane Handbook for Systematic Reviews of Interventions (Higgins et al., 2023) identifies publication bias as one of the most serious threats to the validity of systematic review conclusions.
Publication bias is not the only form of reporting bias. Selective outcome reporting, where authors report only the outcomes that achieved significance, and selective analysis reporting, where authors choose analytical approaches that produce favorable results, operate through similar mechanisms. The broader category is sometimes called reporting bias or dissemination bias. However, publication bias, the selective publication of entire studies based on results, is the form most amenable to detection through the methods described in this article.
The magnitude of the problem is well documented. Empirical studies have shown that trials with statistically significant results are roughly twice as likely to be published as those with null results (Dwan et al., 2008). In pharmacological research, the imbalance is even larger. The practical effect on meta-analyses is that pooled estimates may be inflated by 10-30% when significant publication bias is present.
How It Affects Meta-Analysis Results
A meta-analysis pools effect sizes from individual studies using a weighted average, where larger and more precise studies receive greater weight. Publication bias distorts this process because the missing studies are not randomly distributed, they are systematically those with smaller, null, or negative effects.
Consider a hypothetical meta-analysis of 20 published studies examining the effect of an intervention. If 8 additional studies were conducted but remain unpublished because they found no significant effect, the pooled estimate from the 20 published studies will overstate the intervention's effectiveness. The magnitude of the overestimation depends on the number of missing studies, the size of their effects, and their precision.
The distortion compounds with other biases. Small studies are more susceptible to publication bias because their results are more variable, a small study might find a large effect by chance, get published, while another small study finding no effect goes unpublished. This is why the concept of small-study effects is closely linked to publication bias. Larger studies are more likely to be published regardless of results because they represent substantial investments and contribute important evidence even when results are null.
The impact on clinical decision-making is real. If a Cochrane review concludes that a treatment has a moderate effect based on biased evidence, clinicians may adopt that treatment for patients who would receive no benefit. In public health, biased meta-analyses can influence policy recommendations, resource allocation, and guideline development. The GRADE Working Group explicitly includes publication bias as one of five domains that can reduce the certainty of evidence, reflecting its importance in evidence-based practice.
Funnel Plots, Visual Detection
A funnel plot is a scatter plot that displays the relationship between each study's effect size (x-axis) and a measure of its precision (y-axis, typically standard error with the scale inverted so that more precise studies appear at the top). In the absence of bias, the plot should resemble a symmetric inverted funnel: large, precise studies cluster near the pooled estimate at the top, while smaller, less precise studies scatter more widely but symmetrically around the same central value.
Funnel plot asymmetry occurs when the scatter is not symmetric. The most common pattern associated with publication bias is a gap in the bottom-right or bottom-left corner of the funnel, indicating that small studies with null or negative results are missing from the evidence base. When you observe this pattern, small studies on the side favoring the treatment are present while small studies on the opposite side are absent.
Interpreting funnel plots requires nuance. The Cochrane Handbook recommends visual inspection as a starting point but warns against over-reliance on subjective assessment. Different observers may reach different conclusions about asymmetry, especially when the number of studies is small. Funnel plots with fewer than 10 studies are difficult to interpret because the expected symmetry may not emerge even in the absence of bias simply due to sampling variability.
When constructing a funnel plot, use the standard error on the y-axis rather than sample size or inverse variance. The standard error produces the expected funnel shape more reliably and is the convention used by major software packages including RevMan, Stata, and R's metafor package. You can generate publication-ready funnel plots using our free funnel plot generator, which accepts effect sizes and standard errors and produces formatted output suitable for journal submission.
A well-constructed funnel plot communicates three things simultaneously: the distribution of precision across studies, the degree of scatter relative to the pooled estimate, and any asymmetry that might suggest systematic bias. These visual properties make funnel plots one of the most informative single graphics in evidence synthesis.
| Funnel Plot Feature | What It Suggests | Action Required |
|---|---|---|
| Symmetric scatter around pooled estimate | No evidence of publication bias | Report as reassuring; proceed with pooled estimate |
| Gap in bottom-right corner | Small negative/null studies may be missing | Run formal statistical tests; consider trim-and-fill |
| Gap in bottom-left corner | Small studies with large positive effects missing (less common) | Investigate data; may indicate other biases |
| Asymmetry with outliers | Possible heterogeneity rather than bias | Investigate study-level characteristics; run subgroup analysis |
| Hollow funnel (few small studies) | Small studies not conducted or not found | Assess search comprehensiveness; note in limitations |
Statistical Tests for Publication Bias
Visual inspection of funnel plots is inherently subjective. Statistical tests provide a formal, reproducible method for quantifying funnel plot asymmetry and assessing whether the observed pattern is consistent with chance alone. Three tests are used most frequently: Egger's regression, Begg's rank correlation, and Peter's test.
Egger's Weighted Regression Test
Egger's test (Egger et al., 1997) is the most widely used statistical test for funnel plot asymmetry. It regresses the standardized effect estimate (effect size divided by its standard error) against precision (the inverse of the standard error). If no asymmetry is present, the regression intercept should not differ significantly from zero. A statistically significant intercept (typically p less than 0.10, using a more liberal threshold to maintain power) suggests that smaller, less precise studies tend to report systematically different effect sizes than larger studies.
The test is computed using weighted least squares regression, where each study's contribution is weighted by the inverse of its variance. The test statistic follows a t-distribution, and the p-value is obtained from the slope's deviation from zero. In practice, most meta-analysis software packages, including R's metafor package (function regtest), Stata's metabias command, and RevMan, compute Egger's test automatically.
Egger's test has well-established limitations. Its power is low when the meta-analysis includes fewer than 10 studies, which is why the Cochrane Handbook recommends a minimum of 10 studies before applying asymmetry tests. The test can also produce false positives when effect sizes are measured as odds ratios, particularly when the event rate is very low or very high. In these circumstances, Peter's test is the preferred alternative.
Begg's Rank Correlation Test
Begg and Mazumdar's rank correlation test (Begg & Mazumdar, 1994) uses a non-parametric approach to detect funnel plot asymmetry. It calculates the rank correlation (Kendall's tau) between the standardized effect sizes and their variances. A significant correlation suggests that study size is associated with effect magnitude, the pattern expected under publication bias.
Begg's test is less powerful than Egger's test in most scenarios, meaning it is more likely to miss true asymmetry. However, it makes fewer distributional assumptions and may be appropriate as a complementary analysis. When both Egger's test and Begg's test agree, the evidence for or against asymmetry is strengthened.
Peter's Test for Binary Outcomes
Peter's test (Peters et al., 2006) was developed specifically for meta-analyses of binary outcomes where the effect measure is an odds ratio or risk ratio. Egger's test applied to log odds ratios can produce spurious significant results because of the mathematical relationship between the effect size and its standard error when events are rare. Peter's test addresses this by regressing the effect size against the inverse of the total sample size rather than the inverse of the standard error.
For meta-analyses using odds ratios with sparse data (few events per study), Peter's test is the recommended choice. For meta-analyses using standardized mean differences or risk differences, Egger's test remains the standard.
| Test | Best Used For | Minimum Studies | Key Limitation |
|---|---|---|---|
| Egger's regression | Continuous outcomes, SMD, MD | 10+ | False positives with binary data |
| Begg's rank correlation | Complementary to Egger's | 10+ | Lower power than Egger's |
| Peter's test | Binary outcomes (OR, RR) | 10+ | Less widely implemented in software |
| Harbord's test | Binary outcomes (alternative) | 10+ | Similar to Peter's; choose one |
In our practice, we run Egger's test as the default for continuous outcome meta-analyses and Peter's test for binary outcome meta-analyses, reporting both the test statistic and p-value alongside the funnel plot. This combination of visual and statistical evidence provides the most defensible assessment.
Adjusting for Publication Bias
When evidence suggests that publication bias may be inflating the pooled estimate, two primary adjustment methods are available: the trim-and-fill method and selection models. Both attempt to estimate what the pooled effect would be if the missing studies were included.
Trim-and-Fill Method
The trim-and-fill method (Duval & Tweedie, 2000) is the most commonly used adjustment for publication bias in meta-analysis. It works in two stages. First, it identifies and temporarily removes (trims) the small studies causing funnel plot asymmetry. Second, it uses the trimmed funnel plot to estimate the number of missing studies and imputes (fills) their effect sizes to restore symmetry.
The adjusted pooled estimate incorporates both the original studies and the imputed studies, providing an estimate of what the meta-analysis result might look like in the absence of publication bias. Most software packages (R metafor, Stata metatrim, Comprehensive Meta-Analysis) implement the method and produce an adjusted funnel plot showing the imputed studies as distinct markers.
Trim-and-fill has important limitations that researchers must understand. The method assumes that funnel plot asymmetry is caused entirely by missing studies, which may not be true, asymmetry can result from heterogeneity, chance, or other biases. The imputed studies are hypothetical, not real, and the adjusted estimate should be interpreted as a sensitivity analysis rather than a definitive correction. The confidence interval around the adjusted estimate does not account for the uncertainty in estimating the number of missing studies, which means it may be artificially narrow.
Despite these limitations, trim-and-fill remains valuable because it provides a concrete, quantitative estimate of the potential impact of publication bias. Reporting both the unadjusted pooled estimate (e.g., SMD = 0.45, 95% CI 0.32-0.58) and the trim-and-fill adjusted estimate (e.g., SMD = 0.31, 95% CI 0.18-0.44, 4 studies imputed) allows readers to evaluate the robustness of your conclusions. Use our sensitivity analysis tool to explore how removing individual studies shifts your pooled estimate before applying trim-and-fill.
Selection Models
Selection models (Vevea & Hedges, 1995; Copas & Shi, 2001) take a more sophisticated approach to publication bias adjustment. Rather than imputing missing studies, they model the probability that a study is published as a function of its p-value or effect size. The model estimates the selection function, the relationship between a study's results and its probability of appearing in the literature, and uses this to adjust the pooled estimate.
Selection models are theoretically more principled than trim-and-fill because they directly model the publication process rather than relying on funnel plot symmetry. However, they require strong assumptions about the form of the selection function, and different assumptions can produce substantially different adjusted estimates. They also require more studies to produce stable estimates, typically 20 or more.
In practice, selection models are used less frequently than trim-and-fill, partly due to their complexity and partly because they are available in fewer software packages. The R package weightr and the Stata command selmodel implement common selection model approaches. When researchers have access to the necessary software and expertise, selection models can complement trim-and-fill as an additional sensitivity analysis.
| Adjustment Method | How It Works | Strengths | Limitations |
|---|---|---|---|
| Trim-and-fill | Imputes missing studies to restore funnel symmetry | Simple, widely available, visual output | Assumes asymmetry equals bias; CI too narrow |
| Vevea-Hedges selection model | Models publication probability as function of p-value | Directly models selection process | Strong assumptions; needs 20+ studies |
| Copas selection model | Models publication probability with sensitivity parameters | Explores range of selection scenarios | Complex; results depend on parameter choices |
| PET-PEESE | Regression-based correction for small-study effects | Performs well in simulations | Less established in Cochrane methodology |
Bias vs. Small-Study Effects
Funnel plot asymmetry and significant Egger's test results do not automatically confirm publication bias. The broader term for the pattern is small-study effects, the observation that smaller studies in a meta-analysis tend to report larger effect sizes than larger studies. Publication bias is one explanation for small-study effects, but it is not the only one.
Clinical heterogeneity can produce small-study effects when smaller studies are conducted in populations or settings where the intervention is genuinely more effective. For example, early trials of a new treatment may be conducted in the most severely affected patients (where the treatment effect is largest), while later, larger trials enroll a broader population (where the average effect is smaller). The resulting funnel plot will show asymmetry that reflects genuine variation in the treatment effect rather than selective publication.
Methodological heterogeneity is another explanation. Smaller studies may use less rigorous designs, fewer controls, shorter follow-up, less standardized outcome measurement, that tend to inflate effect sizes. The asymmetry in this case reflects quality differences rather than publication bias.
Chance alone can produce apparent asymmetry, especially in meta-analyses with fewer than 10-15 studies. Statistical tests for asymmetry have limited power in small meta-analyses, and both false positives and false negatives are common.
The practical implication is that researchers should investigate the cause of asymmetry rather than automatically attributing it to publication bias. Subgroup analyses by study size, risk of bias, or clinical setting can help distinguish between explanations. If smaller studies are systematically at higher risk of bias (as assessed by tools such as the Cochrane Risk of Bias tool), methodological quality rather than publication bias may be the primary explanation. The Cochrane Handbook recommends against labeling all funnel plot asymmetry as publication bias and encourages researchers to consider alternative explanations before reaching conclusions.
When reporting your findings, describe the asymmetry objectively and present the evidence for and against different explanations. A statement such as "Funnel plot asymmetry was observed and Egger's test was significant (p = 0.03). While this may indicate publication bias, subgroup analysis by risk of bias showed that studies at high risk of bias reported larger effects, suggesting methodological quality may contribute to the observed asymmetry" is more informative and defensible than simply stating "Publication bias was detected."
Common Mistakes in Publication Bias Assessment
Several recurring errors undermine the credibility of publication bias assessments in published meta-analyses. Avoiding these mistakes strengthens both the analysis and the manuscript.
Testing with too few studies. Applying Egger's test or interpreting funnel plots with fewer than 10 studies is the most common error. The Cochrane Handbook explicitly recommends against asymmetry testing below this threshold because both visual and statistical methods lack power. If your meta-analysis includes 6 studies, state that publication bias assessment was not feasible due to the small number of studies, and note this as a limitation.
Relying on visual inspection alone. Reporting that "the funnel plot appeared symmetric" without a formal statistical test is insufficient. Subjective interpretation varies between observers, and reviewers increasingly expect quantitative evidence to support visual assessments. Always pair your funnel plot with at least one statistical test.
Using the wrong test for binary outcomes. Applying Egger's test to a meta-analysis of odds ratios with rare events can produce false-positive results, leading to incorrect conclusions about publication bias. Use Peter's test or Harbord's test for binary outcome data with sparse events.
Treating trim-and-fill as definitive. Reporting the trim-and-fill adjusted estimate as if it were the corrected "true" effect misrepresents the method. Trim-and-fill is a sensitivity analysis, it estimates what the effect might be under one specific assumption about the cause of asymmetry. Present it alongside the original estimate and interpret it cautiously.
Ignoring alternative explanations. Concluding that publication bias is present based solely on funnel plot asymmetry without exploring heterogeneity, study quality differences, or clinical variation is methodologically lazy. Consider and report alternative explanations, as described in the section above.
Failing to report in GRADE. Publication bias is Domain 5 in the GRADE framework for rating certainty of evidence. When evidence of publication bias exists, the certainty of evidence should be downgraded by one level. Many authors assess publication bias but fail to integrate the finding into their GRADE assessment, creating an internal inconsistency in the review. The GRADE Working Group recommends that serious concern about publication bias warrants downgrading from, for example, moderate certainty to low certainty.
Omitting from the methods section. PRISMA 2020 Item 15 requires authors to describe any methods used to assess risk of bias due to missing results (including publication bias). PRISMA Item 21 requires results of this assessment. Omitting either violates the reporting guideline and will likely be flagged by peer reviewers.
A well-conducted publication bias assessment follows a clear sequence: construct the funnel plot, apply the appropriate statistical test, investigate causes of any observed asymmetry, run trim-and-fill or selection models if warranted, report both adjusted and unadjusted estimates, note the finding in your GRADE Domain 5 assessment, and describe all methods and results in your manuscript per PRISMA 2020.
For a detailed walkthrough of funnel plot construction and interpretation, see our companion guide on funnel plot interpretation and publication bias. You can also generate your own funnel plots instantly with our funnel plot generator and run leave-one-out sensitivity analyses with our sensitivity analysis tool.
Publication bias detection is not an optional step. It is a methodological requirement endorsed by the Cochrane Collaboration, mandated by PRISMA 2020 reporting guidelines, and evaluated by the GRADE framework. Researchers who skip this assessment, or perform it superficially, risk overstating the certainty and magnitude of their findings. Those who perform it rigorously, interpret it carefully, and report it transparently produce evidence that clinicians, policymakers, and patients can trust.