Sensitivity analysis in systematic reviews tests whether the conclusions of your review hold up when key methodological decisions are varied. Every systematic review involves judgment calls, from which studies to include to which statistical model to use, and sensitivity analysis reveals which of those decisions actually matter for the final result. A finding that survives multiple sensitivity analyses is robust. One that flips under reasonable alternative choices is fragile, and readers deserve to know.
The Cochrane Handbook describes sensitivity analysis as a "crucial component" of systematic reviews, and deep dive into prisma 2020 requires that all pre-specified sensitivity analyses and their results be reported regardless of outcome. Yet many published reviews either skip sensitivity analysis entirely or bury a single leave-one-out analysis in supplementary materials. This guide covers the full toolkit: when sensitivity analysis is needed, which methods to use, how to interpret and report results, and how to pre-specify analyses in your our guide to developing a systematic review protocol.
What Sensitivity Analysis Tests
The core question of sensitivity analysis is simple: "Would my conclusion change if I had made a different reasonable decision?" This applies to every stage of a systematic review:
- Study inclusion. Would results differ if borderline studies (unclear eligibility, conference abstracts, unpublished data) were included or excluded?
- Data extraction. When a study reports multiple time points, outcome measures, or subgroups, does the choice of which data to extract affect the pooled result?
- Risk of bias. Does restricting the analysis to studies with low risk of bias change the conclusion?
- Statistical model. Does switching between fixed-effect and random-effects models alter the pooled estimate or its significance?
- Missing data. When studies have incomplete outcome data, do best-case and worst-case imputation scenarios produce different conclusions?
- Effect size measure. For binary outcomes, do odds ratios, risk ratios, and risk differences tell the same story?
Each of these represents a decision node where an alternative choice was equally defensible.
Leave-One-Out Analysis
Leave-one-out sensitivity analysis is the most common and most straightforward method. It sequentially removes each study from the meta-analysis for researchers, recalculates the pooled estimate, and examines whether any single study disproportionately influences the result.
How to interpret: If the pooled effect size and its statistical significance remain stable regardless of which study is removed, your findings are robust to individual study influence. If removing a single study changes the direction of the effect (e.g., from favoring treatment to favoring control) or changes statistical significance (from significant to non-significant or vice versa), that study is influential and warrants close examination.
What to do with influential studies: An influential study is not necessarily problematic. It may be the largest, highest-quality study that legitimately carries more weight. Investigate whether it differs clinically (different population, dose, or comparator), methodologically (different design, lower risk of bias), or statistically (different follow-up duration, different outcome definition). Report your findings transparently rather than excluding the study without justification.
Limitations: Leave-one-out analysis only tests single-study influence. It does not detect situations where two or three studies collectively drive the result, nor does it address methodological decisions beyond study inclusion.
Software implementation: In R, metafor::leave1out() performs this automatically. In Stata, metainf provides similar functionality. RevMan does not include built-in leave-one-out analysis. Our online sensitivity analysis tool provides an interactive interface for exploring study influence.
Decision-Node Sensitivity Analysis
Decision-node analysis systematically varies choices made at each stage of the review. Unlike leave-one-out (which only tests study inclusion), this approach examines the full range of methodological decisions.
Pre-specify decision nodes in your protocol. For each node, identify the primary analysis choice and at least one reasonable alternative:
| Decision Node | Primary Analysis | Sensitivity Analysis |
|---|---|---|
| Study eligibility | Include randomized controlled trials and quasi-experimental | Restrict to randomized controlled trials only |
| Risk of bias | Include all studies | Restrict to low risk of bias only |
| Missing data | Complete case analysis | Best-case/worst-case imputation |
| Statistical model | Random-effects (REML) | Fixed-effect model |
| Effect measure | Standardized mean difference | Mean difference (if scales comparable) |
| Outlier handling | Include all studies | Exclude statistical outliers (> 3 SD from pooled mean) |
Run the meta-analysis under each alternative specification and present results side by side. This gives readers and guideline panels a comprehensive picture of evidence robustness.