What is the difference between sensitivity analysis and subgroup analysis?

Sensitivity analysis varies methodological decisions (inclusion criteria, model choice, risk of bias thresholds) to test whether conclusions change. Subgroup analysis splits the data by clinical or demographic characteristics (age groups, intervention type, setting) to test whether the effect differs across populations. Sensitivity analysis asks 'are results robust?' while subgroup analysis asks 'does the effect vary across groups?' Both should be pre-specified in the protocol.

Why is sensitivity analysis important in a systematic review?

Systematic reviews involve dozens of judgment calls: which studies to include, which data to extract when multiple time points exist, which model to use, and how to handle missing data. Sensitivity analysis reveals whether your conclusions depend on any single decision. If changing a reasonable methodological choice flips the conclusion, readers and guideline developers need to know the evidence is fragile on that point.

How do you do a sensitivity analysis in a meta-analysis?

The most common approach is leave-one-out analysis: remove each study in turn, re-run the meta-analysis, and check whether the pooled estimate or its significance changes. Beyond leave-one-out, you can restrict analysis to low risk of bias studies only, change the statistical model (fixed vs. random effects), use different effect size measures, or exclude studies with imputed data. Pre-specify which sensitivity analyses you will run in your protocol.

What is a leave-one-out sensitivity analysis?

Leave-one-out analysis sequentially removes one study at a time from the meta-analysis and recalculates the pooled estimate. If the pooled result remains stable regardless of which study is removed, findings are robust. If removing a single study changes the direction or significance of the result, that study is influential and should be investigated for clinical or methodological reasons that might explain its outsized impact.

When should you perform a sensitivity analysis?

Sensitivity analysis should be planned during the protocol stage and conducted after the primary analysis. At minimum, run a sensitivity analysis for study quality (restricting to low risk of bias studies), for model choice (fixed vs. random effects), and for any methodological decision where reasonable alternative choices existed. Additional sensitivity analyses may be triggered by unexpected heterogeneity or reviewer concerns during the analysis phase.

Sensitivity Analysis in Systematic Reviews: Guide

Sensitivity analysis in systematic reviews tests whether the conclusions of your review hold up when key methodological decisions are varied. Every systematic review involves judgment calls, from which studies to include to which statistical model to use, and sensitivity analysis reveals which of those decisions actually matter for the final result. A finding that survives multiple sensitivity analyses is robust. One that flips under reasonable alternative choices is fragile, and readers deserve to know.

The Cochrane Handbook describes sensitivity analysis as a "crucial component" of systematic reviews, and deep dive into prisma 2020 requires that all pre-specified sensitivity analyses and their results be reported regardless of outcome. Yet many published reviews either skip sensitivity analysis entirely or bury a single leave-one-out analysis in supplementary materials. This guide covers the full toolkit: when sensitivity analysis is needed, which methods to use, how to interpret and report results, and how to pre-specify analyses in your our guide to developing a systematic review protocol.

What Sensitivity Analysis Tests

Four common types of sensitivity analysis in systematic reviews — Sensitivity analysis: 4 types and what each tests

The core question of sensitivity analysis is simple: "Would my conclusion change if I had made a different reasonable decision?" This applies to every stage of a systematic review:

Study inclusion. Would results differ if borderline studies (unclear eligibility, conference abstracts, unpublished data) were included or excluded?
Data extraction. When a study reports multiple time points, outcome measures, or subgroups, does the choice of which data to extract affect the pooled result?
Risk of bias. Does restricting the analysis to studies with low risk of bias change the conclusion?
Statistical model. Does switching between fixed-effect and random-effects models alter the pooled estimate or its significance?
Missing data. When studies have incomplete outcome data, do best-case and worst-case imputation scenarios produce different conclusions?
Effect size measure. For binary outcomes, do odds ratios, risk ratios, and risk differences tell the same story?

Each of these represents a decision node where an alternative choice was equally defensible.

Leave-One-Out Analysis

Leave-one-out sensitivity analysis is the most common and most straightforward method. It sequentially removes each study from the meta-analysis for researchers, recalculates the pooled estimate, and examines whether any single study disproportionately influences the result.

How to interpret: If the pooled effect size and its statistical significance remain stable regardless of which study is removed, your findings are robust to individual study influence. If removing a single study changes the direction of the effect (e.g., from favoring treatment to favoring control) or changes statistical significance (from significant to non-significant or vice versa), that study is influential and warrants close examination.

What to do with influential studies: An influential study is not necessarily problematic. It may be the largest, highest-quality study that legitimately carries more weight. Investigate whether it differs clinically (different population, dose, or comparator), methodologically (different design, lower risk of bias), or statistically (different follow-up duration, different outcome definition). Report your findings transparently rather than excluding the study without justification.

Limitations: Leave-one-out analysis only tests single-study influence. It does not detect situations where two or three studies collectively drive the result, nor does it address methodological decisions beyond study inclusion.

Software implementation: In R, metafor::leave1out() performs this automatically. In Stata, metainf provides similar functionality. RevMan does not include built-in leave-one-out analysis. Our online sensitivity analysis tool provides an interactive interface for exploring study influence.

Decision-Node Sensitivity Analysis

Decision-node analysis systematically varies choices made at each stage of the review. Unlike leave-one-out (which only tests study inclusion), this approach examines the full range of methodological decisions.

Pre-specify decision nodes in your protocol. For each node, identify the primary analysis choice and at least one reasonable alternative:

Decision Node	Primary Analysis	Sensitivity Analysis
Study eligibility	Include randomized controlled trials and quasi-experimental	Restrict to randomized controlled trials only
Risk of bias	Include all studies	Restrict to low risk of bias only
Missing data	Complete case analysis	Best-case/worst-case imputation
Statistical model	Random-effects (REML)	Fixed-effect model
Effect measure	Standardized mean difference	Mean difference (if scales comparable)
Outlier handling	Include all studies	Exclude statistical outliers (> 3 SD from pooled mean)

Run the meta-analysis under each alternative specification and present results side by side. This gives readers and guideline panels a comprehensive picture of evidence robustness.

Threshold Sensitivity Analysis

Threshold analysis asks: "How much would the data need to change to overturn the conclusion?" Rather than testing specific alternative decisions, it quantifies the fragility of the result.

Fragility index for meta-analysis: For binary outcomes, the fragility index counts the minimum number of events that, if reassigned from treatment to control (or vice versa) across studies, would change the statistical significance of the pooled result. A fragility index of 2 means that reassigning just 2 events would flip the conclusion, indicating a fragile finding.

Need your sensitivity analyses planned and executed by experienced methodologists? Our team runs every standard and advanced sensitivity analysis, from leave-one-out through threshold analysis, and delivers publication-ready results tables. get a free feasibility assessment for your study and let us strengthen your systematic review manuscript, or explore our full range of meta-analysis research assistance.

Threshold for clinical relevance: Beyond statistical significance, you can calculate how much the pooled effect would need to shift to cross a clinically meaningful threshold. If the pooled risk ratio is 0.72 and the minimally important difference is 0.85, the question becomes: "What would need to change for the effect to become clinically unimportant?"

Unmeasured confounding sensitivity analysis: For systematic reviews of observational studies, the E-value quantifies how strong an unmeasured confounder would need to be to explain away the observed association. A large E-value means the result is robust to potential confounding; a small E-value means even weak confounding could account for the finding.

Sensitivity Analysis in Systematic Reviews: Methods, Examples, and Reporting

Key Takeaways

What Sensitivity Analysis Tests

Leave-One-Out Analysis

Decision-Node Sensitivity Analysis

Threshold Sensitivity Analysis

Frequently Asked Questions

Related Articles

Reading About Meta-Analysis? Our PhD Team Runs Them Every Day.

Dr. Sarah Mitchell

Reading About Meta-Analysis? Our PhD Team Runs Them Every Day.

Risk of Bias Sensitivity Analysis

Reporting Sensitivity Analysis Results

Pre-Specifying Sensitivity Analyses in the Protocol

Common Mistakes to Avoid

Related Articles