Individual participant data meta-analysis pools raw participant-level data across studies instead of aggregate effect estimates. This guide covers when an IPD meta-analysis is justified, one-stage versus two-stage models, data acquisition and harmonization, software (Stata ipdmetan, R metafor and ipdma), reporting under PRISMA-IPD, and common pitfalls in subgroup and interaction testing.
Dr. Sarah Mitchell
April 27, 2026
Considering an IPD meta-analysis for your project? Our meta-analysis service handles protocol design, data acquisition, harmonisation, and one-stage or two-stage modelling end-to-end.
Key Takeaways
IPD meta-analysis pools raw participant-level data across trials and applies one consistent analytic framework, which usually reduces heterogeneity from analytic divergence rather than true clinical variation.
The strongest justification for IPD pooling is within-trial interaction testing for effect modification by age, sex, severity, biomarker status, or comorbidity. Aggregate-data meta-regression cannot do this without ecological bias.
Two-stage models pool study-level estimates and reproduce the familiar forest-plot output. One-stage hierarchical models are preferred for sparse binary outcomes and efficient interaction estimation.
Most IPD projects take 18 to 36 months. Data acquisition (author contact, data-sharing agreements, validation, and harmonisation) typically consumes 9 to 18 months and is the rate-limiting step.
Common tools include Stata ipdmetan, R metafor for two-stage pooling, and R ipdma, lme4, glmmTMB, or survival for one-stage hierarchical models. Bayesian models are fitted in Stan or JAGS.
Reviews must follow the PRISMA-IPD extension. The most consequential additions are a flow diagram of trials and participants contributed, a data-integrity check against each trial's published primary analysis, and explicit within-trial interaction reporting.
An IPD review is the wrong design when the question is purely about an average treatment effect, when trials used a single common outcome, or when the budget and timeline cannot absorb the additional 12 to 24 months.
What an IPD meta-analysis offers that aggregate-data pooling cannot
An individual participant data meta-analysis collects the raw, line-level records of every participant in each included study and re-analyses them together rather than pooling published summary effects. The Cochrane Handbook describes this approach in Chapter 26 and treats it as the methodological reference standard when participant-level data are obtainable, particularly for randomised trials of an intervention with a continuous, binary, time-to-event, or repeated-measures outcome.
Aggregate-data meta-analysis is fast, transparent, and reproducible from public reports, but it inherits whatever choices the original authors made about exposure definitions, follow-up windows, missing-data handling, and covariate adjustment. A pooled analysis of individual participant data lets the review team apply one consistent analytic framework across every study, which usually reduces heterogeneity that originates from analytic divergence rather than true clinical variation.
Why review teams pursue IPD even when aggregate data are available
The headline argument for IPD pooling is statistical power for subgroup and interaction analysis. Aggregate-data meta-regression can detect ecological associations across studies, but it cannot test whether a treatment effect varies across individuals within studies. Within-trial interaction testing requires participant-level data. When the research question is about effect modification by age, sex, baseline severity, biomarker status, or comorbidity, IPD pooling is the only design that protects against ecological bias.
Other practical reasons that drive a team toward IPD include:
Time-to-event outcomes with study-specific follow-up windows, where aggregate hazard ratios cannot be reconstructed without participant-level censoring information.
Frequently Asked Questions
11
An IPD meta-analysis collects the raw participant-level data from each included study and re-analyses them together using one consistent analytic framework, rather than pooling the published summary effect estimates. It is the methodological reference standard for synthesising randomised trials when participant-level data are obtainable.
A regular (aggregate-data) meta-analysis pools effect estimates already published in each trial, so it inherits the original authors' choices about outcome definitions, follow-up windows, and covariate adjustment. An IPD meta-analysis pools the raw records of every participant, which lets the review team apply one consistent analysis across every trial and lets them test whether the treatment effect varies across individuals within studies.
IPD is justified when the research question is about effect modification, when the trials used heterogeneous outcome scales that need harmonisation, when the outcome is time-to-event with non-proportional hazards, or when existing aggregate-data syntheses disagree because of inconsistent definitions. If the question is just about an average treatment effect across one common outcome, an aggregate-data review is usually sufficient.
Two-stage models analyse each trial separately to obtain a treatment-effect estimate and standard error, then pool those estimates with a standard random-effects meta-analysis. One-stage models fit a single hierarchical regression across all participant records, with study identifiers entered as random effects. Two-stage is the default for most settings; one-stage is preferred for sparse binary outcomes and for efficient interaction estimation.
Most published IPD projects take between 18 and 36 months from protocol registration to manuscript submission. The bottleneck is data acquisition: contacting corresponding authors, negotiating data-sharing agreements, validating the supplied files, and harmonising outcome and covariate definitions across trials typically consumes 9 to 18 months on its own.
Common tools include Stata's ipdmetan and mvmeta packages for two-stage pooling, R's metafor for two-stage random-effects models, and R's ipdma, lme4, glmmTMB, or survival packages for one-stage hierarchical models. Bayesian one-stage models are typically fitted in Stan, JAGS, or WinBUGS when prior information needs to be incorporated.
Yes, every contributing trial requires a data-sharing agreement and usually institutional review board approval on both the contributing and the receiving side. Response rates from corresponding authors typically range from 50 to 80 percent, with industry-sponsored trials and trials older than 15 years being the hardest to obtain.
The PRISMA-IPD extension, published in JAMA in 2015, supplements the standard PRISMA 2020 checklist with 33 IPD-specific items. The most consequential additions are a flow diagram showing trials and participants contributed, a data-integrity check against each trial's published primary analysis, and explicit reporting of within-trial interaction tests for any subgroup analysis.
Reviews should declare a sensitivity analysis comparing the IPD-only pooled estimate with a hybrid estimate that incorporates aggregate data from the non-responding trials. The two can diverge meaningfully when non-response is non-random, particularly when industry-sponsored or older trials are systematically less likely to share.
Not always. IPD adds value when within-individual variation, harmonisation, or effect modification is the question. For a straightforward pooled estimate of an average treatment effect with a single common outcome, an aggregate-data review with sensitivity analyses produces nearly identical inference at a fraction of the cost and timeline.
Cochran's Q is the traditional test of heterogeneity in meta-analysis. It sums the squared deviations of each study's effect estimate from the pooled estimate, weighted by inverse variance, and follows a chi-square distribution with k minus 1 degrees of freedom under the null hypothesis of homogeneity. In IPD meta-analysis, Q is computed at the second stage of two-stage pooling or implied by the random-effects variance component in one-stage hierarchical models. Q is underpowered with few trials and over-sensitive with many, so it is reported alongside I-squared, which expresses the proportion of total variation attributable to genuine between-study differences rather than sampling error.
Share
Found this useful? Share it with your colleagues.
Need help with your meta-analysis?
Our PhD statisticians run complete meta-analyses: effect sizes, forest plots, heterogeneity testing, and publication-ready results sections.
Reading About Meta-Analysis? Our PhD Team Runs Them Every Day.
From data extraction to forest plots, sensitivity analysis, and a journal-ready manuscript. We handle the full meta-analysis so you can focus on your research question.
Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.
4.9 / 5Quote in minutesmetafor R + Cochrane HandbookPhD methodologistNDA available on request
Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences.
If you are still deciding between IPD and aggregate-data pooling, our statistical analysis team can scope the trade-offs against your timeline and budget.
Research Gold delivers IPD reviews under PRISMA-IPD with publication-ready forest plots, harmonisation logs, and reproducible code. Request a quote and a methodologist will respond within an hour.
Reading About Meta-Analysis? Our PhD Team Runs Them Every Day.
From data extraction to forest plots, sensitivity analysis, and a journal-ready manuscript. We handle the full meta-analysis so you can focus on your research question.
Quote in minutes. Pay only after you approve your quote. Unlimited revisions until your reviewers are satisfied. NDA available on request.
Outcome harmonisation when included trials measured the same construct with different scales (Hamilton Depression versus Beck Depression Inventory) and a common conversion is needed.
Risk of bias from selective reporting at the outcome level, which IPD checks can detect by recomputing effects from raw data.
Updating a network meta-analysis with patient-level covariates to adjust for transitivity violations.
The Cochrane IPD Methods Group has documented these gains and the trade-offs in their guidance on individual participant data meta-analyses of randomised trials, and Riley and colleagues' 2010 PLOS Medicine guidance remains the most-cited synthesis of when IPD is justified.
The decision rule: when is IPD worth the cost
An individual participant data project consumes between 18 and 36 months of calendar time and substantially more researcher and statistical effort than a conventional aggregate-data meta-analysis. Funders increasingly require justification for the additional cost. A defensible IPD proposal should rest on at least one of the following criteria:
The primary question is about effect modification or subgroup variation, and aggregate-data subgroup tests have produced inconclusive or ecologically biased estimates.
The outcome is time-to-event with non-proportional hazards or differential follow-up across trials.
Existing aggregate-data syntheses disagree because of inconsistent outcome definitions or covariate adjustment, and harmonised re-analysis would resolve the discrepancy.
A clinically important subgroup is rare within any single trial but common across the pooled cohort, and within-trial estimates lack precision.
The therapeutic question has stalled at a clinically equipoise level and a definitive synthesis is needed before the next trial is designed.
If none of these conditions apply, an aggregate-data meta-analysis with sensitivity and meta-regression analyses is usually sufficient and substantially cheaper.
Figure 1. When IPD is worth the cost.
Need help with your meta-analysis?
Our PhD statisticians run complete meta-analyses: effect sizes, forest plots, heterogeneity testing, and publication-ready results sections.
The two analytical frameworks for an IPD meta-analysis differ in whether participant-level data from all trials are analysed together or separately followed by pooling.
Figure 2. Two-stage versus one-stage IPD architectures.
Two-stage models
Stage one fits a study-specific model to each trial separately, producing an effect estimate (log odds ratio, log hazard ratio, mean difference) and its standard error. Stage two pools those estimates using a standard random-effects meta-analysis model with DerSimonian-Laird or REML estimation of between-study variance. Two-stage models are the default in most published IPD reviews because they reproduce the familiar forest-plot output of an aggregate-data review and they handle clustering by study automatically.
The Burke et al. 2017 Statistics in Medicine paper showed that two-stage models give nearly identical inference to one-stage models for continuous outcomes when the within-trial sample size is adequate. They are also more practical when data sharing agreements require trial-level analyses to remain on the host institution's server.
One-stage models
A one-stage model fits a single hierarchical regression across all participant records, with study identifiers entered as random effects (random intercepts and, optionally, random treatment-effect slopes). For binary outcomes the typical specification is a generalised linear mixed model with a logit link and study-level random effects. For time-to-event outcomes a stratified Cox model or a frailty model with study as the frailty term is standard.
One-stage models are preferred when:
The number of events per trial is small (sparse-data bias makes two-stage estimates unstable).
Within-trial interaction terms need to be estimated efficiently (the one-stage framework permits a treatment-by-covariate term while keeping study-level confounding under random-intercept control).
The team needs flexible non-linear modelling of continuous covariates using restricted cubic splines.
The Riley 2023 paper "Two-stage or not two-stage?" in Research Synthesis Methods is the most current guidance and recommends one-stage for sparse-data binary outcomes and two-stage as the default elsewhere.
Data acquisition: the bottleneck most teams underestimate
Securing individual participant data is the rate-limiting step of any IPD project. The workflow is:
Inventory candidate studies from a systematic search (the search itself follows PRISMA 2020 and is no different from a conventional review).
Contact corresponding authors with a structured request describing the planned analysis, the variables required, and the data-sharing infrastructure.
Negotiate a data transfer agreement with each contributing institution, often involving institutional review board approvals on both sides.
Receive and validate the data, recomputing the trial's published primary analysis from the raw file as an integrity check.
Harmonise variables across trials, mapping outcome scales, covariate definitions, and time units to a common analytical schema.
Empirical reports place the typical response rate from corresponding authors at between 50 percent and 80 percent of contacted teams, with refusal rates highest for industry-sponsored trials and trials older than 15 years. Review protocols should declare a sensitivity analysis comparing the IPD-only pooled estimate with a hybrid estimate that incorporates aggregate data from non-responding trials, since the two can diverge meaningfully when non-response is non-random.
Software for IPD meta-analysis
The software ecosystem has matured substantially in the last decade. The main packages are:
Stata ipdmetan for two-stage IPD pooling with forest-plot output and meta-regression hooks.
Stata mvmeta for multivariate meta-analysis when outcomes are correlated.
R metafor for two-stage random-effects models with REML and a wide range of moderator and meta-regression options.
R ipdma and ipdmeta packages for one-stage hierarchical models including treatment-by-covariate interactions.
R lme4 or glmmTMB for one-stage generalised linear mixed models, especially for binary and count outcomes.
R survival with frailty terms or stratified Cox models for time-to-event one-stage analyses.
Reporting templates and replication code for the most-cited IPD methods papers are available through the Centre for Statistics in Medicine and through the Cochrane IPD Methods Group resource library.
Reporting under PRISMA-IPD
The PRISMA-IPD extension, published in JAMA in 2015, provides a 33-item checklist that supplements PRISMA 2020 with IPD-specific items. The most consequential additions are:
A flow diagram showing the number of trials approached, the number that responded, the number that supplied IPD, and the participants contributed by each.
A description of the data-checking procedures used to verify that the supplied IPD reproduces the trial's published primary analysis.
A statement of how missing IPD trials were handled, including a sensitivity analysis comparing IPD-only and IPD-plus-aggregate estimates.
An explicit reporting of within-trial interaction tests for any subgroup or effect-modification analysis, with separate within-trial and across-trial estimates.
Reviews that fail to report within-trial interaction estimates are vulnerable to ecological bias and have been re-analysed by independent groups in several published critiques.
Common pitfalls
The most frequent errors in published IPD reviews include:
Treating IPD as a single-stage data set without accounting for clustering by study, which inflates type I error rates by an order of magnitude in some settings.
Pooling within-trial and across-trial covariate effects in one regression term, which conflates effect modification with confounding.
Selective inclusion of trials that supplied IPD without comparing characteristics with non-responding trials, biasing the pooled estimate toward the responder population.
Reporting subgroup p-values without an interaction test, a problem the EQUATOR Network and several editorials have repeatedly flagged.
Underpowered interaction analyses that are reported as "no evidence of effect modification" when the confidence interval is wide enough to include clinically important interactions.
When to commission an IPD meta-analysis
An IPD meta-analysis is the right design when the clinical or policy question depends on within-individual variation, when the included trials used heterogeneous outcome definitions, or when an aggregate-data review has produced inconclusive or contested estimates. It is the wrong design when the question is purely about average treatment effect, when the trials used a single common outcome, or when the budget and timeline cannot absorb 18 months and a full statistical team. In those cases a conventional aggregate-data review with sensitivity analyses delivers most of the inferential value at a fraction of the cost.
For most thesis and journal-article timelines, an aggregate-data meta-analysis is the appropriate design. IPD reviews are typically funded as multi-year collaborations and benefit from a methodologist with prior IPD experience leading the analytic plan from the protocol stage. For teams who need that statistical lead but cannot recruit one in-house, our PhD-led meta-analysis service and full data analysis service handle one-stage and two-stage IPD pooling end-to-end.
WinBUGS, JAGS, or Stan for Bayesian one-stage models when prior information needs to be incorporated.
A cohort study follows people by exposure status over time to measure incidence and relative risk. Learn prospective vs retrospective designs and the bias to control.
A cross-sectional study measures exposure and outcome at one time point. Learn when to use this design, how to analyze prevalence, and the bias to avoid.
A case-control study compares prior exposure in people with and without a disease. Learn why it suits rare outcomes, how to read the odds ratio, and the bias to control.