Heterogeneity & Sample Size Calculator

Free

Assess heterogeneity in your meta-analysis with I², Cochran's Q, τ² (DerSimonian-Laird or REML estimator), and prediction intervals. Import studies directly from the Effect Size Calculator or Forest Plot via the pipeline, or drag-drop a CSV/Excel file. Calculate the minimum number of studies needed to detect a given effect size.

How to Use

Heterogeneity tab: Import studies from the Effect Size Calculator or Forest Plot via the pipeline banner, drag-drop a CSV/Excel file, or enter effect sizes and standard errors manually. Toggle between DerSimonian-Laird (DL) and REML (τ² estimator) in the results panel. REML uses Fisher scoring iteration for less biased estimates with small study counts. All inputs are auto-saved locally. Export results as CSV, Excel, or copy to clipboard.
Power tab: Enter your expected effect size, significance level, and desired power to estimate the minimum number of studies needed.

Analysis Pipeline

Move data between tools automatically. Compute effect sizes, then send results to Forest Plot, Funnel Plot, or Heterogeneity analysis with one click.

Data Extraction

Median/IQR SE/SD P→CI

Effect Sizes

Effect Size

Synthesis

Sensitivity Heterogeneity(here)

Visualization

Forest Plot Funnel Plot

No data in pipeline yet. Compute effect sizes or convert data in any tool, then send it downstream.

Drag & drop a file or

CSV, TSV, Excel (.xlsx/.xls) - max 500 rows

Study Data

Load sample data to see how the tool works, or clear all fields to start fresh.

#Study NameEffect Size (y)Standard Error (SE)

Enter effect sizes and SEs for at least 2 studies.

Next step

Got the number. Want a PhD to plan the full study?

Sample size with attrition, randomization plan, power curves, and a publication-ready methods section.

Our promise: Free re-run and re-write if reviewers question the analysis or reporting.

Quote in minutesPay only after you approve your quotePhD methodologistReproducible R or Stata codeNDA available on request

Quote my statistical analysis WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the analysis, assumptions, or reporting, we re-run and re-write free.

Confidentiality

NDA available on request before any project discussion. Your data, study design, and manuscript stay private either way.

Free download

The sample size and power quick guide

A printable reference to the inputs behind a power analysis and how to justify the number your reviewers will ask about.

Download the PDF

How to Use This Calculator

Import or enter your study data. You have three options: import studies directly from the Effect Size Calculator or Forest Plot Generator via the pipeline banner at the top of the tool; drag-drop a CSV or Excel file (or paste spreadsheet data) using the import area; or type effect sizes and standard errors manually into the study table.
Review your study entries. The tool requires at least two studies with valid effect sizes and standard errors. Add or remove rows as needed. All inputs are auto-saved to your browser, so you can close the tab and return later without losing work.
Choose your τ² estimator. Toggle between DerSimonian-Laird (DL) and REML in the results panel. DL is the classic moment-based estimator and remains the default in many software packages. REML (Restricted Maximum Likelihood) uses Fisher scoring iteration (up to 100 iterations, convergence threshold 1×10⁻&sup8;) and produces less biased estimates, especially when the number of studies is small. When REML is selected, the DL estimate is shown alongside for comparison.
Interpret the results. The panel reports Cochran's Q, I², H², τ², and the pooled random-effects estimate with its 95% confidence interval and 95% prediction interval. Use the traffic-light interpretation guide below to contextualize your I² value.
Export your results. Copy the summary to your clipboard, or export the full study-level data (including fixed-effect and random-effects weights) as CSV or Excel for use in your manuscript or supplementary materials.
Continue in the pipeline. The workflow bar at the top shows where this tool sits in the Synthesis phase. After assessing heterogeneity, proceed to the Forest Plot Generator or Funnel Plot Generator to visualize your pooled results.

Want a PhD methodologist to handle the whole project?

Get a full heterogeneity analysis with expert statistical interpretation. Free re-run and re-write if reviewers question the analysis or reporting. Pay only after you approve your quote.

WhatsApp Quote my statistical analysis

Interpreting Heterogeneity

I² = 0–40%

Low heterogeneity. Studies are reasonably consistent. A fixed-effect model may be appropriate.

I² = 40–75%

Moderate heterogeneity. Consider exploring sources via subgroup or meta-regression analyses.

I² = 75–100%

High heterogeneity. Results should be interpreted with caution. Investigate sources before pooling.

Heterogeneity Assessment in Meta-Analysis: From I-Squared to Prediction Intervals

An I-squared calculator quantifies the proportion of total variability in a set of effect estimates that is attributable to genuine between-study differences rather than within-study sampling error. Introduced by Higgins & Thompson (2002) and codified in the Cochrane Handbook (Higgins et al., 2023), I-squared has become the most widely reported heterogeneity statistic in systematic reviews because it offers an intuitive percentage-based interpretation. Cochrane RevMan, the standard software for Cochrane systematic reviews, reports I-squared alongside Cochran's Q and tau-squared as built-in heterogeneity statistics in every forest plot output. However, I-squared alone does not tell the full story. It is a relative measure that describes the ratio of between-study variance to total variance, and its value depends heavily on the precision of the included studies. Two meta-analyses with identical between-study variance can produce very different I-squared values if one contains large, precise trials and the other contains small, imprecise ones.

This is why a comprehensive heterogeneity test calculator must also report absolute measures. Tau-squared (the between-study variance on the effect-size scale) provides the information that I-squared cannot: how much the true effects actually vary across studies. DerSimonian & Laird (1986) proposed the most commonly used moment-based estimator of tau-squared, and it remains the default in many software packages. More modern estimators, such as restricted maximum likelihood (REML), now recommended as the preferred alternative to DerSimonian-Laird for tau-squared estimation, and the Paule-Mandel estimator, which has been shown to produce improved between-study variance estimates with better coverage properties, produce less biased estimates, particularly when the number of studies is small. Regardless of the estimator, tau-squared feeds directly into the random-effects model weight calculation and into prediction intervals, which describe the range within which the true effect of a future similar study is expected to fall.

The relationship between I-squared and tau-squared is governed by the equation I-squared = tau-squared / (tau-squared + typical within-study variance). This means that meta-analysis heterogeneity can appear low (small I-squared) even when the absolute between-study variance is clinically meaningful, a scenario that arises when the included studies are small and imprecise. Conversely, very large trials can inflate I-squared to alarming levels even when the absolute differences between study effects are trivially small. The Cochrane Handbook recommends interpreting I-squared alongside the confidence interval for I-squared itself, the Q-test p-value, and visual inspection of the forest plot. Prediction intervals, as advocated by IntHout et al. (2016), provide a valuable complement to I-squared by expressing the expected range of true effects in future similar studies, a clinically intuitive measure that can reveal meaningful heterogeneity even when I-squared appears moderate. Our effect size calculator helps ensure that the effect estimates entering your heterogeneity assessment are computed consistently, while the forest plot generator provides the visual context needed to judge whether observed variability is clinically important.

Beyond descriptive statistics, understanding the sources of heterogeneity is often the most valuable part of a meta-analysis. When I-squared exceeds 40-50%, PRISMA 2020 (Page et al., 2021) recommends pre-specified subgroup analyses or meta-regression to explore potential moderators. Study-level characteristics such as risk of bias rating, intervention dose, follow-up duration, and participant demographics may explain why effects differ across studies. As a preliminary step, outlier detection using externally studentized residuals can identify individual studies whose effect sizes fall far from the pooled estimate, flagging potential data errors or genuinely distinct study populations that warrant separate investigation. Our meta-regression data formatter structures moderator data for direct import into R metafor or Stata, while the RoB 2 assessment tool provides the methodological quality ratings that serve as common moderator variables. Together, these tools create a workflow where heterogeneity is not merely reported but actively investigated.

Finally, the tau-squared calculator output has direct implications for sample size planning in future research. When between-study variance is large, even a well-powered individual trial may not resolve the clinical question. What is needed instead is reduction in the sources of heterogeneity through standardized protocols and patient populations. Conversely, when tau-squared is near zero and the summary effect is imprecise, the field needs more or larger studies rather than better-designed ones. By combining I-squared, tau-squared, prediction intervals, and Cochran's Q in a single assessment, researchers gain a multidimensional view of heterogeneity that supports both the interpretation of current evidence and the planning of future research.

REML vs. DerSimonian-Laird: Choosing the Right τ² Estimator

The DerSimonian-Laird (DL) estimator, introduced in 1986, is the most widely used method for estimating between-study variance (τ²) in random-effects meta-analysis. It is a moment-based estimator: it equates the observed Cochran's Q statistic to its expected value under the random-effects model and solves for τ². DL is computationally simple, non-iterative, and has been the default in Cochrane RevMan and most statistical software for decades. However, DL has well-documented limitations. It tends to underestimate τ² when the number of studies is small (fewer than 15–20), which in turn leads to confidence intervals that are too narrow and pooled estimates that appear more precise than they truly are. This negative bias can have real consequences for clinical decision-making, particularly in fields where meta-analyses routinely include only 5–10 studies.

The Restricted Maximum Likelihood (REML) estimator addresses these shortcomings through an iterative likelihood-based approach. REML maximizes a restricted log-likelihood function that accounts for the loss of degrees of freedom from estimating the overall mean, analogous to using n − 1 instead of n in a sample variance calculation. Our implementation uses Fisher scoring, running up to 100 iterations with a convergence threshold of 1×10⁻&sup8;, initialized at the DL estimate to ensure rapid convergence. Simulation studies by Veroniki et al. (2016) and Langan et al. (2019) have shown that REML produces less biased τ² estimates than DL, particularly when the number of studies is small or when studies have unequal sample sizes. The Cochrane Statistical Methods Group and the metafor R package (Viechtbauer, 2010) now recommend REML as the preferred default estimator for random-effects meta-analysis.

When should you use each? For meta-analyses with 20 or more reasonably homogeneous studies, DL and REML typically produce very similar τ² estimates, and the choice is unlikely to affect conclusions. For smaller meta-analyses, which represent the majority of published systematic reviews, REML is the safer choice because it reduces the risk of underestimating between-study variance. When you select REML in this calculator, the DL estimate is displayed alongside for comparison: if the two values are similar, you can be more confident in the result; if they diverge substantially, it signals that the DL estimate may be biased and REML should be preferred. In either case, always report which estimator was used, as recommended by PRISMA 2020 (Page et al., 2021) and the Cochrane Handbook (Higgins et al., 2023).

Frequently Asked Questions

What is I² (I-squared)?

I² describes the percentage of variability in effect estimates that is due to heterogeneity rather than sampling error. An I² of 0% means all variability is due to chance; 100% means all variability reflects true differences between studies. It was proposed by Higgins & Thompson (2002).

What is τ² (tau-squared)?

τ² is the between-study variance in a random-effects meta-analysis. Unlike I², τ² is on the scale of the effect size, making it useful for calculating prediction intervals. It's estimated using methods like DerSimonian-Laird or REML.

When should I use a random-effects model?

Use random-effects when you expect genuine variation between studies (different populations, interventions, settings). If I² > 0 or the Q test is significant, a random-effects model is generally more appropriate than fixed-effect.

How many studies do I need for a meta-analysis?

Technically, you can pool 2+ studies. However, with fewer than 5 studies, estimates of between-study variance (τ²) are imprecise, and tests for heterogeneity have low power. Most methodologists recommend at least 5-10 studies for reliable random-effects results.

What does I² = 75% mean in a meta-analysis?

An I² of 75% means that 75% of the total variability in effect estimates is due to true differences between studies (heterogeneity) rather than sampling error. The Cochrane Handbook suggests: I² = 0–40% might not be important, 30–60% may represent moderate heterogeneity, 50–90% may represent substantial heterogeneity, and 75–100% represents considerable heterogeneity. Always interpret I² alongside τ² and the prediction interval.

What is the difference between I² and τ²?

I² is a percentage describing the proportion of variability due to heterogeneity (relative measure). τ² (tau-squared) is the absolute between-study variance in the true effect sizes, expressed on the effect size scale. I² depends on study precision and sample size; it increases mechanically as studies get larger. τ² is scale-dependent but precision-independent, making it more appropriate for comparing heterogeneity across meta-analyses.

When should I use a random-effects model instead of fixed-effect?

Use a random-effects model when you expect true effect sizes to vary across studies due to differences in populations, interventions, or settings, which is almost always the case in systematic reviews. The Cochrane Handbook recommends random-effects (e.g., DerSimonian-Laird) as the default unless there is strong reason to believe all studies estimate the same true effect. Fixed-effect models are appropriate when studies are very similar in design and population.

What is the REML estimator and when should I use it?

REML (Restricted Maximum Likelihood) is an iterative, likelihood-based estimator for τ² that accounts for the degrees of freedom lost when estimating the overall mean. It produces less biased estimates than DerSimonian-Laird, especially when the number of studies is small (fewer than 15–20). This calculator implements REML via Fisher scoring with up to 100 iterations and a convergence threshold of 1×10⁻⁸. REML is now recommended as the default by the Cochrane Statistical Methods Group and the metafor R package.

Can I import studies from other Research Gold tools?

Yes. If you have already computed effect sizes using the Effect Size Calculator or prepared data in the Forest Plot Generator, the pipeline banner at the top of the tool lets you import those studies directly with no re-entry needed. You can also drag-drop a CSV or Excel file, or paste spreadsheet data. All imported data can be edited in place before running the analysis.

Related Research Tools

Ready to visualize your pooled results? Our meta-analysis forest plot tool renders publication-ready forest plots with subgroup diamonds and prediction intervals. Before pooling, determine whether your study has adequate statistical power with the power analysis calculator, which estimates the probability of detecting a true effect at various sample sizes. To compute the standardized mean differences, odds ratios, or correlations you need as inputs, use our effect size calculator with built-in confidence intervals and variance estimates. When you observe substantial heterogeneity, identify which studies drive it using our GOSH plot generator, which computes the pooled estimate for every possible subset of studies.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

This Calculator Is Free. The Full Analysis? We Handle That Too.

Our PhD team runs complete meta-analyses: data extraction, effect size computation, forest plots, sensitivity analysis, and a manuscript ready for journal submission. Most projects deliver in under 2 weeks.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

4.9 / 5 across 1,194+ projectsQuote in minutesmetafor R + Cochrane HandbookPhD methodologistPay only after you approve your quoteNDA available on request

Quote my meta-analysis Chat on WhatsApp

Need the whole review, not just the analysis? Quote my systematic review and meta-analysis

The methodologists behind your review

Your project is led by a named PhD methodologist with real credentials and published work.

4.9 / 5 across 1,194+ delivered projects

Meet our methodologists

Wei Cheng, PhD

Network Meta-Analysis

Eva Culakova, PhD

Clinical Trials

Belinda Burford, PhD

GRADE

Shelley Strowman, PhD

Nursing / DNP

Jenny Berrio, MD, PhD

Meta-Analysis

You Shape What We Build Next