Effect size calculation for meta-analysis is the process of computing a standardized, quantitative measure of the magnitude of a treatment effect, exposure association, or group difference from individual study data so that results can be pooled across studies. Meta-analysis requires effect sizes as its fundamental input, without them, there is nothing to synthesize. In our meta-analyses, effect size extraction is where most data errors originate, standardizing your extraction process with a structured spreadsheet eliminates approximately 80% of these errors.
An effect size is a standardized, quantitative measure of the magnitude of a treatment effect, exposure association, or group difference. In meta-analysis, effect sizes from individual studies are pooled to produce a combined estimate. The three main families are: standardized mean differences (Cohen's d, Hedges' g), ratio measures (odds ratio, risk ratio, hazard ratio), and correlation coefficients (Pearson's r).
This guide covers every step of effect size calculation, from selecting the right measure for your outcome type, through formulas for computing each effect size family, to converting between measures when studies report results on different scales. Whether you are working with continuous outcomes, binary events, or survival data, you will find the formulas, tables, and practical guidance you need. For a broader overview of the full synthesis process, see our complete meta-analysis guide.
What Is an Effect Size in Meta-Analysis?
An effect size quantifies how large a treatment effect, group difference, or association is, independent of sample size. Unlike a p-value, which tells you whether a result is statistically significant, an effect size tells you whether it matters practically. Statistical significance can be achieved with a trivially small effect in a large enough sample, while a clinically meaningful effect may fail to reach significance in a small study.
Meta-analysis requires effect size calculation because pooling results across studies demands a common metric. Individual studies may report means and standard deviations, 2x2 tables, hazard ratios, or correlation coefficients. Before these can be combined, they must be translated into comparable effect sizes with accompanying variance or standard error estimates for weighting.
The distinction matters: a p-value of 0.03 from a study of 5,000 participants may reflect a negligible effect, while a p-value of 0.08 from a study of 40 participants may reflect a large one. Effect sizes strip away sample-size dependence and focus on what researchers actually care about, the magnitude and direction of the effect. Every forest plot in a meta-analysis visualizes these pooled effect sizes and their confidence intervals, making effect size calculation the foundation of evidence synthesis.
Types of Effect Size Measures
The correct effect size type depends on your outcome data and study design. There are three main families, each suited to different research questions. Choosing the wrong measure introduces bias or makes your results uninterpretable. The table below maps outcome types to the appropriate effect size measure.
| Outcome Type | Study Design | Recommended Effect Size | Example |
|---|---|---|---|
| Continuous (same scale) | randomized controlled trial or cohort | Mean Difference (MD) | Blood pressure in mmHg |
| Continuous (different scales) | randomized controlled trial or cohort | Standardized Mean Difference (Hedges' g) | Depression measured by BDI vs. PHQ-9 |
| Binary (events) | randomized controlled trial | Risk Ratio (RR) or Risk Difference (RD) | Infection yes/no |
| Binary (case-control) | Case-control | Odds Ratio (OR) | Disease exposure status |
| Time-to-event | Survival analysis | (HR) |
Standardized Mean Differences, Cohen's d and Hedges' g
Cohen's d is the most widely known standardized mean difference. It expresses the difference between two group means in standard deviation units: d = (M1 − M2) / SD_pooled. Cohen's d is a standardized mean difference that uses the pooled within-group standard deviation as the denominator. When studies measure the same construct on different scales, for example, depression severity using the Beck Depression Inventory and the Patient Health Questionnaire, the standardized mean difference allows direct comparison.
Hedges' g corrects for the upward bias inherent in Cohen's d when sample sizes are small. Hedges' g applies a correction factor J = 1 − 3/(4df − 1) to adjust Cohen's d for small-sample bias, which is particularly important when studies have fewer than 20 participants per group (Hedges, 1981). For meta-analysis, Hedges' g is the preferred standardized mean difference because most systematic reviews include at least some small studies.
Ratio Measures, Odds Ratio, Risk Ratio, Risk Difference
Odds ratios compare the odds of an event in one group to the odds in another. An odds ratio is a ratio effect size measure computed from a 2x2 contingency table: OR = (a/b) / (c/d), where a, b, c, and d are the cell counts. Odds ratios are the standard effect size for case-control studies and are widely used in clinical trials.
Risk ratios (relative risk) compare the probability of an event: RR = (a/(a+b)) / (c/(c+d)). Risk ratios are more intuitive than odds ratios for prospective studies. When event rates are low (under 10%), OR and RR approximate each other closely. When event rates are high, they diverge substantially, and the choice between them matters.
Risk difference measures the absolute difference in event probability between groups: RD = (a/(a+b)) − (c/(c+d)). Risk difference conveys clinical impact directly, a risk difference of 0.05 means 5 additional events per 100 people treated.
Correlation-Based, Pearson's r
Pearson's r measures the linear association between two continuous variables, ranging from −1 to +1. It is the natural effect size for studies examining relationships, for example, the correlation between physical activity and cardiovascular risk. For meta-analysis, r is typically transformed using Fisher's z-transformation before pooling and back-transformed for interpretation.
Hazard Ratios (Time-to-Event Data)
Hazard ratios are the standard effect size for survival analysis and time-to-event outcomes. An HR of 0.75 means the treatment group has a 25% lower instantaneous rate of the event at any given time. When primary studies report Kaplan-Meier curves without explicit HRs, you can digitize survival curves to extract the data needed for estimation.
How to Calculate Effect Size for Meta-Analysis: Cohen's d and Hedges' g
Cohen's d calculation starts with the difference between group means divided by the pooled standard deviation. The formula requires three inputs from each study: mean, standard deviation, and sample size for both groups.
Cohen's d formula: d = (M1 − M2) / SD_pooled, where SD_pooled = sqrt[((n1−1)SD1² + (n2−1)SD2²) / (n1 + n2 − 2)].
The variance of d is: V_d = (n1+n2)/(n1×n2) + d²/(2(n1+n2)).
Hedges' g correction: g = d × J, where J = 1 − 3/(4(n1+n2−2) − 1). This small sample bias correction adjusts for the tendency of Cohen's d to overestimate effects in small samples. The correction is negligible when total sample size exceeds 40, but critical for smaller studies.
Effect size interpretation follows the conventions established by Cohen (1988): 0.2 = small, 0.5 = medium, 0.8 = large. These benchmarks are widely cited but should be interpreted in context, a "small" effect may be clinically significant if the outcome is mortality, while a "large" effect may be trivial if the outcome measure is unreliable.
Use our free interactive effect size calculator to compute Cohen's d and Hedges' g from means, SDs, and sample sizes without manual formula entry.
Need help calculating and pooling effect sizes for your meta-analysis? Our biostatisticians handle everything from data extraction to forest plot generation and sensitivity analyses. get a personalized evidence synthesis quote, or see our meta-analysis services.