The type of effect size displayed on a forest plot depends on the outcome being measured and the study designs included. Choosing the correct metric and scale is essential for accurate forest plot interpretation.
Mean Difference (MD) and Standardized Mean Difference (SMD)
When all studies measure outcomes on the same scale (e.g., blood pressure in mmHg), the forest plot displays mean differences with the line of no effect at zero. When studies use different scales to measure the same construct (e.g., pain measured by VAS and NRS), the standardized mean difference is used, typically reported as Cohen's d or Hedges' g. Hedges' g applies a correction for small sample bias, making it the preferred choice for meta-analyses with studies under 20 participants per group. Before creating a forest plot, calculate your effect sizes with our free our free effect size calculator.
Odds Ratio (OR) and Risk Ratio (RR)
For dichotomous outcomes (event occurred or did not), forest plots display odds ratios or risk ratios. An odds ratio compares the odds of an event in the treatment group versus the control group. A risk ratio compares the probability of an event. Both use a line of no effect at 1, and values are plotted on a logarithmic scale to ensure symmetry, an OR of 0.5 (halved odds) and an OR of 2.0 (doubled odds) appear equidistant from 1.
Hazard Ratio (HR)
For time-to-event outcomes (survival analysis), forest plots display hazard ratios. Like OR and RR, the line of no effect is at 1 and values are plotted on a log scale. A hazard ratio below 1 typically indicates a protective effect (slower event occurrence in the treatment group).
| Measure | Outcome Type | Line of No Effect | Scale | Common Use |
|---|
| Mean Difference (MD) | Continuous (same scale) | 0 | Linear | Blood pressure, weight |
| Standardized Mean Difference (SMD) | Continuous (different scales) | 0 | Linear | Pain, depression scores |
| Odds Ratio (OR) | Dichotomous | 1 | Log | Case-control studies |
| Risk Ratio (RR) | Dichotomous | 1 | Log | RCTs, cohort studies |
| Hazard Ratio (HR) | Time-to-event | 1 | Log |
Understanding Heterogeneity on a Forest Plot
Heterogeneity on a forest plot reveals whether the included studies are measuring the same underlying effect or producing genuinely different results. Visual and statistical cues work together to quantify this variation and determine whether the pooled effect size is a meaningful summary or an oversimplification.
Visual cues provide the first indication. When confidence intervals overlap substantially and most squares cluster near the diamond, heterogeneity is low. When intervals are scattered across the plot with minimal overlap, or when some studies show strong positive effects while others show negative effects, heterogeneity is high. In our meta-analysis work, the most common misinterpretation we encounter is researchers focusing on the diamond while ignoring I² values above 75%.
I-squared interpretation follows established Cochrane thresholds for the heterogeneity assessment:
| I² Value | Classification | Interpretation |
|---|
| 0-25% | Low | Studies are consistent; pooled estimate is reliable |
| 25-50% | Moderate | Some variation; investigate but pooled estimate is usually acceptable |
| 50-75% | Substantial | Meaningful variation; explore with subgroup analysis |
| >75% | Considerable | Studies disagree; pooled estimate should be interpreted with caution |
Cochrane classifies heterogeneity as low (I² < 25%), moderate (25-50%), substantial (50-75%), or considerable (>75%), these thresholds guide whether the pooled estimate should be interpreted with caution (Higgins et al., 2023).
When heterogeneity is substantial or considerable, subgroup analysis and meta-regression can explore potential sources. Common sources include differences in study populations, interventions, outcome measurement, and follow-up duration. The Q-test provides a p-value for heterogeneity, but it has low statistical power with few studies and excessive power with many studies, I² is generally more informative. Tau-squared quantifies the absolute amount of between-study variance in a random-effects model. For a deeper discussion, see our guide on understanding heterogeneity in meta-analysis.
In a random-effects meta-analysis, study weights are more evenly distributed than in a fixed-effect model, because the random-effects model accounts for both within-study and between-study variance (DerSimonian & Laird, 1986). This distinction matters because the model choice affects both the width of the diamond and the relative influence of each study.
Five recurring errors compromise forest plot interpretation across disciplines. Recognizing these pitfalls prevents flawed conclusions that can misguide clinical practice and research direction.
Confusing statistical significance with clinical significance. A diamond that does not cross the line of no effect indicates statistical significance, but the magnitude of the effect may be too small to matter clinically. A pooled standardized mean difference of 0.1 might be statistically significant with thousands of participants yet meaningless in practice. Always evaluate whether the effect size is clinically relevant, not just statistically distinguishable from zero.
Ignoring heterogeneity when the diamond looks favorable. A well-positioned diamond can create false confidence. If I² is 80%, the studies fundamentally disagree about the magnitude or even the direction of the effect. The diamond in that scenario is an average of conflicting findings, not a robust summary. Always check I-squared before drawing conclusions from the pooled effect size.
Over-interpreting results from few studies. A meta-analysis with two or three studies produces a forest plot that looks like any other. However, the pooled estimate from two studies is highly sensitive to each individual result, the heterogeneity statistics are unreliable, and the confidence interval may be misleadingly narrow if both studies happen to agree. Sensitivity analysis tests result robustness by removing one study at a time, this is critical when the forest plot includes few studies.
Mistaking fixed-effect for random-effects forest plots. The same data can produce different diamonds depending on the model. A fixed-effect model assumes one true underlying effect, producing a narrower diamond driven by the largest studies. A random-effects model accounts for between-study heterogeneity, producing a wider diamond with more evenly distributed weights. Check which model was used before interpreting the result, most systematic reviews use random-effects models, but not all.
Reading odds ratios or risk ratios on a linear scale instead of a log scale. Ratio measures (OR, RR, HR) should be plotted on a logarithmic scale. On a log scale, an OR of 0.5 and an OR of 2.0 are equidistant from 1. On a linear scale, they are not, creating a visual distortion that makes effects appear asymmetric. Properly constructed forest plots use log scales for ratio measures, but always verify this when reading unfamiliar plots.
Several software options produce publication-ready forest plots, ranging from free tools requiring no installation to statistical packages used by biostatisticians worldwide. Your choice depends on your technical comfort level and the complexity of your analysis.
R (metafor and meta packages), R is the most flexible platform for forest plot generation. The metafor package by Viechtbauer (2010) provides full control over every visual element, from label formatting to color coding by subgroup. The meta package by Balduzzi et al. offers a simpler interface with sensible defaults. Both packages handle effect size visualization for all common measures (MD, SMD, OR, RR, HR) and support subgroup, cumulative, and leave-one-out forest plots.
Stata (metan command), Stata's metan and admetan commands generate forest plots with options for random-effects and fixed-effect models, subgroup analysis, and prediction intervals. Stata is widely used in epidemiology and health services research, and its forest plot output is accepted by most journals without modification.
RevMan (Cochrane's tool), Review Manager is Cochrane's free software for conducting systematic reviews. It produces standardized forest plots that match Cochrane Handbook formatting requirements. RevMan is the required tool for Cochrane reviews and provides a guided interface that does not require programming knowledge.
Our free online forest plot generator, Create your own publication-ready forest plot with our free our forest plot generator. Enter your study data, select the effect size measure, and download a high-resolution figure suitable for journal submission, no software installation, no coding, no subscription required.
For researchers who want full control over their analysis, our complete meta-analysis guide covers the end-to-end process from data extraction through forest plot generation and interpretation. When your forest plot is ready, assess reporting quality using the GRADE summary of findings framework to contextualize your results within a broader evidence evaluation.
Different programs produce forest plots with varying customization. Explore software options for generating high-quality forest plots beyond RevMan.
For step-by-step instructions on building your own forest plot with subgroups and cumulative analysis, see our forest plot creation guide.
A specialized variant, the cumulative meta-analysis forest plot, shows how the pooled estimate changes as each study is added chronologically.