Meta-analysis in R gives researchers access to the most powerful, flexible, and freely available tools for synthesizing quantitative evidence. The metafor package, created by Wolfgang Viechtbauer, is the gold standard for meta-analysis in R, cited in over 15,000 peer-reviewed publications and supporting more than 40 different effect size measures.

This guide walks through every step of conducting a meta-analysis in R using metafor, from installing packages and preparing data to generating forest plots, assessing heterogeneity, testing for publication bias, and running sensitivity analyses. Whether you are running your first meta-analysis or transitioning from other software, this tutorial covers the practical workflow with real code examples.

Setting Up R for Meta-Analysis

Before running any analysis, install the required packages. Open R or RStudio and run:

Install metafor (the core package): install.packages("metafor"). This installs metafor along with its dependencies. Load it with library(metafor).

Install meta (optional, for convenience functions): install.packages("meta"). The meta package provides higher-level wrapper functions like metabin() for binary outcomes and metacont() for continuous outcomes that some researchers find more intuitive.

Install dmetar (optional, for advanced diagnostics): This package by Mathias Harrer provides additional functions for outlier detection, influence analysis, and power calculations. Install from GitHub using devtools::install_github("MathiasHarrer/dmetar").

For forest plot customization, the base metafor forest() function is sufficient for most publications. If you need advanced visualizations, the ggplot2 and forestplot packages offer additional options.

Preparing Your Data for Meta-Analysis

Data preparation is the most critical and time-consuming step. Each included study must contribute a standardized effect size and its sampling variance (or standard error) for the analysis.

Common effect size types:

The escalc() function in metafor calculates effect sizes and sampling variances from summary statistics. For a two-group comparison with binary outcomes:

dat <- escalc(measure="OR", ai=events_treat, bi=nonevents_treat, ci=events_control, di=nonevents_control, data=mydata)

For continuous outcomes using means and standard deviations:

dat <- escalc(measure="SMD", m1i=mean_treat, sd1i=sd_treat, n1i=n_treat, m2i=mean_control, sd2i=sd_control, n2i=n_control, data=mydata)

This eliminates manual calculation errors and automatically computes the sampling variance (vi) needed for the model.

Running the Meta-Analysis Model

The core function in metafor is rma(), which fits meta-analytic models:

Random-effects model (most common): res <- rma(yi, vi, data=dat, method="REML"). The REML (restricted maximum likelihood) estimator is the default and recommended method for estimating between-study variance (tau-squared). Alternatives include DerSimonian-Laird (method="DL"), which is simpler but can underestimate tau-squared with few studies.

Fixed-effect model: res <- rma(yi, vi, data=dat, method="FE"). Use this only when you believe all studies estimate the same underlying true effect, which is rarely justified in practice.

Interpreting the output: The summary() or print() of the rma object reports the pooled effect estimate, its standard error, z-value, p-value, and 95% confidence interval. It also reports tau-squared (between-study variance), I-squared (percentage of total variability due to heterogeneity), and the Q-test for heterogeneity.

Need help choosing the right model or interpreting your results? Our biostatisticians conduct meta-analyses in R every week, from data extraction through publication-ready output. Get a free quote and let us handle the statistical analysis while you focus on the clinical interpretation, or explore our full meta-analysis service.

Creating Forest Plots in R

The forest plot is the signature visualization of a meta-analysis. In metafor, generate one with:

forest(res, slab=dat$study_label, header=TRUE)

Key elements of a forest plot: each study appears as a square (sized by weight) with a horizontal line (95% confidence interval). The diamond at the bottom represents the pooled effect estimate with its confidence interval.

Customization options for publication-ready plots:

Always include both the confidence interval (precision of the average effect) and the prediction interval (expected range of effects across settings) in random-effects models.

Assessing and Exploring Heterogeneity

Heterogeneity measures how much the true effect varies across studies. Key statistics:

Moderator analysis (meta-regression) explores sources of heterogeneity:

res_mod <- rma(yi, vi, mods = ~ year + risk_of_bias, data=dat)

This tests whether study-level covariates explain some of the between-study variance. For subgroup analysis, use categorical moderators:

res_mod <- rma(yi, vi, mods = ~ factor(subgroup), data=dat)

With few studies (fewer than 10 per covariate), meta-regression has low power and results should be interpreted cautiously.

Testing for Publication Bias

Publication bias occurs when studies with statistically significant results are more likely to be published, potentially inflating the pooled estimate. Assess it using multiple complementary methods:

Funnel plot: funnel(res) plots each study's effect size against its precision (standard error). Asymmetry suggests potential bias, with missing studies in the lower-left region indicating suppressed non-significant findings.

Egger's regression test: regtest(res) tests for funnel plot asymmetry statistically. A significant result (p < 0.10) suggests small-study effects, though this test has low power with fewer than 10 studies.

Trim-and-fill method: trimfill(res) estimates the number of "missing" studies and provides an adjusted pooled estimate. This is a sensitivity analysis, not a definitive correction.

P-curve analysis and selection models (selmodel() in metafor) are more modern approaches that model the selection process directly and are increasingly recommended in methodological literature.

Report all publication bias assessments transparently, including their limitations with small numbers of studies.

Sensitivity Analysis and Diagnostics

Leave-one-out analysis removes each study in turn and re-fits the model: leave1out(res). This identifies influential studies that disproportionately drive the pooled result. If removing a single study changes the conclusion, flag this in your report.

Influence diagnostics: influence(res) computes Cook's distance, DFFITS, hat values, and other measures for each study. Visualize with plot(influence(res)).

Baujat plot: baujat(res) shows each study's contribution to overall heterogeneity versus its influence on the pooled result. Studies in the upper-right quadrant are both heterogeneous and influential.

Cumulative meta-analysis: cumul(res, order=dat$year) adds studies one at a time in chronological order, revealing whether the evidence has stabilized or shifted over time.

Run sensitivity analyses for key methodological decisions: restricting to low risk of bias studies, using different effect size measures, and comparing random-effects estimators (REML vs. DL vs. Paule-Mandel).

Common Pitfalls and Best Practices

Do not use the fixed-effect model by default. Unless you have a strong theoretical reason to assume all studies estimate identical effects, the random-effects model is almost always appropriate. The choice between models affects both the pooled estimate and its confidence interval.

Always report the prediction interval alongside the confidence interval. A meta-analysis can show a significant average effect (narrow CI excluding zero) while the prediction interval includes zero, meaning the effect may not replicate in all settings.

Do not run meta-regression with too few studies. The commonly cited rule of thumb is at least 10 studies per covariate, though even this may be insufficient. With 5 studies total and 3 covariates, the analysis will run but results are meaningless.

Do not conflate statistical significance with clinical importance. A pooled odds ratio of 1.05 (95% CI: 1.01-1.09) may be statistically significant with enough studies but clinically irrelevant. Always interpret effect sizes in context.

Use REML for tau-squared estimation. The DerSimonian-Laird estimator, still the default in some older software, underestimates tau-squared especially with few studies. REML is the recommended default in metafor and most methodological guidance.

Save your analysis script. One of R's greatest advantages over point-and-click software is reproducibility. Your entire analysis, from data import through final forest plot, should be captured in a single script that any reviewer can re-run.

A complete meta-analysis workflow in R follows this sequence: (1) prepare your data file with study identifiers and raw summary statistics, (2) calculate standardized effect sizes with escalc(), (3) fit the random-effects model with rma(), (4) generate the forest plot with forest(), (5) assess heterogeneity with Q, I-squared, and prediction intervals, (6) explore heterogeneity sources with moderator analysis if pre-specified, (7) test for publication bias with funnel plots, Egger's test, and trim-and-fill, (8) run leave-one-out and influence diagnostics, and (9) compile all results for your manuscript.

For your protocol, specify the planned statistical methods including the effect size measure, model type, heterogeneity estimator, and pre-specified subgroup analyses. This prevents data-driven analytical decisions that inflate false-positive rates.