Identify sources of heterogeneity in your meta-analysis using all-subsets analysis (Olkin, Dahabreh & Trikalinos, 2012) with interactive study highlighting to pinpoint influential studies.
Load sample data to see how the tool works, or clear all fields to start fresh.
Drag & drop a file or
CSV, TSV, Excel (.xlsx/.xls) - max 500 rows
| Study | Effect Size | Standard Error | |
|---|---|---|---|
Input the effect size and standard error for each study in your meta-analysis. You can type values manually, paste from a spreadsheet, or import a CSV or Excel file. Each row represents one study contributing to the all-subsets analysis.
Choose between a fixed-effect or random-effects (DerSimonian-Laird) model. Set the maximum number of subsets to compute. For 12 or fewer studies, all subsets are enumerated exhaustively. For larger datasets, random sampling approximates the full pattern.
Click Generate to compute the pooled effect and I-squared for each subset of studies. The tool renders an interactive D3.js scatter plot with the pooled estimate on the x-axis and I-squared on the y-axis, where each point represents one subset.
Use the dropdown to select a specific study. All subsets containing that study are colored differently, revealing whether its inclusion systematically shifts the pooled estimate or increases heterogeneity compared to subsets without it.
A single compact cluster indicates stable, homogeneous results. Multiple distinct clusters or a wide vertical spread suggests that certain studies create fundamentally different heterogeneity patterns. Elongated tails indicate one or more outliers pulling subsets toward higher I-squared.
Download the GOSH plot as a high-resolution PNG or PDF for your manuscript. Copy the auto-generated methods paragraph describing your all-subsets analysis, or export the R code for the metafor gosh() function for full reproducibility.
Need this done professionally? Get a complete systematic review or meta-analysis handled end-to-end.
Get a Free QuoteUnlike leave-one-out analysis that removes one study at a time, the GOSH plot examines every possible combination of studies. This reveals complex interactions where two or more studies jointly drive heterogeneity in ways that single-study removal cannot detect.
The study highlighting tool colors all subsets containing a selected study. If highlighted subsets cluster at higher I-squared values or shifted effect estimates, that study is a key driver of heterogeneity and should be examined through sensitivity analysis.
When the GOSH plot shows a single tight cluster, your meta-analysis results are robust. No individual study or combination of studies substantially alters the pooled estimate or heterogeneity statistics, supporting confidence in the overall finding.
Distinct clusters on a GOSH plot indicate that subsets of studies produce fundamentally different pooled estimates or heterogeneity levels. This often reveals latent subgroup structure that may be explained by clinical, methodological, or population differences.
For meta-analyses with more than 12 to 15 studies, exhaustive enumeration of all subsets becomes computationally prohibitive. Research shows that 10,000 randomly sampled subsets reliably capture the major features of the full GOSH pattern without requiring hours of computation.
Use the GOSH plot for comprehensive exploration and leave-one-out as a targeted confirmation tool. The GOSH plot identifies which studies create heterogeneity patterns, while leave-one-out quantifies the exact change in pooled effect and I-squared when each study is removed individually.
The GOSH plot (Olkin, Dahabreh, and Trikalinos, 2012) addresses a fundamental limitation of aggregate heterogeneity measures like I-squared and Cochran's Q: they tell you heterogeneity exists but not where it comes from. By computing the pooled estimate and I-squared for every possible subset of studies (or a random sample of 10,000 subsets for larger datasets), the graphical overview of study heterogeneity reveals whether high heterogeneity is driven by one outlier, a subgroup of discordant studies, or diffuse variability across the entire evidence base. A compact cluster indicates stable results, while multiple clusters or wide spread indicates specific studies creating distinct heterogeneity patterns.
The mathematical foundation of all-subsets meta-analysis is straightforward but computationally intensive. For k studies, there are 2^k minus 1 non-empty subsets. Each subset produces a pooled effect estimate and an I-squared value, creating one point on the GOSH scatter plot. The x-axis displays the pooled effect estimate for each subset, while the y-axis displays the corresponding I-squared statistic. This two-dimensional representation allows researchers to visualize the joint distribution of effect magnitude and heterogeneity across all possible study combinations, revealing patterns invisible to single-number summaries.
The highlight feature is the most practical aspect of GOSH analysis for heterogeneity source detection. When you select a study, all subsets containing it are colored differently. If highlighted points cluster at higher I-squared values or shifted pooled estimates, that study is an influential driver of heterogeneity. This approach surpasses leave-one-out sensitivity analysis because it captures interactions where two or more studies jointly drive variability in ways that single removal cannot detect. Olkin et al. (2012) demonstrated that GOSH analysis can identify studies whose exclusion would reduce I-squared from above 75% to below 25%, providing clear guidance for sensitivity analysis decisions.
Interpreting I-squared subsets requires understanding what cluster patterns mean clinically and statistically. A single compact cluster centered at low I-squared values indicates that your meta-analysis is genuinely homogeneous regardless of which studies are included. A single cluster at moderate to high I-squared suggests diffuse heterogeneity without clear outliers. Two or more distinct clusters typically indicate that a subset of studies represents a different population, intervention dose, or methodological approach. In such cases, formal subgroup analysis or meta-regression is warranted to test whether the cluster structure corresponds to a pre-specified moderator variable.
Outlier detection through GOSH analysis is more nuanced than simple statistical tests. A study identified as influential on the GOSH plot should not be automatically excluded. Instead, examine whether the study differs from others in population characteristics, intervention details, outcome definitions, or risk of bias ratings. If the influential study represents a legitimate subpopulation or methodological variant, its removal would reduce generalizability. The Cochrane Handbook recommends reporting results both with and without influential studies, noting the clinical rationale for any exclusion decisions.
The R code generated by this tool uses the metafor package (Viechtbauer, 2010) with the gosh() function, which implements the same all-subsets algorithm with options for exhaustive enumeration or random sampling. The generated code includes parallel processing support for large datasets and produces publication-quality plots that match the visual output of this tool. For reproducing results in a peer-reviewed manuscript, including the R code ensures full computational transparency.
Start with the forest plot generator to visualize individual study results and overall heterogeneity. Use the GOSH plot to decompose that heterogeneity into its sources. Confirm findings with the leave-one-out sensitivity analysis tool. If publication bias is also a concern, combine with the funnel plot and publication bias tool for converging evidence on influential studies. Quantify heterogeneity formally using the heterogeneity calculator to compute I-squared, tau-squared, and prediction intervals.
A GOSH (Graphical Overview of Study Heterogeneity) plot, introduced by Olkin, Dahabreh, and Trikalinos (2012), is a diagnostic visualization that reveals which individual studies drive heterogeneity in a meta-analysis. It works by computing the pooled effect estimate and I-squared statistic for every possible subset of studies (or a large random sample of subsets), then plotting the pooled estimate on the x-axis against I-squared on the y-axis. Each point represents one subset. Clusters, outliers, and spread patterns in the resulting scatter plot reveal whether specific studies are responsible for observed heterogeneity.
A GOSH plot with a single compact cluster suggests that heterogeneity is relatively stable regardless of which studies are included, meaning no single study dominates. Multiple distinct clusters or a wide spread along the I-squared axis indicates that certain studies substantially alter the heterogeneity when included or excluded. To identify influential studies, use the highlight feature to color subsets containing a specific study. If highlighting a study shows that its subsets cluster in a different region (particularly at higher I-squared values), that study is likely a key driver of heterogeneity and warrants further investigation through sensitivity analysis.
Use a GOSH plot whenever your meta-analysis shows substantial heterogeneity (for example, I-squared above 50%) and you want to understand which studies contribute most to that variability. It complements standard heterogeneity metrics like I-squared and Cochran Q by providing a visual decomposition of heterogeneity across all possible study combinations. The GOSH plot is particularly useful before conducting leave-one-out sensitivity analyses, because it can reveal complex patterns (such as two or more studies jointly driving heterogeneity) that simple one-at-a-time removal would miss.
For k studies, there are 2^k minus 1 possible non-empty subsets. When k is small (up to about 12 to 15 studies), the tool can enumerate all subsets exhaustively. For example, 10 studies produce 1,023 subsets, which is computationally trivial. However, 20 studies would produce over 1 million subsets, and 30 studies over 1 billion. For larger datasets, the tool randomly samples subsets (default: 10,000) to approximate the full GOSH pattern. Research shows that 10,000 randomly sampled subsets reliably captures the major features of the full GOSH plot without exhaustive computation.
A GOSH plot helps distinguish between diffuse heterogeneity (where variability is spread across many studies) and concentrated heterogeneity (where one or two studies create most of the variability). If the GOSH plot shows a single compact cluster, heterogeneity is likely distributed. If it shows distinct clusters or a tail of high-I-squared subsets associated with specific studies, those studies may be genuine outliers or may reflect a different population, intervention, or methodology. The GOSH plot identifies the pattern, but clinical and methodological judgment is needed to determine whether excluding an influential study is justified.
Leave-one-out analysis removes one study at a time and recalculates the pooled estimate and heterogeneity, producing k results for k studies. The GOSH plot is a much more comprehensive approach: it examines all possible combinations of studies (not just single removals), revealing interactions between studies that leave-one-out cannot detect. For example, two studies might individually contribute modest heterogeneity, but jointly create a large increase in I-squared. The GOSH plot would reveal this interaction as a distinct cluster, while leave-one-out would not. Use leave-one-out as a quick screening tool and GOSH for deeper investigation.
Visualize individual study results and the overall pooled estimate with our forest plot generator for meta-analysis. Conduct leave-one-out sensitivity analysis to confirm which studies identified by the GOSH plot substantially influence the pooled result when removed. Detect publication bias with contour-enhanced funnel plots, Egger's test, and trim-and-fill using our funnel plot and publication bias tool.
Reviewed by
Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.
Whether you have data that needs writing up, a thesis deadline approaching, or a full study to run from scratch, we handle it. Average turnaround: 2-4 weeks.