AMSTAR-2 Assessment Tool

Free

Assess the methodological quality of systematic reviews using the AMSTAR-2 framework (Shea et al., 2017) with auto-calculated confidence ratings for your umbrella review.

How to Use

Add one row per systematic review being appraised. Click each colored circle to cycle through judgments: + Yes, ~ Partial Yes, − No (and N/A for Q11, Q14, Q15). Items marked with are critical domains. The overall confidence rating is calculated automatically using the AMSTAR-2 algorithm (Shea et al., 2017). Use the group tabs to switch between Items 1-8 and Items 9-16, then export as a high-resolution PNG.

Load sample data to see how the tool works, or clear all fields to start fresh.

+Yes

~Partial Yes

−No

AMSTAR-2 Item Reference ( = critical domain)

Q1PICO components in research questions

Q2Protocol registered before review

Q3Study design selection explained

Q4Comprehensive literature search

Q5Study selection in duplicate

Q6Data extraction in duplicate

Q7List of excluded studies with justifications

Q8Included studies described in detail

Q9Risk of bias assessed satisfactorily

Q10Funding sources of included studies reported

Q11Appropriate statistical methods for meta-analysis

Q12Risk of bias impact assessed

Q13Risk of bias in results interpretation

Q14Heterogeneity discussed satisfactorily

Q15Publication bias investigated

Q16Conflicts of interest reported

Systematic Review	Q1 PICO components in research questions	Q2 Protocol registered before review	Q3 Study design selection explained	Q4 Comprehensive literature search	Q5 Study selection in duplicate	Q6 Data extraction in duplicate	Q7 List of excluded studies with justifications	Q8 Included studies described in detail	Overall Confidence
									-
									-
									-

AMSTAR 2 (Shea et al., 2017, BMJ) • Generated with Research Gold

Next step

One review appraised. Want the full umbrella review run for you?

Comprehensive search of reviews, AMSTAR-2 across all included reviews, overlap analysis, and a publication-ready manuscript.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

Quote in minutesPay only after you approve scopePhD methodologistPRISMA 2020 + Cochrane HandbookNDA available on request

Quote my systematic review WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the search, screening, or synthesis, we rework the section free.

Confidentiality

NDA available on request before scope discussion. Your data, study design, and manuscript stay private either way.

How to Use This Tool

Enter Review Details

Click Add Review to create a row for each systematic review included in your umbrella review or overview of reviews. Enter the study identifier (e.g., Author Year) so your assessment table maps directly to your PRISMA flow diagram and characteristics of included reviews table.

Rate All 16 Items

For each review, click through the 16 AMSTAR-2 items across two tabs (Items 1-8 and Items 9-16). Cycle judgments between Yes, Partial Yes, No, and Not Applicable. Each item addresses a specific methodological feature such as protocol registration, search strategy, or statistical methods.

Check the 7 Critical Domains

Pay special attention to Items 2, 4, 7, 9, 11, 13, and 15, which are marked as critical. A No on any critical item will reduce the overall confidence to Low or Critically Low. These items represent methodological steps where flaws most commonly produce misleading conclusions in systematic reviews.

View Confidence Ratings

The tool automatically calculates the overall confidence rating (High, Moderate, Low, or Critically Low) for each review using the AMSTAR-2 algorithm. High means no critical flaws and no non-critical weaknesses. Moderate means non-critical weaknesses only. Low means one critical flaw. Critically Low means two or more critical flaws.

Generate Summary Visualizations

Review the summary bar chart showing the distribution of Yes, Partial Yes, and No judgments across all included reviews for each item. This visualization reveals which methodological features are consistently weak across the body of included reviews.

Export for Publication

Download the assessment table and summary bar chart as high-resolution PNG files for your manuscript supplementary materials. The visualizations are formatted to meet journal submission requirements and can be included directly in your umbrella review without additional formatting.

Want a PhD methodologist to handle the whole project?

Get a complete umbrella review with AMSTAR-2 quality assessment across all included reviews. Free rework on search, screening, or synthesis if reviewers push back. Pay only after you approve scope.

WhatsApp Quote my systematic review

Key Takeaways for AMSTAR-2 Assessment

16 items with 7 critical domains drive the rating

AMSTAR-2 distinguishes between critical and non-critical items. The 7 critical domains (Items 2, 4, 7, 9, 11, 13, 15) represent methodological steps where flaws are most likely to invalidate the review's conclusions. A single critical flaw reduces confidence to Low, and two or more critical flaws produce Critically Low. Non-critical weaknesses can only reduce confidence to Moderate at worst.

The confidence rating algorithm is categorical, not additive

Unlike the original AMSTAR which produced a numeric score (0-11), AMSTAR-2 uses a categorical algorithm. High confidence requires no flaws anywhere. Moderate allows non-critical weaknesses but no critical flaws. This means a review could have 9 non-critical weaknesses and still rate Moderate, while a single critical flaw drops it to Low regardless of other strengths.

Non-critical flaws vs critical flaws have different impacts

Non-critical items (1, 3, 5, 6, 8, 10, 12, 14, 16) cover important but less decisive methodological features such as study design justification, data extraction methods, and conflict of interest reporting. A No on these items counts as a weakness but cannot alone reduce confidence below Moderate. Multiple non-critical weaknesses together move the rating from High to Moderate.

PICO components and protocol registration are foundational

Item 1 (research question using PICO) and Item 2 (protocol registered before the review, critical) establish whether the review was planned a priori. Protocol registration protects against outcome switching and selective reporting. Reviews without protocols often modify their scope, inclusion criteria, or outcomes during data collection, introducing bias that cannot be detected from the final publication alone.

Literature search adequacy determines what evidence enters the review

Item 4 (critical) requires a comprehensive search strategy including at least two databases, keyword and index term searches, and supplementary strategies such as reference list checking, expert contact, or grey literature searching. An inadequate search may miss relevant studies, distorting the pooled estimate and reducing the generalizability of the review's conclusions.

Overall confidence does not equal overall quality of evidence

AMSTAR-2 assesses how well the systematic review was conducted, not the certainty of the evidence it synthesizes. A review can receive High confidence (excellent methodology) while its conclusions remain uncertain because the primary studies are few, small, or heterogeneous. Pair AMSTAR-2 with GRADE to communicate both the trustworthiness of the review process and the certainty of the findings.

AMSTAR-2 Methodology for Umbrella Reviews

AMSTAR-2 (Shea et al., 2017), published in the BMJ, replaced the original AMSTAR with a more nuanced 16-item instrument that identifies 7 critical domains and produces categorical confidence ratings (High, Moderate, Low, Critically Low) rather than numeric scores. A single critical flaw reduces confidence to Low regardless of performance on other items. Two or more critical flaws result in Critically Low. This design ensures that fundamental methodological weaknesses are never masked by strengths in minor areas. The original AMSTAR (Shea et al., 2007) used an 11-item scale that produced a numeric score, but this approach allowed reviews with serious flaws in critical areas to receive acceptable scores if they performed well on non-critical items.

The 7 critical items represent methodological steps where flaws most commonly produce misleading conclusions. Protocol registration (Item 2) prevents post-hoc modifications to eligibility criteria and outcomes. A comprehensive literature search (Item 4) ensures all relevant evidence enters the review. Justification for excluded studies (Item 7) provides transparency about what was left out and why. The risk of bias assessment technique (Item 9) ensures appropriate tools were applied to primary studies. Appropriate statistical methods (Item 11) guards against invalid pooling. Accounting for risk of bias in interpreting results (Item 13) prevents overconfident conclusions from flawed studies. Publication bias investigation (Item 15) assesses whether missing studies may have distorted the pooled estimate.

Umbrella reviews (also called overviews of reviews) synthesize evidence from multiple systematic reviews on the same or related topics. AMSTAR-2 is the standard tool for classifying the quality of included reviews in these higher-order syntheses. After classifying all included reviews, conduct sensitivity analyses restricted to reviews rated High or Moderate confidence to test robustness. If conclusions change substantially when Critically Low reviews are excluded, the overall evidence base may be unreliable. This approach mirrors the sensitivity analysis strategy used in primary meta-analyses when removing high risk of bias studies.

AMSTAR-2 complements GRADE for umbrella reviews but addresses different questions. AMSTAR-2 assesses how well each included systematic review was conducted (methodological quality of the review process), while GRADE assesses the certainty of evidence for specific outcomes across the primary studies. A systematic review can receive High AMSTAR-2 confidence but still present Low GRADE certainty evidence if the primary studies are small, inconsistent, or indirect. Conversely, a Critically Low AMSTAR-2 review might contain high-certainty GRADE evidence if the primary studies are large and consistent, though the review methodology raises concerns about completeness and bias.

The distinction between Partial Yes and No responses is important for several AMSTAR-2 items. Partial Yes typically indicates that the review authors attempted but did not fully achieve the methodological standard. For example, Item 4 (comprehensive search) receives Partial Yes if at least two databases were searched but supplementary search strategies (grey literature, reference checking) were not employed. These partial responses count as non-critical weaknesses in the confidence algorithm, meaning they reduce confidence from High to Moderate but not further.

For primary studies within your included reviews, use the RoB 2 tool for randomized trials or our GRADE evidence certainty tool to rate overall certainty. Ensure transparent reporting with the PRISMA checklist tool. When your umbrella review identifies gaps in the evidence base, plan future primary systematic reviews using structured frameworks and visualize their meta-analytic results with our forest plot generator.

Frequently Asked Questions

What is AMSTAR-2?

AMSTAR-2 (A MeaSurement Tool to Assess systematic Reviews, version 2) is a 16-item critical appraisal instrument developed by Shea et al. (2017) and published in the BMJ. It evaluates the methodological quality of systematic reviews, including those with or without meta-analysis. Unlike the original AMSTAR, AMSTAR-2 identifies 7 critical domains and uses an overall confidence rating system (High, Moderate, Low, Critically Low) rather than a numeric score. It is the standard tool for umbrella reviews and overviews of reviews.

What are the 7 critical domains in AMSTAR-2?

The 7 critical domains are: Item 2 (protocol registered before the review), Item 4 (comprehensive literature search strategy), Item 7 (list of excluded studies with justifications), Item 9 (satisfactory risk of bias assessment technique), Item 11 (appropriate statistical methods for meta-analysis), Item 13 (risk of bias accounted for in interpreting results), and Item 15 (publication bias investigation when 10 or more studies are included). A flaw in any critical domain has a disproportionate impact on the overall confidence rating.

How does AMSTAR-2 calculate the overall confidence rating?

AMSTAR-2 uses a categorical rating system. High confidence means no critical flaws and no non-critical weaknesses. Moderate confidence means more than one non-critical weakness but no critical flaws. Low confidence means one critical flaw, with or without non-critical weaknesses. Critically Low confidence means more than one critical flaw, with or without non-critical weaknesses. A critical flaw is a 'No' response on any of the 7 critical domains. A non-critical weakness is a 'No' or 'Partial Yes' on a non-critical item.

When should I use AMSTAR-2 vs ROBIS?

Both AMSTAR-2 and ROBIS assess the quality of systematic reviews, but they differ in scope and structure. AMSTAR-2 evaluates 16 methodological items and produces an overall confidence rating, making it well suited for umbrella reviews that need a standardized, transparent quality classification across many included reviews. ROBIS (Whiting et al., 2016) focuses on risk of bias across 4 domains with signaling questions and is often preferred when reviewers want a domain-based risk of bias judgment rather than a global quality rating. Many umbrella review protocols specify AMSTAR-2 because its categorical output (High to Critically Low) integrates naturally into GRADE-CERQual and evidence mapping frameworks.

Can AMSTAR-2 be used for systematic reviews without meta-analysis?

Yes. AMSTAR-2 was explicitly designed to assess systematic reviews both with and without meta-analysis. For reviews that do not include a meta-analysis, Items 11 (appropriate statistical methods for meta-analysis), 14 (satisfactory discussion of heterogeneity), and 15 (publication bias investigation) can be marked as Not Applicable. The remaining 13 items still provide a comprehensive assessment of the review methodology, and the overall confidence rating is calculated from the applicable items.

How do I report AMSTAR-2 results in an umbrella review?

Present AMSTAR-2 results in a summary table showing the per-review judgment for each of the 16 items and the overall confidence rating. Include a summary bar chart showing the proportion of reviews rated Yes, Partial Yes, and No for each item. Report the distribution of overall confidence ratings across all included reviews (e.g., 5 High, 8 Moderate, 3 Low, 2 Critically Low). State in the methods section that AMSTAR-2 (Shea et al., 2017) was used and that two reviewers independently assessed each review. Sensitivity analyses restricted to reviews rated High or Moderate confidence can strengthen the robustness of your conclusions.

Related Research Tools

Assess risk of bias in the primary randomized trials within your included reviews using the RoB 2 assessment tool for randomized controlled trials. Rate the certainty of evidence for each outcome using the GRADE evidence certainty tool. Ensure your umbrella review reporting meets PRISMA standards with the PRISMA checklist tool for scoping and systematic reviews.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

Quality Assessment Is Complex. Our Experts Handle It Daily.

We conduct full risk of bias assessments, GRADE evaluations, and complete systematic reviews with rigorous methodology that satisfies peer reviewers. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request

Quote my systematic review Chat on WhatsApp

You Shape What We Build Next

Systematic Review

PICO components in research questions

Protocol registered before review

Study design selection explained

Comprehensive literature search

Study selection in duplicate

Data extraction in duplicate

List of excluded studies with justifications

Included studies described in detail

Overall

Confidence

How to Use This Tool

Enter Review Details

Rate All 16 Items

Check the 7 Critical Domains

View Confidence Ratings

Generate Summary Visualizations

Export for Publication

Key Takeaways for AMSTAR-2 Assessment

16 items with 7 critical domains drive the rating

The confidence rating algorithm is categorical, not additive

Non-critical flaws vs critical flaws have different impacts

PICO components and protocol registration are foundational

Literature search adequacy determines what evidence enters the review

Overall confidence does not equal overall quality of evidence

AMSTAR-2 Methodology for Umbrella Reviews

Frequently Asked Questions

What is AMSTAR-2?

What are the 7 critical domains in AMSTAR-2?

How does AMSTAR-2 calculate the overall confidence rating?

When should I use AMSTAR-2 vs ROBIS?

Can AMSTAR-2 be used for systematic reviews without meta-analysis?

How do I report AMSTAR-2 results in an umbrella review?

Related Research Tools

Quality Assessment Is Complex. Our Experts Handle It Daily.

We conduct full risk of bias assessments, GRADE evaluations, and complete systematic reviews with rigorous methodology that satisfies peer reviewers. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request