QUADAS-2 Assessment Tool

Free

Assess risk of bias and applicability concerns in diagnostic test accuracy studies using the QUADAS-2 framework (Whiting et al., 2011), the gold standard for Cochrane DTA reviews.

How to Use

Add your diagnostic accuracy studies and enter their names. Click each colored circle to cycle through judgments: + Low, − High, ? Unclear, N/A. The table has two sections: Risk of Bias (4 domains) and Applicability Concerns (3 domains). Domain 4 (Flow and Timing) has no applicability assessment per the QUADAS-2 framework.

Load sample data to see how the tool works, or clear all fields to start fresh.

Risk of Bias:

+Low risk

−High risk

?Unclear

|Applicability:

+Low concern

−High concern

?Unclear

Study	Risk of Bias				Applicability Concerns
Study	D1 Patient Selection	D2 Index Test	D3 Reference Standard	D4 Flow and Timing	A1 Patient Selection	A2 Index Test	A3 Reference Standard

QUADAS-2 (Whiting PF et al., 2011) • Generated with Research Gold

Next step

One diagnostic study scored. Have the full DTA set to assess?

Two trained reviewers complete QUADAS-2 across all studies, deliver a traffic-light plot, and feed it into a full DTA review.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

Quote in minutesPay only after you approve scopePhD methodologistPRISMA 2020 + Cochrane HandbookNDA available on request

Quote my systematic review WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the search, screening, or synthesis, we rework the section free.

Confidentiality

NDA available on request before scope discussion. Your data, study design, and manuscript stay private either way.

How to Use This Tool

Add Your Studies

Click Add Study to create a row for each primary diagnostic accuracy study included in your systematic review. Enter study identifiers (e.g., Author Year) so the traffic light table clearly maps to your PRISMA flow diagram and data extraction forms.

Rate Patient Selection

Assess whether consecutive or random enrollment was used, whether a case-control design was avoided, and whether inappropriate exclusions were applied. Assign Low, High, or Unclear risk of bias, then rate the applicability of the patient population to your review question.

Rate Index Test

Evaluate whether the index test results were interpreted without knowledge of the reference standard results and whether a pre-specified threshold was used. Consider whether the conduct of the index test, its technology, or its interpretation differs from your review question for applicability.

Rate Reference Standard

Determine whether the reference standard correctly classifies the target condition and whether its results were interpreted without knowledge of the index test results. Assess applicability by considering whether the reference standard definition matches your review question.

Rate Flow and Timing

Evaluate whether all patients received the same reference standard, whether all patients were included in the analysis, and whether the interval between the index test and reference standard was appropriate. This domain has no applicability concern rating.

Generate Summary and Export

Review the traffic light plot showing per-study, per-domain judgments and the summary bar chart showing proportions at each risk level. Download both as high-resolution PNGs for your manuscript and copy the auto-generated methods paragraph.

Want a PhD methodologist to handle the whole project?

Get a complete systematic review or meta-analysis handled end-to-end. Free rework on search, screening, or synthesis if reviewers push back. Pay only after you approve scope.

WhatsApp Quote my systematic review

Key Takeaways for QUADAS-2 Assessment

Four risk of bias domains plus three applicability concerns

QUADAS-2 produces seven separate judgments per study: four risk of bias ratings (Patient Selection, Index Test, Reference Standard, Flow and Timing) and three applicability concern ratings (Patient Selection, Index Test, Reference Standard). Flow and Timing has no applicability rating because it relates to study conduct rather than relevance to the review question.

Signaling questions guide transparent judgments

Each domain includes signaling questions that prompt assessors to consider specific methodological features before assigning a judgment. These questions are answered Yes, No, or Unclear, and the pattern of answers informs the overall domain judgment. This structured approach improves inter-rater reliability compared to global assessments without guidance.

Overall vs domain-level risk of bias

QUADAS-2 does not prescribe a formal algorithm for an overall risk of bias judgment across all domains. Most systematic reviews classify a study as high overall risk if any single domain is rated high. However, review teams should pre-specify their decision rule and apply it consistently across all included studies.

Handling unclear judgments appropriately

An Unclear judgment indicates insufficient information to determine whether bias is present. High proportions of Unclear ratings suggest inadequate reporting in the primary studies rather than absence of bias. When interpreting results, consider conducting sensitivity analyses that treat Unclear as either Low or High to test the impact on your conclusions.

Relationship to SROC and diagnostic test accuracy meta-analysis

QUADAS-2 findings directly inform sensitivity analyses in DTA meta-analysis. When pooling sensitivity and specificity using bivariate or HSROC models, restrict analyses to studies at low risk of bias to assess whether the pooled estimate changes. A substantial shift indicates that methodological quality is influencing the accuracy estimates.

Weighted summary bars reveal domain-level patterns

The summary bar chart shows what proportion of included studies are at Low, High, or Unclear risk for each domain. Domains where most studies are High or Unclear warrant particular attention in your discussion section and may justify downgrading certainty of evidence within the GRADE framework for diagnostic tests.

QUADAS-2 in Diagnostic Test Accuracy Systematic Reviews

QUADAS-2 (Whiting et al., 2011) organizes quality assessment into four domains: Patient Selection, Index Test, Reference Standard, and Flow and Timing. Each domain uses signaling questions to guide transparent judgments of risk of bias, while the first three domains also receive applicability concern ratings. This structured approach replaced the original 14-item QUADAS checklist (Whiting et al., 2003), which lacked a clear pathway from item responses to domain-level conclusions. PRISMA-DTA (McInnes et al., 2018) requires authors to present QUADAS-2 results for all included studies in diagnostic test accuracy systematic reviews.

The Patient Selection domain evaluates whether the study enrolled a consecutive or random sample and whether it avoided inappropriate exclusions that could distort accuracy estimates. Case-control designs, where known diseased patients are compared with known healthy controls, tend to overestimate diagnostic accuracy because spectrum effects are eliminated. The Index Test domain focuses on whether test interpretation was blinded to the reference standard result and whether a pre-specified threshold was applied. Post-hoc threshold selection inflates sensitivity and specificity by optimizing the cutoff to the available data rather than validating it prospectively.

The Reference Standard domain assesses whether the reference test correctly classifies the target condition and whether its results were interpreted independently of the index test. Incorporation bias, where the index test forms part of the reference standard, artificially inflates agreement between the two tests. The Flow and Timing domain evaluates whether all enrolled patients received the same reference standard and whether the time interval between tests was appropriate. Differential verification (using different reference standards for different patients) and excessive delay between tests can both distort accuracy estimates in unpredictable directions.

After completing your assessment, investigate the influence of study quality on pooled accuracy by conducting sensitivity analyses restricted to studies at low risk of bias. This is particularly important when constructing summary receiver operating characteristic (SROC) curves or pooling sensitivity and specificity using bivariate models (Reitsma et al., 2005). Calculate diagnostic accuracy metrics for individual studies using our diagnostic accuracy calculator. For randomized controlled trials, use the RoB 2 assessment tool, and for non-randomized intervention studies, use the ROBINS-I tool.

QUADAS-2 findings feed directly into GRADE assessments of diagnostic evidence (Defined by Schunemann et al., 2020, in the Cochrane Handbook Chapter 8). A high proportion of studies at high risk of bias provides grounds for downgrading certainty by one or two levels depending on the severity and consistency of the bias across included studies. Present both the traffic light table (per-study domain judgments) and the summary bar chart (proportion at each risk level per domain) in your manuscript as recommended by Cochrane DTA guidelines.

Two independent reviewers should complete assessments for every study, with disagreements resolved by discussion or a third reviewer. Pilot the tailored signaling questions on 3 to 5 studies before full application to ensure consistent interpretation across your review team. Document any modifications to the standard signaling questions in your protocol and supplementary materials. Visualize your pooled diagnostic accuracy results alongside your QUADAS-2 findings using our forest plot generator to present sensitivity and specificity estimates with confidence intervals for each included study.

Frequently Asked Questions

What is QUADAS-2?

QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies, revised) is a structured tool for assessing risk of bias and applicability concerns in primary diagnostic accuracy studies included in systematic reviews. Developed by Whiting et al. (2011) and published in Annals of Internal Medicine, QUADAS-2 replaced the original QUADAS checklist (2003) with a domain-based framework that uses signaling questions to guide transparent, reproducible judgments. It is the recommended tool for all Cochrane diagnostic test accuracy (DTA) reviews.

How many domains does QUADAS-2 assess?

QUADAS-2 assesses four domains: Patient Selection, Index Test, Reference Standard, and Flow and Timing. The first three domains are evaluated for both risk of bias and applicability concerns, producing seven separate judgments per study (four risk of bias ratings plus three applicability ratings). The fourth domain, Flow and Timing, is assessed only for risk of bias because it relates to the conduct of the study rather than the relevance of its findings to the review question.

What is the difference between risk of bias and applicability concerns?

Risk of bias refers to methodological flaws in how the study was designed, conducted, or analyzed that could distort estimates of diagnostic accuracy (for example, partial verification bias or differential verification). Applicability concerns refer to the degree to which the study matches the review question in terms of patient characteristics, index test conduct, or reference standard definition. A study can have low risk of bias but high applicability concern if it was well conducted but enrolled a population that differs meaningfully from the target population of the review.

When should I use QUADAS-2 vs RoB 2?

Use QUADAS-2 when your systematic review evaluates the accuracy of a diagnostic or screening test (sensitivity, specificity, predictive values, likelihood ratios, AUC). Use RoB 2 when your review evaluates the effect of an intervention in randomized controlled trials. The two tools address fundamentally different study designs: QUADAS-2 is built for cross-sectional or cohort diagnostic accuracy studies, while RoB 2 is built for randomized experiments. If your review includes both diagnostic accuracy studies and intervention trials, apply the appropriate tool to each study type.

How do I report QUADAS-2 results in a systematic review?

Report QUADAS-2 results using a traffic light table showing per-study, per-domain judgments (Low, High, or Unclear risk of bias and Low, High, or Unclear applicability concern) and a summary bar chart showing the proportion of studies at each risk level for each domain. PRISMA-DTA (McInnes et al., 2018) requires authors to present risk of bias and applicability results for all included studies. Include the completed assessment as a supplementary table and reference the QUADAS-2 citation (Whiting et al., 2011) in your methods section.

Can QUADAS-2 be used for screening tests?

Yes. QUADAS-2 is appropriate for any study that evaluates the accuracy of a test against a reference standard, whether the test is used for diagnostic confirmation, screening, staging, monitoring, or prognosis. The key requirement is that the study reports data from which two-by-two tables (true positives, false positives, false negatives, true negatives) can be constructed. The signaling questions may need tailoring to the specific review question, and Cochrane encourages review teams to pilot the tailored tool on a subset of studies before full application.

Related Research Tools

Calculate sensitivity, specificity, positive and negative likelihood ratios, and AUC for your diagnostic accuracy studies using our diagnostic accuracy calculator. For randomized controlled trials, assess bias with the RoB 2 assessment tool. For non-randomized studies of interventions, use the ROBINS-I tool for non-randomized studies. Visualize your pooled diagnostic accuracy estimates with our forest plot generator for meta-analysis.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

Stuck on Your Project? Let a PhD Expert Take Over.

Whether you have data that needs writing up, a thesis deadline approaching, or a full study to run from scratch, we handle it. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request

Quote my systematic review Chat on WhatsApp

You Shape What We Build Next

Study

Risk of Bias

Applicability Concerns

Patient Selection

Index Test

Reference Standard

Flow and Timing

Patient Selection

Index Test

Reference Standard

How to Use This Tool

Add Your Studies

Rate Patient Selection

Rate Index Test

Rate Reference Standard

Rate Flow and Timing

Generate Summary and Export

Key Takeaways for QUADAS-2 Assessment

Four risk of bias domains plus three applicability concerns

Signaling questions guide transparent judgments

Overall vs domain-level risk of bias

Handling unclear judgments appropriately

Relationship to SROC and diagnostic test accuracy meta-analysis

Weighted summary bars reveal domain-level patterns

QUADAS-2 in Diagnostic Test Accuracy Systematic Reviews

Frequently Asked Questions

What is QUADAS-2?

How many domains does QUADAS-2 assess?

What is the difference between risk of bias and applicability concerns?

When should I use QUADAS-2 vs RoB 2?

How do I report QUADAS-2 results in a systematic review?

Can QUADAS-2 be used for screening tests?

Related Research Tools

Stuck on Your Project? Let a PhD Expert Take Over.

Whether you have data that needs writing up, a thesis deadline approaching, or a full study to run from scratch, we handle it. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request