Diagnostic Test Accuracy Calculator

Free

Compute sensitivity, specificity, PPV, NPV, likelihood ratios, diagnostic odds ratio, Youden's index, accuracy, and F1 score from a 2×2 table with Wilson score confidence intervals at a selectable confidence level (80%, 90%, 95%, or 99%). Import many studies at once from a CSV or Excel file to compute every metric per study.

Enter the four cells of a 2×2 diagnostic table comparing index test results against the reference standard.

Disease +

Disease −

Test Positive

Test Negative

Se = TP/(TP+FN), Sp = TN/(TN+FP), PPV = TP/(TP+FP), NPV = TN/(TN+FN). Wilson score CIs for all proportions.

Confidence Level

Target prevalence (pre-test probability) optional

Override the sample prevalence with a clinical or target-population prevalence to recompute predictive values and post-test probabilities by Bayes theorem. Accepts a proportion (0 to 1) or a percentage. Leave blank to use the sample prevalence from the cells.

Load sample data to see how the tool works, or clear all fields to start fresh.

Enter TP, FP, FN, and TN to see diagnostic accuracy results.

Next step

Got accuracy metrics. Want the full diagnostic-accuracy meta-analysis?

Bivariate and HSROC models, QUADAS-2, summary sensitivity/specificity, and a publication-ready manuscript.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

Quote in minutesPay only after you approve your quotePhD methodologistmetafor R + Cochrane HandbookNDA available on request

Quote my meta-analysis WhatsApp

Starting from the research question, not just the data? We run the whole systematic review and meta-analysis end to end.

Quote my review + meta-analysis

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the pooled estimate or model choice, we re-run and re-write the analysis free.

Confidentiality

NDA available on request before any project discussion. Your data, study design, and manuscript stay private either way.

How to Use This Tool

Enter the 2×2 Table

Fill in the true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) from your diagnostic study.

Review All Metrics

See sensitivity, specificity, PPV, NPV, likelihood ratios, DOR, Youden’s index, accuracy, prevalence, and F1 score computed instantly.

Check Confidence Intervals

All proportions include Wilson score confidence intervals at your selected level (80%, 90%, 95%, or 99%). Likelihood ratios and DOR use log-based intervals for correct coverage.

Copy or Reset

Copy all results to your clipboard for reporting, load the example data to explore, or reset to start fresh.

Want a PhD methodologist to handle the whole project?

Get a full diagnostic accuracy systematic review with bivariate meta-analysis. Free re-run of the pooled analysis if reviewers question the estimate or model. Pay only after you approve your quote.

WhatsApp Quote my meta-analysis

Key Takeaways for Diagnostic Accuracy Studies

Sensitivity vs. specificity trade-off

Increasing sensitivity typically decreases specificity and vice versa. The optimal balance depends on the clinical context: screening tests favor high sensitivity (to catch all cases), while confirmatory tests favor high specificity (to minimize false positives).

Likelihood ratios are prevalence-independent

Unlike PPV and NPV, likelihood ratios do not depend on disease prevalence. LR+ > 10 and LR− < 0.1 provide strong diagnostic evidence (Jaeschke et al., 1994). This makes LRs more generalizable across populations than predictive values.

DOR summarizes overall discrimination

The diagnostic odds ratio combines sensitivity and specificity into one measure (DOR = TP×TN / FP×FN). A DOR of 1 means no discrimination. DOR is useful for meta-analysis of diagnostic tests but does not indicate the direction of the trade-off.

Prevalence affects PPV and NPV

Even an excellent test (Se = 0.99, Sp = 0.99) has a PPV of only 50% when prevalence is 1%. Always report prevalence alongside PPV/NPV, or use likelihood ratios to communicate test performance across populations.

Diagnostic Accuracy in Systematic Reviews

A diagnostic accuracy calculator transforms a 2×2 contingency table, comparing index test results against a reference standard, into the full set of metrics that clinicians and systematic reviewers require. The sensitivity specificity calculator computes the probability of a positive test given disease is present (sensitivity) and the probability of a negative test given disease is absent (specificity). These paired metrics are the foundation of Cochrane Diagnostic Test Accuracy (DTA) Reviews (Deeks et al., 2023), which pool sensitivity and specificity separately using bivariate or hierarchical summary ROC models to account for the threshold effect, meaning the inherent trade-off between sensitivity and specificity as the positivity threshold changes. The bivariate model (Reitsma et al., 2005) jointly estimates pooled sensitivity and specificity with their correlation, while the hierarchical summary ROC (HSROC) model (Rutter & Gatsonis, 2001) parameterises the underlying ROC curve directly; both approaches are implemented in software such as RevMan, Meta-DiSc, and the R mada package.

The likelihood ratio calculator derives LR+ (sensitivity / (1 − specificity)) and LR− ((1 − sensitivity) / specificity) from the same table. Likelihood ratios express how much a test result changes the pre-test probability of disease: LR+ above 10 or LR− below 0.1 provides strong diagnostic evidence (Deeks & Altman, 2004). The diagnostic odds ratio (DOR = LR+ / LR−) is a single summary of test discriminative ability, while Youden's J index (sensitivity + specificity − 1) identifies the optimal threshold when multiple cut-points are evaluated. The Fagan nomogram provides a visual method for Bayesian updating: by drawing a line from the pre-test probability through the likelihood ratio, clinicians can read off the post-test probability directly, making LRs immediately actionable at the point of care. Predictive values, specifically PPV and NPV, depend on disease prevalence in the tested population, which is why this tool requires the 2×2 cell counts rather than pre-computed sensitivity and specificity.

This 2x2 table calculator connects directly to the broader evidence synthesis workflow. The same contingency table structure underlies our chi-square and Fisher's exact test calculator, which tests statistical association between test classification and disease status. When test-and-treat strategies are evaluated, likelihood ratios can be combined with baseline risk estimates in our NNT calculator to determine the number of patients who need to be tested and treated for one additional patient to benefit. For Bayesian approaches to updating diagnostic probabilities, our Bayes factor calculator quantifies evidence strength under competing hypotheses.

When reporting diagnostic accuracy in a systematic review, PRISMA-DTA (McInnes et al., 2018) requires presenting sensitivity and specificity for each study alongside 95% confidence intervals, paired forest plots (one for sensitivity, one for specificity), and if applicable, a summary ROC curve. Individual study forest plots can be generated using our forest plot generator. Always report the reference standard used, the spectrum of disease severity in the study population, and whether index test interpretation was blinded to the reference standard result. These are factors that the QUADAS-2 risk of bias tool evaluates for diagnostic accuracy studies. Spectrum bias (when the enrolled case mix differs from the target clinical population) and partial verification bias (when only test-positive patients receive the reference standard) are among the most common threats to DTA study validity and should be explicitly assessed and reported.

Frequently Asked Questions

What do sensitivity and specificity mean?

Sensitivity (true positive rate) is the probability that a test correctly identifies patients who have the disease: Se = TP/(TP+FN). Specificity (true negative rate) is the probability that a test correctly identifies patients who do not have the disease: Sp = TN/(TN+FP). A highly sensitive test rarely misses true cases (few false negatives), while a highly specific test rarely misclassifies healthy individuals (few false positives).

How do I interpret likelihood ratios?

Likelihood ratios quantify how much a test result changes the probability of disease. A positive likelihood ratio (LR+) above 10 provides strong evidence to rule in a diagnosis, while an LR+ of 5–10 provides moderate evidence (Jaeschke et al., 1994). A negative likelihood ratio (LR−) below 0.1 provides strong evidence to rule out a diagnosis. LR values close to 1 indicate the test provides little diagnostic information.

Why does PPV depend on prevalence?

Positive predictive value (PPV) is the probability that a person with a positive test truly has the disease. PPV depends not only on the test’s sensitivity and specificity but also on the prevalence (pre-test probability) of the disease in the population being tested. Even a highly specific test will have a low PPV when used in a low-prevalence population because most positive results will be false positives. This is why screening tests perform differently in different populations.

When should I use the diagnostic odds ratio?

The diagnostic odds ratio (DOR = LR+/LR−) is a single summary measure that combines sensitivity and specificity into one number. It is particularly useful in meta-analyses of diagnostic test accuracy (DTA reviews) because it can be pooled across studies using standard meta-analytic methods. A DOR of 1 indicates the test has no discriminatory power, while higher values indicate better discriminatory ability. However, DOR does not convey the trade-off between sensitivity and specificity.

How should I handle verification bias in diagnostic accuracy studies?

Verification bias (also called work-up bias) occurs when not all patients receive the reference standard, and the decision to verify depends on the test result. This typically inflates sensitivity and deflates specificity. To address it, report whether all patients received the reference standard, consider using correction methods (e.g., Begg & Greenes adjustment), and flag potential bias in your quality assessment using tools like QUADAS-2. This calculator assumes all patients received both the index test and the reference standard.

What is the difference between sensitivity and positive predictive value?

Sensitivity is the probability of a positive test given the patient has the disease (TP/(TP+FN)), while PPV is the probability of disease given a positive test (TP/(TP+FP)). Sensitivity is a fixed test property, whereas PPV depends on disease prevalence. A test with 99% sensitivity can have a PPV below 10% in low-prevalence populations.

How do I use likelihood ratios to update pre-test probability?

Multiply the pre-test odds by the likelihood ratio to get post-test odds, then convert back to probability. Pre-test odds = prevalence / (1 – prevalence). Post-test odds = pre-test odds × LR. Post-test probability = post-test odds / (1 + post-test odds). This Bayesian updating process is often visualized using a Fagan nomogram.

What is Youden’s index and when should I use it?

Youden’s index (J = sensitivity + specificity – 1) summarizes a test’s discriminative ability in a single number ranging from 0 (useless) to 1 (perfect). It is most useful when selecting an optimal cut-point on a ROC curve: the threshold that maximizes J balances sensitivity and specificity equally. For clinical decisions where one error type is costlier, weighted alternatives are preferred.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

This Calculator Is Free. The Full Analysis? We Handle That Too.

Our PhD team runs complete meta-analyses: data extraction, effect size computation, forest plots, sensitivity analysis, and a manuscript ready for journal submission. Most projects deliver in under 2 weeks.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

4.9 / 5 across 1,194+ projectsQuote in minutesmetafor R + Cochrane HandbookPhD methodologistPay only after you approve your quoteNDA available on request

Quote my meta-analysis Chat on WhatsApp

Need the whole review, not just the analysis? Quote my systematic review and meta-analysis

The methodologists behind your review

Your project is led by a named PhD methodologist with real credentials and published work.

4.9 / 5 across 1,194+ delivered projects

Meet our methodologists

Wei Cheng, PhD

Network Meta-Analysis

Eva Culakova, PhD

Clinical Trials

Belinda Burford, PhD

GRADE

Shelley Strowman, PhD

Nursing / DNP

Jenny Berrio, MD, PhD

Meta-Analysis

You Shape What We Build Next

How to Use This Tool

Enter the 2×2 Table

Fill in the true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) from your diagnostic study.

Review All Metrics

See sensitivity, specificity, PPV, NPV, likelihood ratios, DOR, Youden’s index, accuracy, prevalence, and F1 score computed instantly.

Check Confidence Intervals

All proportions include Wilson score confidence intervals at your selected level (80%, 90%, 95%, or 99%). Likelihood ratios and DOR use log-based intervals for correct coverage.

Copy or Reset

Copy all results to your clipboard for reporting, load the example data to explore, or reset to start fresh.

Key Takeaways for Diagnostic Accuracy Studies

Sensitivity vs. specificity trade-off

Likelihood ratios are prevalence-independent

DOR summarizes overall discrimination

Prevalence affects PPV and NPV

Diagnostic Accuracy in Systematic Reviews

Frequently Asked Questions

What do sensitivity and specificity mean?

How do I interpret likelihood ratios?

Why does PPV depend on prevalence?

When should I use the diagnostic odds ratio?

How should I handle verification bias in diagnostic accuracy studies?

What is the difference between sensitivity and positive predictive value?

How do I use likelihood ratios to update pre-test probability?

What is Youden’s index and when should I use it?

This Calculator Is Free. The Full Analysis? We Handle That Too.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

4.9 / 5 across 1,194+ projectsQuote in minutesmetafor R + Cochrane HandbookPhD methodologistPay only after you approve your quoteNDA available on request