Newcastle-Ottawa Scale Calculator

Free

Score the methodological quality of cohort and case-control studies using the Newcastle-Ottawa Scale. Assign stars across selection, comparability, and outcome/exposure domains with multi-study support and CSV export.

How to Use

Add studies and name them. For each study, select the option that best describes the study for every NOS item. Options that award a star are marked with a icon. Each study can earn up to 9 stars. Quality ratings: 0-3 Low, 4-6 Moderate, 7-9 High.

Load sample data to see how the tool works, or clear all fields to start fresh.

Untitled StudyLow (0/9)

Selection

(0/4)

Representativeness of the exposed cohort

Selection of the non-exposed cohort

Ascertainment of exposure

Demonstration that outcome of interest was not present at start of study

Comparability

(0/2)

Comparability of cohorts on the basis of the design or analysis

Comparability - additional factor

Outcome

(0/3)

Assessment of outcome

Was follow-up long enough for outcomes to occur?

Adequacy of follow-up of cohorts

Next step

Scored one. Want a second reviewer to dual-score the rest in parallel?

PhD reviewer mirrors your scoring across every included study, flags disagreements with kappa, and delivers a publication-ready quality table. Fixed per-study price, no full-project commitment.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

Quote in minutesPay only after you approve scopePhD methodologistPRISMA 2020 + Cochrane HandbookNDA available on request

Quote my systematic review WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the search, screening, or synthesis, we rework the section free.

Confidentiality

NDA available on request before scope discussion. Your data, study design, and manuscript stay private either way.

How to Use This Tool

Choose Study Type

Select the appropriate tab for your study design: Cohort Studies or Case-Control Studies. Each tab presents the relevant NOS items for that design.

Add & Name Studies

Add a row for each study in your review and enter the study identifier. Click a study row to expand its assessment form.

Score Each Item

For each NOS item, select the option that best describes the study. Options marked with a star icon contribute to the total score.

Review & Export

Review the summary table showing stars per domain and overall quality ratings. Export results as CSV or copy the formatted text to your clipboard.

Want a PhD methodologist to handle the whole project?

Get a full quality appraisal for all your cohort and case-control studies. Free rework on search, screening, or synthesis if reviewers push back. Pay only after you approve scope.

WhatsApp Quote my systematic review

Key Takeaways for NOS Assessment

Star scoring reflects methodological rigor

Each NOS star represents a specific methodological criterion that the study meets. Stars are awarded for secure exposure ascertainment, appropriate control selection, adequate follow-up, and other quality indicators. The maximum 9 stars span selection (4), comparability (2), and outcome/exposure (3).

Quality thresholds guide sensitivity analyses

While 7-9 stars is commonly considered high quality, 4-6 moderate, and 0-3 low, these thresholds are conventions rather than validated cutoffs. Use NOS scores to conduct sensitivity analyses by re-running your meta-analysis restricted to high-quality studies to test whether results are robust to study quality.

Report domain scores, not just totals

A total score of 6 can arise from very different quality profiles. Always report the domain-level breakdown (selection, comparability, outcome/exposure) so readers can identify whether specific quality concerns apply to your evidence base.

Dual independent rating improves reliability

NOS inter-rater reliability is moderate. Best practice requires two independent reviewers to score each study, with discrepancies resolved through discussion or a third reviewer. Report the initial agreement rate (e.g., using Cohen's kappa) in your methods section.

Scoring Observational Study Quality with the Newcastle-Ottawa Scale

Observational studies, including cohort, case-control, and cross-sectional designs, form the backbone of evidence in many systematic reviews, particularly when randomized controlled trials are impractical or unavailable. The Newcastle-Ottawa Scale calculator implements the star-based scoring system developed by Wells et al. (2000) at the Universities of Newcastle (Australia) and Ottawa (Canada) to provide a standardized, reproducible method for appraising the methodological quality of these study designs. A NOS score tool allows reviewers to assign up to 9 stars across three categories: Selection (up to 4 stars), Comparability (up to 2 stars), and Outcome for cohort studies or Exposure for case-control studies (up to 3 stars). Each star represents a specific methodological criterion that the study satisfies, and the total score serves as a summary indicator of cohort study quality assessment that can be reported alongside effect estimates in meta-analyses. For cross-sectional studies, which fall outside the original NOS scope, a modified version proposed by Modesti et al. (2016) adapts the star-based framework to address the unique methodological concerns of cross-sectional designs, including response rate and standardized outcome measurement.

The Selection category evaluates whether the exposed and non-exposed cohorts (or cases and controls) were drawn from comparable populations and whether exposure or outcome ascertainment methods were reliable. The Comparability category, the most heavily weighted per-star domain, assesses whether the study controlled for the most important confounders, typically age, sex, and other variables central to the research question. The Outcome (or Exposure) category examines how the outcome was measured, whether follow-up was sufficiently long and complete, and whether objective assessment methods were used. The Cochrane Handbook for Systematic Reviews of Interventions (Higgins et al., 2023) acknowledges NOS as one of several validated tools for observational study quality, while PRISMA 2020 (Page et al., 2021) requires all systematic reviews to present results of quality or risk of bias assessments for every included study. It is worth noting that AMSTAR 2 (Shea et al., 2017) assesses the quality of systematic reviews themselves rather than primary studies. This is a distinct purpose that should not be confused with the NOS, which evaluates individual observational study methodology. Systematic review management platforms such as Covidence and DistillerSR now include built-in quality assessment modules that support NOS scoring alongside other appraisal tools, streamlining the workflow for review teams.

A common interpretation framework classifies studies scoring 7-9 stars as high quality, 4-6 as moderate quality, and 0-3 as low quality, although these thresholds are conventions rather than empirically validated cutoffs. Best practice involves using NOS scores in sensitivity analyses, for instance restricting a meta-analysis to high-quality studies to determine whether the pooled effect remains robust. NOS scores can also serve as a covariate in meta-regression, testing whether study quality moderates the treatment effect across studies. In dose-response meta-analyses, quality-stratified pooling by NOS score is particularly important because poorly designed studies may distort the shape of the exposure-response curve. Similarly, funnel plot asymmetry can be stratified by NOS score to disentangle publication bias from small-study effects driven by lower methodological quality. Because inter-rater reliability for NOS has been reported as moderate in validation studies (Lo et al., 2014), dual independent scoring followed by consensus resolution is essential. Reviewers should calculate and report the initial agreement rate, ideally using Cohen's kappa as an inter-rater reliability measure, in their methods section.

Choosing the right quality assessment tool depends on the study design and the level of detail required. NOS provides a relatively quick, star-based scoring system well-suited for large reviews with many observational studies. For a more granular, domain-based evaluation of non-randomized comparative studies, the ROBINS-I bias assessment for non-randomized studies uses signaling questions across seven domains with four judgment levels. Randomized trials should be appraised with the Cochrane RoB 2 risk of bias tool rather than NOS, as the two instruments address fundamentally different bias mechanisms. For reviews incorporating qualitative or prevalence data, the JBI critical appraisal checklists offer design-specific item sets from the Joanna Briggs Institute (Aromataris & Munn, 2020). Regardless of which instrument you choose, documenting the complete scoring rationale in your data extraction form ensures transparency and reproducibility across your review team.

Frequently Asked Questions

What is the Newcastle-Ottawa Scale (NOS)?

The Newcastle-Ottawa Scale is a widely used tool for assessing the quality of non-randomized studies (cohort and case-control) in systematic reviews and meta-analyses. Developed by Wells et al., it assigns stars across three categories: Selection, Comparability, and Outcome (for cohort studies) or Exposure (for case-control studies). A study can earn a maximum of 9 stars, with higher scores indicating better methodological quality.

How do I interpret NOS scores?

While there is no universally agreed threshold, a common interpretation uses three tiers: 7-9 stars indicates high quality, 4-6 stars indicates moderate quality, and 0-3 stars indicates low quality. Some systematic reviews use the NOS as a continuous variable in meta-regression to assess whether study quality moderates the pooled effect estimate. Always report the specific domain scores alongside the total.

What is the difference between the cohort and case-control versions?

Both versions share the same structure with three categories and similar principles, but they differ in specific items. The cohort version evaluates outcome assessment (independent blind assessment, record linkage, follow-up adequacy), while the case-control version evaluates exposure ascertainment (secure records, structured interviews, same method for cases and controls, non-response rate). The Comparability category is identical in both versions.

Can I use NOS for cross-sectional studies?

The original NOS was designed for cohort and case-control studies. An adapted version for cross-sectional studies has been proposed by various authors, but it is not part of the validated original scale. For cross-sectional studies, consider using the JBI critical appraisal checklist for analytical cross-sectional studies instead, which was specifically designed for this study design.

What are the limitations of the Newcastle-Ottawa Scale?

Key limitations include: (1) the scoring is somewhat subjective, as different reviewers may interpret criteria differently; (2) the scale assigns equal weight to all items, even though some may be more important for specific research questions; (3) the threshold cutoffs for low/moderate/high quality are not evidence-based; (4) inter-rater reliability has been reported as moderate in validation studies. Despite these limitations, NOS remains one of the most commonly used quality assessment tools for observational studies in systematic reviews.

What is a good NOS score?

Scores of 7–9 stars are generally considered high quality, 4–6 moderate quality, and 0–3 low quality. However, these thresholds are conventions, not empirically validated cutoffs. Some reviews define their own thresholds based on the research question. Always report the individual domain scores alongside the total, as a study can score 7 overall while having a critical weakness in one domain.

Is the Newcastle-Ottawa Scale validated?

The NOS was developed by Wells et al. at the Universities of Newcastle and Ottawa but has limited formal validation. Inter-rater reliability has been reported as moderate (Lo et al., 2014). Despite these limitations, NOS remains the most widely used quality assessment tool for observational studies in systematic reviews. The Cochrane Handbook acknowledges NOS but recommends ROBINS-I as a more rigorous alternative for non-randomized intervention studies.

Can I use the Newcastle-Ottawa Scale for cross-sectional studies?

The original NOS was designed for cohort and case-control studies only. A modified version for cross-sectional studies was proposed by Modesti et al. (2016), adapting the selection, comparability, and outcome domains. If your review includes cross-sectional studies, use the modified NOS or consider the JBI critical appraisal checklist for cross-sectional studies as an alternative.

Related Research Tools

Including randomized trials in your review? Use our RoB 2 tool for randomized trials to create traffic-light summary tables across 5 bias domains. For non-randomized comparative studies, the ROBINS-I assessment for non-randomized studies provides a more detailed 7-domain evaluation with signaling questions. For qualitative and mixed-methods studies, explore our JBI critical appraisal checklists covering multiple study designs.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

Quality Assessment Is Complex. Our Experts Handle It Daily.

We conduct full risk of bias assessments, GRADE evaluations, and complete systematic reviews with rigorous methodology that satisfies peer reviewers. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request

Quote my systematic review Chat on WhatsApp

You Shape What We Build Next

How to Use This Tool

Choose Study Type

Select the appropriate tab for your study design: Cohort Studies or Case-Control Studies. Each tab presents the relevant NOS items for that design.

Add & Name Studies

Add a row for each study in your review and enter the study identifier. Click a study row to expand its assessment form.

Score Each Item

For each NOS item, select the option that best describes the study. Options marked with a star icon contribute to the total score.

Review & Export

Review the summary table showing stars per domain and overall quality ratings. Export results as CSV or copy the formatted text to your clipboard.

Key Takeaways for NOS Assessment

Star scoring reflects methodological rigor

Quality thresholds guide sensitivity analyses

Report domain scores, not just totals

Dual independent rating improves reliability

Scoring Observational Study Quality with the Newcastle-Ottawa Scale

Frequently Asked Questions

What is the Newcastle-Ottawa Scale (NOS)?

How do I interpret NOS scores?

What is the difference between the cohort and case-control versions?

Can I use NOS for cross-sectional studies?

What are the limitations of the Newcastle-Ottawa Scale?

What is a good NOS score?

Is the Newcastle-Ottawa Scale validated?

Can I use the Newcastle-Ottawa Scale for cross-sectional studies?

Related Research Tools

Quality Assessment Is Complex. Our Experts Handle It Daily.

We conduct full risk of bias assessments, GRADE evaluations, and complete systematic reviews with rigorous methodology that satisfies peer reviewers. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request