Quality assessment is not optional in a systematic review. Every included study must be evaluated for methodological rigor, and the tools you choose must match the study designs in your review. This guide provides a direct side-by-side comparison of the four tools you are most likely to encounter: RoB 2, the Newcastle-Ottawa Scale (NOS), JBI checklists, and GRADE.
All four are available as free online instruments: Risk of Bias Tool, Newcastle-Ottawa Scale, JBI Checklist, and GRADE Evidence Tool.
What These Tools Actually Measure
RoB 2 and NOS measure study-level risk of bias: how likely is it that the methods introduced systematic error?
JBI checklists measure methodological quality, especially for qualitative and mixed-methods research where the concept of bias operates differently.
GRADE measures evidence certainty at the outcome level across a body of studies. Risk of bias findings from the other tools feed into GRADE as one of five domains.
Mixing these up creates serious problems. NOS is an input to GRADE, not a substitute.
RoB 2: The Standard for RCTs
Five domains: randomization, deviations from interventions, missing data, outcome measurement, selection of reported results. Judgment scale: Low, Some concerns, High. Required for all Cochrane reviews.
Use our free Risk of Bias Tool for RoB 2 assessments.
Newcastle-Ottawa Scale: Observational Studies
Star-rating system across three categories (Selection, Comparability, Outcome/Exposure) with a maximum of 9 stars. Common threshold: 7+ as high quality, 5-6 as moderate, below 5 as low.
Key limitation: the star system creates a false sense of precision. NOS is best used for domain-level comparison rather than total-score ranking.
Our Newcastle-Ottawa Scale tool calculates star ratings automatically.
JBI Checklists: Qualitative and Mixed-Methods
JBI covers study designs NOS and RoB 2 do not: qualitative research, cross-sectional studies, case series, prevalence studies. Each item is rated Yes, No, Unclear, or Not applicable with no aggregate score.
Our JBI Checklist tool covers all major JBI checklist types.
GRADE: Evidence Certainty Across Studies
Five downgrading domains: risk of bias, inconsistency, indirectness, imprecision, publication bias. Two upgrading domains for observational evidence: large magnitude, dose-response.
Certainty levels: High, Moderate, Low, Very low. RCT evidence starts at High; observational starts at Low.
Use our GRADE Evidence Tool for GRADE assessments and Summary of Findings tables.
Decision Flowchart
RCT --> RoB 2. Cohort or case-control --> NOS or ROBINS-I. Cross-sectional, qualitative, case series --> JBI. After individual assessments --> GRADE for each pooled outcome.
For non-randomized studies where you need more granular assessment, our ROBINS-I Tool offers the seven-domain framework.
Key Takeaways
- RoB 2 is for randomized controlled trials only. It is the Cochrane standard.
- NOS is for cohort and case-control studies. Aggregate scores should be interpreted cautiously.
- JBI checklists cover study designs NOS and RoB 2 do not, including qualitative research.
- GRADE operates at the outcome level across all studies, not at the individual study level.
- Never use GRADE as a substitute for study-level bias assessment. They measure different things.
- Use all four tools as free online instruments without software installation.
FAQ
Can I use the same tool for both RCTs and observational studies?
No. RoB 2 is designed specifically for RCTs. Using RoB 2 on a cohort study is a methodological error that reviewers will flag. Use NOS, ROBINS-I, or JBI for observational studies.
Do I need to complete GRADE if my review has no meta-analysis?
GRADE can be applied to narrative reviews. The Cochrane Handbook recommends completing GRADE and acknowledging that imprecision and inconsistency are harder to assess without quantitative synthesis.
What is the difference between ROBINS-I and NOS?
ROBINS-I is more structured with seven domains and a four-level scale. NOS uses a star-rating system across three broad categories. ROBINS-I is preferred for Cochrane reviews; NOS is widely used in non-Cochrane journals.
How do I translate NOS star ratings into a GRADE risk of bias judgment?
There is no validated formula. Report the distribution of NOS scores and make a narrative judgment. Describe your reasoning in the GRADE footnotes.
Should I include quality assessment results in a table or figure?
Yes to both. A table provides transparency for reviewers. A summary bar chart provides a visual overview. GRADE findings should appear in a Summary of Findings table.
Need help with your systematic review or meta-analysis? Get a free quote from our team of PhD researchers.