Risk of bias in a systematic review is a domain-based evaluation of whether systematic errors in study design, conduct, or reporting may distort results. Unlike quality scoring, risk of bias assessment examines specific domains, selection, performance, detection, attrition, and reporting bias, using validated tools such as RoB 2 (for RCTs), ROBINS-I (for non-randomized studies), and the Newcastle-Ottawa Scale (for observational studies).
Every systematic review must include a formal risk of bias assessment of its included studies. The Cochrane Handbook (Higgins et al., 2023, Chapter 8) mandates domain-based evaluation because a single high-risk domain can invalidate an otherwise well-conducted study. Whether you are reviewing randomized controlled trials, cohort studies, or qualitative research, selecting the correct bias assessment tool and applying it consistently determines whether your review's conclusions can be trusted. Need to assess risk of bias? Use our free online RoB tool for randomized controlled trials or ROBINS-I risk of bias tool for non-randomized studies.
What Is Risk of Bias in a Systematic Review?
Risk of bias is not the same as reporting quality or methodological quality. Reporting quality evaluates whether a study describes what was done (assessed by tools like CONSORT or STROBE). Methodological quality is a broader, less precise term that can encompass everything from study design to statistical analysis. Risk of bias specifically asks: did systematic errors in the study's design, conduct, or analysis produce results that deviate from the truth?
The Cochrane Handbook identifies five core bias domains that apply across study designs:
- Selection bias, systematic differences between groups at baseline, caused by flawed randomization or allocation concealment
- Performance bias, systematic differences in care provided to groups, typically caused by lack of blinding of participants and personnel
- Detection bias, systematic differences in how outcomes are measured, caused by lack of blinding of outcome assessors
- Attrition bias, systematic differences in withdrawals and dropouts between groups, leading to incomplete outcome data
- Reporting bias, systematic differences between reported and unreported results, including selective outcome reporting
Each bias domain is assessed independently because problems in one domain can completely undermine a study's findings regardless of how well other domains were handled. A systematic review has as a core component risk of bias assessment, this is not optional, it is foundational to the integrity of the evidence synthesis.
Why Risk of Bias Assessment Matters
Risk of bias assessment directly determines whether your systematic review's conclusions are credible. Without it, readers and guideline developers cannot judge whether the studies supporting your findings are trustworthy.
Three authoritative frameworks require formal risk of bias evaluation. The Cochrane Handbook mandates it for all Cochrane reviews and provides detailed guidance on tool selection and application. PRISMA 2020 (Page et al., 2021) requires authors to describe the risk of bias methods used and present results for each included study. Most peer-reviewed journals now reject systematic reviews that lack risk of bias assessment, making it a practical requirement for publication.
Risk of bias is the first domain evaluated in the GRADE framework for assessing certainty of evidence. When studies contributing to a body of evidence have high risk of bias, GRADE downgrades the certainty of evidence, potentially moving it from "high" to "moderate" or lower. This directly affects whether clinical guidelines classify a recommendation as strong or conditional. Risk of bias is input to GRADE assessment, making your domain-level judgments consequential far beyond the systematic review itself.
In our systematic reviews, we have found that calibration sessions between reviewers before full assessment reduce disagreements by approximately 40%, making the process faster and more reliable.
Choosing the Right Risk of Bias Tool
Which risk of bias tool should I use? Use RoB 2 for randomized controlled trials, ROBINS-I for non-randomized studies of interventions, Newcastle-Ottawa Scale for observational cohort and case-control studies, and JBI checklists for qualitative or other study types.
The decision depends entirely on the study design of your included studies. Many systematic reviews include multiple study designs, requiring you to use more than one tool. The table below maps each tool to its intended use:
| Tool | Study Design | Domains Assessed | Scoring Method | Developer |
|---|---|---|---|---|
| RoB 2 | Randomized controlled trials | 5 domains | Low risk / Some concerns / High risk | Cochrane (Sterne et al., 2019) |
| ROBINS-I | Non-randomized studies of interventions | 7 domains | Low / Moderate / Serious / Critical / No information | Cochrane (Sterne et al., 2016) |
| Newcastle-Ottawa Scale | Observational (cohort, case-control) | 3 categories (8 items) | Star-based (max 9 stars) | Wells et al. |
| JBI Checklist | Qualitative, cross-sectional, prevalence, and others | Varies by checklist | Yes / No / Unclear / Not applicable |
RoB 2 -- For Randomized Controlled Trials
RoB 2 (revised Cochrane risk of bias tool) is the standard for assessing randomized controlled trials. It replaced the original Cochrane RoB tool in 2019 and provides a more structured, signaling-question-based approach. RoB 2 assesses risk of bias in randomized controlled trials across five domains, each evaluated through a series of signaling questions that guide the assessor to a domain-level judgment.
ROBINS-I -- For Non-Randomized Studies of Interventions
ROBINS-I (Risk Of Bias In Non-randomized Studies of Interventions) was developed for studies that compare health outcomes across groups but lack randomization. ROBINS-I assesses risk of bias in non-randomized studies by evaluating seven domains organized into pre-intervention, at-intervention, and post-intervention categories. Because non-randomized studies lack the built-in protection against confounding that randomization provides, ROBINS-I is necessarily more complex than RoB 2 and demands careful consideration of each study's analytical approach to confounding control. For a full walkthrough, see our ROBINS-I assessment guide.
Newcastle-Ottawa Scale -- For Observational Studies
The Newcastle-Ottawa Scale is widely used for cohort and case-control studies. It is simpler than ROBINS-I and uses a star-based scoring system rather than domain-level judgments. NOS is particularly common in public health and epidemiology reviews where observational designs predominate. Its ease of use makes it accessible to reviewers without extensive methodological training, though this simplicity comes at the cost of less detailed domain-level insight compared to ROBINS-I.
JBI Checklists -- For Qualitative and Other Study Types
The JBI Checklist (Joanna Briggs Institute) provides JBI critical appraisal tools for study types that RoB 2, ROBINS-I, and NOS do not cover. These include qualitative studies, cross-sectional studies, prevalence studies, case reports, and diagnostic accuracy studies. Each checklist is tailored to the methodological features of its target study design. JBI checklists use a Yes/No/Unclear/Not applicable format and are freely available from the JBI website, making them the most versatile option for mixed-methods systematic reviews.
Assessing observational studies? Use our Newcastle-Ottawa Scale calculator or JBI critical appraisal tool.