When should I use ROBINS-I vs NOS?

ROBINS-I for intervention studies (did treatment X cause outcome Y). Newcastle-Ottawa Scale for broader observational quality assessment.

How long does ROBINS-I take per study?

20-40 minutes. ROBINS-I is more complex than RoB 2 due to the confounding domain and target trial specification.

Do I need the target trial for every study?

Specify one target trial for your review question. Apply it consistently across all included studies.

What makes confounding critical?

When the study provides no adjustment for one or more key confounders that fundamentally undermine the comparison.

Can I use ROBINS-I for diagnostic accuracy studies?

No. Use QUADAS-2 for diagnostic accuracy. ROBINS-I is for intervention studies.

Back to Blog

Methodology

10 min read

ROBINS-I Assessment Guide: Evaluating Risk of Bias in Non-Randomized Studies

Complete ROBINS-I assessment guide covering all 7 bias domains, the target trial framework, judgement categories, and practical steps for evaluating risk of bias in non-randomized studies of interventions.

Dr. Sarah Mitchell

February 14, 2026

Assessing non-randomized studies? Use our free ROBINS-I assessment tool to work through each domain systematically.

Key Takeaways

ROBINS-I assesses bias in cohort, case-control, cross-sectional, and other non-randomized designs (Sterne et al., 2016)

ROBINS-I evaluates 7 domains: confounding, selection of participants, classification of interventions, deviations from interventions, missing data, measurement of outcomes, and selection of reported results

The tool benchmarks each domain against a hypothetical target trial

Need expert quality assessment for your review?

Our methodologists conduct dual-reviewer risk of bias assessments, GRADE certainty ratings, and publication-ready summary tables.

Explore our Evidence Synthesis Service starting from $750.

Get a Free Quote

Quality Assessment Support

Quality Assessment Takes Expertise. Our Team Does It Daily.

Rigorous risk of bias assessment, GRADE evaluations, and summary tables that satisfy peer reviewers. We handle the methodology so your review stands up to scrutiny.

4.9/5 rating1,194+ deliveredNDA protected

Get a Free Quote Pricing

Written by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences.

Learn more about our team

For randomized trials, see our RoB 2 assessment guide instead.

Research Gold's methodology team conducts dual-reviewer ROBINS-I assessments as part of our systematic review service. Start your project today.

Quality Assessment Takes Expertise. Our Team Does It Daily.

Rigorous risk of bias assessment, GRADE evaluations, and summary tables that satisfy peer reviewers. We handle the methodology so your review stands up to scrutiny.

Get a Free Quote Explore Evidence Synthesis Service

Starting from $750. Final quote after we review your project.

ROBINS-I assessment is the standard approach for evaluating risk of bias in non-randomized studies of interventions. Developed by Sterne and colleagues in 2016 and endorsed by Cochrane, the ROBINS-I tool provides a structured framework for judging whether a cohort study, case-control study, or other non-randomized design produces results that can be trusted for clinical and policy decisions. If your systematic review includes any non-randomized evidence, ROBINS-I is the tool you need.

ROBINS-I stands for Risk Of Bias In Non-randomized Studies of Interventions. Unlike simple quality checklists that assign points, ROBINS-I asks assessors to reason through seven specific bias domains and reach a judgement for each one. The overall risk of bias rating for a study is determined by its worst domain-level judgement. This makes ROBINS-I more rigorous and more demanding than older tools, but it also produces more transparent and reproducible assessments.

What Is ROBINS-I?

The tool was published in the BMJ by Sterne, Hernan, Reeves, Savovic, Berkman, Viswanathan, and colleagues as part of an international effort to bring the same rigor to non-randomized study assessment that the Cochrane Risk of Bias tool (now RoB 2) brought to randomized controlled trials. Before ROBINS-I, reviewers relied on ad-hoc scales and checklists that lacked a coherent theoretical foundation. ROBINS-I changed that by grounding every judgement in a comparison against an ideal randomized experiment.

The tool applies to any study that compares two or more intervention groups using non-randomized data. This includes prospective and retrospective cohort studies, case-control studies, controlled before-after designs, interrupted time series with a comparison group, and cross-sectional studies comparing exposed and unexposed populations. It does not apply to single-arm studies, uncontrolled case series, or diagnostic accuracy studies.

ROBINS-I is recommended in the Cochrane Handbook Chapter 25 (Higgins et al., 2023) as the preferred tool whenever a systematic review includes non-randomized studies of interventions. Its results feed directly into GRADE assessments, where risk of bias is one of five domains that determine the overall certainty of evidence.

The Target Trial Framework

The most important concept in ROBINS-I is the target trial framework. Before assessing any study, you must specify the hypothetical randomized controlled trial that would ideally answer your review question. Every domain-level judgement then asks: how does this study deviate from that ideal experiment?

The target trial specification includes the eligible population, the experimental and comparator interventions, the randomization and allocation procedure, the follow-up period, the primary outcome and how it would be measured, and the analysis plan. You do not need a detailed protocol for a trial that will never run. You need enough specificity to anchor your bias judgements.

This framework improves upon ad-hoc checklists because it forces assessors to think causally. Rather than asking "did the study control for confounders?" in the abstract, ROBINS-I asks "what confounders would be balanced by randomization in the target trial, and did the study adequately adjust for them?" The target trial makes the standard of comparison explicit, reducing subjective disagreement between assessors.

You specify one target trial for your entire review, not one per included study. Every study is then evaluated against the same benchmark. This ensures consistency and allows meaningful comparison of bias judgements across studies. Write the target trial specification into your systematic review protocol before you begin data extraction.

The target trial concept comes from the work of Hernan and Robins on causal inference. It recognizes that non-randomized studies are attempting to emulate a trial that was never conducted. The degree to which a study succeeds in that emulation, or fails, determines its risk of bias.

The 7 ROBINS-I Domains

Diagram of the seven ROBINS-I domains for non-randomized studies grouped by pre-, at-, and post-intervention stage — The seven ROBINS-I domains and five-level judgement scale. Source: Sterne et al., 2016, BMJ.

ROBINS-I evaluates seven distinct sources of bias. Each domain is assessed through a series of signalling questions that guide the assessor toward a domain-level judgement. The domains are ordered to follow the chronological structure of a study, from the conditions present before intervention assignment through to the reporting of results.

Domain 1: Bias Due to Confounding

Confounding occurs when a prognostic factor that predicts the outcome also influences which intervention a participant receives. In a randomized trial, randomization balances measured and unmeasured confounders across groups. In a non-randomized study, this balance is absent by default.

The signalling questions ask whether the study identified relevant confounders, whether those confounders were measured validly, and whether appropriate analytical methods (such as regression adjustment, propensity score matching, or instrumental variable analysis) were used to control for them. A study that provides no adjustment for key confounders will typically receive a serious or critical risk of bias judgement for this domain.

Confounding is the domain where non-randomized studies are most vulnerable. Even well-designed observational studies with rigorous adjustment can only control for measured confounders. Residual and unmeasured confounding remain threats. For this reason, achieving a low risk of bias rating for confounding requires strong justification, for instance, when the intervention is assigned based on a clear threshold (creating a natural experiment) or when the study uses an instrumental variable approach.

Domain 2: Bias in Selection of Participants

Selection bias in ROBINS-I refers to bias that arises when participant inclusion in the study is related to both the intervention and the outcome. This is not the same as external validity or generalizability. It concerns whether the selection process created a distorted comparison between intervention groups.

For example, if a cohort study excludes participants who experienced the outcome before the study start date but does so differentially between groups, the resulting comparison is biased. Similarly, if participants are selected into the study based on their intervention status and their outcome status simultaneously, the analysis will produce misleading estimates.

The signalling questions ask whether the start of follow-up coincided with intervention assignment, whether selection was related to the intervention and outcome, and whether appropriate adjustments were made for any selection-related biases.

Domain 3: Bias in Classification of Interventions

Misclassification of intervention status occurs when the study incorrectly assigns participants to intervention groups. In a well-conducted RCT, intervention assignment is unambiguous. In non-randomized studies, intervention status may be determined from medical records, self-report, or administrative databases, each of which introduces the possibility of error.

This domain asks whether intervention status was well-defined and whether classification was based on information collected at or before the time of intervention. Differential misclassification, where errors in classification are related to the outcome, is particularly problematic because it can bias the effect estimate in either direction.

Domain 4: Bias Due to Deviations From Intended Interventions

Once participants are assigned to an intervention group, they may not receive or adhere to that intervention as intended. Deviations can include switching between groups, co-interventions that differ between groups, or non-adherence to the assigned treatment protocol.

ROBINS-I distinguishes between two types of effect: the effect of assignment to intervention (analogous to intention-to-treat analysis in a trial) and the effect of starting and adhering to intervention (analogous to per-protocol analysis). The signalling questions differ depending on which effect your review targets.

For the effect of assignment, the key concern is whether deviations from intended interventions occurred and whether they were balanced across groups. For the effect of adherence, the questions ask whether appropriate statistical methods (such as inverse probability weighting) were used to account for deviations.

Domain 5: Bias Due to Missing Data

Missing data can bias results when the proportion of participants with missing outcome data is substantial and when missingness is related to the true value of the outcome. In a well-conducted trial, missing data are minimized through active follow-up and reported transparently. In non-randomized studies, missing data may be more prevalent and less well-documented.

The signalling questions ask whether outcome data were available for all or nearly all participants, whether the proportion of missing data was similar across intervention groups, and whether the study used appropriate methods to handle missing data (such as multiple imputation). A study that excludes 30% of participants without explanation or analysis of the impact will typically receive a serious risk of bias judgement for this domain.

Domain 6: Bias in Measurement of Outcomes

Bias in outcome measurement occurs when the method of measuring the outcome is influenced by knowledge of the intervention received. In a blinded RCT, outcome assessors do not know which treatment participants received. In most non-randomized studies, blinding is absent.

The signalling questions ask whether the outcome measure was objective or subjective, whether assessors were blinded to intervention status, and whether measurement methods were comparable across groups. Objective outcomes (such as mortality or laboratory values) are less susceptible to measurement bias than subjective outcomes (such as pain scales or functional assessments), even without blinding.

Domain 7: Bias in Selection of the Reported Result

Reporting bias at the study level occurs when the reported result is selected from among multiple analyses, outcomes, or subgroups based on the direction or magnitude of the effect. A study that measures an outcome at multiple time points but reports only the most favorable one introduces selection bias in the reported result.

The signalling questions ask whether the outcome and analysis plan were pre-specified, whether multiple outcome measures or analytical methods were used, and whether there is evidence that the reported result was selected on the basis of the findings. Pre-registration of the study protocol provides protection against this domain, though it is less common in non-randomized research than in randomized trials.

Need ROBINS-I assessments for non-randomized studies in your review? Our team conducts rigorous bias assessments with dual-reviewer calibration and transparent domain-level justifications. get your personalized project cost estimate, or see our reliable systematic review services support.

Judgement	Meaning	Implication
Low risk of bias	Comparable to a well-conducted randomized controlled trial for this domain	No concern about bias from this source
Moderate risk of bias	Sound for a non-randomized study but cannot be considered comparable to a well-conducted randomized controlled trial	Some concern, but unlikely to seriously alter results
Serious risk of bias	Important problems exist in this domain	Results may be meaningfully biased
Critical risk of bias	The study is too problematic in this domain to provide useful evidence	Results should not be used for the comparison of interest

Feature	ROBINS-I	RoB 2
Study designs	Cohort, case-control, cross-sectional, controlled before-after	Individually randomized, cluster-randomized trials
Number of domains	7	5
Judgement categories	Low, Moderate, Serious, Critical, NI	Low, Some Concerns, High
Target trial required	Yes	No (randomization provides the benchmark)
Confounding domain	Yes (Domain 1)	Not applicable
Assessment time per study	20–40 minutes	15–25 minutes
Overall judgement rule	Worst domain determines overall	Algorithm-based overall judgement

ROBINS-I Assessment Guide: Evaluating Risk of Bias in Non-Randomized Studies

Key Takeaways

Quality Assessment Takes Expertise. Our Team Does It Daily.

Dr. Sarah Mitchell

Quality Assessment Takes Expertise. Our Team Does It Daily.

What Is ROBINS-I?

The Target Trial Framework

The 7 ROBINS-I Domains

Domain 1: Bias Due to Confounding

Domain 2: Bias in Selection of Participants

Domain 3: Bias in Classification of Interventions

Domain 4: Bias Due to Deviations From Intended Interventions

Domain 5: Bias Due to Missing Data

Domain 6: Bias in Measurement of Outcomes

Domain 7: Bias in Selection of the Reported Result

Judgement Categories and the Overall ROBINS-I Assessment Rating

Define the target trial once

Confounding is almost always moderate or worse

Use ROBINS-I's free Excel template

Frequently Asked Questions

Related Articles

ROBINS-I vs RoB 2, When to Use Which

Presenting ROBINS-I Results

Common ROBINS-I Assessment Mistakes

Related Articles