ROBINS-I assessment is the standard approach for evaluating risk of bias in non-randomized studies of interventions. Developed by Sterne and colleagues in 2016 and endorsed by Cochrane, the ROBINS-I tool provides a structured framework for judging whether a cohort study, case-control study, or other non-randomized design produces results that can be trusted for clinical and policy decisions. If your systematic review includes any non-randomized evidence, ROBINS-I is the tool you need.
ROBINS-I stands for Risk Of Bias In Non-randomized Studies of Interventions. Unlike simple quality checklists that assign points, ROBINS-I asks assessors to reason through seven specific bias domains and reach a judgement for each one. The overall risk of bias rating for a study is determined by its worst domain-level judgement. This makes ROBINS-I more rigorous and more demanding than older tools, but it also produces more transparent and reproducible assessments.
What Is ROBINS-I?
The tool was published in the BMJ by Sterne, Hernan, Reeves, Savovic, Berkman, Viswanathan, and colleagues as part of an international effort to bring the same rigor to non-randomized study assessment that the Cochrane Risk of Bias tool (now RoB 2) brought to randomized controlled trials. Before ROBINS-I, reviewers relied on ad-hoc scales and checklists that lacked a coherent theoretical foundation. ROBINS-I changed that by grounding every judgement in a comparison against an ideal randomized experiment.
The tool applies to any study that compares two or more intervention groups using non-randomized data. This includes prospective and retrospective cohort studies, case-control studies, controlled before-after designs, interrupted time series with a comparison group, and cross-sectional studies comparing exposed and unexposed populations. It does not apply to single-arm studies, uncontrolled case series, or diagnostic accuracy studies.
ROBINS-I is recommended in the Cochrane Handbook Chapter 25 (Higgins et al., 2023) as the preferred tool whenever a systematic review includes non-randomized studies of interventions. Its results feed directly into GRADE assessments, where risk of bias is one of five domains that determine the overall certainty of evidence.
The Target Trial Framework
The most important concept in ROBINS-I is the target trial framework. Before assessing any study, you must specify the hypothetical randomized controlled trial that would ideally answer your review question. Every domain-level judgement then asks: how does this study deviate from that ideal experiment?
The target trial specification includes the eligible population, the experimental and comparator interventions, the randomization and allocation procedure, the follow-up period, the primary outcome and how it would be measured, and the analysis plan. You do not need a detailed protocol for a trial that will never run. You need enough specificity to anchor your bias judgements.
This framework improves upon ad-hoc checklists because it forces assessors to think causally. Rather than asking "did the study control for confounders?" in the abstract, ROBINS-I asks "what confounders would be balanced by randomization in the target trial, and did the study adequately adjust for them?" The target trial makes the standard of comparison explicit, reducing subjective disagreement between assessors.
You specify one target trial for your entire review, not one per included study. Every study is then evaluated against the same benchmark. This ensures consistency and allows meaningful comparison of bias judgements across studies. Write the target trial specification into your systematic review protocol before you begin data extraction.
The target trial concept comes from the work of Hernan and Robins on causal inference. It recognizes that non-randomized studies are attempting to emulate a trial that was never conducted. The degree to which a study succeeds in that emulation, or fails, determines its risk of bias.
The 7 ROBINS-I Domains
ROBINS-I evaluates seven distinct sources of bias. Each domain is assessed through a series of signalling questions that guide the assessor toward a domain-level judgement. The domains are ordered to follow the chronological structure of a study, from the conditions present before intervention assignment through to the reporting of results.
Domain 1: Bias Due to Confounding
Confounding occurs when a prognostic factor that predicts the outcome also influences which intervention a participant receives. In a randomized trial, randomization balances measured and unmeasured confounders across groups. In a non-randomized study, this balance is absent by default.
The signalling questions ask whether the study identified relevant confounders, whether those confounders were measured validly, and whether appropriate analytical methods (such as regression adjustment, propensity score matching, or instrumental variable analysis) were used to control for them. A study that provides no adjustment for key confounders will typically receive a serious or critical risk of bias judgement for this domain.
Confounding is the domain where non-randomized studies are most vulnerable. Even well-designed observational studies with rigorous adjustment can only control for measured confounders. Residual and unmeasured confounding remain threats. For this reason, achieving a low risk of bias rating for confounding requires strong justification, for instance, when the intervention is assigned based on a clear threshold (creating a natural experiment) or when the study uses an instrumental variable approach.
Domain 2: Bias in Selection of Participants
Selection bias in ROBINS-I refers to bias that arises when participant inclusion in the study is related to both the intervention and the outcome. This is not the same as external validity or generalizability. It concerns whether the selection process created a distorted comparison between intervention groups.
For example, if a cohort study excludes participants who experienced the outcome before the study start date but does so differentially between groups, the resulting comparison is biased. Similarly, if participants are selected into the study based on their intervention status and their outcome status simultaneously, the analysis will produce misleading estimates.
The signalling questions ask whether the start of follow-up coincided with intervention assignment, whether selection was related to the intervention and outcome, and whether appropriate adjustments were made for any selection-related biases.
Domain 3: Bias in Classification of Interventions
Misclassification of intervention status occurs when the study incorrectly assigns participants to intervention groups. In a well-conducted RCT, intervention assignment is unambiguous. In non-randomized studies, intervention status may be determined from medical records, self-report, or administrative databases, each of which introduces the possibility of error.
This domain asks whether intervention status was well-defined and whether classification was based on information collected at or before the time of intervention. Differential misclassification, where errors in classification are related to the outcome, is particularly problematic because it can bias the effect estimate in either direction.
Domain 4: Bias Due to Deviations From Intended Interventions
Once participants are assigned to an intervention group, they may not receive or adhere to that intervention as intended. Deviations can include switching between groups, co-interventions that differ between groups, or non-adherence to the assigned treatment protocol.
ROBINS-I distinguishes between two types of effect: the effect of assignment to intervention (analogous to intention-to-treat analysis in a trial) and the effect of starting and adhering to intervention (analogous to per-protocol analysis). The signalling questions differ depending on which effect your review targets.
For the effect of assignment, the key concern is whether deviations from intended interventions occurred and whether they were balanced across groups. For the effect of adherence, the questions ask whether appropriate statistical methods (such as inverse probability weighting) were used to account for deviations.
Domain 5: Bias Due to Missing Data
Missing data can bias results when the proportion of participants with missing outcome data is substantial and when missingness is related to the true value of the outcome. In a well-conducted trial, missing data are minimized through active follow-up and reported transparently. In non-randomized studies, missing data may be more prevalent and less well-documented.
The signalling questions ask whether outcome data were available for all or nearly all participants, whether the proportion of missing data was similar across intervention groups, and whether the study used appropriate methods to handle missing data (such as multiple imputation). A study that excludes 30% of participants without explanation or analysis of the impact will typically receive a serious risk of bias judgement for this domain.
Domain 6: Bias in Measurement of Outcomes
Bias in outcome measurement occurs when the method of measuring the outcome is influenced by knowledge of the intervention received. In a blinded RCT, outcome assessors do not know which treatment participants received. In most non-randomized studies, blinding is absent.
The signalling questions ask whether the outcome measure was objective or subjective, whether assessors were blinded to intervention status, and whether measurement methods were comparable across groups. Objective outcomes (such as mortality or laboratory values) are less susceptible to measurement bias than subjective outcomes (such as pain scales or functional assessments), even without blinding.
Domain 7: Bias in Selection of the Reported Result
Reporting bias at the study level occurs when the reported result is selected from among multiple analyses, outcomes, or subgroups based on the direction or magnitude of the effect. A study that measures an outcome at multiple time points but reports only the most favorable one introduces selection bias in the reported result.
The signalling questions ask whether the outcome and analysis plan were pre-specified, whether multiple outcome measures or analytical methods were used, and whether there is evidence that the reported result was selected on the basis of the findings. Pre-registration of the study protocol provides protection against this domain, though it is less common in non-randomized research than in randomized trials.
Need ROBINS-I assessments for non-randomized studies in your review? Our team conducts rigorous bias assessments with dual-reviewer calibration and transparent domain-level justifications. get your personalized project cost estimate, or see our reliable systematic review services support.