A systematic review protocol is the single most important document you will write before launching your review. It locks in your methods, shields your work from bias, and tells the research community exactly what you plan to do and why. Without a protocol, even a well-executed review can be dismissed as retrospectively engineered, and rightly so.

This guide walks you through every section of a rigorous protocol, aligned with the Cochrane Handbook for Systematic Reviews of Interventions, PRISMA 2020, the JBI Manual for Evidence Synthesis, and GRADE certainty-of-evidence standards. By the end, you will have a clear roadmap from research question to PROSPERO registration, with practical examples and decision frameworks at every step.


Why Your Systematic Review Needs a Protocol

A protocol is not bureaucratic paperwork. It is a methodological contract that separates credible evidence synthesis from cherry-picked narrative reviews. Registering or publishing your protocol before beginning data collection delivers four critical advantages that directly improve the quality and credibility of your final review.

Bias prevention is the most important function. Pre-specifying inclusion criteria, outcomes, and analysis methods removes the temptation to adjust decisions after seeing the data. This is the same logic behind clinical trial registration, and journals increasingly hold systematic reviews to the same standard. Empirical studies comparing reviews with and without registered protocols consistently find fewer discrepancies between planned and reported methods when protocols exist.

Transparency allows peer reviewers, journal editors, and readers to compare your final manuscript against your original plan. Any deviations must be justified, which raises the bar for intellectual honesty. A registered protocol creates a public timestamp that proves your methods were established before the results were known.

Reduced duplication benefits the entire research community. PROSPERO and other registries are searchable, so registering your protocol alerts other teams working on similar questions. This encourages collaboration instead of redundancy and helps funders avoid financing duplicate work.

Team alignment is an often-overlooked practical benefit. A detailed protocol forces your co-authors to agree on methods before the work begins. Disagreements about eligibility criteria, outcome definitions, or synthesis approaches are far cheaper to resolve at the protocol stage than mid-review when you have already screened thousands of records.

If your goal is publication in a high-impact journal, a registered protocol is effectively a prerequisite. Cochrane, The Lancet, JAMA, and BMJ all expect or require protocol registration for systematic reviews.


Step 1: Formulate Your Research Question with PICO

Every strong protocol begins with a precisely framed question. The PICO framework is the gold standard for intervention reviews, and it directly shapes every downstream decision in your protocol, from eligibility criteria to search strategy to outcome analysis.

ElementDefinitionExample
P (Population)Who are the participants?Adults aged 18+ with type 2 diabetes mellitus
I (Intervention)What treatment or exposure is being studied?Continuous glucose monitoring (CGM) devices
C (Comparator)What is the alternative being compared?Self-monitored blood glucose (SMBG) testing
O (Outcome)What results are measured?HbA1c reduction at 12 months, hypoglycaemic episodes

For qualitative or scoping reviews, use PCC (Population, Concept, Context) instead. For diagnostic accuracy reviews, use PIRD (Population, Index test, Reference standard, Diagnosis). The key principle remains the same regardless of framework: every word in your question constrains your eligibility criteria, your search strategy, and your analysis plan.

Common Mistakes When Framing the Research Question

Researchers frequently make three errors at this stage that create cascading problems throughout the review. First, framing the question too broadly results in thousands of irrelevant hits during screening and forces you to either narrow retroactively (which looks like post-hoc decision-making) or spend weeks screening records that should never have been retrieved.

Second, omitting the comparator makes it impossible to assess relative effectiveness. A question like "Does CBT reduce anxiety?" lacks the specificity needed for a meaningful synthesis. "Does CBT reduce anxiety compared to waitlist control in adults with generalised anxiety disorder?" defines exactly what you are comparing and for whom.

Third, specifying only a single outcome when your review should address multiple endpoints. Most clinical questions involve both efficacy outcomes and safety outcomes, and your protocol should declare both primary and secondary endpoints upfront. This prevents the appearance of outcome switching later.

Pro Tip: Test your PICO by converting it to a search query. If your research question does not translate naturally into database search terms, it probably needs refinement. Try running a preliminary search in PubMed using your PICO elements as search concepts. If you retrieve fewer than 50 results, your question may be too narrow. If you retrieve more than 10,000, it is almost certainly too broad. This quick calibration exercise takes 15 minutes and can save weeks of wasted effort.


Step 2: Define Eligibility Criteria

Your eligibility criteria operationalise each PICO element into concrete inclusion and exclusion rules. The goal is to write criteria so explicit and unambiguous that any trained reviewer on your team can apply them consistently without needing to consult you for clarification.

Inclusion Criteria

Structure your inclusion criteria around these six domains, with specific operational definitions for each:

Exclusion Criteria

Exclusion criteria should address study types you will not consider (case reports, editorials, conference abstracts without full data), populations that overlap with but differ from your target (e.g., excluding gestational diabetes when your focus is type 2 diabetes), and interventions that are conceptually distinct from your intervention of interest. Every exclusion should be justified with a clear rationale.

Pro Tip: Pilot your criteria on 50 records before finalising. Have two reviewers independently screen a random sample of 50 title-abstract records using your draft criteria. If they disagree on more than 10 percent of records, your criteria have ambiguities that need to be resolved before full screening begins. Piloting typically takes half a day and reveals problems that would otherwise surface hundreds of hours into the review.


Step 3: Develop a Reproducible Search Strategy

The search strategy is the engine of your review. A poorly designed search will either miss relevant studies (undermining the comprehensiveness that defines a systematic review) or bury you under thousands of irrelevant records (wasting months of screening effort). Developing a strong search strategy is both a science and a craft, and it benefits enormously from expert input.

Building the Search Step by Step

The process follows a logical sequence that translates your PICO concepts into a comprehensive, reproducible database query:

  1. Identify the core concepts from your PICO question. Most searches have two to four main concept blocks. For the CGM example, your concepts would be "type 2 diabetes," "continuous glucose monitoring," and potentially "glycaemic control."

  2. List all synonyms for each concept. Include controlled vocabulary terms (MeSH headings for MEDLINE, Emtree descriptors for Embase), free-text keywords, spelling variations (American and British English), abbreviations, and brand names where relevant. Cast a wide net because you can always deduplicate, but you cannot find studies you did not search for.

  3. Combine synonyms within each concept using OR to create a comprehensive block for that concept. Then combine the concept blocks using AND to produce results that address all elements of your question simultaneously.

  4. Apply validated filters judiciously. The Cochrane Highly Sensitive Search Strategy for randomised controlled trials is a well-validated design filter. Ad-hoc filters for language, date, or study type should be used sparingly and documented with justification. Over-filtering is one of the most common causes of missed studies in systematic reviews.

Minimum Database Requirements

The Cochrane Handbook recommends searching at a minimum these core databases, though your specific topic may require additional sources:

DatabaseCoverageAccess
MEDLINE (via PubMed or Ovid)Biomedical and life sciences, 1946–presentFree (PubMed) or institutional (Ovid)
Embase (via Ovid or Elsevier)Biomedical with strong pharmacology and international coverageInstitutional subscription
CENTRAL (Cochrane Library)Controlled trials extracted from multiple databasesFree or institutional
PsycINFOPsychology, behavioural sciences, mental healthInstitutional subscription
CINAHLNursing and allied healthInstitutional subscription

Supplementary Search Methods

Database searching alone is insufficient for a comprehensive systematic review. You should also conduct reference list checking of all included studies (backward citation searching), forward citation searching using tools like Google Scholar or Scopus, grey literature searching through clinical trial registries (ClinicalTrials.gov, WHO ICTRP), dissertation databases (ProQuest), and conference proceedings, and direct contact with subject experts to identify unpublished or in-progress studies.

Pro Tip: Involve a research librarian early and use the PRESS checklist. An information specialist can peer-review your search strategy using the PRESS (Peer Review of Electronic Search Strategies) checklist, which evaluates translation across databases, subject headings, spelling, syntax, and Boolean logic. Studies consistently show that librarian-reviewed searches retrieve significantly more relevant records while maintaining precision. If your institution does not have a medical librarian, many university libraries offer consultation services to external researchers for a modest fee.


Step 4: Plan Screening and Selection

Your screening process must be described in enough operational detail that someone outside your team could replicate it independently. This section of your protocol should address every phase of the screening workflow, the tools you will use, and how you will handle disagreements.

Two-Phase Screening Process

Phase 1: Title and abstract screening is where the majority of records are excluded. At least two reviewers should independently screen all records retrieved by the search strategy. Before beginning full screening, conduct a calibration exercise on a batch of 50 to 100 records. During calibration, both reviewers screen the same records, compare their decisions, discuss any disagreements, and refine the application of eligibility criteria until you achieve an acceptable level of agreement (typically Cohen's kappa above 0.80).

Phase 2: Full-text screening involves retrieving and reading the complete publication for every record that passed title-abstract screening. Two reviewers should independently assess each full-text article against your eligibility criteria. For every excluded full-text, record a specific reason for exclusion that maps to one of your criteria (wrong population, wrong intervention, wrong comparator, wrong outcome, wrong study design, or wrong publication type). These exclusion reasons will populate the bottom box of your PRISMA 2020 flow diagram.

Conflict Resolution and Documentation

Specify your conflict resolution process: disagreements should be resolved first by discussion between the two reviewers, then by adjudication from a third senior reviewer if consensus cannot be reached. Name the screening software you will use (Covidence, Rayyan, EPPI-Reviewer, or a custom spreadsheet) and state that you will track records through each phase and report results using the PRISMA 2020 flow diagram template.


Step 5: Design Data Extraction

Data extraction translates the information in each included study into a structured dataset ready for synthesis. A well-designed, pilot-tested extraction form prevents missing fields, reduces inconsistent coding between reviewers, and saves significant time during the analysis phase.

Essential Extraction Fields

Your extraction form should capture these categories of information for every included study:

CategoryFields
Study identifiersFirst author, publication year, journal, country, funding source
Design and settingStudy design, randomisation method, blinding, single vs. multicentre, clinical setting
ParticipantsSample size per arm, age (mean/SD or median/IQR), sex distribution, baseline severity, diagnostic criteria
Intervention detailsSpecific intervention, dose/intensity, frequency, duration, delivery mode, co-interventions
Comparator detailsSpecific comparator, dose/intensity, frequency, duration
Outcome dataMeans and SDs, event counts, time points, adjusted vs. unadjusted estimates
Risk of biasDomain-level judgements from your chosen tool

Extraction Best Practices

Use a standardised electronic form in a tool like REDCap, Google Forms, or your review management software. Have two reviewers independently extract data from every study, or at minimum have one reviewer extract and a second reviewer verify every data point against the source publication. Pilot the extraction form on three to five diverse studies before proceeding. Piloting invariably reveals missing fields, ambiguous categories, and inconsistencies in how reviewers interpret the form.

Pro Tip: Pre-specify your approach to missing data before you encounter it. Define in your protocol whether you will contact study authors for missing data (and how many attempts you will make), use statistical imputation methods, or exclude studies with insufficient data. Making this decision after seeing which studies have missing data introduces bias. Most protocols specify that the review team will email corresponding authors up to twice with a two-week window before classifying data as unavailable.


Step 6: Specify Risk-of-Bias Assessment

The risk-of-bias assessment evaluates the internal validity of each included study and determines how much confidence you should place in its findings. Selecting the right tool is critical because different study designs have different sources of bias, and each tool evaluates different domains.

Choosing the Right Tool

Study DesignRecommended ToolKey Domains Assessed
Randomised controlled trialsRoB 2 (Cochrane, revised)Randomisation, deviations, missing data, measurement, selection
Non-randomised intervention studiesROBINS-IConfounding, selection, classification, deviations, missing data, measurement, reporting
Cohort and case-control studiesNewcastle-Ottawa ScaleSelection, comparability, outcome/exposure ascertainment
Diagnostic accuracy studiesQUADAS-2Patient selection, index test, reference standard, flow and timing
Qualitative studiesJBI Critical AppraisalPhilosophical perspective, methodology, representation, interpretation

How Risk of Bias Should Inform Your Synthesis

Your protocol must specify not only which tool you will use, but how the assessments will influence your analysis. At minimum, plan a sensitivity analysis that excludes studies rated at high risk of bias. You should also present risk-of-bias results in a summary figure (the "traffic light" plot for RoB 2) and incorporate risk-of-bias ratings into your GRADE certainty-of-evidence assessment. State that at least two reviewers will independently assess risk of bias for every included study, with disagreements resolved by discussion or third-reviewer adjudication.


Step 7: Plan Your Synthesis Approach

The synthesis section is where your protocol moves from information gathering to analysis. Pre-specifying this section is essential because it prevents the most damaging form of bias in evidence synthesis: choosing the analytical method that produces the most favourable or interesting result after seeing the data.

Narrative Synthesis

If quantitative pooling is not feasible due to excessive clinical or methodological heterogeneity, describe how you will organise, present, and interpret findings narratively. Use the Synthesis Without Meta-analysis (SWiM) reporting guideline to structure this section. Even without a meta-analysis, you should present results in structured tables, group studies by meaningful categories, describe the direction and magnitude of effects across studies, and explain why quantitative pooling was not appropriate.

Meta-Analysis Specification

If you plan to conduct a meta-analysis, your protocol must pre-specify every analytical decision. Leaving these choices unspecified allows (consciously or unconsciously) the selection of methods that favour particular results.

Effect measure: Justify your choice of risk ratio, odds ratio, mean difference, standardised mean difference, or hazard ratio based on the outcome type and the measurement scales used across your included studies.

Statistical model: Specify whether you will use a random-effects model (DerSimonian-Laird, restricted maximum likelihood, or the Hartung-Knapp-Sidik-Jonkman adjustment) or a fixed-effect model (inverse variance). In most systematic reviews, a random-effects model is appropriate because clinical and methodological heterogeneity between studies is expected.

Heterogeneity assessment: State that you will report the I-squared statistic, Cochran's Q test, tau-squared, and prediction intervals. Pre-define the thresholds that will trigger further investigation. The Cochrane Handbook suggests that I-squared above 50 percent warrants exploration through subgroup analysis or meta-regression.

Subgroup analyses: List every subgroup variable you plan to explore (e.g., age group, intervention dose, study quality, geographic region) and limit the number to avoid data dredging. A common guideline is no more than one subgroup analysis per 10 included studies.

Sensitivity analyses: Describe at least one planned sensitivity analysis. Common approaches include excluding studies at high risk of bias, using alternative effect size calculations, restricting to studies that meet a specific methodological threshold, or using a different statistical model.

Publication bias: If you expect 10 or more studies contributing to a meta-analysis, specify that you will assess publication bias using funnel plot inspection and a statistical test (Egger's test for continuous outcomes, Peters' test for binary outcomes).

Certainty of evidence: State that you will use the GRADE approach to rate the certainty of the body of evidence for each primary outcome, evaluating risk of bias, inconsistency, indirectness, imprecision, and publication bias.


Step 8: Register Your Protocol on PROSPERO

PROSPERO (International Prospective Register of Systematic Reviews) is a free, searchable registry maintained by the Centre for Reviews and Dissemination at the University of York. Registration provides your protocol with a permanent, citable record that timestamps your methods before data collection begins.

The Registration Process

The registration workflow involves creating an account on the PROSPERO website, completing the structured registration form (which mirrors the sections of your protocol), and submitting for editorial review. Approval typically takes two to three weeks, after which your protocol receives a unique registration number (e.g., CRD42026000001) that you cite in your manuscript, grant applications, and any related publications.

What PROSPERO Accepts

PROSPERO currently accepts systematic reviews with health-related outcomes (broadly defined to include public health, health services, and social care interventions), scoping reviews (accepted since 2022), and reviews at the protocol stage (data extraction must not have started at the time of registration). For reviews outside PROSPERO's scope, the Open Science Framework (OSF) accepts any type of review protocol, and journals such as Systematic Reviews, BMJ Open, and JMIR Research Protocols publish protocol manuscripts.

Pro Tip: Version-control your protocol and maintain a dated changelog. PROSPERO allows amendments at any stage, and legitimate changes are expected as you learn more about the literature. But reviewers and journal editors want to see exactly what changed, when, and why. Keep a dated changelog at the end of your protocol document that records every amendment with its rationale. A transparent revision history strengthens the credibility of your final review and demonstrates methodological integrity.


The PRISMA-P Quality Check

Before submitting your protocol for registration or publication, audit it against the PRISMA-P (Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols) checklist. This 17-item checklist covers three major sections:

SectionItems Covered
Administrative informationTitle, registration status, authors, contact details, funding, role of funder
IntroductionRationale, specific objectives
MethodsEligibility criteria, information sources, search strategy, study selection, data extraction, risk of bias, synthesis plan, meta-bias assessment, certainty of evidence

Use PRISMA-P as both a drafting guide during protocol development and a final completeness audit before submission. Many journals require a completed PRISMA-P checklist as a supplementary file at protocol manuscript submission. Working through the checklist systematically also helps you identify gaps in your protocol that you might not have noticed otherwise.


From Protocol to Published Review

Your protocol is a living document, but changes must be transparent and justified. As your review progresses from protocol through data collection, synthesis, and manuscript preparation, maintain these practices to preserve the integrity of your pre-specified methods:

A well-crafted protocol does more than satisfy journal requirements. It is the intellectual backbone of your review, the document that proves your findings were discovered rather than manufactured. Invest the time to get it right, and every subsequent stage of your systematic review will be stronger, faster, and more defensible.