The PRISMA flow diagram is a four-phase visual that documents how records are identified, screened, assessed for eligibility, and finally included in a systematic review, providing a transparent accounting of every record entering the pipeline and every reason for exclusion at the full-text stage. The diagram is required by all journals that endorse the PRISMA reporting standard, expected by peer reviewers as a standalone evidence of methodological transparency, and used by readers as the single fastest way to judge whether a review's selection process was reproducible. This guide covers the anatomy of the PRISMA 2020 diagram, the differences from the older PRISMA 2009 version, the official tools for building it, the numbers reviewers most often get wrong, and the placement and styling conventions that journals expect in 2026.
The diagram itself is a visual instrument. The numbers it carries are the load-bearing content. A diagram that looks pretty but reports numbers that contradict the PRISMA 2020 reporting checklist elsewhere in the manuscript fails the transparency standard that the diagram exists to enforce. Reviewers writing manuscripts should treat the flow diagram and the checklist as a coupled pair, never one without the other.
PRISMA 2020 Versus PRISMA 2009: What Changed and Why
The original PRISMA 2009 flow diagram was a four-phase chart: identification, screening, eligibility, included. Records came in from databases and other sources, duplicates were removed, abstracts were screened, full texts were assessed, and the final included set fed into the synthesis. The structure has been retained in 2020 but the diagram has changed in three substantive ways that reviewers must implement.
The terminology distinction between records and reports. In PRISMA 2009 the word "studies" or "records" was used loosely throughout the diagram. PRISMA 2020 distinguishes records (individual bibliographic entries returned by a database search) from reports (the published documents a record points to: journal articles, conference papers, preprints, theses, grey literature). The distinction matters because a single study can have multiple reports (a primary publication, a conference abstract, a thesis, a follow-up extension), and multiple records can point to the same report (different database indexing of the same paper). The 2020 diagram tracks records during identification and screening, then reports during eligibility, and studies at the inclusion node.
The database-and-register branch versus the other-methods branch. The 2020 diagram explicitly splits the identification phase into two parallel columns: one column for records identified from databases and trial registers (the structured-search route) and a separate column for records identified from other methods (citation searching, hand-searching reference lists, contacting authors, organizational websites, social media). Each branch tracks its own records, screened counts, full-text assessments, and exclusions, with the two streams converging only at the final included node. The split addresses a long-standing criticism that the 2009 diagram pooled the two routes and obscured the contribution of each source.
Optional sub-fields and updated guidance for living reviews and updates. The 2020 diagram template provides optional fields for records identified from previous version of the review (when the review is an update) and supports living-review reporting where additional records are tracked in cumulative updates. The complete narrative of what changed between the two PRISMA versions appears in the dedicated what changed in the PRISMA 2020 update guide; the present discussion is limited to the flow diagram specifically.
The PRISMA 2020 statement was published in BMJ in March 2021 by Matthew Page, David Moher, and 25 co-authors, and the flow diagram template was released alongside the statement at the official prisma-statement.org website. Reviews completed before 2021 used the 2009 template and were correct to do so at the time. Reviews published in 2026 are expected to use the 2020 diagram unless a methodological reason demands otherwise.
Anatomy of the 2020 Diagram
The 2020 diagram has three horizontal phases (identification, screening, included) and two parallel vertical branches (databases and registers, other methods). Each cell in the matrix corresponds to a specific count that must appear somewhere in the manuscript or appendices.
Identification phase, left branch (databases and registers). Two boxes. The first reports the number of records identified from each database and register, broken down by source (PubMed n = 1,234; Embase n = 2,456; Cochrane CENTRAL n = 567; total n = 4,257). The second reports the number of records removed before screening, with sub-categories for duplicates removed, records marked ineligible by automation tools (machine learning de-duplication, language filter), and records removed for other reasons (book chapters, editorials, if pre-screening filtered them).
Identification phase, right branch (other methods). A single box reporting the number of records identified from other methods, with optional sub-categories for citation searching (forward and backward), grey literature, contacting authors, organizational websites, and other named sources. The 2020 template lists the standard categories but reviewers can add custom rows for unusual sources.
Screening phase, both branches. Two boxes per branch. The first reports the number of records screened (title and abstract) and the number of records excluded at this stage. The second reports the number of reports sought for retrieval (i.e., full texts the reviewers attempted to obtain) and the number of reports not retrieved (full texts unavailable despite inter-library loan and author contact). The distinction is important: a reviewer who excluded a record at screening because the abstract clearly failed the eligibility criteria is in a different position than a reviewer who could not find the full text at all.
Eligibility phase, both branches. Two boxes per branch. The first reports the number of reports assessed for eligibility (full-text review). The second reports the number of reports excluded with a structured list of reasons (wrong study design, wrong population, wrong intervention, wrong comparator, wrong outcome, conference abstract only, full text in unsupported language, retracted). Each reason gets its own count.
Included phase, convergence node. The two branches merge at the final box, which reports the number of studies included in review and the number of reports of included studies. A study with two reports (primary publication plus follow-up) counts as one study but two reports. The 2020 standard reports both.
The Three Official 2020 Variants
The PRISMA 2020 statement publishes three official flow-diagram templates. Reviewers must pick the one that matches the review type, not the one that looks easiest.
Variant 1: New searches only. The standard four-phase diagram with the two-branch identification (databases-and-registers plus other-methods). Used for first-time systematic reviews where no prior version exists.
Variant 2: New searches plus updates. Adds a column for records identified from a previous version of the review, with sub-rows for studies included previously, studies excluded previously, and new records identified by the update search. Used when the review is an update of an existing published review.
Variant 3: Living reviews and scoping reviews. The PRISMA-ScR (Scoping Review) extension has its own PRISMA-ScR for scoping reviews variant with a simplified flow diagram that does not require reasons for exclusion at the eligibility phase (since scoping reviews do not formally exclude on quality grounds). Living systematic reviews track records cumulatively across multiple update cycles.
Numbers You Must Report at Each Node
A reviewer auditing a flow diagram should be able to add the numbers and arrive at internally consistent totals. The arithmetic that must hold:
Records identified - records removed before screening = records screened. If 4,257 records were identified and 1,341 duplicates were removed, then 2,916 records must appear at the screening node.
Records screened - records excluded at title/abstract = reports sought for retrieval. If 2,916 records were screened and 2,612 were excluded at title and abstract, then 304 reports must have been sought for retrieval.
Reports sought - reports not retrieved = reports assessed for eligibility. If 304 reports were sought and 18 could not be retrieved, then 286 reports must appear at the eligibility node.
Reports assessed - reports excluded = reports of included studies. If 286 reports were assessed and 248 were excluded, then 38 reports correspond to included studies, which may collapse to (say) 32 unique included studies after merging multiple reports of the same study.
The arithmetic must hold in both branches separately and in the merged inclusion node. A discrepancy of one record is acceptable (rounding in totals reported across sources); a discrepancy of more than that is a red flag.
Reasons for Exclusion at the Full-Text Phase
The exclusion-with-reasons block at the eligibility phase is the most-scrutinized part of the diagram. The 2020 standard requires that every report excluded at full-text review be categorized by reason, with at least one reason per report and a count per reason. The typical reason categories:
Wrong study design (e.g., a narrative review when only original studies are eligible). Wrong population (e.g., paediatric studies when adults were the target). Wrong intervention (e.g., a different drug class). Wrong comparator (e.g., active control when placebo was specified). Wrong outcome (e.g., a surrogate marker when the protocol required clinical endpoints). Wrong setting (e.g., outpatient when only inpatient was eligible). Wrong language (when language was an exclusion criterion). Conference abstract only (no full report available). Duplicate publication (same data as an included study). Retracted publication. Pre-print not peer-reviewed (if the protocol excluded preprints). Insufficient data for extraction (the paper was eligible by criteria but did not report the data needed for synthesis).
The categories used should match the inclusion and exclusion criteria stated in the methods section and the protocol. A reviewer who lists a reason in the flow diagram that was not in the protocol is signalling a post hoc decision that needs to be disclosed. The narrative of the methods section should briefly explain how the reasons were assigned. The methods section walkthrough with PRISMA covers the prose framing.