A systematic review in psychology synthesizes empirical evidence on behavioral phenomena, psychological interventions, and cognitive processes using transparent, reproducible methods that address the unique challenges of behavioral science research. Psychology's emphasis on effect size estimation, its diverse research designs ranging from randomized controlled trials to experience sampling studies, and the ongoing replication crisis make evidence synthesis both critically important and methodologically demanding.
The Replication Crisis and Why Systematic Reviews Matter More Than Ever
The Open Science Collaboration (2015) famously found that only 36% of psychology findings replicated successfully, fundamentally shaking confidence in individual study results. This finding, published in Science, transformed how the field approaches evidence. Individual studies, no matter how well-designed, carry uncertainty. Systematic reviews and meta-analyses aggregate evidence across multiple studies, providing more reliable effect estimates and revealing whether findings hold across diverse samples, settings, and operationalizations.
The replication crisis also introduced important methodological concepts that systematic reviewers must understand. P-hacking (selective reporting of significant results), HARKing (hypothesizing after results are known), and publication bias toward statistically significant findings all inflate effect sizes in the published literature. A rigorous psychology systematic review must actively assess and correct for these distortions using tools like free funnel plot creation tool, Egger's regression, trim-and-fill analysis, and p-curve analysis.
The American Psychological Association (APA) now strongly endorses systematic reviews and meta-analyses as the highest level of evidence for psychological practice guidelines. The APA's JARS-Quant and JARS-Qual reporting standards provide detailed guidance on what to include in quantitative and qualitative synthesis reports.
Searching the Psychology Literature Effectively
Psychology research is distributed across clinical, social, developmental, cognitive, organizational, and educational subfields, each with distinct publication outlets.
PsycINFO (now PsycNET), maintained by the American Psychological Association, is the essential psychology database. Its controlled vocabulary (Thesaurus of Psychological Index Terms) uses discipline-specific descriptors like "Cognitive Behavioral Therapy," "Self Efficacy," and "Emotional Regulation" that differ from MeSH terms in PubMed. Mapping between PsycINFO descriptors and MeSH is necessary when translating your deep dive into search strategy across databases.
PubMed/MEDLINE captures clinical psychology and health psychology research published in medical journals. Embase adds European and pharmacological literature relevant to psychopharmacology reviews.
ERIC (Education Resources Information Center) is essential for educational psychology, school-based interventions, and learning science reviews. Sociological Abstracts and Social Science Citation Index capture social psychology research published outside traditional psychology journals.
ProQuest Dissertations and Theses is uniquely valuable in psychology because doctoral dissertations often contain null results not published in journals, providing a critical check against publication bias explained. Including dissertations can meaningfully change meta-analytic effect estimates.
Build your cross-database search strategy using our handy search strategy builder, which supports translation between PsycINFO descriptors and MeSH vocabulary.
Effect Sizes in Psychology: Choosing and Interpreting the Right Measure
Psychology relies heavily on effect size estimation as the primary currency of meta-analysis. The choice of effect size measure depends on the type of outcome and study design.
Cohen's d and Hedges' g are standard for comparing group means. Hedges' g corrects for small-sample bias and is preferred in meta-analysis. Jacob Cohen (1988) proposed benchmarks of 0.2 (small), 0.5 (medium), and 0.8 (large), but these should be interpreted in context, not as universal thresholds. A small effect on a large-scale public health intervention may be more practically significant than a large effect in a laboratory study.
Pearson's r (correlation coefficient) is the natural effect size for correlational studies, common in personality, social, and developmental psychology. Fisher's z transformation is recommended for meta-analytic pooling, with back-transformation for interpretation.
Odds ratios and risk ratios appear in clinical psychology reviews evaluating treatment response (e.g., remission from depression). These are standard in clinical trials and are interpretable using our our advanced effect size calculator.
The critical issue in psychology is that published effect sizes are inflated compared to true population effects. Schaefer et al. (2019) estimated that the average published psychology effect size is roughly twice the true effect due to publication bias and questionable research practices. Meta-analysts must account for this through sensitivity analyses and bias-correction methods.