P-Curve Analysis Tool

Free

Evaluate whether a set of statistically significant findings contains evidential value or shows signs of p-hacking using p-curve analysis (Simonsohn et al., 2014). Enter p-values directly or compute them from t, F, chi-squared, or z statistics. Run binomial and continuous right-skew tests, a flatness test against 33% power, and visualize the p-value distribution with an interactive D3.js histogram. Import CSV/Excel data, auto-generate a methods paragraph, and export R code for the dmetar package.

Input Data

Drag & drop a file or

CSV, TSV, Excel (.xlsx/.xls) - max 500 rows

P-value	Computed p
	-
	-
	-
	-
	-

0 p-values computed. 0 significant (p < .05). Need at least 3 significant p-values for analysis.

How to Use This Tool

Enter P-Values or Test Statistics

Enter p-values directly (one per row), or switch to t, F, chi-squared, or z input mode. Each test statistic is automatically converted to a two-tailed p-value. Import from CSV/Excel or paste spreadsheet data.

Run the Analysis

Click Analyze to filter significant p-values (p < 0.05) and run all three tests: the binomial right-skew test, the Stouffer continuous right-skew test, and the flatness test against 33% power.

Interpret the P-Curve Plot

The histogram shows the distribution of significant p-values across five bins ([0, 0.01] through [0.04, 0.05]). Compare the observed distribution against the flat null expectation and the 33% power curve.

Review Statistical Tests

A significant right-skew test means the findings contain evidential value. A significant flatness test means the evidence is inadequate. The overall conclusion integrates both results.

Generate Methods and R Code

Copy a publication-ready methods paragraph summarizing the p-curve analysis. Export reproducible R code for the dmetar package that runs the full analysis in RStudio.

Export Results

Download the p-curve plot as a high-resolution PNG. Export the results table as CSV or Excel. Copy test statistics for your manuscript.

Next step

P-curve done. Want the full evidential value and bias assessment?

P-curve, p-uniform, selection models, trim-and-fill, and a publication-ready manuscript by a PhD statistician.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

Quote in minutesPay only after you approve scopePhD methodologistmetafor R + Cochrane HandbookNDA available on request

Quote my meta-analysis WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the pooled estimate or model choice, we re-run and re-write the analysis free.

Confidentiality

NDA available on request before scope discussion. Your data, study design, and manuscript stay private either way.

Want a PhD methodologist to handle the whole project?

Get a complete systematic review or meta-analysis handled end-to-end. Free rework on search, screening, or synthesis if reviewers push back. Pay only after you approve scope.

WhatsApp Quote my systematic review

Key Takeaways for P-Curve Analysis

Right-skew indicates real effects

When the underlying effect is genuine, statistically significant p-values cluster near zero rather than spreading uniformly. A significantly right-skewed p-curve provides strong evidence that the set of findings reflects true effects rather than noise or p-hacking.

Flat or left-skewed curves signal problems

A p-curve that is flat (uniform) or left-skewed (concentrated near 0.05) suggests that the significant results were not generated by real effects. This pattern is consistent with p-hacking, selective reporting, or ambient statistical noise.

P-curve complements funnel plots and publication bias tests

Funnel plots and Egger's test examine the relationship between effect sizes and precision. P-curve takes a different approach by focusing exclusively on the distribution of significant p-values. Using both methods together provides a more thorough assessment of evidential integrity.

Only statistically significant p-values are analyzed

P-curve analysis uses only p-values below 0.05 from the set of studies. Non-significant results are excluded because the method is specifically designed to evaluate the distribution pattern among significant findings.

The 33% power benchmark matters

The flatness test compares the observed p-curve against the expected distribution under 33% statistical power. This benchmark was chosen by Simonsohn et al. because it represents the minimum level of power that would still produce a meaningfully right-skewed distribution of significant p-values.

Report both tests for transparency

Best practice is to report both the right-skew test (evidence of real effects) and the flatness test (evidence of inadequate evidence) in your manuscript. This dual reporting provides readers with a complete picture of the evidentiary value of the included studies.

P-Curve Analysis in Systematic Reviews and Meta-Analysis

P-curve analysis was introduced by Simonsohn, Nelson, and Simmons (2014) as a diagnostic tool for evaluating the evidential value of a set of statistically significant findings. The method addresses a fundamental question in meta-analysis: do the reported significant results reflect genuine underlying effects, or could they be the product of selective reporting and p-hacking? Traditional publication bias methods like funnel plots and regression-based tests (Egger et al., 1997; Begg and Mazumdar, 1994) focus on the relationship between effect sizes and precision. P-curve offers a complementary lens by examining the shape of the distribution of significant p-values themselves.

The core insight behind p-curve is straightforward: when a true effect exists and studies have adequate power, the distribution of significant p-values (those below 0.05) should be right-skewed, with most p-values clustering near zero. Under the null hypothesis of no effect, significant p-values follow a uniform distribution between 0 and 0.05. When researchers engage in p-hacking, exploiting researcher degrees of freedom to push p-values just below 0.05, the distribution becomes left-skewed with a concentration of values near the significance threshold.

Statistical Tests for Evidential Value

This tool implements three complementary tests from the p-curve framework. The binomial right-skew test checks whether more than 50% of significant p-values fall below 0.025. Under the null hypothesis, exactly 50% should fall in each half, so a significant excess in the lower half indicates right-skew. The continuous right-skew test uses Stouffer's method to combine evidence across all p-values. Each significant p-value is transformed to a uniform scale (pp = p / 0.05), then converted to a z-score via the inverse normal distribution. The combined Stouffer Z-statistic follows a standard normal distribution under the null, providing a continuous measure of right-skew. The flatness test evaluates whether the observed p-curve is flatter than expected under 33% statistical power. If the flatness test is significant, the evidential value is deemed inadequate, suggesting the significant findings may not reflect true underlying effects.

Input Flexibility and Test Statistic Conversion

Many published studies report test statistics rather than exact p-values. This tool supports direct conversion from t-statistics (with degrees of freedom), F-statistics (with numerator and denominator degrees of freedom), chi-squared statistics (with degrees of freedom), and z-statistics. All conversions use two-tailed p-values to maintain consistency. You can also import data from CSV or Excel files for batch processing, or paste tab-separated data directly from a spreadsheet.

Complementary Approaches to Bias Detection

P-curve analysis works best when combined with other methods for evaluating the integrity of a body of evidence. Use our funnel plot and publication bias tool for visual inspection and formal tests of funnel plot asymmetry (Egger's test, Begg's test, trim-and-fill). Visualize individual study estimates with the forest plot generator to assess the overall pattern of results. Calculate individual study effect sizes with the effect size calculator before conducting your meta-analysis. Test the stability of your pooled estimate with the leave-one-out sensitivity analysis tool.

Methods Paragraph and R Code

After running the analysis, the tool generates a publication-ready methods paragraph that reports the number of significant p-values analyzed, the results of the right-skew and flatness tests with exact p-values, and the overall conclusion about evidential value. For full reproducibility, the R code generator produces a script using the dmetar package (Harrer et al., 2021), which includes the pcurve() function for comprehensive p-curve analysis. The generated code includes your p-values and is ready to paste into RStudio.

Important caveats apply to p-curve analysis. The method requires a sufficient number of significant p-values (ideally 20 or more) for reliable inference. P-curve cannot distinguish between genuine effects with low power and effects inflated by p-hacking when sample sizes are very small. The method assumes that the selected studies represent a meaningful set of tests of the same or similar hypotheses. Mixing studies testing fundamentally different hypotheses can distort the p-curve shape. Always interpret p-curve results alongside visual inspection of the histogram and in the context of the broader evidence.

Frequently Asked Questions

What is p-curve analysis?

P-curve analysis (Simonsohn, Nelson, and Simmons, 2014) is a method for evaluating whether a set of statistically significant findings contains evidential value or shows signs of p-hacking. It examines the distribution of significant p-values (those below 0.05) from a collection of studies. If the effects studied are real, significant p-values should cluster near zero (right-skewed distribution). If there is no real effect, or if researchers have engaged in p-hacking, the distribution should be flat or left-skewed (clustered near 0.05).

How does the right-skew test work?

The right-skew test evaluates whether the distribution of significant p-values is right-skewed, meaning there are more very small p-values than would be expected under the null hypothesis. This tool computes two versions: (1) a binomial test checking whether more than 50% of significant p-values fall below 0.025, and (2) a continuous test using Stouffer's method, which transforms each p-value to a uniform scale and then combines z-scores. A significant right-skew test (p < 0.05) indicates the set of findings contains evidential value, meaning the underlying effects are likely real.

What does the flatness test tell me?

The flatness test (also called the test for inadequate evidential value) evaluates whether the p-curve is flatter than would be expected if the studies had 33% statistical power. A flat or left-skewed p-curve suggests that the significant results may have been obtained through selective reporting, p-hacking, or other questionable research practices rather than genuine effects. If the flatness test is significant (p < 0.05), this indicates that the evidential value in the set of studies is inadequate.

What input formats does this tool support?

You can enter p-values directly (one per row), or provide test statistics that the tool will convert to p-values. Supported test statistics include t-statistics with degrees of freedom, F-statistics with numerator and denominator degrees of freedom, chi-squared statistics with degrees of freedom, and z-statistics. You can also import data from a CSV or Excel file using the drag-and-drop uploader.

How many p-values do I need for a reliable p-curve analysis?

Simonsohn et al. (2014) recommend a minimum of approximately 20 statistically significant p-values for reliable p-curve analysis. With fewer studies, the binomial and continuous tests may lack sufficient statistical power to detect right-skew or flatness. However, even with as few as 5 to 10 significant p-values, the p-curve histogram can provide useful visual information about the distribution pattern.

How is p-curve different from funnel plots and Egger's test?

Funnel plots and Egger's regression test detect publication bias by examining the relationship between effect sizes and their precision (standard errors). P-curve takes a fundamentally different approach: it examines only statistically significant p-values and tests whether they are distributed in a way consistent with real effects. P-curve can detect p-hacking and selective reporting even when traditional publication bias tests show no asymmetry. The two approaches are complementary, and combining them provides a more complete picture of the integrity of a body of evidence.

Can this tool generate R code for p-curve analysis?

Yes. After running the analysis, you can copy a ready-to-run R script that uses the dmetar package (Harrer et al., 2021) for p-curve analysis. The generated code includes your p-values and calls the pcurve() function, which produces the p-curve plot and all statistical tests. You can paste the code directly into RStudio for full reproducibility.

Related Research Tools

Detect publication bias with funnel plots and formal asymmetry tests using our funnel plot and publication bias tool. Visualize individual study estimates with our forest plot generator for meta-analysis. Calculate individual study effect sizes before your analysis with our effect size calculator for SMD, OR, and RR.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

Stuck on Your Project? Let a PhD Expert Take Over.

Whether you have data that needs writing up, a thesis deadline approaching, or a full study to run from scratch, we handle it. Most projects deliver in under 2 weeks.

Our promise: Free rework on search, screening, or synthesis if reviewers push back.

4.9 / 5 across 1,194+ projectsQuote in minutesPRISMA 2020 + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request

Quote my systematic review Chat on WhatsApp

You Shape What We Build Next

P-value

Computed p