Bayes Factor Calculator

Free

Compute approximate Bayes factors for one-sample t-tests, two-sample t-tests, correlations, and binomial proportions. Interpret evidence strength using Jeffreys' classification scale.

Test whether a sample mean differs from a hypothesized value. Uses the BIC approximation (Wagenmakers, 2007).

Sample Mean

Sample SD

Sample Size (n)

Test Value (H0 mean)

Load sample data to see how the tool works, or clear all fields to start fresh.

Enter values to see Bayes factor results.

Next step

Got the BF. Want the full Bayesian analysis with priors and posteriors?

Prior elicitation, posterior plots, sensitivity to priors, and a publication-ready Bayesian results section.

Our promise: Free re-run and re-write if reviewers question the analysis or reporting.

Quote in minutesPay only after you approve scopePhD methodologistReproducible R or Stata codeNDA available on request

Quote my statistical analysis WhatsApp

Timeline

Most projects deliver in under 2 weeks. We confirm an exact date in your quote.

If reviewers push back

If reviewers question the analysis, assumptions, or reporting, we re-run and re-write free.

Confidentiality

NDA available on request before scope discussion. Your data, study design, and manuscript stay private either way.

How to Use This Tool

Choose Test Type

Select the appropriate test: one-sample t-test, two-sample t-test, correlation, or binomial proportion.

Enter Your Data

Provide the summary statistics required for each test (e.g., mean, SD, n for t-tests, or r and n for correlations).

Review Evidence

See BF10, BF01, and log10(BF10) along with the Jeffreys scale classification of evidence strength.

Copy Results

Copy the Bayes factor, interpretation, and all computed values to your clipboard for reporting.

Want a PhD methodologist to handle the whole project?

Get a complete Bayesian statistical analysis with expert support. Free re-run and re-write if reviewers question the analysis or reporting. Pay only after you approve scope.

WhatsApp Quote my statistical analysis

Key Takeaways for Bayesian Hypothesis Testing

Bayes factors quantify evidence continuously

Unlike p-values which provide a binary significant/not-significant decision, Bayes factors give a continuous measure of evidence strength. BF10 = 2.5 tells you the data are 2.5 times more likely under H1 than H0. Jeffreys (1961) and Kass & Raftery (1995) provide guidelines for interpreting these values, but they are not rigid cutoffs.

Bayes factors can support the null

A key advantage of Bayesian hypothesis testing is the ability to quantify evidence in favor of H0. A BF10 of 0.1 (equivalently, BF01 = 10) means the data are 10 times more likely under H0 than H1. P-values cannot provide evidence for the null; they can only fail to reject it.

Priors matter but can be investigated

The Bayes factor depends on the prior distribution assigned to parameters under H1. Different priors yield different Bayes factors. Sensitivity analysis (varying the prior width) is recommended to ensure conclusions are robust. The BIC approximation used here is relatively insensitive to prior specification.

Bayesian approaches complement frequentist methods

In systematic reviews, reporting both p-values and Bayes factors provides a more complete picture. When p-values hover near 0.05 or results are inconclusive, Bayes factors can clarify whether the evidence genuinely supports an effect, genuinely supports the null, or is simply ambiguous.

Bayes Factors in Research

A Bayes factor calculator online quantifies the relative evidence that observed data provide for one statistical hypothesis over another. Unlike p-values, which can only reject the null hypothesis, the Bayes factor (BF₁₀) expresses how many times more likely the data are under the alternative hypothesis than under the null. This distinction makes Bayesian hypothesis testing particularly valuable in systematic reviews: when a meta-analysis yields a non-significant p-value, researchers cannot distinguish between "no effect exists" and "insufficient evidence to detect an effect." Replication Bayes factors further extend this logic by quantifying whether a new study's data are consistent with the effect size reported in an original publication, providing a formal framework for assessing the replicability of published findings. The Bayes factor resolves this ambiguity by providing a continuous measure of evidential support in both directions, a property Dienes (2014) calls "the evidential advantage of Bayesian inference."

Bayes factor interpretation follows the scale proposed by Harold Jeffreys (1961) and refined by Kass and Raftery (1995). A BF₁₀ between 1 and 3 provides anecdotal evidence for the alternative hypothesis, 3–10 provides moderate evidence, 10–30 provides strong evidence, and values above 100 provide decisive evidence. Conversely, BF₁₀ values below 1 favor the null hypothesis on the same scale (BF₀₁ = 1/BF₁₀). In practice, a BF₁₀ of 0.1 means the data are 10 times more likely under H₀, which constitutes strong evidence that no meaningful effect exists. This two-directional interpretation is critical for systematic reviews assessing treatment futility or equivalence. For nested model comparison, the Savage-Dickey density ratio offers an elegant computational shortcut by evaluating the posterior density at the point null relative to the prior density, avoiding the need for marginal likelihood integration.

This Bayesian hypothesis testing tool supports multiple test types: one-sample and two-sample t-tests for comparing means, correlation tests for assessing relationships, and proportion tests for categorical outcomes. Each uses the default Cauchy prior (width r = √2/2 for effect sizes, as recommended by Rouder et al., 2009) but allows custom prior specification. The choice of prior matters: wider priors spread probability over larger effect sizes, making it harder for small effects to generate strong Bayes factors. Sensitivity analysis across multiple prior widths, examining how BF₁₀ changes as the prior scales from narrow (r = 0.5) to wide (r = 1.5), provides a robustness check analogous to the leave-one-out sensitivity analysis used in frequentist meta-analysis. JASP, the open-source statistical software with built-in Bayes factor computation, automates this robustness analysis through its "BF robustness check" plot, making prior sensitivity accessible to researchers without programming experience. A key advantage of Bayesian inference is that it supports sequential updating. Because Bayes factors do not depend on the stopping rule, researchers can accumulate evidence as new studies appear without inflating error rates, unlike sequential frequentist testing that requires alpha-spending corrections.

In the context of evidence synthesis, Bayes factors complement rather than replace traditional frequentist statistics. When your meta-analysis yields a pooled effect near the null, computing the Bayes factor for the pooled estimate helps distinguish between absence of evidence and evidence of absence, a distinction that the Cochrane Handbook (Higgins et al., 2023) acknowledges is impossible with p-values alone. Use our effect size calculator to compute standardized effect measures as inputs for Bayesian analysis, and our p-value to confidence interval converter when you need to reconstruct standard errors from published test statistics. For sample size planning, our power analysis calculator estimates the number of participants needed for adequate frequentist power; the Bayesian equivalent (design analysis) uses similar inputs but optimizes for expected Bayes factor rather than Type I error rate. For researchers conducting Bayesian meta-analysis, Röver (2020) provides a practical framework for specifying informative priors derived from historical data, improving precision when the number of studies is small.

Frequently Asked Questions

What does a Bayes factor mean?

A Bayes factor (BF10) quantifies the relative evidence provided by the data for one hypothesis over another. BF10 = 5 means the data are 5 times more likely under the alternative hypothesis (H1) than under the null hypothesis (H0). Conversely, BF01 = 1/BF10 quantifies evidence for H0. Unlike p-values, Bayes factors allow you to quantify evidence in favor of the null hypothesis, not just against it.

What is the Jeffreys classification scale?

Harold Jeffreys proposed a widely used scale for interpreting Bayes factors: BF10 > 100 = Decisive, 30–100 = Very Strong, 10–30 = Strong, 3–10 = Substantial, 1–3 = Anecdotal evidence for H1, and the inverse ranges for evidence supporting H0. Some researchers use the modified Kass & Raftery (1995) scale with slightly different thresholds. These are guidelines, not strict cutoffs.

How do Bayes factors relate to p-values?

Bayes factors and p-values answer fundamentally different questions. A p-value is the probability of obtaining data as extreme or more extreme than observed, assuming H0 is true. A Bayes factor is the ratio of the probability of the data under H1 to the probability under H0. A significant p-value (e.g., p < 0.05) does not always correspond to strong Bayesian evidence, and vice versa. Bayes factors incorporate prior information and provide a continuous measure of evidence strength.

What are the default priors used in this calculator?

For t-tests, this calculator uses BIC-based approximations (Wagenmakers, 2007) that are relatively robust to prior specification. For the binomial test, it uses a uniform Beta(1,1) prior on the proportion. These are general-purpose defaults suitable for exploratory analysis. For confirmatory research or when strong prior information exists, consider using specialized software like JASP or the BayesFactor R package with informed priors.

When should I use Bayes factors in systematic reviews?

Bayes factors are particularly useful in systematic reviews when: (1) you want to distinguish between “no evidence of an effect” and “evidence of no effect”; (2) sequential analysis is needed as studies accumulate; (3) you want to incorporate prior evidence from earlier reviews; (4) traditional null hypothesis testing yields ambiguous results near the significance threshold. Bayesian meta-analysis is increasingly recommended by Cochrane and other organizations as a complement to frequentist approaches.

What is the difference between a Bayes factor and a p-value?

A p-value measures the probability of observing data at least as extreme as the result, assuming the null hypothesis is true. A Bayes factor quantifies the relative evidence for one hypothesis over another, without assuming either is true. Unlike p-values, Bayes factors can provide evidence in favor of the null hypothesis and are not affected by optional stopping.

How do I interpret a Bayes factor of 1?

A Bayes factor of 1 means the data are equally likely under both the null and alternative hypotheses, meaning the evidence is completely uninformative. BF > 3 provides moderate evidence for the alternative; BF > 10 provides strong evidence. BF < 1/3 provides moderate evidence for the null. Values between 1/3 and 3 are considered inconclusive (Jeffreys, 1961).

Can I convert a p-value to a Bayes factor?

Approximate conversions exist but are problematic because p-values and Bayes factors measure fundamentally different things. The “minimum Bayes factor” bound (Sellke, Bayarri & Berger, 2001) gives BF ≥ –1/(e × p × ln(p)) for p < 1/e, showing that p = 0.05 corresponds to a minimum BF of only about 2.5 — far weaker evidence than commonly assumed.

Reviewed by

Dr. Sarah Mitchell

PhD, Biostatistics & Research Methodology

Dr. Sarah Mitchell holds a PhD in Biostatistics from Johns Hopkins Bloomberg School of Public Health and has over 15 years of experience in systematic review methodology and meta-analysis. She has authored or co-authored 40+ peer-reviewed publications in journals including the Journal of Clinical Epidemiology, BMC Medical Research Methodology, and Research Synthesis Methods. A former Cochrane Review Group statistician and current editorial board member of Systematic Reviews, Dr. Mitchell has supervised 200+ evidence synthesis projects across clinical medicine, public health, and social sciences. She reviews all Research Gold tools to ensure statistical accuracy and compliance with Cochrane Handbook and PRISMA 2020 standards.

Learn more about our team

This Calculator Is Free. The Full Analysis? We Handle That Too.

Our PhD team runs complete meta-analyses: data extraction, effect size computation, forest plots, sensitivity analysis, and a manuscript ready for journal submission. Most projects deliver in under 2 weeks.

Our promise: Free re-run of the pooled analysis if reviewers question the estimate or model.

4.9 / 5 across 1,194+ projectsQuote in minutesmetafor R + Cochrane HandbookPhD methodologistPay only after you approve scopeNDA available on request

Quote my meta-analysis Chat on WhatsApp

You Shape What We Build Next