Required sample sizes for Bayesian analysis
What is this about?
For frequentist hypothesis testing, several power calculators (e.g. G*Power1, powerandsamplesize.com2) are available to determine the minimum sample size needed to detect an effect with a given probability (statistical power). No such calculators are available (to my knowledge) for the Bayesian versions of t-tests, ANOVA, and linear correlations returning a Bayes factor.3 Calculating statistical power for these techniques for given sample sizes cannot be done analytically4 but can be done using simulation, and to save everyone a lot of time I will summarize my findings below.
Minimum sample sizes needed
Bayesian t-test
The analyses below show the minimum number of subjects needed per condition as a function of the true effect size in the population. This is based on a Bayes factor threshold of 3, a statistical power of 80% (i.e. you have an 80% probability of finding a BF10 > 3), a noninformative Jeffreys prior placed on the variance of the normal population, and a Cauchy prior (JZS) placed on the standardized effect size with scale \(\sqrt{2}/2\).5 Note that these sample sizes are somewhat more conservative than a standard (frequentist) t-test.6 In the rightmost column you can find the required sample size for a standard (frequentist) t-test to reject the null hypothesis using a desired significance level of .05.
True effect size7 | Cohen’s d | Bayesian sample size | Frequentist sample size |
---|---|---|---|
0.00 (null) | 1028 | N/A | |
Very small | 0.01 | ~425,000 | 156,656 |
0.10 | 3,050 | 1,567 | |
Small | 0.20 | 667 | 392 |
0.30 | 275 | 175 | |
0.40 | 147 | 98 | |
Medium | 0.50 | 91 | 63 |
0.60 | 62 | 44 | |
0.70 | 45 | 32 | |
Large | 0.80 | 34 | 25 |
0.90 | 27 | 20 | |
1.00 | 22 | 16 | |
Very large | 1.20 | 16 | 11 |
Huge | 2.00 | 7 | 4 |
Bayesian test for linear correlation
The analyses below show the minimum number of paired observations (data points) needed as a function of the true correlation \(\rho\) in the population. This is based on a Bayes factor threshold of 3, a statistical power of 80% (i.e. you have an 80% probability of finding a BF10 > 3), noninformative priors assumed for the population means and variances of the two populations, and a \(\operatorname{Beta}(3,3)\) prior distribution is assumed for \(\rho\).5
True effect size9 | \(\rho\) | Bayesian sample size | Frequentist sample size |
---|---|---|---|
0.00 (null) | 2458 | N/A | |
Small | 0.10 | 1,300 | 782 |
0.20 | 273 | 193 | |
Medium | 0.30 | 109 | 84 |
0.40 | 54 | 46 | |
Large | 0.50 | 32 | 29 |
0.60 | 21 | 19 | |
0.70 | 15 | 13 | |
0.80 | 11 | 9 | |
0.90 | 8 | 6 |
Software used
Simulations and analyses were performed using Richard D. Morey’s BayesFactor package and the cluster computing snowfall package for R.
-
https://www.psychologie.hhu.de/arbeitsgruppen/allgemeine-psychologie-und-arbeitspsychologie/gpower ↩
-
Note that the usefulness of Bayes factors is subject of discussion. Some authors (e.g. Kruschke) argue that precise description of posterior distributions is a better idea. ↩
-
Some would go so far as to argue that statistical power in the sense used in frequentist statistics is meaningless in a Bayesian framework. ↩
-
These are standard, noninformative priors. If you have a reasonable alternative prior, or test against a null interval instead of a point null you could improve power. ↩ ↩2
-
See Rouder, J.N., Speckman, P.L., Sun, D., et al. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225–237. doi:10.3758/PBR.16.2.225 ↩
-
Labels taken from Sawilowsky, S. (2009). New effect size rules of thumb. Journal of Modern Applied Statistical Methods, 8, 467–474. doi:10.22237/jmasm/1257035100 ↩
-
This concerns the sample size needed for BF01 > 3 in the case where the true effect size is zero. ↩ ↩2
-
Labels taken from Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.) ↩