I do not know about you; but when I look at sample size determination formulae, they appear gibberish. It is like “the more you look, the less you see”! I signed up for the role of a clinician, not a statistician; yet reviewers of research proposals have a knack for insisting that I include sample size calculation formula in my research proposal, that the formula must  be appropriate and that the calculation must be error free. A number of times, I must confess, I have so frustrated reviewers on sample size issue that they told me in annoyance (so I perceived) to consult a statistician!

Eventually, the experience forced me to fire three questions at published statisticians: Why sample size calculation formula? How does one select the right formula from the available array? And how can one ensure that the calculation is error free?


Why sample size calculation formula?
I am thirsty. I have a 30-liter bucket of table water at my disposal. How much of the water should I drink? 
“Drink as much as satisfies you,” you say. 
Good answer.
 “How much will that be?” I ask, trying to elicit a precise figure from you.
You scratch your head and conclude you cannot give an exact figure unless you derive a formula that will include variables you consider determinants of how much water will satisfy my thirst. Some of the determinants you may think of (assuming all are measurable) are my level of thirst, my stomach capacity, my greed level, the environmental temperature, if I am hungry at the same time and have a plate of rice set before me, the past record of how much I drank in similar condition and so on.  Everything factored into the equation, you may come up with a figure. You and I know that that figure is, at best, approximate: in reality, a little more or a little less may be needed to satisfy my thirst at the reference point in time.

In similar way, statisticians (and mathematicians) have derived formulae to determine how many units (sample size) need to be selected from all available units (population) to satisfy something similar to  thirst, which, in research term, is called statistical “power”. The power is the ability of study’s statistical test to detect an effect (Fitzner & Heckinger, 2010). In other words, it is the probability that a statistical test will result in statistically significant difference when there is a true difference(Stokes, 2014). In English language, if the test in a study has power of 100%, it means that the test will detect a significant difference between the effects of two variables in the study every time there is true difference between the effects. If the power is 90%, the test will detect the difference 9 times out of ten when there is true difference. You can imagine what power of 80%, 50% or even 0% means. The scientific community does not insist that the power of statistical test in a study be 100%; instead, power of 80 or 90% is acceptable (Dell, Holleran, & Ramakrishnan, 2002Fitzner & Heckinger, 2010). Since statistical power depends mainly on the sample size, the scientific community recommends that the minimum sample size for any study should give a power of 80% or 90%, other factors considered. Use of sample size formulae is to give verifiable minimum sample size figure that results in the pre-specified statistical power. Studies that lack power and sample size analyses are read and interpreted with caution by the scientific community (Fitzner & Heckinger, 2010).

Basis for selecting sample size formula (SSF) that gives acceptable power
Researcher’s life would be quite easy if there was a one-size-fits-all formula to calculate the minimum sample size for all studies. This is not so (Charan & Biswas, 2013). There is a myriad of formulae and a researcher has to decide on the suitable one for his study. Should he decide based on “abracadabra”? Definitely not. Should his decision be based on the study type? Should it be based on the study design (by the way study type is an aspect of study design)? Should it be based on the objective/hypothesis of the study? Should it be based on the statistical procedure intended to test the hypothesis or to achieve the objective of the study. Yes, yes, yes and yes.

The first consideration, but by no means the only, to select the right SSF is the statistical procedure to test a hypothesis. Every study has at least one hypothesis, which may be implicit or explicit. Implicit hypothesis is not stated but implied; for example, in a descriptive cross sectional survey to determine the mean age of a population by calculating the mean age of its sample, the implicit null hypothesis is that there is no significant difference between the mean ages of the sample and the respective population. The SSF for a study to determine mean in a population is different from SSF for proportion in the same study and same population (Charan & Biswas, 2013). In case the proposed study has multiple hypotheses, the statistical procedure to test the primary hypothesis should be used as the basis for sample size formula (Guo & Pandis, 2015ICH, 1998). Other considerations include the study type, study design and data type. The SSF for comparing means in case-control study type is different from that of intervention study (Charan & Biswas, 2013). The SSF for repeated and independent measures designs are different. The SSF for a two-arm Randomized Controlled Trial depends on data type (continuous or dichotomous) and type of test (inferiority, equivalence or superiority) (Zhong, 2009).    

Statistical procedures and sample size formulae
The following is a list of commonly used SSF. You can check the corresponding references to learn further about them. If you are a clinician caught up in research but who is not interested in complex calculations, you may skip the list and go straight to the next section on software for calculating sample size.

Proportion, descriptive survey (Charan & Biswas, 2013)



Comparing proportions, case-control study with two independent group design (Charan & Biswas, 2013)
Note: Sample size = sample size per group

Comparing proportions, Cohort study with two independent group design (Charan & Biswas, 2013)
Note: Sample size = sample size per group

Comparing proportions, intervention studies with two independent group design (Charan & Biswas, 2013)
Note: Sample size = sample size per group

Testing a proportion using the weighted sign test when binary observations are dependent within a cluster - equal weights to observations (Ahn, Hu, & Schucany, 2011)

Testing a proportion using the weighted sign test when binary observations are dependent within a cluster - equal weights to clusters (Ahn, Hu, & Schucany, 2011)

Testing a proportion using the weighted sign test when binary observations are dependent within a cluster - optimal weights that minimize the variance of the estimator (Ahn, Hu, & Schucany, 2011)

Mean of a sample from a large population(Kamangar & Islami, 2013)

Comparing means, cross sectional with two independent samples(Kamangar & Islami, 2013)

Comparing means, case-control study with two independent group design (Charan & Biswas, 2013)
Note: Sample size = sample size per group

Comparing means, intervention studies with two independent group design (Charan & Biswas, 2013)
Note: Sample size = sample size per group

SPn and Vn Log-rank tests used in case-cohort study (Cai & Zeng, 2004)


Paired right-censored data based on the difference of Kaplan-Meier estimates (Su, Li, & Shyr, 2014).
Note: applicable to 2-sided alpha

Treatment effect on multivariate event times.(Chen, Ibrahim, & Chu, 2014)


Statistical method using dichotomous variables in non-inferiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using continuous variables in non-inferiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using dichotomous variables in equivalence RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using continuous variables in equivalence RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using dichotomous variables in statistical superiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using continuous variables in statistical superiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using dichotomous variables in clinical superiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Statistical method using continuous variables in clinical superiority RCT that has two comparison groups, both having same sample size(Zhong, 2009)

Sample size formula for time-averaged difference to allow for missing data, general correlation structures, and unequal sample sizes between study groups.(Zhang & Ahn, 2012)

Comparing effect of intervention in (a two-group) randomized controlled trial(Kamangar & Islami, 2013)
n=sample size for each group

Repeated measure ANOVA, as in multiple post-randomization measurements design(Morgan & Case, 2013)

 

Sample size calculation software
Simple calculation for sample size can be done manually by given formulae but for complex formula,  software should be used (Charan & Biswas, 2013). I do not know if you got to this section after going through the preceding list of formulae or skipped it. Whatever the case, to use any of the software appropriately, you need to understand the basic terms used in the respective SSF,  such as confidence interval, effect size, level of significance, standard deviation, α error, β error etc.  You may find the explanation of these terms in the software’s documentation or elsewhere. When you use any of the software in your work, ensure you reference both the software and the SSF used in the software.

Free installable

Free online

 

Name: Open Epi
Websitehttp://www.openepi.com/Menu/OE_Menu.htm
Citation: Dean AG, Sullivan KM, Soe MM. OpenEpi: Open Source Epidemiologic Statistics for Public Health, Version. www.OpenEpi.com, updated 2015/05/04, accessed 2015/09/09

SN Test method / design  Link Formula stated Formula referenced
1 Proportion in a population Access link Yes Yes
2 Unmatched case-control study (comparing proportions) Access link Yes Yes
3 Cohort / RCT /Cross sectional Access link Yes  Yes
4 Mean difference Access link Yes Yes

 

Name: Biomath 
Websitehttp://www.biomath.info/power/index.htm 
Citation: Ramakrishnan R, Holleran S. Biomath. New York: Division of Biomathematics/Biostatistics,  Department of Pediatrics, Columbia University Medical Center [cited 2015 9/9]; Available from: http://www.biomath.info/.

SN Test method / design  Link Formula stated Formula referenced
1 One group: paired t-test Access link  No Yes (see bottom page Access link)
2 One group: one-sample t-test  Access link  No  Yes (see see bottom page Access link)
3 One group: chi-square test on proportion Access link  No  Yes (see see bottom page Access link)
4 One group: correlation Access link  No  Yes (see see bottom page Access link)
5 Two groups: t-test on group means Access link  No  Yes (see see bottom page Access link)
6 Two groups: chi-square test on proportions Access link  No  Yes (see see bottom page Access link)
7 Two groups: Tests for non-inferiority - chi-square Access link  No  Yes (see see bottom page Access link)
8 Two groups: Tests for non-inferiority - t-test Access link  No  Yes (see see bottom page Access link)

 

 

Name: EpiTools 
Websitehttp://epitools.ausvet.com.au/content.php?page=SampleSize  
Citation: AusVet. Epi Tools - Sample size calculations. Australia: AusVet Animal Health Services;  [cited 2015 9/9]; Available from: http://epitools.ausvet.com.au/content.php?page=SampleSize

SN Test method / design  Link Formula stated Formula referenced
1 To estimate a single proportion Access link Yes (see Access link) Not specifically, but references available
2 To estimate a single mean Access link No Not specifically, but references available
3 Two proportions Access link No Not specifically, but references available
4 Two means with equal sample size and equal variances Access link No Not specifically, but references available
5 Two means with unequal sample size and unequal variances Access link No Not specifically, but references available
6 To estimate true prevalence (at animal or herd-level) Access link No Yes (see see bottom page Access link)
7 Sample size for a cohort study Access link No Not specifically, but references available
8 Sample size for a case-control study Access link No Not specifically, but references available

 

Name: StatPages
Websitehttp://epitools.ausvet.com.au/content.php?page=SampleSize  
Citation: AusVet. Epi Tools - Sample size calculations. Australia: AusVet Animal Health Services;  [cited 2015 9/9]; Available from: http://epitools.ausvet.com.au/content.php?page=SampleSize

SN Test method / design  Link Formula stated Formula referenced
1 To estimate a single proportion Access link Yes (see Access link) Not specifically, but references available
2 To estimate a single mean Access link No Not specifically, but references available
3 Two proportions Access link No Not specifically, but references available
4 Two means with equal sample size and equal variances Access link No Not specifically, but references available
5 Two means with unequal sample size and unequal variances Access link No Not specifically, but references available
6 To estimate true prevalence (at animal or herd-level) Access link No Yes (see see bottom page Access link)
7 Sample size for a cohort study Access link No Not specifically, but references available
8 Sample size for a case-control study Access link No Not specifically, but references available

 

Name: Power Analysis for ANOVA Designs
Websitehttp://www.math.yorku.ca/SCS/Online/power/ 
Citation: Friendly M. Power Analysis for ANOVA Designs. Canada: Department of Mathematics and Statistics,  York University;  [cited 2015 9/9/]; Available from: http://www.math.yorku.ca/SCS/Online/power/.

SN Test method / design  Link Formula stated Formula referenced
1 Factorial ANOVA design Access link No No, but based on SAS macro program, fpower.sas 

 

Name: Free Statistics Calculators
Websitehttp://www.danielsoper.com/statcalc3/ 
Citation: Soper D. Free Statistics Calculators. USA [cited 2015 9/9]; Available from: http://www.danielsoper.com/statcalc3/

SN Test method / design  Link Formula stated Formula referenced
1 Priori Sample Size Calculator for Hierarchical Multiple Regression Access link Yes (same Access link) Yes (same Access link) 
2 Priori Sample Size Calculator for Multiple Regression Access link Yes (same Access link) Yes (same Access link)
3 Priori Sample Size Calculator for Structural Equation Models Access link Yes (same Access link) Yes (same Access link)
4 Priori Sample Size Calculator for Student t-Tests Access link Yes (same Access link) Yes (same Access link)

 

 

References

 Ahn, C., Hu, F., & Schucany, W. R. (2011). Sample Size Calculation for Clustered Binary Data with Sign Tests Using Different Weighting Schemes. Stat Biopharm Res, 3(1), 65-72. doi: 10.1198/sbr.2010.10021

 Cai, J., & Zeng, D. (2004). Sample size/power calculation for case-cohort studies. Biometrics, 60(4), 1015-1024. doi: 10.1111/j.0006-341X.2004.00257.x

 Charan, J., & Biswas, T. (2013). How to calculate sample size for different study designs in medical research? Indian J Psychol Med, 35(2), 121-126. doi: 10.4103/0253-7176.116232

 Chen, L. M., Ibrahim, J. G., & Chu, H. (2014). Sample size determination in shared frailty models for multivariate time-to-event data. J Biopharm Stat, 24(4), 908-923. doi: 10.1080/10543406.2014.901346

 Dattalo, P. (2009). A review of software for sample size determination. Eval Health Prof, 32(3), 229-248. doi: 10.1177/0163278709338556

 Dell, R. B., Holleran, S., & Ramakrishnan, R. (2002). Sample size determination. ILAR J, 43(4), 207-213.

 Fitzner, K., & Heckinger, E. (2010). Sample size calculation and power analysis: a quick review. Diabetes Educ, 36(5), 701-707. doi: 10.1177/0145721710380791

 Gogtay, N. J. (2010). Principles of sample size calculation. Indian J Ophthalmol, 58(6), 517-518. doi: 10.4103/0301-4738.71692

 Guo, Y., & Pandis, N. (2015). Sample-size calculation for repeated-measures and longitudinal studies. Am J Orthod Dentofacial Orthop, 147(1), 146-149. doi: 10.1016/j.ajodo.2014.10.009

 ICH. (1998). ICH harmonised tripartite guideline - statistical principles for clinical trials. Retrieved from http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E9/Step4/E9_Guideline.pdf

 Kamangar, F., & Islami, F. (2013). Sample size calculation for epidemiologic studies: principles and methods. Arch Iran Med, 16(5), 295-300. doi: 013165/aim.0010

 Kim, J., & Seo, B. S. (2013). How to calculate sample size and why. Clin Orthop Surg, 5(3), 235-242. doi: 10.4055/cios.2013.5.3.235

 Morgan, T. M., & Case, L. D. (2013). Conservative Sample Size Determination for Repeated Measures Analysis of Covariance. Ann Biom Biostat, 1(1).

 Stokes, L. (2014). Sample size calculation for a hypothesis test. JAMA, 312(2), 180-181. doi: 10.1001/jama.2014.8295

 Su, P. F., Li, C. I., & Shyr, Y. (2014). Sample size determination for paired right-censored data based on the difference of Kaplan-Meier estimates. Comput Stat Data Anal, 74, 39-51. doi: 10.1016/j.csda.2013.12.006

 Zhang, S., & Ahn, C. (2012). Sample size calculation for time-averaged differences in the presence of missing data. Contemp Clin Trials, 33(3), 550-556.

 Zhong, B. (2009). How to calculate sample size in randomized controlled trial? J Thorac Dis, 1(1), 51-54.