**Who should use this website?**

**Nothing focuses the researcher’s mind like a sample size calculation!**

**Dichotomous versus continuous?**

**Are you doing an experiment?**

**Understanding observational study designs**

### Who should use this website?

If you are a clinical researcher trying to determine how many subjects to include in your study or you have another question related to sample size or power calculations, we developed this website for you. Our approach is based on Chapters 5 and 6 in the 4th edition of Designing Clinical Research (DCR-4), but the material and calculators provided here go well beyond an introductory textbook on clinical research methods.

### Nothing focuses the researcher’s mind like a sample size calculation!

Despite flaws with the traditional approach to sample size calculation (1), it has real value in forcing clarity about your study design. Before you can calculate your sample size, you must

- identify your predictors and outcomes and choose which of these are the primary predictor and outcome,
- decide how you will measure your variables,
- specify a clinically significant effect size, and
- estimate the variance of continuous measurements or, for categorical measurements, the proportions in the different categories.

The estimation requirement will often prompt you to return to the prior literature with greater focus.

### Dichotomous versus continuous?

If you can quantify your predictor or outcome variable with a number such as body weight in kilograms, blood pressure in mm of Hg, or serum glucose in mg/dL, then it is **continuous**. Even if it is an integer count or score such as cigarettes per day or Glasgow Coma Scale, consider it continuous. If the variable classifies the subject into one of several unordered groups such as blood type or race, then it is categorical. Categorical variables with two possible values (e.g., dead or alive) are **dichotomous**. In coding dichotomous (yes/no) variables, make 0 represent *no *or *absent* and 1 represent* yes* or *present*. For the purpose of sample size calculations, you can and often should make a dichotomous variable out of a variable with many possible categories by combining or excluding groups. You can also make a continuous variable dichotomous by choosing a numerical cutpoint between “positive” and “negative”. Ordinal variables report ordered categories such as mild, moderate, or severe pain. You can also make a continuous variable ordinal by dividing the range of results into several intervals. For your initial sample size calculation, limit your variable choices to continuous or dichotomous.

### Are you doing an experiment?

The classic experimental study design is the randomized controlled trial (RCT) in which you randomly assign eligible subjects to receive either the experimental intervention or the control intervention. You measure the outcome variable in both groups and compare. In this case, your (dichotomous) predictor variable is the group assignment: experimental or control. If your outcome variable is continuous, you will compare group means . You need to specify the difference in group means (effect size) that you want to detect as well as the standard deviation of the outcome measure. Initially, you should assume that the standard deviation of the outcome measure is the same in the experimental and control groups. If your outcome is dichotomous, you will compare group proportions. You need to specify the proportion with the outcome in the control group and the difference in proportions that you want to detect or, equivalently, the proportion with the outcome in the experimental group.

### Understanding observational study designs

If you are not doing an experiment, you are doing an observational study. The main observational study designs are cohort, cross-sectional, and case-control. If the predictor and outcome are both dichotomous, you will compare the proportion with the outcome in the two different predictor groups (“predictor +” and “predictor –”). One way to do this is to divide the proportion with the outcome in the “predictor +” group by the proportion with the outcome in the “predictor -” group. The name of this *relative* measure of association between predictor and outcome depends on the study design. Although you ultimately compare two proportions, our calculators allow you to start by specifying the proportion with the outcome in the “predictor -” group and the relative measure of association that you would like to detect.

**Relative Measures of Association
(Dichotomous Predictor and Outcome)**

Bad Outcome | No Bad Outcome | Total | |

Predictor + | a | b | a + b |

Predictor – | c | d | c + d |

Total | a + c | b + d | N = a + b + c + d |

Risk given Predictor + : P_{1} = a / (a + b)

Risk given Predictor – : P_{0} = c / (c + d)

*P*=

_{1}/P_{0}*a/(a+b)*

*c/(c+d)*

*(P*=

_{1}/(1-P_{1}))/(P_{0}/(1-P_{0}))*(a/b)*

*(c/d)*

*(a/c)*

*(b/d)*

Set up your 2×2 table’s columns so the bad outcome is on the left and the absence of the bad outcome is on the right; set up the rows so that the presence of the predictor is on top and the absence of the predictor is on the bottom. Then, if your relative measure of association is greater than 1, the predictor is harmful.

### Observational Study Designs with Relative Measures of Association

**Cohort Study — **Measure each subject’s predictor status, follow over time, and assess the outcome variable. Relative measure of association: relative risk, a.k.a. relative incidence or relative cumulative incidence.

**Cross-sectional Study —** Measure each subject’s predictor and outcome status at a single point in time. Usually the study population is defined by a characteristic other than the predictor or outcome, such as clinical presentation or geographical location. Relative measure of association: relative prevalence (calculated the same way as the relative risk).

**Case-control Study —** Identify two separate groups of subjects, one group with the outcome (cases) and one without the outcome (controls) and compare predictor status. Relative measure of association: odds ratio.

In case-control studies, the ratio of controls to cases is determined by the investigator. In cross-sectional and cohort studies, the split between “predictor +” and “predictor –” subjects is rarely 50-50, so all of our sample size calculators allow unequal group sizes.

**Studies of diagnostic test accuracy**

These are descriptive studies with special sample size considerations.

1. Bacchetti P. Current sample size conventions: flaws, harms, and alternatives. BMC Med. 2010;8:17.