Sample Size Calculator for a SMART Design with Censored Data


This web application calculates the sample size for comparing two-stage adaptive treatment strategies in a SMART trial using weighted log rank test. In these trials, all subjects are randomized to one of two initial treatments, denoted by A1 and A2; the probability that a subject being assigned to A1 is denoted by p1. All subjects with insufficient response to the first-stage treatment (non-responders) are re-randomized to one of two second-stage treatments: those who initially receive A1 and do not respond are further randomized to B11 or B12; those who initially receive A2 and do not respond are further randomized to B21 or B22. For nonresponders to A1, the probability of being randomized to B11 is p21, and for nonresponders to A2, the probability of being randomized to B21 is p22. The responding subjects are offered the same treatment (no randomization; usually a continuation of the first-stage treatment or a maintenance/relapse prevention treatment). The timing for observation of non-response along with the criteria for non-response should be defined in the study protocol. Non-response may be assessed at a fixed point in time (e.g. 2 months after study entry) or at regular intervals, or may be assessed at a non-response time-to-event such as the time until two unexcused counseling sessions are missed or time until a second drug positive urine is collected. The trial design is depicted in the graph below. The primary outcome of interest is a failure time.

There are 4 adaptive treatment strategies in this trial, denoted by 11, 12, 21, 22. Strategy jk is the strategy in which Aj is offered first, then Bjk is offered to non-responding subjects and Cj is offered to responders, where j, k can be either 1 or 2 (see graph below).


To use the sample size calculator, enter the following information in the appropriate boxes to the right:

  • p1, p21, p22: first and second stage randomization probabilities, respectively (see the graph above).
  • α : the significance level of the test (0 < α < 1).
  • 1-β : the desired power of the test (0 < β < 1).
  • strategies compared: the two strategies you want to compare. Refer to the definitions in the second paragraph in the Description.
  • ξ : hazard ratio of T1 and T2, where T1 and T2 are the (random) times-to-event if a subject followed one of the two strategies compared, respectively.
  • Pobs : the probability of observing an event before the end of study among subjects following the first strategy entered above as “strategies compared”.


  • N : the sample size necessary to detect the difference of two treatment strategies.

Sample size calculation

p1 :
p21 :
p22 :
strategies compared:
α :
β :
ξ :
Pobs :

An Example

In this example, suppose children with ADHD are first randomized to either a behavioral modification therapy (BT) or a medication (Med), with equal probability. Beginning at 2 months and every month thereafter each child’s classroom behavior is assessed and compared to a prespecified criterion. Exceeding the criterion is interpreted as a sign of nonresponse; the nonresponding children are then rerandomized to either intensification of current treatment (more intensive behavioral therapy (BT+) or higher dose medication (Med+)), or a combined treatment (behavioral therapy and medication (BT+Med)), with equal probability. Children who do not show signs of nonresponse continue on their initial treatment (see the figure below). Suppose the outcome of interest is time until a major school disciplinary event.

For the sample size calculation, you need to first specify Aj, Bjk and Cj, j = 1,2. In this example, the first and second stage treatments are: A1=BT, A2=Med, B11=BT+,B12=BT&Med, C1=BT, B21=Med, B22 =BT&Med, and C2=Med.

After this, you can determine p1, p21 and p22. In this example, p1 is the probability that a subject is initially randomized to behavioral treatment, since treatment A1 is the behavioral therapy. Also, p21 is the probability that a subject not responding to behavioral therapy is randomized to more intensive behavioral therapy, and p22 is the probability that a subject not responding to medicine is randomized to higher dose medication. In this example, we have p1 =p21 =p22 =0.5.

Then you need to determine which two strategies you want to compare. For example, suppose you want to compare strategies 11 and 22. Strategy 11 is: offer behavioral therapy first, and offer more intensive behavioral therapy if the subject does not respond; and stay on the original behavioral therapy if the subject responds. Strategy 22 is: offer medicine first, then offer combined treatment (add behavioral therapy) to nonresponding subjects; responding subjects stay on the medication.

Finally, specify the probability that you observe a major school disciplinary event during the study period among children assigned strategy 11, where “observe an event” on a subject implies that the subject has not dropped out of the study when the event occurs.


This web applet calculates the sample size necessary to detect a meaningful difference between two adaptive treatment strategies (also called adaptive interventions or dynamic treatment regimens) in a SMART trial [1] with two stages. The primary outcome is a failure time and the sample size calculator is based on the weighted log rank test with time independent weights given in [2] (also see [3]).

This sample size calculator can be used to size a SMART trial for comparing two strategies beginning with different first-stage treatments (e.g. 11 versus 21 or 11 versus 22 or 12 versus 21, or 12 versus 22). Often investigators compare the two strategies that can be viewed as most extreme in terms of intensity and burden or two strategies that represent opposing clinical approaches to treatment. The primary outcome, a failure time, may be censored before or at the end of study. The failure time under strategy jk is denoted by Tjk. This is the failure time if the subject had followed strategy jk.

In deriving the sample size formula, we make the following working assumptions:

  1. the planned follow up for all subjects is the same (denoted by time τ),
  2. the censoring time is independent of both the failure time and non-response/response,
  3. the failure times under the two strategies to be compared, for example, T11 and T22, have proportional hazards.

Assumption 1 is common in trials in metal health and substance abuse, etc. Assumption 2, the independent censoring assumption, is the usual assumption in standard failure time studies. Assumption 3 permits the specification of an effect size. Under Assumption 3, the effect size for the comparison of the two strategies is taken to be the hazard ratio. However the use of the weighted log rank test in the data analysis does not require Assumptions 1 and 3 (these assumptions are only used to size the study).

This sample size calculator usually results in conservative sample sizes. This occurs because, in deriving the sample size formula, the variances involved in the test statistics are replaced by their upper bounds. We find that it is easier to elicit necessary information to size the study when we use these upper bounds [2]. Moreover, the degree of conservatism depends on the percentage of subjects rerandomized in the second stage. The higher percentage of subjects rerandomized, the less conservative the sample sizes [2].

To improve power in the data analysis, we recommend using the weighted log rank test with time dependent weights [2]. This test is more powerful than the weighted log rank test with time independent weights (see [2] and [4] for details).

Although we assumed only non-responders are re-randomized in the above, this sample size calculator can also be used if responding instead of non-responding subjects are re- randomized. Then strategy jk is: offer treatment Aj in the first stage, offer second stage treatment Bjk if the subject has met the response criteria. Offer a fixed treatment Cj (e.g. continue on current treatment or provide a salvage treatment) if the subject does not meet the response criteria. Everything else is the same as above.


[1]. Murphy S.A. An experimental design for the development of adaptive treatment strategies. Statistics in Medicine 2005; 24:1455-1481.

[2]. Z. Li and S.A. Murphy, Sample Size Formulae for Two-Stage Randomized Trials with Survival Outcomes. Biometrika 2011; 98(3):503-518. Click here to view paper Click here to obtain the simulation code in Matlab Click here to obtain the supplementary material.

[3]. Guo X. Statistical analysis in two stage randomization designs in clinical trials. unpublished PhD thesis, Department of Statistics, North Carolina State University, 2005. Click here to view thesis.

[4]. Guo X. and Tsiatis A.A., A weighted risk set estimator for survival distributions in two-stage randomization designs with censored survival data. The International Journal of Biostatistics 2005; 1(1):1-15.

How to Cite This Work

If you use this applet in your own research, we would greatly appreciate it if you cite one or more of the articles listed in the references section of this web page. Thank you very much!