Introduction to Adaptive Designs
Minh Huynh, Ph.D. Aaron Heuser, Ph.D. Chunxiao Zhou, Ph.D.
FUNDAMENTAL DESIGN PRINCIPLES, CASE STUDIES, AND HANDS-ON PRACTICE
Introduction to Adaptive Designs FUNDAMENTAL DESIGN PRINCIPLES, - - PowerPoint PPT Presentation
Introduction to Adaptive Designs FUNDAMENTAL DESIGN PRINCIPLES, CASE STUDIES, AND HANDS-ON PRACTICE Minh Huynh, Ph.D. Aaron Heuser, Ph.D. Chunxiao Zhou, Ph.D. Course Objectives Gain familiarity with the basic principles of adaptive 1.
Minh Huynh, Ph.D. Aaron Heuser, Ph.D. Chunxiao Zhou, Ph.D.
FUNDAMENTAL DESIGN PRINCIPLES, CASE STUDIES, AND HANDS-ON PRACTICE
1.
Gain familiarity with the basic principles of adaptive designs
2.
See of examples of adaptive designs in practice
3.
Understand the strengths and weaknesses of adaptive methods, and when to use and when not to use them
4.
See the latest application of adaptive designs
5.
Obtain a starter toolkit to start using adaptive designs in your work
1.
A research protocol in which some features are adaptive
At pre-determined points in the study, data are analyzed
Some design aspects of the protocol may change depending on the latest findings from the data
These aspects include sample size, treatment group randomization and assignment, dose, treatment arms
2.
A research protocol in which pre-planned design changes are aimed at improving power, increasing efficiency, reducing cost, reducing time, or addressing ethical issues.
Possible adaptive features include:
duration)
1.
Data collected during study are not only for studying
2.
These data are very useful for the conduct of the study itself
3.
They are typically available sooner than study outcomes
4.
Flexible designs based on population response will result in more efficient and more powerful studies than fixed designs
1.
Ad-hoc changes in design
2.
Changes in design not based on accumulated data
3.
Changes in design based accumulated data, but not part
4.
Stopping rules, sample size changes, or dose escalations based on other factors not related to pre-established design.
Design Data Collection Data Analysis
Conventional
Initial Design Revise Protocol Data Collection Interim Analysis Data Analysis
Adaptive
Yes No
Advantages of Adaptive Designs:
1.
Reduce number of subjects
2.
Reduce subject exposure time
3.
Shorten overall length of study
4.
Lower Cost
Disadvantages of Adaptive Designs:
1.
Requires statistical expertise, some Bayesian concepts
2.
May require longer protocol planning time/regulatory approval
3.
Requires near real-time data entry
4.
Requires high level of coordination for multi-arm study
5.
May yield bias estimates if not done properly
Study Type Key Objectives Key Design Consideration(s) Applicable Clinical Science Phase I Clinical Trial Optimal Dosing, Minimum Toxicity, Minimum Risks Maximum Tolerable Dose identified
Yes
Phase II Clinical Trial Establish Proof of Principle, measure treatment effect, make treatment comparison Ineffective treatments identified with the minimum sample size
Yes
Phase III Clinical Trial Large Scale study of therapeutic effects of treatment, with follow-up Identified impact using double-blind, randomized, placebo-controlled subjects
Yes
Phase IV Clinical Trial Post-introduction monitoring for adverse events Detection of rare adverse events not previously identified NO
Study Type Key Objectives Key Design Consideration Applicable Social Science Randomized Controlled Impact Evaluation Determine whether a social intervention works; Measure Treatment effects Determine treatment effect with minimum bias Yes Quasi-experimental Impact Evaluation Determine whether a social intervention works; Measure Treatment effects Finding useful comparison groups to use in place of the controls; Determine treatment effect with minimum bias Yes Survey methodology Real-time Dynamic Survey management Design, Implement, and manage survey with minimal resources Obtain intended coverage using minimum resources Yes Network Sampling Design, Implement, and manage survey to reach rare or hard-to-reach population Obtain intended coverage using fewer resources than conventional sampling designs Yes
1.
Sample Size Re-estimation
a) Early Stopping b) Interim sample size adjustments
2.
Treatment group Randomization
a) Pick-the-winner b) Arm dropping or switching
3.
Treatment Intensity Escalation
a) Continual Reassessment b) Adaptive dose-finding algorithm Ensures the most number of subjects get the most effective treatment Ensures the study is completed with minimum number of subjects Ensures the maximum benefits with the minimum cost or harm
Simple Adaptive Stopping Rule Suppose you conduct a study with binary outcomes You observe 13 successes among 65 subjects Based on a binomial distribution, 𝑞 =.2 , and 𝑇𝐹 = 𝑞(1 − 𝑞) 𝑜 .05 The confidence interval is thus [ 𝑞 - 1.96x 𝑇𝐹, 𝑞 + 1.96x 𝑇𝐹 ] [.1, .3] (1)
Simple Adaptive Stopping Rule Suppose your policy advisors tell you that a success rate of 35% or more is a policy relevant outcome. At any time, you can use (1) to monitor your study as shown below
.35 𝑞 𝑞𝑉 𝑞𝑀
You can thus stop the trial for futility when [ 𝑞𝑀 , 𝑞𝑉 ] is below .35
Simple Adaptive Stopping Rule What is wrong with this ? You don’t know the optimal time to stop stopping too early may bias estimate of [ 𝑞𝑀 , 𝑞𝑉] stopping too late is costly as n increases This simple method is easy to use, and you can thus stop the trial at any time for futility when [ 𝑞𝑀 , 𝑞𝑉 ] is below some target P
Hands-on Practice Suppose you conduct a study to evaluate where a new job search strategy will help unemployed individuals. Your study have the budget to enroll 700 people. You decide to take a look after 100 people have been enrolled and received the intervention, and see that 25 of the were placed. Your policy advisors tell you that the treatment needs to be at least 35% effective to be worth pursuing. What should you do ?
Optimal Adaptive Sample Size
Rapid Enrollment Design Fleming’s 2-Stage Ivanova’s 2-Stage Gehan’s 2-stage Simon’s 2-Stage
Gehan’s 2-Stage Design
One of the earliest design for phase II clinical trials with binary
response by Gehan(1961)
At the end of first round, study may be stopped for futility This method minimizes sample size under H0 for a given target
effectiveness and a given -level
Stage 1
Rejecting Rule Met ?
Stage 2
Rejecting Rule Met ?
Yes Yes
Reject Treatment
Gehan’s 2-Stage Design
Suppose you have a new treatment that either works, or
does not work at all
This treatment is administered to everyone (uncontrolled) treatment is worth pursuing only if it can hit the 20% mark
(Probability of success=.2)
Since you do not know if this will work, you would like the
minimal sample size required to see this new treatment is worth pursuing.
1.
Gehan’s 2-Stage Design Let p = .2 =target success rate Let X = 1 if treatment is successful, 0 otherwise Let Y1= y1 number of successes among first n1 subjects, Y = 𝑗=1
𝑜1 𝑌𝑗 Bin(n1,p)
Then the probability of getting 0 success after first n1 subjects is Pr 𝑍 = 0 𝑞 = (1 − 𝑞)𝑜1
1.
Gehan’s 2-Stage Design Let = small number, such as .05 Then the probability of getting 0 success after first n1 subjects, given p=.2 is Pr 𝑍 = 0 𝑞 = .2 = (1 − .2)𝑜1= 0.8𝑜1 =.05 It follows that n1 14 In general, when p = any minimally accepted target, then the number of subjects required for stage 1 is
log(1−𝑞)
1.
Gehan’s 2-Stage Design
This can allow the study to end early
NOTE p can be set arbitrarily high to end any study early!!
Step 1 Set target success rate p Step 2 If there is 0 response among the first 𝑜1 = log(0.05)
log(1−𝑞) then
Step 3 Else continue to end of study
Hands-on Practice Suppose you conduct a study to evaluate where a new job search strategy will help unemployed individuals. Your study have the budget to enroll 700 people.
Hands-on Practice Suppose you conduct a study to evaluate where a new job search strategy will help unemployed individuals. Your study have the budget to enroll 700 people. Your policy advisors tell you that the treatment needs to be at least 35% effective to be worth pursuing. Use Gehan’s 2-stage design to come up with a stopping strategy.
Optimal Adaptive Sample Size
Rapid Enrollment Design Fleming’s 2-Stage Ivanova’s 2-Stage Gehan’s 2-stage Simon’s 2-Stage
Simon’s 2-Stage Design
Originally designed for phase II clinical trials with binary
response
Suitable for a one-arm (uncontrolled) study testing H0 : p p0 At the end of first round, study may be stopped for futility This method minimizes sample size under H0 for a given -level
Stage 1
Rejecting Rule Met ?
Stage 2
Rejecting Rule Met ?
Yes Yes
Reject Treatment
Simon’s 2-Stage Design
Suppose you have a new treatment that either works, or
does not work at all
This treatment is administered to everyone (uncontrolled) You currently have a treatment that works 20% of the time
(p0=.2)
Your policy advisors tell you that is your new treatment is
worth pursuing only if it can hit the 40% mark (p1=.4)
Since you do not know if this will work, you would like the
minimal sample size required to test this new treatment
Simon’s 2-Stage Design
Simon’s method allows you to enroll some subjects,
administer treatment, stop and take a look, and decide whether to proceed.
Can we do better than the traditional design?
1.
Simon’s 2-Stage Design Let n1 = number of subjects in 1st stage and n2 = number of subjects in 2nd stage Y1= y1 number of successes among n1 Y1 Bin(n1,p) Y2= y2 number of successes among n2 Y2 Bin(n2,p) r1 = number of successes below which we terminate the study r= number of successes below which we reject the treatment
1.
Simon’s 2-Stage Design Type 1 and type 2 Error Constraints The treatment is defined to be non-promising if y1 r1 or ( y1 > r1 ) ( y1 + y2 r ) ; or promising if ( y1 > r1 ) ( y1 + y2 > r ); or promising if ( y1 > r )
1.
Simon’s 2-Stage Design Type 1 and type 2 Error Constraints Thus we need Prob( promising | p p0 ) < Prob( promising | p p1 ) > At the boundaries, we need Prob(( y1 > r1 ) ( y1 + y2 > r ) | p = p0 ) = ; Prob(( y1 > r1 ) ( y1 + y2 > r ) | p = p1 ) = (2)
1.
Simon’s 2-Stage Design Assuming independence between Y1 and Y2 Prob{( y1 > r1 ) ( y1 + y2 > r ) | p } = =
𝑧1>𝑠1 𝑧2>𝑠−𝑧1
𝑄(𝑧1|𝑞) 𝑄(𝑧2|𝑞)
𝑧1>𝑠1 𝑧2>𝑠−𝑧1
𝑜1 𝑧1 𝑞𝑧1 (1 − 𝑞)𝑜1−𝑧1 𝑜2 𝑧2 𝑞𝑧2 (1 − 𝑞)𝑜2−𝑧2
1.
Simon’s 2-Stage Design Simon’s method chooses the set of (n1 , n2 , r1 , r ) to minimize sample size, subject to Type 1 and Type 2 errors probability constraints in (2)
1.
Simon’s 2-Stage Design (n1 , n2 , r1 , r ) can be chosen to meet one of two optimality criteria (1) Minimizes Expected Sample size under the null, E(N|H0 ) where E(N|H0 ) = n1 + n2 Prob(Proceed to Stage 2|H0 ) = n1 + n2 Prob(r1 +1 y1 r | p = p0 ) Optimal OR (2) Minimizes maximum size in the trial, n= n1 + n2 Minimax
1.
Simon’s 2-Stage Design Simon’s method chooses the set of (n1 , n2 , r1 , r ) to minimize sample size, subject to Type 1 and Type 2 errors probability constraints in (2)
Step 1 Specify p0 , p1, and Step 2 For each value of total sample size n, and each value of n1 in the range (1, n – 1), determine the integer values of r1 and r which satisfy the error constraints and minimize EN when p = p0
Alpha Power Current Treatment success rate New Treatment success rate Click
http://cancer.unc.edu/biostatistics/program/ivanova/SimonsTwoStageDesign.aspx
Total Sample Size = 54 Total Stage 1 Sample Size = 19 End Study and Reject Treatment after Stage 1 if success 4/19 Reject Treatment after Stage 2 if success 15/54 Expected Sample Size Under H0 is 30.4 Probability of Stopping Early Under H0 is .6733
Simon’s 2-Stage Design From these calculations, the resulting adaptive design is
Step 1 Observe 19 subjects Step 2 If fewer than 4 subjects receive successful treatments Step 3 If 4 or more subjects receive successful treatments, go to Stage 2 Step 4 Observe 35 more subjects and count total successes Step 5 If fewer than 15 subjects total receive successful treatments, fail to reject Ho, reject new treatment Step 6 If 15 or more subjects total receive successful treatments, reject Ho, suggest new treatment is better than the existing treatment
Simon’s 2-Stage Design If these steps and stopping rules are followed,
Expected Sample Size is only 30.4 subjects Probability of Stopping early is .6733 This is very good efficiency gain !
NOTE: This design stops early for futility, and not for efficacy
Hands-on Practice
Suppose you conduct a Phase II clinical trial to test whether a new drug is effective in treating the ZIKA virus. You would like to have =0.05 and your trial to have power = 90%. Your Section Chief informs you that the previous candidate drug had a 20% response rate, and for the new drug to have move to Phase III you need a response rate of at least 40%. You have a research budget that can enroll no more than 100 subjects. (a) Design a Simon 2-stage trial using both optimal and minimax criteria (b) How far below the 100 subjects mark will you expect to be ? (c) What is the probability that this will happen (i.e., that you will have an early stop?)
Bayesian Phase II Design with Posterior Probability
Use successive predictive probabilities of success as stopping
rules
Data accrual are monitored continuously to make adaptive
decisions
Can be stopped for both efficacy or futility Suited for situations where a new treatment is being considered
for further study, but you do not have a large number of subjects (or you do not want to)
Simple to implement for studies with binary outcomes
Bayesian Phase II Design with Posterior Probability Let p1= response rate for new, experimental treatment, p1 Beta(1,1) Let p0= response rate for currently existing treatment, p0 Beta(0,0) At any stage of the study, let Y = number of successes among n subjects treated, Y Bin(n,p1) Because of conjugate property between the binomial distribution and the beta distribution, the posterior distribution of p1 given Y = y is a beta distribution p1 |Y = y Beta(1 +y, 1 +N-y)
Bayesian Phase II Design with Posterior Probability Let f(p; 1,) = p.d.f. of p Beta(1,) and let F(p; 1,) =
𝑞 𝑔 𝑦; 1, its
correspondent c.d.f; then we can compute Pr 𝑞1 > 𝑞0 + 𝜀 𝑧 =
1−𝜀
1 − 𝐺 𝑞 + 𝜀; 𝛽1 + 𝑧, 𝛾1 + 𝑜 − 𝑧 𝑔 𝑞; 𝛽0,, 𝛾0 𝑒𝑞; where 0< <1 is the minimally acceptable change in the new treatment’s response rate compared to the standard treatment.
Bayesian Phase II Design with Posterior Probability Now we are almost ready to specify the design: Let U= upper probability cut-off, U[.95,.99] L= lower probability cut-off, U[.01,.05] Un=smallest integer of y such that Pr(p1>p0|y) U Ln=largest integer of y such that Pr(p1>p0+|y) L
Bayesian Phase II Design with Posterior Probability Step 0 Let N = the max number of subjects you can enroll Step 1 Enroll n subjects, and observe the first y the number of successes Step 2 If y Un= end Phase II for efficacy, treatment is promising If y Ln= terminate the study for futility, treatment is not promising Step 3 If Ln < y < Un= and n < N , continue enroll the next subject Step 4 If n reaches N before y crosses any stopping boundary, then the study is inconclusive
Bayesian Phase II Design with Posterior Probability Example 3a Suppose the most we can enroll in a study is N=40 And suppose the prior beta distribution for the current treatment is p0 Beta(0,0)=Beta(15,35) and prior beta distribution for the new treatment is p1 Beta(1,1)=Beta(.6,1.4)
Bayesian Phase II Design with Posterior Probability Using L= .05 and =0 for futility stopping, we enroll our subjects and monitor each of them. Our stopping rules are given by Each pair (Ln , n) is a stopping rule = stop the study if the number of successes after enrolling n subjects is less than or equal to Ln For example, (2,18) means that after enrolling 18 subjects, if there are 2
Bayesian Predictive Monitoring with Posterior Probability One particular feature of the Bayesian paradigm is that one can obtain predictions based on the posterior predictive distribution. Frequentist predictive methods use conditional probability based on a particular value of a model parameter. Bayesian predictive methods average these probabilities over the parameter space given the
Lee and Liu (2008) used this concept to derive a method, Predictive Probability Monitoring
Bayesian Predictive Monitoring with Posterior Probability Let n = current number of subjects, 1 < n < Nmax, where Nmax is the maximum sample size planned Let X = the number of successes among n treated patients, X ~ Bin(n,p1). Assume that the prior distribution of the success rate (p1) follows ~ Beta(a0,b0). The posterior distribution for p1 thus follows p1|X=x ~ Beta(a0+ x, b0+ n - x)
Bayesian Predictive Monitoring with Posterior Probability Let Y denote the number of successful subjects out of m=N-n future recruits , Y< m The probability of Y = y given the current data x follows a beta-binomial distribution, Y|x ~ Beta-Bin(N – n, a0 + x, b0 + n – x), with probability mass function Pr 𝑧 𝑦 =
1 𝑂 − 𝑜
𝑧 𝑞𝑧(1 − 𝑞)𝑂−𝑜−𝑧 𝑞𝑏0+𝑦−1(1 − 𝑞)𝑐0+𝑜−𝑦−1 𝐶(𝑏0 + 𝑦; 𝑐0 + 𝑜 − 𝑦) 𝑒𝑞; = 𝑂 − 𝑜 𝑧
𝐶(𝑏0+𝑦+𝑧,𝑐0+𝑜−𝑦−𝑧) 𝐶(𝑏0+𝑦,𝑐0+𝑜−𝑦)
Bayesian Predictive Monitoring with Posterior Probability Thus the posterior distribution of the success rate, given y and x, is p1|Y=y,X=x~ Beta(a0+y+x,b0+N-y-x) The criterion for declaring the treatment is promising is Prob(p1>p0|y,x) T with p0 the previous success rate and T is some threshold.
Bayesian Predictive Monitoring with Posterior Probability Lee and Liu next defined the Predictive Probability (PP) as 𝑄𝑄 ≡
𝑗=1 𝑛
𝑄 𝑍 = 𝑗 𝑌 = 𝑦 × 𝑱[𝑄 𝑞1 > 𝑞0 𝑍 = 𝑗, 𝑌 = 𝑦 > 𝜄𝑈] where I I [] is an indicator function, and Prob(Y=i|X=x) is the probability of observing i responses in m patients given current data X=x
Bayesian Predictive Monitoring with Posterior Probability
PP is the predictive probability of obtaining a positive result by the end
A high PP means that the treatment is likely to be efficacious by the
end of the study, given the.
A low PP suggests that the treatment may not have sufficient activity. Therefore, PP can be used to determine whether the trial should be
stopped early due to efficacy/futility based on the current data . We are almost there…
Bayesian Predictive Monitoring with Posterior Probability
Next, choose two cut-offs L (small number) , U (large number) (0,1) 1.
If PP > U , stop the study the new treatment is promising.
1.
If PP < L , stop the study the new treatment is not promising
2.
Otherwise, continue the study until Nmax is reached. To illustrate, we use a hands-on example, and this software from the MD Anderson Cancer Center Software Repository
Hands-On, Part A Suppose you have a Phase II trial, where you have observed the first 10 subjects. Your trial has a prior distribution of success rates of beta(.6,.4), and you have a budget for a maximum of 50 subjects. Suppose you set the threshold T at 0.9, and would like the Type I error to be 0.05, and Power to be 0.9. You also would not stop unless U =1.0. Use the Lee and Liu (2008) Predictive Probability Method to design your trial.
the number of patients in the first cohort being evaluated for response when PP interim decision starts to be implemented
How many incoming subjects any given time
Maximum sample size
Choose one of these
Set your thresholds here; Note the flexible starting point, ending point, and step size
Success rate of current treatment
Target success rate for your trial
Set desired alpha and power levels
Information from previous trial(s) What if you do not have any information?
What if your trial is the first, and do not have any prior information?
Recall our assumption that the prior distribution of the response rate,
(p), follows a beta distribution, namely beta(a0, b0)
The quantity a0/(a0+b0) reflects the prior mean while size of a0+b0 indicates how informative the prior is, with a0=the
number of successes and b0= the number of failures
Thus, a0+b0 can be considered as a measure of the amount of
information contained in the prior.
You can always run the trial and record the first a0 , b0
Click to run Click to see output
suitable ranges of L and T under an Nmax that satisfies the constrained Type I and Type II error rates
Subject number
First (Negative) Rejection Region: This shows the maximum number of patients with positive response to terminate a trial. If the number of responses is less than or equal to this boundary, stop the trial and reject the alternative hypothesis
Second (Positive) Rejection Region: This shows minimum number of patients with positive response to terminal a trial and claim to reject the null hypothesis. If the number of responses is greater than this boundary, stop the trial and reject the null hypothesis. If this number is greater than what you have in the sample, keep going. You cannot reject the null
Probability of making the negative decision under the null hypothesis
Probability of making the positive decision under the null hypothesis
Probability of continuing the trial under the null hypothesis Same information is also computed under the alternative
PET=probability of early termination for futility (PP<L), probability of
early termination for efficacy (PP>U), and total probability of early termination
Expected number of subjects under the null
Highest Type I and Type II error rates within suitable range of L and T
What are some major weakness with the methods we discussed?
Erratic Accrual Rate Suppose a few subjects have been treated at a leisurely pace and then the long line of patients appears. The
probabilities for everyone, and additional information do not get to play a role. Accrual Rate Mismatch If subjects are accruing faster than information is accruing, adaptive learning is compromised, or if subjects are treated at a faster rate than outcomes can be recorded, there can be no adaptation to the data,
What are some major weakness with the nethods we discussed?
Single-Arm Biases Many promising drugs eventually fail in Phase III trials, even though they showed efficacy in Phase II trials. The reason Single-arm trials can introduce bias, because there can be significant but unobserved differences in patient populations, in study criteria, and in medical facilities between the current and previous studies. For a better assessment, Predictive Monitoring in Randomized Phase II trials should be used
Predictive Monitoring in Randomized Phase II trials
Main idea: the posterior predictive probability can be use to monitor a study by predicting the outcome of the study at t = after all the subjects are enrolled. If there is a high predictive probability that a definitive conclusion would be reached by the end of the study (e.g., superiority or futility), then the study could be stopped earlier.
Predictive Monitoring in Randomized Phase II trials
Suppose you have a two-arm trial, and let pk = the success rate for treatment k, pk ~ Beta(ak,ßk), k=1,2 Nk = the maximum sample size planned for arm k, and Yk = the number of successes among k treated subjects, 1 < nk < Nk, then Yk ~ Bin(nk,Pk) Thus, as before, the posterior distribution of pk is
pk|Y=yk ~ Beta(ak+ yk, bk+ nk - yk) k=1,2
Predictive Monitoring in Randomized Phase II trials Let Xk = the number of future successes among the remaining Nk n subjects in arm k. Then, as before, the posterior predictive distribution of Xk given Yk = yk is Beta-Binomial: Pr 𝑦𝑙 𝑧𝑙 = 𝑂𝑙 − 𝑦𝑙 𝑦𝑙
𝐶(𝛽1+𝑦𝑙+𝑧𝑙, 𝛾1+𝑂𝑙−𝑦𝑙−𝑧𝑙) 𝐶(𝛽1+𝑧𝑙, 𝛾1+𝑜𝑙−𝑧𝑙)
Predictive Monitoring in Randomized Phase II trials Next, let H0 : p1= p2 H1: p1 p2
For each pair of future data (X1=x1, X2=x2), we can draw a conclusion on whether this hypothesis test would give a significant difference by the time the study concludes.
Predictive Monitoring in Randomized Phase II trials
The predictive probability of rejecting H0 is found by summing over all possible future outcomes of this pair (x1,x2), and is given by (6) where I[ I[] ] is an indicator function indicating whether a binomial test of two proportions for H0 : p1=p2 is significant.
Pr 𝑡𝑗𝑜𝑗𝑔𝑗𝑑𝑏𝑜𝑢 𝑒𝑗𝑔𝑔𝑓𝑠𝑓𝑜𝑑𝑓 𝑏𝑢 𝑓𝑜𝑒 𝑝𝑔 𝑡𝑢𝑣𝑒𝑧 𝑒𝑏𝑢𝑏 =
𝑦1=0 𝑂1−𝑜1 𝑦2=0 𝑂2−𝑜2
𝑄 𝑦1 𝑧1 𝑄 𝑦2 𝑧2 𝑱[𝑆𝑓𝑘𝑓𝑑𝑢𝑗𝑜 𝐼0]
Predictive Monitoring in Randomized Phase II trials
We would reject H0 if |𝑎| ≥ 𝑨
𝛽 2 where
and and 𝑨
𝛽 2 is the 100(1-/2)th percentile of the standard normal distribution.
This is a hybrid Frequentist/Bayesian approach. See Yin (2012) 𝑎 = 𝑞1 − 𝑞2 𝑞(1 − 𝑞)( 1 𝑂1 + 1 𝑂2)
𝑞𝑙 = 𝑧𝑙 + 𝑦𝑙 𝑂𝑙
Predictive Monitoring in Randomized Phase II trials
We can also employ a fully Bayesian interim monitoring procedure using predictive probability. Given current data (y1,y2) and future data (x1,x2), we compute the posterior probability
Pr 𝑞1 > 𝑞2 𝑦1, 𝑦2, 𝑧1, 𝑧2 =
1 𝑞2 1
𝑔 𝑞2 𝑦2, 𝑧2 𝑔 𝑞1 𝑦1, 𝑧1 𝑒𝑞1𝑒𝑞2 ≥ 𝜄𝑈
Where 𝑔 𝑞𝑙 𝑦𝑙, 𝑧𝑙 is the probability density function of 𝑞𝑙 with the distribution 𝑞𝑙~Beta 𝛽𝑙 + 𝑧𝑙 + 𝑦𝑙, 𝛾𝑙 + 𝑂𝑙 − 𝑧𝑙 − 𝑦𝑙 , 𝑙 = 1,2
Predictive Monitoring in Randomized Phase II trials Treatment 1 is superior if Pr 𝑞1 > 𝑞2 𝑦1, 𝑦2, 𝑧1, 𝑧2 ≥ 𝜄𝑈 where is 𝜄𝑙 the
usual threshold. But since future data (x1,x2) have not yet been observed, we use
𝑦1=0 𝑂1−𝑜1 𝑦2=0 𝑂2−𝑜2
𝑄 𝑦1 𝑧1 𝑄 𝑦2 𝑧2 𝑱[Pr 𝑞1 > 𝑞2 𝑦1, 𝑦2, 𝑧1, 𝑧2 ≥ 𝜄𝑈]
Predictive Monitoring in Randomized Phase II trials Yin (2012) Illustration: For the frequentist hypothesis testing, use a two- sided binomial test at = 0.05. For the Bayesian method, use prior distributions for p1 and p2 as Beta(0.2,0.8) . = 0.95 in (5.5), and 𝜄𝑈 = .095
Arm 1 Arm 2 Pr(favoring Arm 1) Pr(favoring Arm 2) N y1/n1 y2/n2 Frequentist Bayesian Frequentist Bayesian 40 5/10 2/10 0.5062 0.6702 <0.0001 <0.0001 60 5/10 2/10 0.6266 0.7225 0.0005 0.0013 80 5/10 2/10 0.6915 0.7567 0.002 0.0037 100 5/10 2/10 0.7291 0.7815 0.004 0.0065 100 10/20 4/20 0.8415 0.8999 <0.0001 <0.0001 100 15/30 6/30 0.9306 0.9735 <0.0001 <0.0001 100 20/40 8/40 0.991 0.9993 <0.0001 <0.0001
Arm 1 Arm 2 Pr(favoring Arm 1) Pr(favoring Arm 2) N y1/n1 y2/n2 Frequentist Bayesian Frequentist Bayesian 40 5/10 2/10 0.5062 0.6702 <0.0001 <0.0001 60 5/10 2/10 0.6266 0.7225 0.0005 0.0013 80 5/10 2/10 0.6915 0.7567 0.002 0.0037 100 5/10 2/10 0.7291 0.7815 0.004 0.0065 100 10/20 4/20 0.8415 0.8999 <0.0001 <0.0001 100 15/30 6/30 0.9306 0.9735 <0.0001 <0.0001 100 20/40 8/40 0.991 0.9993 <0.0001 <0.0001
N=40: Enroll 20 subjects in each arm to compare the two treatments. Suppose after 10 patients were treated in each arm, 5 patients responded in Arm 1 and 2 patients responded in Arm 2.
Predictive Probability of picking Arm 1 is 51% for frequentist, 67% for Bayesian Predictive Probability of picking Arm 2 is close to zero
Arm 1 Arm 2 Pr(favoring Arm 1) Pr(favoring Arm 2) N y1/n1 y2/n2 Frequentist Bayesian Frequentist Bayesian 40 5/10 2/10 0.5062 0.6702 <0.0001 <0.0001 60 5/10 2/10 0.6266 0.7225 0.0005 0.0013 80 5/10 2/10 0.6915 0.7567 0.002 0.0037 100 5/10 2/10 0.7291 0.7815 0.004 0.0065 100 10/20 4/20 0.8415 0.8999 <0.0001 <0.0001 100 15/30 6/30 0.9306 0.9735 <0.0001 <0.0001 100 20/40 8/40 0.991 0.9993 <0.0001 <0.0001
N=60-100: As Nmax increases, the predictive probability of choosing Arm 1 increases, but reaches a plateau. These calculations show that there is value to adding additional data, but the conclusion is unchanged beyond certain level of success.
Adaptive Randomization
For a more objective comparison of different treatments, subjects
should be randomized the two (or more) arms.
Randomization can use a fixed probability, or an outcome-based
adaptive probability—this is called adaptive randomization (AR)
AR is more beneficial, as each new subject has a higher probability
Yin, Chen and Lee (2012) proposed a method to combine PP and
AR for trial monitoring.
Adaptive Randomization Yin, Chen and Lee (2012) proposed the use of a tuning parameter Each new subject is randomized into Arm 1 with probability 𝜌(, )= +(1−) where = Pr(p1 > p2|y1,y2), and is a tuning parameter This method is implement in the AR software by the M.D. Anderson Canter Center, Department of Biostatistics.
Adaptive Randomization Predictive Monitoring in Randomized Phase II trials Bayesian Predictive Monitoring with Posterior Probability Simon’s 2-Stage Design Gehan’s 2-Stage Design Basic Concepts in Adaptive Design Simple Adaptive Stopping Rule Key Design Principles
Dose-escalation Designs Adaptive Sampling Methods Hands-on Practice Starter Kit General Concerns with Adaptive Design
Key Concepts in Adaptive Survey Design
mean two different things
improve survey quality metrics, such as response rate, sample balance, non-response error, stability/quality of estimates, sampling errors, etc…
Key Concepts for Dynamic Adaptive Survey Management
conduct of the survey. At pre-determined points in the survey implementation, these data are examined to guide the rest of the implementation
improve cost, efficiency, and precision This is sometimes referred to as dynamic survey management.
depth.
Key Concepts Data collected and examined are of four groups
statistics, alternative modes, previous response data
propensity, interviewer observations, time and travel, survey progress rate, Web survey metrics
stability/quality of estimates
Key Concepts After the data are examined, the following survey design features can be changed
R-indicators and their use An important tool to use for dynamic survey management is the R-indicator First introduced by Schouten and Cobben (2007), it is a measure used to show the extent to which survey response deviates from the representative measure. Schouten and Cobben introduced three measures, and there have been my variants introduced since. R-indicators are used to track how a survey is preforming over time, and corrections or adjustments can be made
R-indicators and their use The basic R-indicator has the form 𝑆 𝑦 = 1 − 2𝑦 where x is the individual response propensity given the auxiliary variable X An important variant is the Partial R-indicator First introduced by Schouten et al (2010), it allows the representativeness of subgroups to be measured over time.
Example from Miller et al (2013) National Survey of College Graduates
intervention
Sampling Hard to reach population Example: Homeless, Sex Worker, HIV/AIDS, Drug user… Hidden, Invisible, Vulnerable Social Network Traditional Sampling methods may not be efficient Adaptive Web Sampling (Steven Thompson)
Colorado Springs HIV/AIDS Study (Potterat 1993)
injection drug user yi=1 non-Injection drug user yi=0 drug using relationship Wij= 1 if there is a link between i and j Wij= 0 otherwise
Steven Thompson 2006 Main idea: follow link + random jump
Design based: probability only enters through design, no probability model for the population
Selecting of units depends on observed values of interest Variable of interest: node variable + link variable
Flexible : balancing depth and width two previous methods: Random walk: depth Snowball: width
Sampling in networks
Population: units 1,2, …, N Variable of interest:
node variables: y1, y2, …, yN link variable (weights): wij, i,j = 1,2, …, N
Sample : (S, ys)
S = (units, pairs of units)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30 Population: units 1,2, …, 30
yi=1, i = 1, 2, … , 30 yi=0, i = 1, 2, … , 30 relationship Wij= 1 if there is a link between i and j Wij= 0 otherwise
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Initial sample sample size =1 Select a node randomly S0 = {7}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
First wave sample sample size =2 Select a node randomly among all nodes connect to the node selected in the previous step S1 = {7, 10}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Second wave sample sample size =3 Select a node randomly among all nodes connect to the node selected in the previous step S2 = {7, 10, 15}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Third wave sample sample size =4 Select a node randomly among all nodes connect to the node selected in the previous step S3 = {7, 10, 1, 18}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Fourth wave sample sample size =5 Select a node randomly among all nodes connect to the node selected in the previous step S4 = {7, 10, 1, 18, 19}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Fifth wave sample sample size =6 Select a node randomly among all nodes connect to the node selected in the previous step S5 = {7, 10, 1, 18, 19, 17}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Initial sample sample size =5 S0 = {7, 11, 18, 20, 24}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
First wave sample sample size =20 S1 = {7, 11, 18, 20, 24, 4, 6, 8, 10, 12, 13, 14 15, 16, 17, 19, 22, 23, 26, 27}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
Second wave sample sample size =25 S2 = {7, 11, 18, 20, 24, 4, 6, 8, 10, 12, 13, 14 15, 16, 17, 19, 22, 23, 26, 27, 2, 3, 9, 21, 30}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
The next unit or set of units is selected by follow a link or a random jump with some probability
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
random jump
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
follow a link
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
follow a link
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
random jump
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
follow a link
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 21 20 22 23 24 25 26 27 28 29 30
follow a link
Design-based estimation
Main idea Start with some preliminary estimator improve the estimator by Rao-Blackwellization
Sufficient statistic A statistics T(X) is sufficient for parameter Θ iff P(X|T, Θ) = P(X|T) A statistic T(X) is minimal sufficient iff i) T(X) is sufficient ii) for any
In network sampling, minimal sufficient statistics: dr ={ (i, yi, wi+, wij); i,j are sampled units}, i.e., distinct units and associated values of interest, wi+ is out-degree
Rao-Blackwellization
Rao-Blackwell Theorem conditional expectation of original estimator given sufficient statistic is always as good as or better û = E(û0 |dr)=Σpaths û0(S)P(S|dr) , E((û – μ)2)≤ E((û0 – μ)2) Note sample S includes order information
Complete ensures that the distributions corresponding to different values of the parameters are distinct If the conditioning statistic is both complete and sufficient, and the original estimator is unbiased, then the Rao–Blackwell estimator is the unique best unbiased estimator In network sampling, the minimal sufficient statistic dr is not complete No unique best estimator
Four Rao-Blackwellization Estimators (no unique best estimator) To estimate the proposition of injection drug users
Estimator based on initial sample mean û01 = Σi∈s0 yi/n0 or û01 = 1/NΣi∈s0 yi/πi
Improved estimator by Rao-Blackwellization û1 = E(û01 |dr)=Σpaths û01(S)P(S|dr) , paths are all permutations of sample unit S Similarly, the other three estimators and improved Rao-Blackwellization estimators are based on conditional probabilities, composite conditional generalized ratio, and composite conditional mean of ratios.
Rao-Blackwellization
Computational issue Going through all possible sample paths (all permutations n!) is prohibitively expensive!
Solution MCMC Target: generate a Markov chain of permutations X0, X1, X2… having Stationary distribution p(x|dr) Metropolis–Hastings algorithm: 1) generate a tentative permutation tk 2) xk=tk with probability α, else xk=xk-1 , where α=min {p(tk)/p(xk-1)*pt(xk-1|dr)/pt (tk|dr),1}
Thompson 2006
Thompson 2006
Left column: four original estimators; Right column: improved estimators By Rao-Blackwellization
Maximum Tolerated Dose
Maximum Tolerated Dose
ignored.
Dose Finding
Dose Finding – 3 + 3 Design
an ineffective dose.
Dose Finding – 3 + 3 Design
1.
Current dose level = j (administered to 3 patients).
2.
N(j) = number experiencing dosage limiting toxicity (DLT).
a.
If N(j) = 0, then escalate dose to j+1 and go to step 1.
b.
Else if N(j) = 1, treat 3 more patients at dose level j.
i.
If N(j) = 1, escalate dose to j+1.
ii.
Else if N(j) = 2, trial is ended and MTD = j – 1.
iii.
Else, treat 3 more patients at dose j – 1.
Dose Finding – 3 + 3 Design
DLT = 1/3 DLT > 2/6 DLT = 1/6 DLT = 2/6 DLT = 0/3 DLT > 1/3 3 patients – dose level j 3 patients Escalate to j+1 MTD = j-1 De-escalate to j-1
Dose Finding – Continual Reassessment Model
Dose Finding – Continual Reassessment Model
Dose Finding – Continual Reassessment Model
exp(𝛽)
Dose Finding – Continual Reassessment Model
Dose Finding – Continual Reassessment Model
exp(𝑦+𝛽𝑒𝑘) 1+ exp(𝑦+𝛽𝑒𝑘), for some fixed x.
tanh 𝑒𝑘 +1 2 𝛽
.
Dose Finding – Continual Reassessment Model
𝑀𝛽 ∝
𝑘=1 𝐾
𝑞𝑘
exp 𝛽 𝑙𝑘 1 − 𝑞𝑘 exp 𝛽 𝑜𝑘−𝑙𝑘
Dose Finding – Continual Reassessment Model
𝜌𝑘 = 𝑒𝛽 𝑞𝑘
exp 𝛽
𝑀𝛽𝑔 𝛽 𝑒𝛽 𝑀𝛽𝑔(𝛽)
recalculate the posterior means at all dose levels.
Dose Finding – Continual Reassessment Model
𝑘 = argmini∈{1,…,𝐾} | 𝜌𝑗 − 𝜚𝑈|
escalation) never exceeds one dose level.
Dose Finding – Continual Reassessment Model
1.
Treat the first cohort at the lowest dose level.
2.
Denote the current dose level as 𝑘0, and based on observed data, obtain the posterior mean estimates
𝜌𝐾.
Dose Finding – Continual Reassessment Model
3.
Find the dose level with toxicity probability closest to 𝜚𝑈: 𝑘∗ = argminj∈{1,…,𝐾} | 𝜌𝑘 − 𝜚𝑈|
a.
If 𝑘0 > 𝑘∗, de-escalate to dose 𝑘0 − 1;
b.
If 𝑘0 < 𝑘∗, escalate to dose 𝑘0 + 1;
c.
Otherwise, the dose remains the same.
4.
Once the maximum sample size has been reached, set the MTD as the dose with toxicity probability closest to 𝜚𝑈.
Dose Finding – Continual Reassessment Model
𝑄 𝜌1 > 𝜚𝑈 𝑃𝑐𝑡𝑓𝑠𝑤𝑓𝑒 =
−∞ log log 𝜚𝑈 log 𝑞1
𝑒𝛽 𝑔(𝛽|𝑃𝑐𝑡𝑓𝑠𝑤𝑓𝑒) > 𝜐
Dose Finding – Bayesian Model Averaging CRM
note the following:
subjectivity to the design.
pre-specified skeleton may be impossible.
Dose Finding – Bayesian Model Averaging CRM
collection of CRMs.
to the toxicity level of a drug.
proportional to the model fit.
Dose Finding – Bayesian Model Averaging CRM
𝜌𝑙𝑘 𝛽𝑙 = 𝑞𝑙𝑘
exp(𝛽𝑙).
the likelihood function is 𝑀𝛽𝑙
𝑙
∝ 𝑘=1
𝐾
𝑞𝑙𝑘
exp 𝛽𝑙 𝑜𝑘 1 − 𝑞𝑙𝑘 exp 𝛽𝑙 𝑜𝑘−𝑛𝑘.
Dose Finding – Bayesian Model Averaging CRM
𝜌𝑙𝑘 𝛽𝑙 = 𝑞𝑙𝑘
exp(𝛽𝑙).
the likelihood function is 𝑀𝛽𝑙
𝑙
∝ 𝑘=1
𝐾
𝑞𝑙𝑘
exp 𝛽𝑙 𝑜𝑘 1 − 𝑞𝑙𝑘 exp 𝛽𝑙 𝑜𝑘−𝑛𝑘.
Dose Finding – Bayesian Model Averaging CRM
𝐿 (Uniform distribution) if no prior information.
likelihood of 𝑁𝑙 is 𝑀𝑙 = 𝑒𝛽𝑙 𝑀𝛽𝑙
𝑙 𝑔(𝛽𝑙|𝑁𝑙)
𝑄 𝑁𝑙 𝑃𝑐𝑡𝑓𝑠𝑤𝑓𝑒 = 𝑀𝑙𝑄 𝑁𝑙 𝑗=1
𝐿
𝑀𝑗𝑄 𝑁𝑗
Dose Finding – Bayesian Model Averaging CRM
𝜌𝑙1, … , 𝜌𝑙𝐾 be the posterior means of toxicity probability at all dose levels, where 𝜌𝑙𝑘 = 𝑒𝛽𝑙 𝑀𝛽𝑙
𝑙 𝑔 𝛽𝑙 𝑁𝑙
𝑒𝛽𝑙 𝑀𝛽𝑙
𝑙 𝑔(𝛽𝑙|𝑁𝑙)
𝜌𝑘 =
𝑙 𝐿
𝜌𝑙𝑘𝑄(𝑁𝑙|𝑃𝑐𝑡𝑓𝑠𝑤𝑓𝑒)
Dose Finding – Bayesian Model Averaging CRM
𝜌𝑘 assigned to each 𝜌𝑙𝑘 is the weight 𝑄 𝑁𝑙|𝑃𝑐𝑡𝑓𝑠𝑤𝑓𝑒 .
fit.
based on 𝜌𝑘.
Dose Finding – Bayesian Model Averaging CRM
at a time.
1.
The first cohort is administered the lowest treatment level.
2.
Denote the current dose level as 𝑘0, and based on observed data, obtain the BMA estimates for toxicity probability at all dose levels:
Dose Finding – Bayesian Model Averaging CRM
3.
Find the dose level with toxicity probability closest to 𝜚𝑈: 𝑘∗ = argminj∈{1,…,𝐾} | 𝜌𝑘 − 𝜚𝑈|
a.
If 𝑘0 > 𝑘∗, de-escalate to dose 𝑘0 − 1;
b.
If 𝑘0 < 𝑘∗, escalate to dose 𝑘0 + 1;
c.
Otherwise, the dose remains the same.
4.
Once the maximum sample size has been reached, set the MTD as the dose with toxicity probability closest to 𝜚𝑈.
Escalation With Overdose Control (EWOC)
threshold of overdose proportion.
Escalation With Overdose Control (EWOC)
the relationship between ℓ𝑗 and 𝑒𝑗 by 𝑄 𝑒𝑗 = 1 ℓ𝑗 = 𝐺(𝛾0 + 𝛾1ℓ𝑗)
Escalation With Overdose Control (EWOC)
𝑀 𝑒1, … , 𝑒𝑜 𝛾0, 𝛾1 =
𝑗=1 𝑜
𝐺 𝛾0 + 𝛾1ℓ𝑗 𝑒𝑗 1 − 𝐺 𝛾0 + 𝛾1ℓ𝑗
1−𝑒𝑗
toxicity probability of the lowest dose ℓ0: 𝜚𝑈 = 𝑄 𝑒 = 1 dose = ℓ∗ = 𝐺 𝛾0 + 𝛾1ℓ∗ 𝜌0 = 𝑄 𝑒 = 1 𝑒𝑝𝑡𝑓 = ℓ0 = 𝐺(𝛾0 + 𝛾1ℓ0)
FDA Draft Guidelines (2010) Cautions
terms of their performance and characteristics
error rate, minimization of the impact of adaptation- associated statistical or operational bias on the estimates of treatment effects, potential increase of Type II error rates, and the interpretability of study results
Control of Study-Wide Type I Errors
for early rejection of various hypothesis, for changing the final hypothesis to be tested, or for increasing the sample size.
analyses represent multiplicities that have the potential to inflate the Type I error rate,
Statistical Bias in Estimates of Treatment Effect Associated with Study Design Adaptations
multiple study outcomes, cief among them the treatment effects
samples, some bias may exist
effect because studies usually adapt steer subjects to the highest observed impact. This is another source of bias
Potential for Increased Type II Error Rate
fail to detect a treatment effect when one actually exists.
way due to an interim data analysis
some treatments with a delayed impact may be erroneously dropped due to futility.
The Role of Simulations
simultaneously considered in the adaptive process, it is difficult to assess design performance.
conducting the study can help evaluate the various trial design options
scenarios that might occur when the study is actually conducted, and proper adaptations can be designed
Things to compare across designs
under different values of the chosen threshold
Things you need to do to get started
improvement over competing designs, and over traditional design
analyses
What we Covered