Bayesian Decision Theory with applications to Experimental Design - - PowerPoint PPT Presentation

bayesian decision theory with applications to
SMART_READER_LITE
LIVE PREVIEW

Bayesian Decision Theory with applications to Experimental Design - - PowerPoint PPT Presentation

Bayesian Decision Theory with applications to Experimental Design Robbie Peck University of Bath 1 / 31 Overview Bayesian Decision Theory through example Motivating Example Ingredients Special cases Gain functions Bayes Decision Rule


slide-1
SLIDE 1

Bayesian Decision Theory with applications to Experimental Design

Robbie Peck University of Bath

1 / 31

slide-2
SLIDE 2

Overview

Bayesian Decision Theory through example Motivating Example Ingredients Special cases Gain functions Bayes Decision Rule Dynamic Programming: Sequential Decision Theory Application to Experimental Design Setting the picture of a Phase II/III program Decision 2 Decision 1

2 / 31

slide-3
SLIDE 3

The Umbrella Conundrum

◮ You can take the umbrella, or not take it. ◮ It may or may not rain during the day. ◮ Do not take the umbrella, and it rains → you get wet. ◮ Take the umbrella, and it does not rain → you have to carry it

around all day.

◮ You may look at the sky, or see the weather forecast, which may

help inform your decision.

3 / 31

slide-4
SLIDE 4

Ingredients

State of Nature θ ∈ Θ, with associated prior πθ( · ) Data x ∈ X, with likelihood πx( · ; θ) The state of nature is unknown, and the observed data may depend upon the state of nature. Action α ∈ A Decision Rule d : X → A The decision rule stipulates which action to take given observed data.

4 / 31

slide-5
SLIDE 5

Ingredients

In the umbrella example, State of Nature: Θ := {rain occurs, rain does not occur} Data X := {no clouds, few clouds, many clouds}, or [0, 1] Action A := {take umbrella, do not take umbrella} Decision Rule d : X → A d(x) = take umbrella ∀x d(x) = do not take umbrella ∀x d(x) = take umbrella if x ∈ {few clouds, many clouds} do not take umbrella if x = no clouds

5 / 31

slide-6
SLIDE 6

No data, Equally weighted losses case...

Suppose we have no data X. Further, suppose there is a bijection γ : A → Θ between actions and states of nature, with incorrect actions weighted equally. i.e. α =take umbrella ⇒ γ(α) =rain.

6 / 31

slide-7
SLIDE 7

No data, Equally weighted losses case...

Suppose we have no data X. Further, suppose there is a bijection γ : A → Θ between actions and states of nature, with incorrect actions weighted equally. i.e. α =take umbrella ⇒ γ(α) =rain. Optimal Decision rule d: Take action α ⇔ α maximises πθ(γ(α)) i.e. Assuming the prior gives a weighting of πθ(rain) < 0.5, we never take the umbrella!

7 / 31

slide-8
SLIDE 8

... suppose we have data

The posterior probability may govern our decision: π(θ|x) = πx(x|θ) πθ(θ) π(x) = πx(x|θ) πθ(θ)

  • Θ πx(x|θ) πθ(θ)dθ

8 / 31

slide-9
SLIDE 9

... suppose we have data

The posterior probability may govern our decision: π(θ|x) = πx(x|θ) πθ(θ) π(x) = πx(x|θ) πθ(θ)

  • Θ πx(x|θ) πθ(θ)dθ

By minimising the average probability of error P(error) = ∞

−∞

P(error|x) π(x) dx, (1)

  • ne obtains

d(x) = argmax

α∈A

π(γ(α) | x). Likelihoods uniform ⇒ decision relies only on priors. Uniform prior ⇒ decision relies only on likelihood. (Bayes decision rule in the case of equal losses)

9 / 31

slide-10
SLIDE 10

The need for gain functions

... but not taking an umbrella when it rains is worse than taking an umbrella when it does not rain! We introduce gain functions to complete our theory.

10 / 31

slide-11
SLIDE 11

Gain functions

The gain function describes the gain of each action. G(α ; θ) : A × Θ → R, is the gain incurred by taking action α when the state of nature is θ. In the case of equal costs, G(αi, θj) = δi,j for suitably ordered α and θ. The expected gain G : A → R, given observed data x is defined as G(α|x) =

  • Θ

G(α|θ) π(θ|x) dθ (2)

11 / 31

slide-12
SLIDE 12

Bayes Decision Rule

Defining the overall gain of a decision rule as

  • X

G(d(x) | x) π(x) dx, (3) choosing decision rule d such that the overall gain is maximised gives us Bayes Decision Rule: d(x) = argmax

α∈A

G(α|x) = argmax

α∈A

  • Θ

G(α|θ) π(θ|x) dθ (4)

12 / 31

slide-13
SLIDE 13

Back to the umbrella problem

◮ Prior on the state of nature

πθ(θ) = 0.25 if θ = {rain occurs} 0.75 if θ = {no rain occurs}

◮ Gain function G(·, ·) takes the following form:

Action α take umbrella do not take umbrella θ it rains

  • 0.1
  • 1

no rain

  • 0.1

13 / 31

slide-14
SLIDE 14

Back to the umbrella problem

◮ We observe some data x ∈ X relating to the prevalence of clouds in

the sky on the continuous scale of 0 to 1.

◮ Likelihood of cloud prevalence x ∈ X = [0, 1] given θ is:

14 / 31

slide-15
SLIDE 15

Bayes decision rule in this case is d(x) = argmax

α∈A

  • θ∈{rain, no rain}

G(α|θ) π(θ|x)

  • (∗)

(5)

15 / 31

slide-16
SLIDE 16

Bayes decision rule in this case is d(x) = argmax

α∈A

  • θ∈{rain, no rain}

G(α|θ) π(θ|x)

  • (∗)

(5) Plotting (∗) for each α ∈ A,

16 / 31

slide-17
SLIDE 17

Bayes decision rule in this case is d(x) = argmax

α∈A

  • θ∈{rain, no rain}

G(α|θ) π(θ|x)

  • (∗)

(5) Plotting (∗) for each α ∈ A, Thus Bayes decision rule is d(x) = take umbrella if x ≥ 0.4 do not take umbrella if x < 0.4 .

17 / 31

slide-18
SLIDE 18

The sequential decision problem

When making decisions sequentially, decisions you make at each stage

◮ determine interim loss or gain, and ◮ affect the ability to make decisions at further stages.

18 / 31

slide-19
SLIDE 19

The sequential decision problem

Dynamic programming (or backward induction) approach: Find the optimal decision rule at the last stage, then work backwards stage by stage, keeping track of the optimal decision rule and the expected payoff when this rule is applied in each stage.

19 / 31

slide-20
SLIDE 20

Setting the picture of a Phase II/III program

Often we have several treatments that show promise. Require a program that:

◮ Selects the most promising treatment (Phase II). ◮ Build up evidence of the efficacy of the treatment (Phase III).

Optimising the overall program is a complicated problem. i.e. the best way to design Phase II depends on how one uses the results

  • f Phase II in designing Phase III.

20 / 31

slide-21
SLIDE 21

Phase III design, given Phase II data. Phase II design.

21 / 31

slide-22
SLIDE 22

Phase III design, given Phase II data. Phase II design.

22 / 31

slide-23
SLIDE 23

Statistical Model (in Phase II)

Prior: θ ∼ N(µ0, Σ0) (6) Likelihood: ˆ θ1|θ ∼ N (θ, Σ) , where I1 = n(t)

1

σ2 (1 + K −1/2)−1, and Σ :=       I−1

1

σ2/ √ Kn(t)

1

... σ2/ √ Kn(t)

1

σ2/ √ Kn(t)

1

I−1

1

. . . . . . ... σ2/ √ Kn(t)

1

σ2/ √ Kn(t)

1

... σ2/ √ Kn(t)

1

I−1

1

      . (7) Posterior: θi|ˆ θ1 ∼ N

  • [(Σ−1

+ Σ−1)−1(Σ−1 ˆ θ1 + Σ−1

0 µ0)]i , [(Σ−1

+ Σ−1)−1]ii

  • .

(8)

23 / 31

slide-24
SLIDE 24

Decision 2

For given X1 = x1, choose i∗ and n2 to maximise

  • R

E [ G(X2, θi∗) | θi∗, X1 = x1 ]

  • Expected gain given θi∗ and Phase II

πθi∗|X1(θi∗| X1 = x1)

  • Posterior density of θi∗

dθi∗ (9) Define the Gain function G for the program with

◮ a large ’reward’ for rejecting the null hypothesis. ◮ a small ’penalty’ for testing each patient.

24 / 31

slide-25
SLIDE 25

Decision 2

25 / 31

slide-26
SLIDE 26

Decision 2

Bayes’ decision rule as a function of the posterior mean of θi∗:

26 / 31

slide-27
SLIDE 27

Decision 1

Choose n(t)

1

to maximise

  • RK E [ G(X1, X2, θi∗) | θ ]
  • Expected Gain given θ

πθ(θ)

Prior

dθ. (10)

27 / 31

slide-28
SLIDE 28

Decision 1

Equation (10) evaluated for selected values of Phase II sample size n(t)

1 .

28 / 31

slide-29
SLIDE 29

Using Combination Testing and GSDs

◮ Use of Phase II data in the final hypothesis test.

(Combination Testing)

◮ Use of early stopping boundaries in Phase III.

(Group Sequential Designs)

29 / 31

slide-30
SLIDE 30

Opportunities of this approach

◮ Quantify the value of Combination Testing and Group

Sequential Designs.

◮ Identify how prior assumptions change the optimal decision rules.

30 / 31

slide-31
SLIDE 31

Thank you for your attention.

31 / 31