D-optimal designs for dependent binary variables Daniel Bruce - - PowerPoint PPT Presentation

d optimal designs for dependent binary variables
SMART_READER_LITE
LIVE PREVIEW

D-optimal designs for dependent binary variables Daniel Bruce - - PowerPoint PPT Presentation

D-optimal designs for dependent binary variables Daniel Bruce daniel.bruce@stat.su.se Department of Statistics, Stockholm University p. 1/17 What is optimal design ? Experimental design under an optimality criterion, see Atkinson and Donev


slide-1
SLIDE 1

D-optimal designs for dependent binary variables

Daniel Bruce

daniel.bruce@stat.su.se

Department of Statistics, Stockholm University

– p. 1/17

slide-2
SLIDE 2

What is optimal design?

Experimental design under an optimality criterion, see Atkinson and Donev (1992) No general optimality criterion Criteria based on the information matrix, A-, D-, and E-optimality D-optimality minimizes the determinant of the inverse of the information matrix. Interpreted as the minimum of the confidence ellipsoid of the parameters Discrimination between models, T-optimality

– p. 2/17

slide-3
SLIDE 3

Advantage and Disadvantage

– p. 3/17

slide-4
SLIDE 4

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained

– p. 3/17

slide-5
SLIDE 5

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained Disadvantage Optimal designs depend on unknown parameters, which we want to estimate...

– p. 3/17

slide-6
SLIDE 6

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained Disadvantage Optimal designs depend on unknown parameters, which we want to estimate... Approaches to handle the parameter dependence

– p. 3/17

slide-7
SLIDE 7

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained Disadvantage Optimal designs depend on unknown parameters, which we want to estimate... Approaches to handle the parameter dependence Sequential designs

– p. 3/17

slide-8
SLIDE 8

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained Disadvantage Optimal designs depend on unknown parameters, which we want to estimate... Approaches to handle the parameter dependence Sequential designs Optimum on average designs

– p. 3/17

slide-9
SLIDE 9

Advantage and Disadvantage

Advantage Better precision in the parameter estimates, smaller number of required observations with precision maintained Disadvantage Optimal designs depend on unknown parameters, which we want to estimate... Approaches to handle the parameter dependence Sequential designs Optimum on average designs Minimax designs

– p. 3/17

slide-10
SLIDE 10

The model

Two dependent identically distributed binary variables, denoted S1 and S2 Complicated dependence structure Identically distributed variables lead to a trinomial model

– p. 4/17

slide-11
SLIDE 11

The model

Two dependent identically distributed binary variables, denoted S1 and S2 Complicated dependence structure Identically distributed variables lead to a trinomial model

Let S = S1 + S2 and P (S = s) = πs for s = 0, 1, 2 Multivariate generalized linear model (MGLM), see Fahrmeir and Tutz (2001) Response vector Y = “ Y1 Y2 ”T , where Yi = 8 < : 1, if S = i 0, otherwise for i = 1, 2.

– p. 4/17

slide-12
SLIDE 12

The model

Expected value µ = E »“ Y1 Y2 ”T – = “ π1 π2 ”T Link function “ π1 π2 ”T = “ ln π1

π0

ln π2

π0

”T = η, Linear predictor η = “ η1 η2 ”T = “ α1 + β1x α2 + β2x ”T = xθ, where x = @1 x 1 x 1 A and θ = “ α1 α2 β1 β2 ”T π0 = 1 1 + eη1 + eη2 π1 = eη1 1 + eη1 + eη2 π2 = eη2 1 + eη1 + eη2

– p. 5/17

slide-13
SLIDE 13

Likelihood, score

Likelihood function L (θ|y) =

n

Π

i=1

n πy1i

1i πy2i 2i (1 − π1i − π2i)(1−y1i)(1−y2i)o

The loglikelihood function l (θ|y) =

n

X

i=1

{y1iα1 + y2iα2 + xi (y1iβ1 + y2iβ2) − ln (1 + eη1i + eη2i)} Score u (θ) = „ ∂η ∂θ «T „ ∂π ∂η «T „ ∂l ∂π «T which simplifies to

n

X

i=1

B B B B B @ (y1i − π1i) (y2i − π2i) xi (y1i − π1i) xi (y2i − π2i) 1 C C C C C A

– p. 6/17

slide-14
SLIDE 14

Information matrix, criterion function

Fisher information I (θ, x) = E h u (θ) uT (θ) i =

n

X

i=1

B B B B B @ π1 (1 − π1) −π1π2 xπ1 (1 − π1) −xπ1π2 −π1π2 π2 (1 − π2) −xπ1π2 xπ2 (1 − π2) xπ1 (1 − π1) −xπ1π2 x2π1 (1 − π1) x2π1π2 −xπ1π2 xπ2 (1 − π2) −x2π1π2 x2π2 (1 − π2) 1 C C C C C A Standardized information matrix M (θ, x) = I (θ, x) n Criterion function ψ {M (θ, ξ)} = ln `˛ ˛M−1 (θ, ξ) ˛ ˛´ Standardized variance of the predicted response, see Silvey(1980) d(x, ξ) = tr ˘ M−1 (θ, ξ) I (θ, x) ¯ ∀xǫX ξ∗ is D-optimal iff d(x, ξ∗) p ∀xǫX

– p. 7/17

slide-15
SLIDE 15

Examples of the model

−10 −5 5 10 15 0.2 0.4 0.6 0.8 1 Plot 1 x 20 40 60 0.2 0.4 0.6 0.8 1 Plot 2 x −5 5 10 0.2 0.4 0.6 0.8 1 x Plot 3 −5 5 0.2 0.4 0.6 0.8 1 x Plot 4

P(S=0) P(S=2) P(S=1) P(S=0) P(S=0) P(S=0) P(S=1) P(S=1) P(S=1) P(S=2) P(S=2) P(S=2)

Four examples of different probability distributions for S. The parameters are θT

1 = (−2, −9, 0.3, 1), θT 2 = (−1, −9, 1.1, 1.3), θT 3 = (−1, −5, 1, 2), and

θT

4 = (−3, −1, 0.5, 1)

– p. 8/17

slide-16
SLIDE 16

Examples of the model

−10 10 20 30 40 50 60 70 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x d(x,ξ*) ← P(S=0) P(S=1) P(S=2)

The probability distribution for S given α1 = −1, α2 = −9, β1 = 1.1 and β2 = 1.3. The standardized variance of the predicted response for design ξ∗, d(x, ξ∗) ξ∗ = 8 < : −0.4719 2.3431 32.3787 47.6213 0.2514 0.2598 0.2422 0.2466 9 = ;

– p. 8/17

slide-17
SLIDE 17

Symmetric model

Restriction β2 = 2β1, gives a model with symmetry properties The log-odds ratio ln Ω = ln 4 + α2 − 2α1 Define x0 as x0 = arg max

x∈XP(S = 1).

x0 = −α2 2β1 Px=x0(S = 1) = 1 1 + √ Ω

– p. 9/17

slide-18
SLIDE 18

Examples of the symmetric model

5 10 0.2 0.4 0.6 0.8 1 Plot 1 x −5 5 10 0.2 0.4 0.6 0.8 1 Plot 2 x −10 −5 5 10 15 0.2 0.4 0.6 0.8 1 x Plot 3 −10 −5 5 10 15 0.2 0.4 0.6 0.8 1 x Plot 4

P(S=0) P(S=1) P(S=2) P(S=0) P(S=0) P(S=0) P(S=1) P(S=1) P(S=1) P(S=2) P(S=2) P(S=2) ↑ ↓

Probability distribution for S as a function of x. The log-odds ratios are lnΩ1 = −4.61, lnΩ2 = −1.61, lnΩ3 = 2.39, and lnΩ4 = 20.39

– p. 10/17

slide-19
SLIDE 19

Examples of the symmetric model

5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x

d(x, ξ∗) for the different examples of D-optimal designs

– p. 10/17

slide-20
SLIDE 20

Examples of the symmetric model

5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x

d(x, ξ∗) for the different examples of D-optimal designs

4-point design 3-point design 2-point design lnΩ

  • 4.07
  • 0.15

Number of design points given the log-odds ratio

– p. 10/17

slide-21
SLIDE 21

Examples of the symmetric model

5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x

d(x, ξ∗) for the different examples of D-optimal designs

4-point design 3-point design 2-point design lnΩ

  • 4.07
  • 0.15 ✓

✒ ✏ ✑

Number of design points given the log-odds ratio

– p. 10/17

slide-22
SLIDE 22

Examples of the symmetric model

5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x

d(x, ξ∗) for the different examples of D-optimal designs

4-point design 3-point design 2-point design lnΩ

  • 4.07
  • 0.15

✓ ✒ ✏ ✑

Number of design points given the log-odds ratio

– p. 10/17

slide-23
SLIDE 23

2-point design

Proposed design ξ∗ = 8 < :

−α2 2β1 − c β1 −α2 2β1 + c β1

0.5 0.5 9 = ; Assume that β1 = 1 and α2 = 0 β1 = 1 is a scale parameter If α2 = 0 a proposed design with the same lnΩ can be derived

– p. 11/17

slide-24
SLIDE 24

2-point design

Proposed design ξ∗ = 8 < :

−α2 2β1 − c β1 −α2 2β1 + c β1

0.5 0.5 9 = ; Assume that β1 = 1 and α2 = 0 β1 = 1 is a scale parameter If α2 = 0 a proposed design with the same lnΩ can be derived The information matrix M (α1, c) = 1 2 (I(α1, −c) + I (α1, c)) . The determinant of M is |M (α1, c)| = c2eα1−6c ` eα1 + eα1+2c + 4ec´ (1 + eα1−c + e−2c)5 Derivative of the determinant of M with respect to c is

d |M (α1, c)| dc = {ceα1−4c[2

1 + eα1−c + e−2c”

h

e−2c “ eα1 + eα1+2c + 4ec” − c

3eα1−2c + 2eα1 + 10e−c”i +5ce−3c “ eα1 + 2e−c” “ eα1 + eα1+2c + 4ec” ]}/

1 + eα1−c + e−2c”6 .

– p. 11/17

slide-25
SLIDE 25

2-point design

If α2 = 0 then ln Ω → ∞ when α1 → −∞ Setting d |M (α1, c)| dc = 0 yields, 2 ` 1 + eα1−c + e−2c´ ˆ e−2c ` eα1 + eα1+2c + 4ec´ − c ` 3eα1−2c + 2eα1 + 10e−c´˜ +5ce−3c ` eα1 + 2e−c´ ` eα1 + eα1+2c + 4ec´ = 0. Let α1 → −∞ and it follows that c = 2 ` 1 + e−2c´ 5 (1 − e−2c) ⇒ c ≈ 0.6778

– p. 12/17

slide-26
SLIDE 26

Evaluation of the 2-point design

5 10 15 20 25 0.6 0.8 1 1.2 1.4 1.6

lnΩ c

5 10 15 20 25 2.8 3 3.2 3.4 3.6 3.8 4

max(d(x,ξ(c))) lnΩ

– p. 13/17

slide-27
SLIDE 27

Evaluation of the 2-point design

5 10 15 20 25 0.6 0.8 1 1.2 1.4 1.6

lnΩ c

5 10 15 20 25 2.8 3 3.2 3.4 3.6 3.8 4

max(d(x,ξ(c))) lnΩ

4-point design 3-point design 2-point design lnΩ

  • 4.07
  • 0.15 ✓

✒ ✏ ✑

Number of design points given the log-odds ratio

– p. 13/17

slide-28
SLIDE 28

Evaluation of the 2-point design

D-efficiency, Deff = „ |M (θ, ξ (c))| |M (θ, ξ∗)| « 1

p

– p. 14/17

slide-29
SLIDE 29

Evaluation of the 2-point design

D-efficiency, Deff = „ |M (θ, ξ (c))| |M (θ, ξ∗)| « 1

p

−5 5 10 15 20 0.2 0.4 0.6 0.8 1 lnΩ Deff

D-efficiency for designs with c = 0.6778 for different parameter values

– p. 14/17

slide-30
SLIDE 30

4-point design

ξ∗ = 8 < :

−α1 β1

c β1 −α1 β1

+

c β1 α1−α2 β1

c β1 α1−α2 β1

+

c β1

0.25 0.25 0.25 0.25 9 = ;

– p. 15/17

slide-31
SLIDE 31

4-point design

ξ∗ = 8 < :

−α1 β1

c β1 −α1 β1

+

c β1 α1−α2 β1

c β1 α1−α2 β1

+

c β1

0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)}

– p. 15/17

slide-32
SLIDE 32

4-point design

ξ∗ = 8 < :

−α1 β1

c β1 −α1 β1

+

c β1 α1−α2 β1

c β1 α1−α2 β1

+

c β1

0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c

– p. 15/17

slide-33
SLIDE 33

4-point design

ξ∗ = 8 < :

−α1 β1

c β1 −α1 β1

+

c β1 α1−α2 β1

c β1 α1−α2 β1

+

c β1

0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c

−25 −20 −15 −10 −5 1.15 1.16 1.17 1.18 1.19 1.2 1.21 1.22 1.23 1.24 lnΩ c −25 −20 −15 −10 −5 2.9 2.95 3 3.05 3.1 3.15 3.2 3.25 3.3 3.35 max(d(x,ξ(c))) lnΩ

– p. 15/17

slide-34
SLIDE 34

4-point design

ξ∗ = 8 < :

−α1 β1

c β1 −α1 β1

+

c β1 α1−α2 β1

c β1 α1−α2 β1

+

c β1

0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c

−20 −15 −10 −5 5 0.2 0.4 0.6 0.8 1 lnΩ Deff

D-efficiency for designs with c = 1.229 for different parameter values

– p. 15/17

slide-35
SLIDE 35

Independent variables

S1 and S2 are independent ⇔ lnΩ = 0

– p. 16/17

slide-36
SLIDE 36

Independent variables

S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1

– p. 16/17

slide-37
SLIDE 37

Independent variables

S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1 S ∼ Bin (2, π) where π =

eη1 2+eη1

– p. 16/17

slide-38
SLIDE 38

Independent variables

S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1 S ∼ Bin (2, π) where π =

eη1 2+eη1

1 2 3 4 5 6 7 8 9 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P(S=s) x P(S=0) P(S=1) P(S=2)

The probability distribution for S given α1 = −4 and β1 = 1

– p. 16/17

slide-39
SLIDE 39

Independent variables

Theorem: For the independent model the locally D-optimal design for arbitrary parameter values is ξ∗ = 8 < :

ln 2−α1 β1

c β1 ln 2−α1 β1

+

c β1

1/2 1/2 9 = ; , where c is the solution to the equation c = ec + 1 ec − 1 . c ≈ 1.5434

– p. 17/17

slide-40
SLIDE 40

Independent variables

Theorem: For the independent model the locally D-optimal design for arbitrary parameter values is ξ∗ = 8 < :

ln 2−α1 β1

c β1 ln 2−α1 β1

+

c β1

1/2 1/2 9 = ; , where c is the solution to the equation c = ec + 1 ec − 1 . c ≈ 1.5434 M (θ, ξ∗) = 2ec (1 + ec)2 @ 1

ln 2−α1 β1 ln 2−α1 β1 1 β2

1

“ (ln 2)2 − 2α1 ln 2 + α2

1 + c2”

1 A |M (θ, ξ∗)| = 4c2e2c β2

1 (1 + ec)4

d |M (θ, ξ∗)| dc = 8cec (1 + ec + c − cec) β2

1 (5 + 10ec + 10e2c + 5e3c + e4c)

Equating to zero yields c = ec + 1 ec − 1 The solution to the equation, c ≈ 1.5434 maximizes |M (θ, ξ∗)|

– p. 17/17