D-optimal designs for dependent binary variables
Daniel Bruce
daniel.bruce@stat.su.se
Department of Statistics, Stockholm University
– p. 1/17
D-optimal designs for dependent binary variables Daniel Bruce - - PowerPoint PPT Presentation
D-optimal designs for dependent binary variables Daniel Bruce daniel.bruce@stat.su.se Department of Statistics, Stockholm University p. 1/17 What is optimal design ? Experimental design under an optimality criterion, see Atkinson and Donev
Daniel Bruce
daniel.bruce@stat.su.se
Department of Statistics, Stockholm University
– p. 1/17
– p. 2/17
– p. 3/17
– p. 3/17
– p. 3/17
– p. 3/17
– p. 3/17
– p. 3/17
– p. 3/17
– p. 4/17
Let S = S1 + S2 and P (S = s) = πs for s = 0, 1, 2 Multivariate generalized linear model (MGLM), see Fahrmeir and Tutz (2001) Response vector Y = “ Y1 Y2 ”T , where Yi = 8 < : 1, if S = i 0, otherwise for i = 1, 2.
– p. 4/17
Expected value µ = E »“ Y1 Y2 ”T – = “ π1 π2 ”T Link function “ π1 π2 ”T = “ ln π1
π0
ln π2
π0
”T = η, Linear predictor η = “ η1 η2 ”T = “ α1 + β1x α2 + β2x ”T = xθ, where x = @1 x 1 x 1 A and θ = “ α1 α2 β1 β2 ”T π0 = 1 1 + eη1 + eη2 π1 = eη1 1 + eη1 + eη2 π2 = eη2 1 + eη1 + eη2
– p. 5/17
Likelihood function L (θ|y) =
n
Π
i=1
n πy1i
1i πy2i 2i (1 − π1i − π2i)(1−y1i)(1−y2i)o
The loglikelihood function l (θ|y) =
n
X
i=1
{y1iα1 + y2iα2 + xi (y1iβ1 + y2iβ2) − ln (1 + eη1i + eη2i)} Score u (θ) = „ ∂η ∂θ «T „ ∂π ∂η «T „ ∂l ∂π «T which simplifies to
n
X
i=1
B B B B B @ (y1i − π1i) (y2i − π2i) xi (y1i − π1i) xi (y2i − π2i) 1 C C C C C A
– p. 6/17
Fisher information I (θ, x) = E h u (θ) uT (θ) i =
n
X
i=1
B B B B B @ π1 (1 − π1) −π1π2 xπ1 (1 − π1) −xπ1π2 −π1π2 π2 (1 − π2) −xπ1π2 xπ2 (1 − π2) xπ1 (1 − π1) −xπ1π2 x2π1 (1 − π1) x2π1π2 −xπ1π2 xπ2 (1 − π2) −x2π1π2 x2π2 (1 − π2) 1 C C C C C A Standardized information matrix M (θ, x) = I (θ, x) n Criterion function ψ {M (θ, ξ)} = ln `˛ ˛M−1 (θ, ξ) ˛ ˛´ Standardized variance of the predicted response, see Silvey(1980) d(x, ξ) = tr ˘ M−1 (θ, ξ) I (θ, x) ¯ ∀xǫX ξ∗ is D-optimal iff d(x, ξ∗) p ∀xǫX
– p. 7/17
−10 −5 5 10 15 0.2 0.4 0.6 0.8 1 Plot 1 x 20 40 60 0.2 0.4 0.6 0.8 1 Plot 2 x −5 5 10 0.2 0.4 0.6 0.8 1 x Plot 3 −5 5 0.2 0.4 0.6 0.8 1 x Plot 4
P(S=0) P(S=2) P(S=1) P(S=0) P(S=0) P(S=0) P(S=1) P(S=1) P(S=1) P(S=2) P(S=2) P(S=2)
Four examples of different probability distributions for S. The parameters are θT
1 = (−2, −9, 0.3, 1), θT 2 = (−1, −9, 1.1, 1.3), θT 3 = (−1, −5, 1, 2), and
θT
4 = (−3, −1, 0.5, 1)
– p. 8/17
−10 10 20 30 40 50 60 70 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x d(x,ξ*) ← P(S=0) P(S=1) P(S=2)
The probability distribution for S given α1 = −1, α2 = −9, β1 = 1.1 and β2 = 1.3. The standardized variance of the predicted response for design ξ∗, d(x, ξ∗) ξ∗ = 8 < : −0.4719 2.3431 32.3787 47.6213 0.2514 0.2598 0.2422 0.2466 9 = ;
– p. 8/17
Restriction β2 = 2β1, gives a model with symmetry properties The log-odds ratio ln Ω = ln 4 + α2 − 2α1 Define x0 as x0 = arg max
x∈XP(S = 1).
x0 = −α2 2β1 Px=x0(S = 1) = 1 1 + √ Ω
– p. 9/17
5 10 0.2 0.4 0.6 0.8 1 Plot 1 x −5 5 10 0.2 0.4 0.6 0.8 1 Plot 2 x −10 −5 5 10 15 0.2 0.4 0.6 0.8 1 x Plot 3 −10 −5 5 10 15 0.2 0.4 0.6 0.8 1 x Plot 4
P(S=0) P(S=1) P(S=2) P(S=0) P(S=0) P(S=0) P(S=1) P(S=1) P(S=1) P(S=2) P(S=2) P(S=2) ↑ ↓
Probability distribution for S as a function of x. The log-odds ratios are lnΩ1 = −4.61, lnΩ2 = −1.61, lnΩ3 = 2.39, and lnΩ4 = 20.39
– p. 10/17
5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x
d(x, ξ∗) for the different examples of D-optimal designs
– p. 10/17
5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x
d(x, ξ∗) for the different examples of D-optimal designs
✲
4-point design 3-point design 2-point design lnΩ
Number of design points given the log-odds ratio
– p. 10/17
5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x
d(x, ξ∗) for the different examples of D-optimal designs
✲
4-point design 3-point design 2-point design lnΩ
✒ ✏ ✑
Number of design points given the log-odds ratio
– p. 10/17
5 10 2.5 2.6 2.7 2.8 2.9 3 3.1 Plot 1 x d(x,ξ*) −2 2 4 6 1.5 2 2.5 3 Plot 2 x d(x,ξ*) −10 10 20 1 1.5 2 2.5 3 Plot 3 d(x,ξ*) x −10 −5 5 10 15 1 1.5 2 2.5 3 Plot 4 d(x,ξ*) x
d(x, ξ∗) for the different examples of D-optimal designs
✲
4-point design 3-point design 2-point design lnΩ
✓ ✒ ✏ ✑
Number of design points given the log-odds ratio
– p. 10/17
Proposed design ξ∗ = 8 < :
−α2 2β1 − c β1 −α2 2β1 + c β1
0.5 0.5 9 = ; Assume that β1 = 1 and α2 = 0 β1 = 1 is a scale parameter If α2 = 0 a proposed design with the same lnΩ can be derived
– p. 11/17
Proposed design ξ∗ = 8 < :
−α2 2β1 − c β1 −α2 2β1 + c β1
0.5 0.5 9 = ; Assume that β1 = 1 and α2 = 0 β1 = 1 is a scale parameter If α2 = 0 a proposed design with the same lnΩ can be derived The information matrix M (α1, c) = 1 2 (I(α1, −c) + I (α1, c)) . The determinant of M is |M (α1, c)| = c2eα1−6c ` eα1 + eα1+2c + 4ec´ (1 + eα1−c + e−2c)5 Derivative of the determinant of M with respect to c is
d |M (α1, c)| dc = {ceα1−4c[2
“
1 + eα1−c + e−2c”
h
e−2c “ eα1 + eα1+2c + 4ec” − c
“
3eα1−2c + 2eα1 + 10e−c”i +5ce−3c “ eα1 + 2e−c” “ eα1 + eα1+2c + 4ec” ]}/
“
1 + eα1−c + e−2c”6 .
– p. 11/17
If α2 = 0 then ln Ω → ∞ when α1 → −∞ Setting d |M (α1, c)| dc = 0 yields, 2 ` 1 + eα1−c + e−2c´ ˆ e−2c ` eα1 + eα1+2c + 4ec´ − c ` 3eα1−2c + 2eα1 + 10e−c´˜ +5ce−3c ` eα1 + 2e−c´ ` eα1 + eα1+2c + 4ec´ = 0. Let α1 → −∞ and it follows that c = 2 ` 1 + e−2c´ 5 (1 − e−2c) ⇒ c ≈ 0.6778
– p. 12/17
5 10 15 20 25 0.6 0.8 1 1.2 1.4 1.6
lnΩ c
5 10 15 20 25 2.8 3 3.2 3.4 3.6 3.8 4
max(d(x,ξ(c))) lnΩ
– p. 13/17
5 10 15 20 25 0.6 0.8 1 1.2 1.4 1.6
lnΩ c
5 10 15 20 25 2.8 3 3.2 3.4 3.6 3.8 4
max(d(x,ξ(c))) lnΩ
✲
4-point design 3-point design 2-point design lnΩ
✒ ✏ ✑
Number of design points given the log-odds ratio
– p. 13/17
D-efficiency, Deff = „ |M (θ, ξ (c))| |M (θ, ξ∗)| « 1
p
– p. 14/17
D-efficiency, Deff = „ |M (θ, ξ (c))| |M (θ, ξ∗)| « 1
p
−5 5 10 15 20 0.2 0.4 0.6 0.8 1 lnΩ Deff
D-efficiency for designs with c = 0.6778 for different parameter values
– p. 14/17
ξ∗ = 8 < :
−α1 β1
−
c β1 −α1 β1
+
c β1 α1−α2 β1
−
c β1 α1−α2 β1
+
c β1
0.25 0.25 0.25 0.25 9 = ;
– p. 15/17
ξ∗ = 8 < :
−α1 β1
−
c β1 −α1 β1
+
c β1 α1−α2 β1
−
c β1 α1−α2 β1
+
c β1
0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)}
– p. 15/17
ξ∗ = 8 < :
−α1 β1
−
c β1 −α1 β1
+
c β1 α1−α2 β1
−
c β1 α1−α2 β1
+
c β1
0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c
– p. 15/17
ξ∗ = 8 < :
−α1 β1
−
c β1 −α1 β1
+
c β1 α1−α2 β1
−
c β1 α1−α2 β1
+
c β1
0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c
−25 −20 −15 −10 −5 1.15 1.16 1.17 1.18 1.19 1.2 1.21 1.22 1.23 1.24 lnΩ c −25 −20 −15 −10 −5 2.9 2.95 3 3.05 3.1 3.15 3.2 3.25 3.3 3.35 max(d(x,ξ(c))) lnΩ
– p. 15/17
ξ∗ = 8 < :
−α1 β1
−
c β1 −α1 β1
+
c β1 α1−α2 β1
−
c β1 α1−α2 β1
+
c β1
0.25 0.25 0.25 0.25 9 = ; More complex expression for the determinant of M M (α1, c) = 1 4 {I(−α1, −c) + I (−α1, c) + I(α1, −c) + I (α1, c)} Set α2 = 0 and β1 = 1 and numerically obtain a D-optimal c
−20 −15 −10 −5 5 0.2 0.4 0.6 0.8 1 lnΩ Deff
D-efficiency for designs with c = 1.229 for different parameter values
– p. 15/17
S1 and S2 are independent ⇔ lnΩ = 0
– p. 16/17
S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1
– p. 16/17
S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1 S ∼ Bin (2, π) where π =
eη1 2+eη1
– p. 16/17
S1 and S2 are independent ⇔ lnΩ = 0 Parameter restrictions α2 = 2α1 − ln 4 β2 = 2β1 S ∼ Bin (2, π) where π =
eη1 2+eη1
1 2 3 4 5 6 7 8 9 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P(S=s) x P(S=0) P(S=1) P(S=2)
The probability distribution for S given α1 = −4 and β1 = 1
– p. 16/17
Theorem: For the independent model the locally D-optimal design for arbitrary parameter values is ξ∗ = 8 < :
ln 2−α1 β1
−
c β1 ln 2−α1 β1
+
c β1
1/2 1/2 9 = ; , where c is the solution to the equation c = ec + 1 ec − 1 . c ≈ 1.5434
– p. 17/17
Theorem: For the independent model the locally D-optimal design for arbitrary parameter values is ξ∗ = 8 < :
ln 2−α1 β1
−
c β1 ln 2−α1 β1
+
c β1
1/2 1/2 9 = ; , where c is the solution to the equation c = ec + 1 ec − 1 . c ≈ 1.5434 M (θ, ξ∗) = 2ec (1 + ec)2 @ 1
ln 2−α1 β1 ln 2−α1 β1 1 β2
1
“ (ln 2)2 − 2α1 ln 2 + α2
1 + c2”
1 A |M (θ, ξ∗)| = 4c2e2c β2
1 (1 + ec)4
d |M (θ, ξ∗)| dc = 8cec (1 + ec + c − cec) β2
1 (5 + 10ec + 10e2c + 5e3c + e4c)
Equating to zero yields c = ec + 1 ec − 1 The solution to the equation, c ≈ 1.5434 maximizes |M (θ, ξ∗)|
– p. 17/17