HODGES-LEHMANN INVERSE LIKELIHOOD ESTIMATES (HLES) KJELL DOKSUM - - PowerPoint PPT Presentation

hodges lehmann inverse likelihood estimates hle s kjell
SMART_READER_LITE
LIVE PREVIEW

HODGES-LEHMANN INVERSE LIKELIHOOD ESTIMATES (HLES) KJELL DOKSUM - - PowerPoint PPT Presentation

HODGES-LEHMANN INVERSE LIKELIHOOD ESTIMATES (HLES) KJELL DOKSUM DEPT. OF STATISTICS UNIVERSITY OF WISCONSIN-MADISON COLUMBIA UNIVERSITY 4TH LEHMANN SYMPOSIUM RICE UNIV. MAY 11, 2011 KJELL DOKSUM DEPT. OF STAT. AT UW-MADISON HLEs


slide-1
SLIDE 1

HODGES-LEHMANN INVERSE LIKELIHOOD ESTIMATES (HLE’S) KJELL DOKSUM

  • DEPT. OF STATISTICS

UNIVERSITY OF WISCONSIN-MADISON COLUMBIA UNIVERSITY

4TH LEHMANN SYMPOSIUM RICE UNIV.

MAY 11, 2011

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 1/31

slide-2
SLIDE 2

Figure: Javier Rojo

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 2/31

slide-3
SLIDE 3

ACKNOWLEDGEMENTS AKI OZEKI

  • UNIV. OF WISCONSIN

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 3/31

slide-4
SLIDE 4

OUTLINE

1 SOME LIKELIHOODS 2 ASYMPTOTIC DISTRIBUTIONS OF HLE’s 3 MINIMAX RESULTS 4 ONE STEP ESTIMATORS KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 4/31

slide-5
SLIDE 5

WHY HL-ESTIMATORS?

1 IN LINEAR REGRESSION MODELS WITH

ERROR∼ F, THE HL NORMAL SCORES ESTIMATE IS ASYMPTOTICALLY MORE EFFICIENT THAN THE LEAST SQUARES ESTIMATE, UNIFORMLY IN F.

2 SCHOLZ’S THEOREM. FOR EACH ONE

SAMPLE ESTIMATE THAT CAN BE WRITTEN AS A LINEAR COMBINATION OF ORDER STATISTICS, THERE IS A HL-ESTIMATE THAT IS ASYMPTOTICALLY MORE EFFICIENT.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 5/31

slide-6
SLIDE 6

SOME LIKELIHOODS X = DATA = (Y , Z), Y ∈ R, Z ∈ Rp. θ = (β ∈ Rp, Λ ∈ F) = PARAMETER LIKELIHOOD =

  • i

p(xi; θ) COX LIK =

  • i

λ(yi; β|zi)

  • j≥i λ(yi; β|zj)

EMPIRICAL LIK =

  • i

p(xi; θ), p(xi; θ) = 1.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 6/31

slide-7
SLIDE 7

SOME LIKELIHOODS PROFILE EMPIRICAL (PE) LIKELIHOOD LPE(β) = sup{

  • p(xi; θ); Λ}

(1) ˆ βPE = arg max LPE(β) THIS ESTIMATE ˆ βPE IS A FUNCTION OF THE RANK R1, · · · , Rn OF Y1, · · · , Yn

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 7/31

slide-8
SLIDE 8

SOME LIKELIHOODS HOEFFDING LIK ≡ LH(r(y); β) = P(R = r) =

1 n!E0

  • i

p(V (ri); θ|zi) p(V (ri); θ0|zi)

  • ,

WHERE ri ≡ r(yi) ≡ RANK(yi). V (1) < · · · < V (n) ARE p(v; θ0|z) ORDER STATISTICS.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 8/31

slide-9
SLIDE 9

RANK LIKELIHOOD ESTIMATOR EXAMPLE: Yi = zT

i β + ǫi,

ǫi ∼ F, IID. FORWARD RANK MLE = ARG MAX LH(r(y); β) = KP MLE KP= KALBFLEISCH-PRENTICE (1973) ˆ βKP SOLVES ∇βLH(r(y); β) = 0

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 9/31

slide-10
SLIDE 10

RANK LIKELIHOOD ESTIMATOR BECAUSE RANK(Λ(yi)) = RANK(yi). FOR Λ ր, ˆ βKP APPLIES TO SEMIPARA. TRANS. MODEL. ˆ βKP IS A FUNCTION OF THE RANKS OF Yi, AS IS THE COX ESTIMATE.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 10/31

slide-11
SLIDE 11

HODGES-LEHMANN INVERSE MLE (1963) DEFINITION: IN THE LINEAR MODEL, ˆ βHL SOLVES ∇βLH(r(y − zTβ∗); β)|β=0 = 0 1ST COMPUTE ∇βLH(r(y; β))|β=0, THEN CONSTRUCT AN ESTIMATING EQUATION IN β∗ BY REPLACING y WITH y − zTβ∗. HERE y − zTβ IS THE ”INVERSE” OF y = ztβ + ǫ.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 11/31

slide-12
SLIDE 12

HODGES-LEHMANN INVERSE MLE (1963) HL INVERSE LIK. EST: ALIGN RANK OF RESIDUALS WITH THE ”BASELINE” RANKS USING HOEFFDING LIKELIHOOD. EXAMPLE: TWO SAMPLE CASE, LOGISTIC SHIFT MODEL, F2(x) = F1(x − ∆), ∇∆LH|∆=0 = WILCOXON STAT ˆ ∆HL = med(X2j − X1i),

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 12/31

slide-13
SLIDE 13

GENERAL HODGES-LEHMANN INVERSE MLE MODEL: Y = h(ǫ, z, β), LET g(y; z, β) BE THE SOLUTION (INVERSE) FOR ǫ OF h(ǫ, z, β) = y. ˆ βHL SOLVES ∇βLH[g(y; z, β∗); β]|β=0 = 0

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 13/31

slide-14
SLIDE 14

GENERAL HODGES-LEHMANN INVERSE MLE IN THE EXAMPLE Yi = zT

i β + ǫ,

∇βLH[r(y − zTβ∗); β]|β=0 ARE p LINEAR RANK STATISTICS Tnj(β∗) =

n

  • i=1

(zij − ¯ zj)an(ri(β∗)), j = 1, · · · , p WHERE ri(β∗) = RANK(yi − zT

i β∗), AND

an(r) = a r

n+1

  • ,

a(u) = −f ′

f (F −1(u))

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 14/31

slide-15
SLIDE 15

GENERAL HODGES-LEHMANN INVERSE MLE SCALE MODEL: Y = ǫ exp[zTβ], ǫ ∼ F ǫ = Y / exp[zTβ], an(RANK(yi/ exp[zT

i β])),

an(r) = a1 r

n+1

  • ,

a1(u) = −F −1(u)f ′

f (F −1(u)) − 1

HERE ˆ βHL SOLVES ∇βLH[r(yi/ exp[zT

i β∗]); β]|β=0 = 0

WHICH IS EQUIVALENT TO

n

  • i=1

(zij − ¯ zj)an(ri(β∗)) = 0, j = 1, · · · , p

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 15/31

slide-16
SLIDE 16

GENERAL HODGES-LEHMANN INVERSE MLE LINEAR MODELS: EX1: ǫ ∼ LOGISTIC = ⇒ an(r) =

r n+1

EX2: ǫ ∼ NORMAL = ⇒ an(r) = Φ−1

r n+1

  • ,

NORMAL SCORES SCALE MODEL: EX3: ǫ ∼ EXP = ⇒ an(r) = − log

  • 1 −

r n+1

  • ,

ˆ βHL IS THE LOGISTIC SCORES ESTIMATOR

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 16/31

slide-17
SLIDE 17

ASYMPTOTICS THEOREM (JAECKEL 1972): IN THE Yi = zT

i β + ǫ MODEL,

1 THE HL ESTIMATE ˆ

βHL IS A MAXIMIZER OF S(β) =

n

  • i=1

exp[ −(yi−zT

i β)·an(RANK(yi−zT i β))]

2 HERE log[S(β)] is NONNEGATIVE,

CONTINUOUS AND CONCAVE.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 17/31

slide-18
SLIDE 18

ASYMPTOTICS IN THE LINEAR MODEL, LET ϕ(u, f0) = −f ′

f0(F −1 0 (u)). THEN,

√n(ˆ βHL − β) → N(0, 1

0 [ϕ(u, f0) − ¯

φ]2du 1

0 ϕ(u, f0)ϕ(u, f )du

2 Σ−1) WHERE ¯ φ = 1

0 ϕ(u, f0)du ,Σ = LIMn→∞ n−1Z TZ,

Z = CENTERED DESIGN MATRIX, ǫi ∼ F, f = F ′ HERE f0(·) GENERATES LH(·) AND ˆ βHL. f (·) IS THE TRUE DENSITY of ǫ.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 18/31

slide-19
SLIDE 19

ASYMPTOTIC LINEAR MODEL EXAMPLES:

1 F0()=LOGISTIC

√n(ˆ βHL − β) → N(0,

  • 1

12( 1

0 f 2(u)du)2

  • Σ−1)

2 F0()=NORMAL(0, σ2)

√n(ˆ βHL − β) → N(0,

  • Σ−1

( 1

0 Φ−1(u)φ(u, f )du)2

  • )

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 19/31

slide-20
SLIDE 20

ASYMPTOTIC INEQUALITY HODGES-LEHMANN (56) CONJECTURE. CHERNOFF-SAVAGE (58) THEOREM. IF ˆ βHL IS BASED ON SCORES DERIVED BY TAKING f0 = N(0, 1), AND IF ˆ βMLE IS THE MLE FOR THE MODEL WITH ǫ ∼ N(0, σ2), THEN ASYMPTOTIC VARIANCEF(ˆ βHL) ≤ ASYMPTOTIC VARIANCEF(ˆ βMLE) WHERE F = TRUE DIST. OF ǫ. EQUALITY ONLY WHEN F = N(0, σ2).

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 20/31

slide-21
SLIDE 21

IN THE AFT MODEL WITH ǫ ∼ F, THE HL EXPONENTIAL SCORES STATISTIC SATISFIES √n(ˆ βHL − β) → N(0,

  • 1

1

0 tλ(t)dF(t)

2 Σ−1) WHERE λ(t) = f (t)/[1 − F(t)].

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 21/31

slide-22
SLIDE 22

NAIVE MINIMAX THEORY RESULT: THE COX ESTIMATE IS ASYMPTOTICALLY MINIMAX FOR THE PROPORTIONAL HAZARD (PH) MODEL: λ(y; z) = λ0(y)ezTβ PROOF: STEP A: THE COX ESTIMATE IS OPTIMAL FOR THE EXPONENTIAL MODEL, INFˆ

βRE(β, ˆ

β) = RE(β, ˆ βC) (2) STEP B: THE PH MODEL CAN BE WRITTEN AS Λ0(Y ) ∼ EXP-DISTR(zTβ) WHERE Λ0 ր IS THE BASELINE HAZARD FUNCTION.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 22/31

slide-23
SLIDE 23

NAIVE MINIMAX THEORY THE COX ESTIMATE ˆ βC IS INVARIANT, ˆ βC(y) = ˆ βC(Λ0(y)), SO IT HAS CONSTANT RISK, sup

F

RF(β, ˆ βC) = RE(β, ˆ βC), (3) F(y|z) ∈ PH STEP C: SINCE THE EXP MODEL IS PH, sup

F

RF(β, ˆ β) ≥ RE(β, ˆ β), (4) F(y|z) ∈ PH STEP D: (2),(3),(4) ⇒ sup

F

RF(β, ˆ βC) = inf

ˆ β

sup

F

RF(β, ˆ β). QED.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 23/31

slide-24
SLIDE 24

NAIVE MINIMAX THEORY NON-NAIVE PROOF: PAGE 332 of BICKEL, KLAASSEN, RITOV, WELLNER.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 24/31

slide-25
SLIDE 25

NAIVE MINIMAX THEORY RESULT: THE HL EXP SCORES EST IS A MINIMAX FOR THE IHR ACCELERATED FAILURE TIME MODEL (IHRAFT) Y = Y0 exp(zTβ), Y0 ∼ F, WITH F ∈ IHR = INCR. HAZARD RATE PROOF: STEP A: THE HL EXP. SC. ESTIMATE IS OPTIMAL FOR THE EXPONENTIAL MODEL, INFˆ

βRE(β, ˆ

β) = RE(β, ˆ βHL) (5)

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 25/31

slide-26
SLIDE 26

NAIVE MINIMAX THEORY STEP B: THE EXP MODEL IS LEAST FAVORABLE FOR ˆ βHL sup

F

RF(β, ˆ βHL) = RE(β, ˆ βHL), (6) F(y|z) ∈ IHRAFT

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 26/31

slide-27
SLIDE 27

NAIVE MINIMAX THEORY STEP C: SINCE THE EXP MODEL IS IHRAFT, sup

F

RF(β, ˆ β) ≥ RE(β, ˆ β), (7) F(y|z) ∈ IHRAFT STEP D: (5),(6),(7) ⇒ sup

F

RF(β, ˆ βHL) = inf

ˆ β

sup

F

RF(β, ˆ β). QED. TO PROVE STEP B, USE DOKSUM (1967); ARGUMENT BASED ON VAN ZWET ORDERINGS.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 27/31

slide-28
SLIDE 28

ASYMPTOTIC miniMAX RESULT CONSIDER f0 = LOGISTIC, SO an(ri) = ri n + 1 (8) THEN ˆ βHL IS ASYMPTOTICALLY miniMAX OVER THE CLASS OF DISTRIBUTIONS WITH (VAN ZWET TYPE) LIGHTER TAILS THAN THE LOGISTIC DISTRIBUTION.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 28/31

slide-29
SLIDE 29

ONE STEP ESTIMATIORS LET ˆ τ BE A CONSISTENT ESTIMATOR OF τ = 1 1

0 φ(u, f0)φ(u, f )du

(9) AND LET ˆ βLSE BE THE LSE OF β. DEFINE ˆ βHL = ˆ βLSE + ˆ τ ·(Z TZ)−1 ·Tn(RANK(Y −Z T ˆ βLSE)) (10) THEN, √n(ˆ βHL − β) → N(0, Γ) (11) JURECKOVA(69), KRAFT AND VAN EEDEN(72), HETTMANSPERGER, MCKEAN, TSIATIS, ETC.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 29/31

slide-30
SLIDE 30

GMT MODEL IN THE MODEL Yi = zT

i β + ǫi

THERE EXISTS G : R → R, INCREASING, SUCH THAT G(yi − zT

i β),

i = 1, · · · , n (12) ARE IID. HERE G is UNKNOWN.

KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 30/31

slide-31
SLIDE 31

SUMMARY

1 SOME LIKELIHOODS 2 ASYMPTOTIC DISTRIBUTIONS OF HLE’s 3 MINIMAX RESULTS 4 ONE STEP ESTIMATORS KJELL DOKSUM

  • DEPT. OF STAT. AT UW-MADISON

HLE’s 31/31