Saddlepoint-Based Bootstrap Inference in Dependent Data Settings - - PowerPoint PPT Presentation

saddlepoint based bootstrap inference in dependent data
SMART_READER_LITE
LIVE PREVIEW

Saddlepoint-Based Bootstrap Inference in Dependent Data Settings - - PowerPoint PPT Presentation

Saddlepoint-Based Bootstrap Inference in Dependent Data Settings Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Rob Paige , Missouri University of Science and Technology Indika Wickramasinghe , Eastern New Mexico


slide-1
SLIDE 1

Saddlepoint-Based Bootstrap Inference in Dependent Data Settings

Alex Trindade

  • Dept. of Mathematics & Statistics, Texas Tech University

Rob Paige, Missouri University of Science and Technology Indika Wickramasinghe, Eastern New Mexico University (former student) Pratheepa Jeganathan, Texas Tech University (current student)

June 2014

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 1 / 21

slide-2
SLIDE 2

Outline

1

Overview of SPBB inference: Saddlepoint-Based Bootstrap An approximate parametric bootstrap for scalar parameter θ

2

Application 1: Spatial Regression Models Classic application of SPBB Better performance than asymptotic-based CIs

3

Application 2: MA(1) Model Challenging; extend methodology in two directions Extension 1: Non-Monotone QEEs Extension 2: Non-Gaussian QEEs

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 2 / 21

slide-3
SLIDE 3

Saddlepoint-Based Bootstrap (SPBB) Inference

Pioneered by Paige, Trindade, & Fernando (SJS, 2009): SPBB: an approximate percentile parametric bootstrap; replace (slow) MC simulation with (fast) saddlepoint approx (SPA); estimators are roots of QEE (quadratic estimating equation); enjoys near exact performance;

  • rders of magnitude faster than bootstrap;

may be only alternative to bootstrap if no exact or asymptotic procedures; Idea:

relate distribution of root of QEE Ψ(θ) to that of estimator ˆ θ; under normality on data have closed form for MGF of QEE; use to saddlepoint approximate distribution of estimator (PDF or CDF); can pivot CDF to get a CI... numerically! leads to 2nd order accurate CIs, coverage error is O(n−1).

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 3 / 21

slide-4
SLIDE 4

SPBB: An Approximate Parametric Bootstrap

F ˆ

θ( ˆ

θobs) Ψ(θ) = 0 ˆ θ solves F ˆ

θ( ˆ

θobs) = FΨ( ˆ

θobs)(0)

Ψ(θ) monotone ˆ FΨ( ˆ

θobs)(0)

SPA via MGF of Ψ(θ) (θL, θU) pivot Intractable! (And bootstrap too expensive...)

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 4 / 21

slide-5
SLIDE 5

Application 1: Spatial Regression Models (Lattice Data)

Jeganathan, Paige, & Trindade (under review) Spatial process y = [y(s1), . . . , y(sn)]⊺ observed at sites {s1, . . . , sn}. Under stationarity & isotropy, correlation modeled via spatial dependence parameter ρ and spatial weights matrix W . 3 main correlation structures for the regression model y = X β + z, z ∼ N(0, σ2Vρ)

SAR: z = ρW z + ε, with Vρ = (In − ρW )−1(In − ρW ⊺)−1 CAR: E[z(si)|z(sj) : sj ∈ N(si)] = ρ ∑ wijz(sj), with Vρ = (In − ρW )−1 SMA: z = ρW ε + ε, with Vρ = (In + ρW )(In + ρW ⊺).

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 5 / 21

slide-6
SLIDE 6

QEEs for ML and REML Estimators of ρ

IRWGLS estimate for z (fixed ρ): ˆ z ≡ r = PC(X)⊥y, with y ∼ N(X β, σ2Vρ) = ⇒ r ∼ N(0, σ2VρPT

C(X)⊥)

Leads to following QEEs for estimators ˆ ρML & ˆ ρREML: ΨML(ρ) = r⊺ Tr

  • V −1

ρ

˙ Vρ

  • V −1

ρ

− nV −1

ρ

˙ VρV −1

ρ

  • r

ΨREML(ρ) = rT Tr

  • V −1

ρ

˙ Vρ

  • V −1

ρ

− Tr

  • PC(X) ˙

VρV −1

ρ

  • V −1

ρ

− (n − q) V −1

ρ

˙ VρV −1

ρ

  • r

Theorem: ML QEE is biased; REML QEE is unbiased.

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 6 / 21

slide-7
SLIDE 7

Approximations for Distribution of ˆ ρML: CAR Model

ρ = 0.05 and n = 36

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

ρ = 0.15 and n = 36

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

ρ = 0.2 and n = 36

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

ρ = 0.05 and n = 100

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

ρ = 0.15 and n = 100

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

ρ = 0.2 and n = 100

Density −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 4 8

Asymptotic Normal (solid) Saddlepoint Approx (dashed) Empirical (histogram)

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 7 / 21

slide-8
SLIDE 8

Empirical Coverage Probabilities and Average Lengths of 95% SPBB & Asymptotic CIs for ˆ ρML: CAR Model

Coverages Sample size ρ0=0.05 ρ0=0.15 ρ0=0.2 SPBB ASYM SPBB ASYM SPBB ASYM n=16 0.941 0.876 0.934 0.895 0.928 0.866 n=36 0.959 0.907 0.941 0.908 0.906 0.895 n=100 0.946 0.925 0.950 0.932 0.946 0.931 Lengths Sample size ρ0=0.05 ρ0=0.15 ρ0=0.2 SPBB ASYM SPBB ASYM SPBB ASYM n=16 0.480 0.575 0.514 0.628 0.440 0.517 n=36 0.402 0.443 0.346 0.378 0.286 0.311 n=100 0.263 0.272 0.214 0.221 0.159 0.165

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 8 / 21

slide-9
SLIDE 9

Real Dataset 1: Mercer & Hall (1911) Wheat Yield

Yield of grain collected in summer of 1910 over n = 500 plots (20 × 25 grid). Mean trend removed via median polish. Available in R package spdep as “wheat”. Cressie (1985, 1993): use spatial binary weights matrix W with polynomial neighborhood structure in SAR model. Table: MLEs and 95% SPBB & ASYM CIs for ρ0.

Estimation Method SAR model CAR model SMA model MLE 0.603 0.078 0.077 SPBB 95% CI (0.477, 0.726) (0.067, 0.084) (0.055, 0.098) ASYM 95% CI (0.478, 0.727) (0.069, 0.088) (0.055, 0.098)

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 9 / 21

slide-10
SLIDE 10

Real Dataset 2: Eire county blood group A percentages

Available in R package spdep as “eire”. Percentage of a sample with blood type A collected over n = 26 counties in Eire. Covariates: towns (towns/unit area) and pale (1=within, 0=beyond). Used by Cliff & Ord (1973) to illustrate spatial dependence in SAR and CAR models (binary weights matrix W with neighborhood structure as in spdep).

Estimation Method SAR model CAR model SMA model MLE 0.313 0.078 0.055 SPBB 95% CI (-0.190, 0.769) (-0.167, 0.195) (-0.078, 0.209) ASYM 95% CI (−0.151, 0.776) (−0.128, 0.283) (−0.083, 0.192)

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 10 / 21

slide-11
SLIDE 11

Application 2: MA(1) Model

Paige, Trindade, & Wickramasinghe (AISM, 2014) Challenging...; extend methodology in two directions. Extension 1: Non-Monotone QEEs

Problem: non-monotone QEEs invalidate SPBB Solution: double-SPA & importance sampling

Extension 2: Non-Gaussian QEEs

Problem: key to SPBB is QEE with tractable MGF Solution: elliptically contoured distributions, and some tricks

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 11 / 21

slide-12
SLIDE 12

The MA(1): World’s Simplest Model?

Model: Xt = θ0Zt−1 + Zt, Zt ∼ iid (0, σ2), |θ0| ≤ 1 Uses:

special case of more general ARMA models; perhaps most useful in testing if data has been over-differenced... if we difference WN we get MA(1) with θ0 = −1 Xt = Zt = ⇒ Yt ≡ Xt − Xt−1 = Zt − Zt−1 connection with unit-root tests in econometrics (Tanaka, 1990, Davis et al., 1995, Davis & Dunsmuir, 1996, Davis & Song, 2011).

Inference: complicated...

common estimators (MOME, LSE, MLE) have mixed distributions, point masses at ±1 and continuous over (−1, 1); LSE & MLE are roots of polynomials of degree ≈ 2n.

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 12 / 21

slide-13
SLIDE 13

Unification of Parameter Estimators

Theorem

For |θ| < 1, MOME, LSE, and MLE are all roots of QEE, Ψ(θ) = x⊺Aθx, where symmetric matrix Aθ in each case is MOME: (QEE is monotone) Aθ = (1 + θ2)Jn − 2θIn LSE: (QEE not monotone...) Aθ = Ω−1

θ [Jn + 2θIn]Ω−1 θ

MLE: (QEE not monotone...) Aθ = function(θ, In, Jn, Ω−1

θ )

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 13 / 21

slide-14
SLIDE 14

SPA densities of estimators: MOME, LSE, MLE, AN

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

n=10, θ=0.4

AN MLE CLSE MOME

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0

n=10, θ=0.8

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0

n=20, θ=0.4

−1.0 −0.5 0.0 0.5 1.0 0.0 1.0 2.0 3.0

n=20, θ=0.8 alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 14 / 21

slide-15
SLIDE 15

95% CI Coverages & Lengths for MOME (Gaussian Noise)

Settings Coverage Probability Average Length n θ0 SPBB Boot AN SPBB Boot AN 10 0.4 0.940 0.432 0.997 1.484 1.438 0.561 10 0.8 0.948 0.358 0.259 1.336 1.653 1.300 20 0.4 0.953 0.717 1.000 1.095 1.560 0.334 20 0.8 0.960 0.524 0.693 1.005 1.692 1.616

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 15 / 21

slide-16
SLIDE 16

Extension 1: Non-Monotone Estimating Equations

Monotonicity of QEE is key (Daniels, 1983). Skovgaard (1990) & Spady (1991) give expression for PDF of ˆ θ where Jacobian does not require monotonicity of Ψ(t) in t; but involves an intractable conditional expectation... Solution: double-SPA & importance sampling.

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 16 / 21

slide-17
SLIDE 17

Example: Density of MLE (Gaussian Noise)

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0

n=10, θ=0.4

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0

n=10, θ=0.8

Skovgaard (solid) Daniels (dashed) Empirical (histogram)

−1.0 −0.5 0.0 0.5 1.0 0.0 1.0 2.0 3.0

n=20, θ=0.4

−1.0 −0.5 0.0 0.5 1.0 0.0 1.0 2.0 3.0

n=20, θ=0.8 alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 17 / 21

slide-18
SLIDE 18

Extension 2: Non-Gaussian QEEs

General Problem with SPBB: need QEEs with tractable MGF... One solution: elliptically contoured (EC) distributions. Relies on appropriate weighting function w(t) (Provost & Cheong, 2002).

Theorem

With MN the MVN MGF: MEC (s; µ, Σ) =

w (t) MN (s; µ, Σ/t) dt

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 18 / 21

slide-19
SLIDE 19

SPBB approx PDFs of MOME in Laplace MA(1)

n=5, θ=0.4

Density −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

n=10, θ=0.4

Density −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

n=5, θ=0.8

Density −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

n=10, θ=0.8

Density −1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 19 / 21

slide-20
SLIDE 20

Key References

Butler, R.W. (2007), Saddlepoint Approximations With Applications, New York: Cambridge University Press. Paige, R.L. and Trindade, A.A. (2008), “Practical Small Sample Inference for Single Lag Subset Autoregressive Models”, J. Statist. Plan. Inf., 138, 1934–1949. Paige, R.L., Trindade, A.A. and Fernando, P.H. (2009), “Saddlepoint-based bootstrap inference for quadratic estimating equations”, Scand. J. Stat., 36, 98–111. Paige, R.L., Trindade, A.A., and Wickramasinghe, R.I.P., “Extensions of Saddlepoint-Based Bootstrap Inference”, Ann. Instit. Statist. Math., to appear. Jeganathan, P., Paige, R.L. and Trindade, A.A., “Saddlepoint-Based Bootstrap Inference for Spatial Dependence in the Lattice Process”, under review.

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 20 / 21

slide-21
SLIDE 21

SPBB Details: Key Steps

Estimator ˆ θ of θ0 solves QEE Ψ(θ) = x⊺Aθx = 0 Assume: x ∼ N(µ, Σ) = ⇒ closed-form for MGF of QEE. QEE monotone (e.g., decreasing) in θ implies: F ˆ

θ(t) = P( ˆ

θ ≤ t) = P (Ψ(t) ≤ 0) = FΨ(t)(0) Nuisance parameter λ: substitute conditional MLE, ˆ λθ. Now: accurately approximate distribution of ˆ θ via SPA F ˆ

θ(t; θ0, λ0) ≈ ˆ

F ˆ

θ

  • t; θ0, ˆ

λθ0

  • = ˆ

FΨ(t)

  • 0; θ0, ˆ

λθ0

  • CI (θL, θU) produced by pivoting SPA of CDF

ˆ FΨ( ˆ

θobs)(0; θL, ˆ

λθL) = 1 − α 2, ˆ FΨ( ˆ

θobs)(0; θU, ˆ

λθU) = α 2

alex.trindade@ttu.edu ( Dept. of Mathematics & Statistics, Texas Tech University Rob Paige, Missouri SPBB Inference Under Dependence June 2014 21 / 21