Robust inference based on flexible parametric families of - - PowerPoint PPT Presentation

robust inference based on flexible parametric families of
SMART_READER_LITE
LIVE PREVIEW

Robust inference based on flexible parametric families of - - PowerPoint PPT Presentation

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing Robust inference based on flexible parametric families of distributions Adelchi Azzalini (Universit di Padova, Italia) ICORS, Parma, June 2009


slide-1
SLIDE 1

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Robust inference based

  • n flexible parametric families of distributions

Adelchi Azzalini

(Università di Padova, Italia)

ICORS, Parma, June 2009

slide-2
SLIDE 2

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Outline of the talk

skew-symmetric families of distributions flexible likelihood for robust inference some numerical comparison

slide-3
SLIDE 3

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Skew-symmetric distributions — Introduction

A generator of distributions context: families of continuous distributions on Rd start from a density f0 symmetric around 0, f0(x) = f0(−x) (x ∈ Rd) choose a real-valued w(x) such that w(−x) = −w(x) choose a scalar cdf G(·) with symmetric pdf G′(·) then f(x) = 2 f0(x) G{w(x)} is a skew-symmetric pdf

slide-4
SLIDE 4

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Basic case: skew-normal distribution (d = 1)

Choose N(0, 1) ingredients: f0(x) = ϕ(x), G = Φ, w(x) = α x and get f(x) = 2 ϕ(x) Φ(αx)

−4 −3 −2 −1 1 2 0.0 0.2 0.4 0.6 0.8 α = −2 α = −5 α = −20 −2 −1 1 2 3 4 0.0 0.2 0.4 0.6 0.8 α = 2 α = 5 α = 20

slide-5
SLIDE 5

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Regulate both skewness and kurtosis

Select f0 from a symmetric family with adjustable tails. Interesting cases: Exponential power (Subbotin, 1923): f0(x) ∝ exp

  • −xν

ν

  • Student’s t:

f0(x) ∝

  • 1 + x2

ν − ν+d

2

In both cases ν regulates the tail thickness Various options for the skewing factor

slide-6
SLIDE 6

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Skew-t distribution (case d = 1)

let Z ∼ Skew-normal(α) then a natural form of skew-t (ST) variate is X = Z

  • χ2

ν/ν

density is f(x) = 2 tν(x) Tν+1{w(x)} where w(x) = αx

  • ν + 1

ν + x2 Note: f(x) is of skew-symmetric type Note: a multivariate version exists

slide-7
SLIDE 7

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Skew-t distribution: example of densities

−2 −1 1 2 3 4 0.0 0.2 0.4 0.6

ν = 5

t5 α = 0 α = 2 α = 5 α = 20 −2 −1 1 2 3 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6

ν = 1

t1 α = 0 α = 2 α = 5 α = 20

slide-8
SLIDE 8

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

A flexible distribution

Consider ST has a general-purpose tool for statistical modelling Combines high flexibility for skewness and for the tails: α regulates skewness (α ∈ Rd), ν regulates the tail thickness (ν > 0) Make use of the tail parameter to accomodate “outliers”, possibly non-symmetrically distributed (Ideal in d-dimensional case: a tail parameter for each component)

slide-9
SLIDE 9

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Regression models with ST errors

fitted model: y = x⊤β + ε, ε ∼ (scale factor) × ST estimate parameters via MLE (or Bayesian approach, according to taste) adjust intercept because E{ST} = 0 various options:

intercept = ˆ β0 + E{ε} . . . needs ˆ ν > 1 intercept = ˆ β0 + median(ε) . . . use this

  • thers. . .
slide-10
SLIDE 10

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Flexible distribution approach vs M-estimation

M-estimates converge to solution of non-linear equation: λ(θ) := E{ψ(X, θ)} = 0 In simple location case λ(θ) := E{ψ(X − θ)} = 0 What are we estimating? If the error distribution is not symmetric, no explicit solution In the “robust likelihood” approach we estimate the parameters of the error distribution Note: empirical evidence that real data have asymmetric outliers

slide-11
SLIDE 11

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

A simple regression example (Yohai, 1987)

50 55 60 65 70 5 10 15 20 International phone calls from Belgium (Yohai, 1987) year calls N N N N N N N N N N N N N N N N N N T T T T T T LS MM

slide-12
SLIDE 12

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

A simple regression example (Yohai, 1987)

50 55 60 65 70 5 10 15 20 International phone calls from Belgium (Yohai, 1987) year calls N N N N N N N N N N N N N N N N N N T T T T T T LS MM ST

slide-13
SLIDE 13

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

A classical benchmark: stackloss data

(loss function) =

n

  • i=1

|yi − ˆ yi|p p 0.5 1 2 LS 30.1 49.7 178.8 MM 27.1 45.3 222.8 LTS 25.9 44.7 241.7 ST 25.0 43.4 240.0

(n = 21 with 3 covariates)

slide-14
SLIDE 14

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Regression with contaminated normal errors

Simulate data from model: y = β0 + β1x + ε where ε ∼ (1 − π) N(0,1) + π N(µ1, 3) β0 = β1 = 2 π = 0.05, 0.10 µ1 = 2.5, 5, 10 replicates: 104 in each case

Distribution of errors

5 10 15 0.0 0.1 0.2 0.3

slide-15
SLIDE 15

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Simulation: Root Mean Square Error for β0

2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 µ1 Root mean square error of hat(beta0)

Contamination = 5%

LS MM LTS ST 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 µ1 Root mean square error of hat(beta0)

Contamination = 10%

LS MM LTS ST

slide-16
SLIDE 16

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Simulation: Root Mean Square Error for β1

2 4 6 8 10 0.00 0.02 0.04 0.06 0.08 0.10 µ1 Root mean square error of hat(beta1)

Contamination = 5%

LS MM LTS ST 2 4 6 8 10 0.00 0.02 0.04 0.06 0.08 0.10 µ1 Root mean square error of hat(beta1)

Contamination = 10%

LS MM LTS ST

slide-17
SLIDE 17

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

Summary

ST and other flexible families of distributions allow regulation of skewness and kurtosis corresponding likelihood inference appears reliable even when used outside the parametric class advantages are:

a probability model is fitted to the data the quantities being estimated are explicitly known

slide-18
SLIDE 18

Outline Skew-symmetric distributions ST Flexible likelihood Numerical illustrations Closing

References & resources

Genton, M. G. (2004, Skew-elliptical distributions. . . ) edited volume Azzalini, A. (2005, Scand J. Stat., vol.32) Review paper with discussion Resources: http://azzalini.stat.unipd.it/SN/

  • A. Azzalini & M. G. Genton (2008).

Robust likelihood methods based on the skew-t and related distributions. Int. Statist. Rev., 76, 106–129