Lecture 12: Extreme Value Theory Applied Statistics 2015 1 / 18 A - - PowerPoint PPT Presentation

lecture 12 extreme value theory
SMART_READER_LITE
LIVE PREVIEW

Lecture 12: Extreme Value Theory Applied Statistics 2015 1 / 18 A - - PowerPoint PPT Presentation

A real problem Extreme Value Theory Lecture 12: Extreme Value Theory Applied Statistics 2015 1 / 18 A real problem Extreme Value Theory This problem concerns the safety of a sea dike. There have been 1965 storms during 122 years at the


slide-1
SLIDE 1

A real problem Extreme Value Theory

Lecture 12: Extreme Value Theory

Applied Statistics 2015

1 / 18

slide-2
SLIDE 2

A real problem Extreme Value Theory

This problem concerns the safety of a sea dike. There have been 1965 storms during 122 years at the station of Hoek van Holland. The corre- sponding sea water levels have been recorded: x1, . . . , x1965.

  • 500

1000 1500 2000 100 150 200 250 300 350 400

sea water levels

i x_i 2 / 18

slide-3
SLIDE 3

A real problem Extreme Value Theory

Let X denote the sea water level when a storm happens and F be the distribution function. Assuming that the data are realizations of a random sample from X, we want to answer two questions as follows. Question 1 Let y be the height of current dike, say y = 500 cm. How to estimate the probability of a flood, i.e. p0 := P(X > y) = 1 − F(y)? Question 2 Let p = 10−5. How to estimate x1−p such that P(X > x1−p) = p?

3 / 18

slide-4
SLIDE 4

A real problem Extreme Value Theory

The failure of empirical distribution function

The two questions concern the estimations of a distribution function and a

  • quantile. Can we use the empirical estimators that we discussed in Lecture

1?

100 200 300 400 0.0 0.2 0.4 0.6 0.8 1.0

EDF of sea water level

x Fn(x)

We have ˆ p0 = 1− ˆ Fn(500) = 0 which underestimates the probability. And ˆ x1−p = xn,n = 409, which underestimates the quantile.

4 / 18

slide-5
SLIDE 5

A real problem Extreme Value Theory

We aim to estimate the probability of an event that has never oc-

  • curred. And we do not want to assume any parametric models. Is it

a mission impossible? Extreme value theory is particularly useful for statistical inference on rare events.

⋄ a crash of a stock market ⋄ a large insurance claim ⋄ an extreme temperature ⋄ a wind storm

The key is that we need to zoom in the tail part of F.

5 / 18

slide-6
SLIDE 6

A real problem Extreme Value Theory

We start with the distribution of Xn,n, i.e. the maximum of a random sample X1, X2, . . . , Xn. P(Xn,n ≤ x) = P(X1 ≤ x, · · · , Xn ≤ x) = F n(x). What can we say about F n(x) as n → ∞? It converges to the right end point of the distribution of X. Let x∗ = sup{x : F(x) < 1}. lim

n→∞ F n(x) =

  • 0, if x < x∗;

1, if x ≥ x∗. In order to get a non-degenerate limit distribution, a normalization is necessary.

6 / 18

slide-7
SLIDE 7

A real problem Extreme Value Theory

Definition

Suppose there exists a sequence of constants an > 0 and bn real such that for a non-degenerate distribution function G, lim

n→∞ P

Xn,n − bn an ≤ x

  • = lim

n→∞ F n(anx + bn) = G(x).

(1) G is the so called extreme value distribution. F is said in the domain attraction of G. Notation: F ∈ D(G). The tail behavior of F is very much captured by G. Compare (1) with central limit theory: assuming that E

  • X2

1

  • < ∞,

lim

n→∞ P

n

i=1 Xi − nE(X1)

  • nVar(X1)

≤ x

  • = Φ(x).

7 / 18

slide-8
SLIDE 8

A real problem Extreme Value Theory

What probability distribution functions G can occur as a limit in (1)? What are the necessary and sufficient conditions on the initial distri- bution F such that it is in the max domain attraction of some G?

8 / 18

slide-9
SLIDE 9

A real problem Extreme Value Theory

Extreme value distributions

Theorem

(Fisher and Tippet(1928), Gnedenko(1943)) The class of extreme value distribution functions is Gγ(ax+b) with a > 0, b real, where Gγ(x) = exp(−(1 + γx)−1/γ), 1 + γx > 0. with γ real and where for γ = 0 the right hand side is read as exp(− exp(−x)). γ is called extreme value index. It is a key parameter in statistics of

  • extremes. It characterizes the heaviness of the tail of the distribution.

9 / 18

slide-10
SLIDE 10

A real problem Extreme Value Theory

Extreme value index

According to the sign of γ, the distributions can be distinguished into three categories. For γ > 0, the distribution has a heavy right tail as the right endpoint is infinite and the moments of order greater than 1/γ do not exist. Examples of distributions in such domain of attraction are Cauchy, Student and Pareto distributions. For γ = 0, the distribution has a light right tail. The right end- point can be either finite or infinite and moments of any order exist. Examples are normal and Gamma distributions. For γ < 0, the distribution has a finite endpoint. The uniform distri- bution is one of the examples.

10 / 18

slide-11
SLIDE 11

A real problem Extreme Value Theory

Peak over threshold

Theorem

F ∈ D(G) if and only if there exists a positive function g such that for all 1 + γx > 0, lim

t↑x∗ P

X − t g(t) > x|X > t

  • = (1 + γx)−1/γ.

(2) Let Hγ(x) = 1−(1+γx)−1/γ, a generalized Pareto distribution(GPD). This theorem is the base of the peak over threshold approach in EVT. The conditional distribution of X−t

g(t) given that X > t has a GPD

limit distribution.

11 / 18

slide-12
SLIDE 12

A real problem Extreme Value Theory

Approximation of a tail probability

Roughly P(X > y|X > t) ≈

  • 1 + γ y−t

g(t)

−1/γ , for large t and y > t. Choose t = Xn−k,n, where k << n. We have P(X > y) ≈(1 − F(Xn−k,n))

  • 1 + γ y − Xn−k,n

g(Xn−k,n) −1/γ ≈k n

  • 1 + γ y − Xn−k,n

g(Xn−k,n) −1/γ . (3) This approximation is valid for any y > Xn−k,n and even for y > Xn,n.

12 / 18

slide-13
SLIDE 13

A real problem Extreme Value Theory

Approximation of a high quantile

For a given small p, we want to find an approximation for x1−p such that P(X > x1−p) = p. From (3), p ≈ k

n

  • 1 + γ x1−p−Xn−k,n

g(Xn−k,n)

−1/γ . Hence, x1−p ≈ Xn−k;n + g(Xn−k;n)

  • k

np

γ − 1 γ . In order to estimate a tail probability or a high quantile, we need the estimators of γ and g(Xn−k;n).

13 / 18

slide-14
SLIDE 14

A real problem Extreme Value Theory

Estimations

Approximately, {Xn−k,n, Xn−k+1,n, . . . , Xn,n} can be viewed as a sample from a GPD distribution with g(Xn−k,n) as the scale pa- rameter and γ as the shape parameter. The estimations of γ and g(Xn−k,n) can be naturally developed based on those upper order

  • statistics. There are MLE and moment estimators among others; see

Chapter 3 of Extreme Value Theory: An Introduction. Question 1 The estimator of a tail probability is given by ˆ p0 = k n

  • 1 + ˆ

γ y − Xn−k,n ˆ ank −1/ˆ

γ

, where ˆ γ and ˆ ank are the estimators of γ and g(Xn−k,n), respectively. Question 2 The estimator of a high quantile is given by x1−p ≈ Xn−k;n + ˆ ank

  • k

np

ˆ

γ

− 1 ˆ γ .

14 / 18

slide-15
SLIDE 15

A real problem Extreme Value Theory

Choose k = 200. Moment estimates: ˆ γ = 0.052; ˆ g(Xn−k,n) = 25.59. The estimate of P(X > 500) is 1.238 × 10−5. For p = 10−5, ˆ x1−p = 508. The figure plots the estimates of P(X > y) against y.

460 480 500 520 540 1e−05 2e−05 3e−05 4e−05 5e−05

Tail probability of sea water level

y P(X>y)

15 / 18

slide-16
SLIDE 16

A real problem Extreme Value Theory

We use upper order statistics, i.e., observations in the tail of the sample, to estimate the tail of the population. Based on the estimated tail distribution, we do statistical inference for extreme events which can be outside the range of the data.

  • 500

1000 1500 2000 100 200 300 400 500 600

sea water levels

i x_i

Histogram of sea water level

data Density 100 200 300 400 500 600 0.000 0.002 0.004 0.006 0.008 0.010 0.012

16 / 18

slide-17
SLIDE 17

A real problem Extreme Value Theory

Master Projects

Extreme value theory is an amazing topic! There are a lot of chal- lenging and interesting theoretical problems. Moreover, statistics of extreme has extensive applications to various fields such as finance, ac- tuarial science, environmental science, hydrology and climate change. If you want to write your master thesis on this topic, please contact j.j.cai@tudelft.nl.

17 / 18

slide-18
SLIDE 18

A real problem Extreme Value Theory

A large part of this course is based on the materials developed by Dr. Frank van der Muelen from TU Delft and Dr. Rui Castro from TU

  • Eindhoven. All the mistakes are Juan’s.

Any comments and suggestions for this course will be highly appreci-

  • ated. Please send me emails. I have already got some during lectures.

So thanks!

18 / 18