 
              Change Point Analysis of Extreme Values Goedele Dierckx Economische Hogeschool Sint Aloysius, Brussels, Belgium e-mail: goedele.dierckx@hubrussel.be Jef L. Teugels Katholieke Universiteit Leuven, Belgium & EURANDOM, Technical University Eindhoven, the Netherlands e-mail: jef.teugels@wis.kuleuven.be Change Point Analysis of Extreme Values – TIES 2008 – p. 1/ ??
Overview 1. Introduction 2. Test statistic (a) Construction (b) Extreme value situation (c) Asymptotics (d) Practical procedure 3. Examples (a) Simulation (b) Malaysian Stock Index. • Classical Approach • Improved Approach (c) Nile Data (d) Swiss-Re Catastrophic Data 4. Conclusions 5. References Change Point Analysis of Extreme Values – TIES 2008 – p. 2/ ??
1. INTRODUCTION We start with an example where a change point has occurred. 987 measurements of the Daily Stock Market Returns of the Malaysian Stock Index . Jan. 1995 – Dec. 1998, covering the Asian financial crisis, July 1997. 0.2 0.1 0.0 -0.1 | | | | | 1 250 500 750 987 1/1/95 10/1/96 14/1/97 15/1/98 31/12/98 Changes in • distribution? • in parameters of a distribution? ◦ central behavior? ◦ tail behavior? Change Point Analysis of Extreme Values – TIES 2008 – p. 3/ ??
2. TEST STATISTIC 2.a. Construction of Test Statistic Start with a sample X 1 , . . . , X m ∗ , X m ∗ +1 , . . . X n , from a density function f ( x ; θ i , η ) . Csörg˝ o and Horváth (1997) test whether θ i changes at some point m ∗ H 0 : θ 1 = θ 2 = . . . = θ n versus θ 1 = . . . = θ m ∗ � = θ m ∗ +1 = . . . = θ n for some m ∗ . H 1 : using the test statistic � max Z n = ( − 2 log Λ m ) , 1 � m<n where � n sup θ,η i =1 f ( X i ; θ, η ) Λ m = i = m +1 f ( X i ; τ, η ) . � m i =1 f ( X i ; θ, η ) � n sup θ,τ,η Change Point Analysis of Extreme Values – TIES 2008 – p. 4/ ??
2.TEST STATISTIC Example For the exponential distribution where X i has mean θ i   m n n  − m log 1 1 X i + n log 1 � � � − 2 log Λ m = 2 X i − ( n − m ) log X i  m n − m n i =1 i = m +1 i =1 For large n, m and n − m one can expect ’normal’ behaviour expressed in terms of Brownian motions. Change Point Analysis of Extreme Values – TIES 2008 – p. 5/ ??
2.TEST STATISTIC 2.b. Extreme Value Situation Assume that X n,n is the maximum in a sample of independent random variables with a common distribution. Maximum domain of attraction condition � X n,n − b n � n →∞ P lim ≤ x = G γ ( x ) . a n Under very weak conditions we get the approximation P ( X n,n ≤ y ) ≈ G γ ( b n + a n x ) where γ is a real-valued extreme value index and G γ ( x ) = exp −{ 1 + γx } − 1 /γ + an extremal law . When γ > 0 we end up with heavy right-tailed distributions, the Pareto-Fr´ echet Case . Change Point Analysis of Extreme Values – TIES 2008 – p. 6/ ??
2.TEST STATISTIC We concentrate on changes of parameters that describe the tail of distributions appearing in extreme value analysis. • X has a Pareto-type distribution with parameter θ = γ , when the relative excesses of X over a high threshold u, given that X exceeds u satisfy the condition � X � → x − 1 γ . u → ∞ , P u > x | X > u • More generally X follows a Generalized Pareto distribution (GPD) with parameter θ = ( γ, σ ) if the behavior of the absolute excesses over a high threshold u satisfies the condition � − 1 1 + γx γ , u → ∞ . � P ( X − u > x | X > u ) → σ Change Point Analysis of Extreme Values – TIES 2008 – p. 7/ ??
2.TEST STATISTIC For large values, log of Pareto-type with extreme value index γ i is close to be exponential with mean γ i . • The most classical approach for the estimation of the extreme value index γ > 0 is to use the Hill estimator : k H k,n = 1 � log X n − i +1 ,n − log X n − k,n . k i =1 Hence, only a segment of the available data is used. • The determination of the quantity k is important. Alternatively, we look at extremes above a threshold u = X n − k,n . The Hill estimator has ◦ small bias but large variance for small k ◦ large bias but small variance for large k . As a compromise we select k such that the empirical mean squared error is minimal. Change Point Analysis of Extreme Values – TIES 2008 – p. 8/ ??
2.TEST STATISTIC 1. Pareto-type density Suppose X 1 , . . . , X m , X m +1 , . . . X n are independent and Pareto-type distributed. We denote the extreme value index for X i by γ i . In order to determine whether the index γ changes or not, we perform the following test H 0 : γ 1 = γ 2 = . . . = γ n = γ versus γ 1 = γ m ∗ � = γ m ∗ +1 = γ n for some m ∗ H 1 : � max Hence Z n = ( − 2 log Λ m ) 1 � m<n where in turn � � log Λ m = k 1 log H k 1 ,m + ( k − k 1 ) log H k − k 1 ,n − m − k log H k,n � 1 �� � + k 1 H k 1 ,m + ( k − k 1 ) H k − k 1 ,n − m − kH k,n . H k,n Change Point Analysis of Extreme Values – TIES 2008 – p. 9/ ??
2.TEST STATISTIC 2. GPD . Suppose now that X i is GPD with parameters θ i = ( γ i , σ i ) .To perform the test H 0 : θ 1 = θ 2 = . . . = θ n versus θ 1 = . . . = θ m ∗ � = θ m ∗ +1 = . . . = θ n for some m ∗ H 1 : � max we use as test statistic Z n = ( − 2 log Λ m ) , where 1 � m<n � � L k 1 (ˆ θ k 1 ) + L + k 1 (ˆ θ + k 1 ) − L k (ˆ − 2 log Λ m = 2 θ k ) � 1 � m � x � L m (ˆ � θ m ) = − m log ˆ σ m − + 1 log 1 + ˆ γ m ˆ γ m σ m ˆ i =1 � 1 n � � x � m (ˆ L + θ + σ + � γ + m ) = − ( n − m ) log ˆ m − + 1 log 1 + ˆ m γ + σ + ˆ ˆ m m i = m +1 γ + σ + and likelihood estimators (ˆ γ m , ˆ σ m ) resp. (ˆ m , ˆ m ) based on X 1 , X 2 , . . . , X m and X m +1 , . . . X n are obtained by numerical procedures. Change Point Analysis of Extreme Values – TIES 2008 – p. 10/ ??
2. TEST STATISTIC 2.c. Asymptotics Using the procedure suggested by Csörg˝ o and Horváth we have Theorem Suppose X 1 , . . . , X m , X m +1 , . . . X n are independent and identically distributed. We set the threshold at u = X n − k,n . Define � max Z n = ( − 2 log Λ m ) , c n � m<n − d n with − 2 log Λ m as before. Let n, k → ∞ such that k/n → 0 . Let further c n and d n be intermediate sequences for which c n /n → 0 and d n /n → 0 . Then, under H 0 of our test, �  B 2 ( t ) sup if Pareto-type ,   t (1 − t )  0 � t< 1     Z n → d  �  B 2  2 ( t ) sup  if GPD .   t (1 − t )  0 � t< 1 B ( t ) is a Brownian bridge, B 2 ( t ) is a sum of two independent Brownian bridges. Change Point Analysis of Extreme Values – TIES 2008 – p. 11/ ??
2. TEST STATISTIC 2.d. Practical Procedure Consecutive steps 1. Check on Pareto-type behavior of the data by Q − Q − plots. 2. Select a threshold u or the value of k = k opt,n that minimizes the asymptotic mean square error of the Hill estimator. We choose the optimal threshold u = X n − k opt,n . (a) Define c n as the smallest number such that at least k min = (log k opt,n ) 3 / 2 of 3. the data points X 1 , · · · , X c n are larger than u . (b) Define d n as the smallest number such that at least k min of the data points X n − d n +1 , . . . , X n are larger than u . 4. Repeat the next step for all m from c n up to n − d n . (a) Split the data up in two groups X 1 , X 2 , . . . , X m and X m +1 , , . . . , X n . (b) Calculate − 2 log Λ m . � max 5. Calculate Z n = ( − 2 log Λ m ) and compare Z n with the critical c n � m<n − d n values for sample size k . Change Point Analysis of Extreme Values – TIES 2008 – p. 12/ ??
3. EXAMPLES 3.a. Simulation We simulate 1000 data sets of size n (with n = 100 , n = 500 ) from the Burr distribution Burr ( β, τ, λ ) with parameters as given by � λ � β P ( X > x ) = , β + x τ an example of a GPD with γ = ( λτ ) − 1 . The rejection probabilities are given below. H 0 true H 0 false n m ∗ γ = 1 γ 1 = 1 γ 1 = 2 γ 1 = 1 γ 1 = .5 γ 2 = 2 γ 2 = 1 γ 2 = .5 γ 2 = 2 100 20 .096 .191 .460 .486 .182 50 .075 .517 .512 .519 .559 500 50 .029 .181 .782 .799 .144 100 .044 .378 .955 .951 .645 250 .019 .894 .951 .966 .909 Change Point Analysis of Extreme Values – TIES 2008 – p. 13/ ??
3. EXAMPLES The corresponding median of ˆ m is given in the table below. H 0 false n m ∗ γ 1 = 1 γ 1 = 2 γ 1 = 1 γ 1 = .5 γ 2 = 2 γ 2 = 1 γ 2 = .5 γ 2 = 2 100 20 48 21 45 21 50 55 44 56 45 500 50 175 47 92 48 100 139 97 107 97 250 252 247 252 248 Change Point Analysis of Extreme Values – TIES 2008 – p. 14/ ??
3. EXAMPLES 400 -- 300 -- 200 -- 100 -- 1 2 3 4 5 m for the Burr cases for n = 500 and m ∗ = 100 . Figure shows Boxplot of ˆ Change Point Analysis of Extreme Values – TIES 2008 – p. 15/ ??
3. EXAMPLES 3.b. MalaysianStockIndex: Classicalapproach Figure below indicates that the data are Pareto-type distributed. If we accept that July 1997 was a change point, then the data before that date give an extreme value index γ 1 between 0.1 and 0.2 while those after that date give γ 2 around 0.5. The mean squared error of the Hill estimator based on the whole data set attains a local minimum for the threshold u given by X 987 − 224 , 987 = 0 . 0099 so that k = k opt = 224 . 1.0 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500 Change Point Analysis of Extreme Values – TIES 2008 – p. 16/ ??
Recommend
More recommend