quantile estimation peter j haas cs 590m simulation
play

) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring - PowerPoint PPT Presentation

glee xD ) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20 Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence


  1. glee xD ) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20

  2. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 2 / 20

  3. Quantiles f X (x) 99% 1% q 0 Example: Value-at-Risk I X = return on investment, want to measure downside risk I q = return s.t. P (worse return than q )  0 . 01 I q is called the 0 . 01-quantile of X I “Probabilistic worst case scenario” 3 / 20

  4. Quantile Definition Definition of p -quantile q p q p = F − 1 X ( p ) (for 0 < p < 1) I When F X is continuous and increasing: solve F ( q ) = p I In general: Use our generalized definition of F − 1 (as in inversion method) Alternative Definition of p -quantile q p q p = min { q : F X ( q ) � p } if 4 / 20

  5. Example: Robust Statistics IQR Median I Median = q 0 . 5 I Alternative to means as measure of central tendency I Robust to outliers Inter-quartile range (IQR) I Robust measure of dispersion I IQR = q 0 . 75 � q 0 . 25 5 / 20

  6. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 6 / 20

  7. Point Estimate of Quantile Fox ) = # His X 34 D I Given i.i.d. observations X 1 , . . . , X n ⇠ F Fade Pcxsx ) I Natural choice is p th sample quantile: Q n = ˆ F − 1 n ( p ) I I.e., generalized inverse of empirical cdf ˆ - . F n = I Q: Can you ever use the simple (non-generalized) inverse here? I Equivalently, sort data as X (1)  X (2)  · · ·  X ( n ) and set Q n = X ( j ) , where j = d np e 2,406,8 4 I Ex: q 0 . 5 for { 6 , 8 , 4 , 2 } = pi . s " i 4 I Other definitions are possible (e.g., interpolating between values), but we will stick with the above defs Tsx 47=521 I 2 7 / 20

  8. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 8 / 20

  9. Confidence Interval Attempt #1: Direct Use of CLT CLT for Quantiles (Bahadur Representation) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n q p , σ 2 p ✓ ◆ p (1 − p ) D Q n ∼ N with σ = n f X ( q p ) Can derive via Delta Method for stochastic root-finding I Recall: to find ¯ θ such that E [ g ( X , ¯ θ )] = 0 P n I Point estimate θ n solves 1 i =1 g ( X i , θ n ) = 0 n I For large n , we have θ n ⇡ N (¯ θ , σ 2 / n ), where σ 2 = Var[ g ( X , ¯ θ )] / c 2 with c = E [ ∂ g ( X , ¯ θ ) / ∂θ ] I For quantile estimation take g ( X , θ ) = I ( X  θ ) � p I ¯ θ = q p and θ n = Q n , since E [ g ( X , ¯ θ )] = P ( X  ¯ θ ) � p = 0 I E [ ∂ g ( X , ¯ θ ) / ∂θ ] = ∂ E [ g ( X , ¯ F X (¯ / ∂θ = f X (¯ � � θ )] / ∂θ = ∂ θ ) � p θ ) θ ) 2 ] = E [ I 2 � 2 pI + p 2 ] I Var[ g ( X , ¯ θ )] = E [ g ( X , ¯ = E [ I � 2 pI + p 2 ] = p � 2 p 2 + p 2 = p (1 � p ) 9 / 20

  10. Confidence Interval Attempt #1: Direct Use of CLT CLT for Quantiles (Bahadur Representation) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n q p , σ 2 p ✓ ◆ p (1 − p ) D with σ = Q n ∼ N f X ( q p ) n I So if we can find an estimator s n of σ , then 100(1 � δ )% CI is  � Q n � z δ s n p n , Q n + z δ s n p n I Problem: Estimating a pdf f X is hard (e.g., need to choose “bandwidth” for “kernel density estimator”) I So we want to avoid estimation of σ 10 / 20

  11. Confidence Interval Attempt #2: Sectioning I Assume that n = mk and divide X 1 , . . . , X n into m sections of k observations each I m is small (around 10–20) and k is large I Let Q n ( i ) be estimator of q p based on data in i th section I Observe that Q n (1) , . . . , Q n ( m ) are i.i.d. q p , σ 2 I By prior CLT, each Q n ( i ) is approx. distributed as N � � k I For i.i.d. normals, standard 100(1 � δ )% CI for mean is p v n h i ¯ m , ¯ p v n Q n � t m − 1 , δ Q n + t m − 1 , δ m Q n = (1 / m ) P m ¯ i =1 Q n ( i ) I � 2 1 P m Q n ( i ) � ¯ I v n = � Q n i =1 m − 1 I t m − 1 , δ is 1 � ( δ / 2) quantile of Student-t distribution with m � 1 degrees of freedom 11 / 20

  12. Sectioning: So What’s the Problem? " - my I Can show, as with nonlinear functions of means, that E [ Q n ] ⇡ q p + b n + c n 2 I It follows that n + m 2 c E [ Q n ( i )] ⇡ q p + b k + c k 2 = q p + mb n 2 I So n + m 2 c Q n ] ⇡ q p + mb E [ ¯ n 2 I Bias of ¯ Q n is roughly m times larger than bias of Q n ! 12 / 20

  13. Attempt #3: Sectioning + Jackknifing Sectioning + Jackknifing: General Algorithm for a Statistic α 1. Generate n = mk i.i.d. observations X 1 , . . . , X n 2. Divide observations into m sections, each of size k 3. Compute point estimator α n based on all observations 4. For i = 1 , 2 , . . . , m : 4.1 Compute estimator ˜ α n ( i ) using all observations except those in section i 4.2 Form pseudovalue α n ( i ) = m α n � ( m � 1)˜ α n ( i ) m n = 1 5. Compute point estimator: α J P α n ( i ) m i =1 m 2 1 6. Set v J P ( α n ( i ) � α J n = n ) m − 1 i =1  � q q v J v J 7. Compute 100(1 � δ )% CI: α J m , α J n � t m − 1 , δ n + t m − 1 , δ n n m 13 / 20

  14. Application to Quantile Estimation I ˜ Q n ( i ) = quantile estimate ignoring section i I Clearly, ˜ Q n ( i ) has same distribution as Q ( m − 1) k , so b c E [ ˜ Q n ( i )] ⇡ q p + ( m � 1) k + ( m � 1) 2 k 2 I It follows that, for pseudovalue α n ( i ), c h i mQ n � ( m � 1) ˜ E [ α n ( i )] = E Q n ( i ) ⇡ q p � ( m � 1) mk 2 I Averaging does not a ff ect bias, so, since n = mk , E [ ¯ Q n ] = q p + O (1 / n 2 ) I General procedure is also called the “delete- k jackknife” 14 / 20

  15. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 15 / 20

  16. Further Comments A confession I There exist special-purpose methods for quantile estimation [Sections 2.6.1 and 2.6.3 in Serfling book] I We focus on sectioning + jackknife because method is general I Can also use bias elimination method from prior lecture Conditioning the data for q p when p ⇡ 1 I Fix r > 1 and get n = rmk i.i.d. observations X 1 , . . . , X n I Divide data into blocks of size r I Set Y j = maximum value in j th block for 1  j  mk I Run quantile estimation procedure on Y 1 , . . . , Y mk X ; s g p ) I Key observation: F Y ( q p ) = [ F X ( q p )] r = p r - Pl m ;ax Fy Lgpl - . . . Prot p ) - p ( x . ,Xy I So p -quantile for X equals p r -quantile for Y - I Ex: if r = 50, then q 0 . 99 for X equals q 0 . 61 for Y = pl x , sq p ) r I Often, reduction in sample size outweighs cost of extra runs 16 / 20

  17. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 17 / 20

  18. Checking Normality Undercoverage I E.g., when a “95% confidence interval” for the mean only brackets the mean 70% of the time I Due to failure of CLT at finite sample sizes I Note: If data is truly normal, then no error in CI for the mean ⇒ ⇐ EL Simple diagnostics - skew Pds . skew neg I Skewness (measures symmetry, equals 0 for normal) I Definition: skewness( X ) = E [( X − E ( X )) 3 ] (var X ) 3 / 2 n Entasis n − 1 ( X i − ¯ X n ) 3 P I Estimator: i =1 ◆ 3 / 2 ✓ n ( X i − ¯ X n ) 2 n − 1 P i =1 I Kurtosis (measures fatness of tails, equals 0 for normal) I Definition: kurtosis( X ) = E [( X − E ( X )) 4 ] � 3 (var X ) 2 n n − 1 ( X i − ¯ X n ) 4 P I Estimator: i =1 ◆ 2 � 3 ✓ n ( X i − ¯ X n ) 2 n − 1 P i =1 18 / 20

  19. Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Checking Normality Bootstrap Confidence Intervals 19 / 20

  20. Bootstrap Confidence Intervals General method works for quantiles (no normality assumptions needed) Bootstrap Confidence Intervals (Pivot Method) 1. Run simulation n times to get D = { X 1 , . . . , X n } 2. Compute Q n = sample quantile based on D 3. Compute bootstrap sample D ∗ = { X ∗ 1 , . . . , X ∗ n } 4. Set Q ∗ n = sample quantile based on D ∗ estimate of " real ' ' world " quantity 5. Set pivot π ∗ = Q ∗ " bootstrap Werk d C n � Q n 6. Repeat Steps 3–5 B times to create π ∗ 1 , . . . , π ∗ Qu - Ep ) B 7. Sort pivots to obtain π ∗ (1)  π ∗ (2)  · · ·  π ∗ ( B ) 8. Set l = d (1 � δ / 2) B e and u = d ( δ / 2) B e 9. Return 100(1 � δ )% CI [ Q n � π ∗ ( l ) , Q n � π ∗ ( u ) ] 20 / 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend