Quantile Estimation Definition and Examples Point Estimates Peter - PowerPoint PPT Presentation

Quantile Estimation Quantile Estimation Definition and Examples Point Estimates Peter J. Haas Confidence Intervals Further Comments Checking Normality CS 590M: Simulation Bootstrap Confidence Intervals Spring Semester 2020 1 / 20 2 / 20 Quantiles Quantile Definition f X (x) 99% Definition of p -quantile q p 1% q p = F − 1 X ( p ) (for 0 < p < 1) q 0 ◮ When F X is continuous and increasing: solve F ( q ) = p ◮ In general: Use our generalized definition of F − 1 Example: Value-at-Risk (as in inversion method) ◮ X = return on investment, want to measure downside risk ◮ q = return s.t. P (worse return than q ) ≤ 0 . 01 Alternative Definition of p -quantile q p ◮ q is called the 0 . 01-quantile of X q p = min { q : F X ( q ) ≥ p } ◮ “Probabilistic worst case scenario” 3 / 20 4 / 20

Example: Robust Statistics IQR Quantile Estimation Definition and Examples Point Estimates Confidence Intervals Further Comments Median Checking Normality ◮ Median = q 0 . 5 Bootstrap Confidence Intervals ◮ Alternative to means as measure of central tendency ◮ Robust to outliers Inter-quartile range (IQR) ◮ Robust measure of dispersion ◮ IQR = q 0 . 75 − q 0 . 25 5 / 20 6 / 20 Point Estimate of Quantile D ◮ Given i.i.d. observations X 1 , . . . , X n ∼ F ◮ Natural choice is p th sample quantile: Quantile Estimation Q n = ˆ F − 1 n ( p ) Definition and Examples Point Estimates ◮ I.e., generalized inverse of empirical cdf ˆ F n Confidence Intervals ◮ Q: Can you ever use the simple (non-generalized) inverse here? Further Comments ◮ Equivalently, sort data as X (1) ≤ X (2) ≤ · · · ≤ X ( n ) and set Checking Normality Bootstrap Confidence Intervals Q n = X ( j ) , where j = ⌈ np ⌉ ◮ Ex: q 0 . 5 for { 6 , 8 , 4 , 2 } = ◮ Other definitions are possible (e.g., interpolating between values), but we will stick with the above defs 7 / 20 8 / 20

Confidence Interval Attempt #1: Direct Use of CLT Confidence Interval Attempt #1: Direct Use of CLT CLT for Quantiles (Bahadur Representation) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n CLT for Quantiles (Bahadur Representation) q p , σ 2 � � � p (1 − p ) Suppose that X 1 , . . . , X n are i.i.d. with pdf f X . Then for large n D Q n ∼ N with σ = n f X ( q p ) q p , σ 2 � � � p (1 − p ) D Q n ∼ N with σ = n f X ( q p ) Can derive via Delta Method for stochastic root-finding ◮ Recall: to find ¯ θ such that E [ g ( X , ¯ θ )] = 0 ◮ So if we can find an estimator s n of σ , then 100(1 − δ )% CI is � n ◮ Point estimate θ n solves 1 i =1 g ( X i , θ n ) = 0 n ◮ For large n , we have θ n ≈ N (¯ θ, σ 2 / n ), � � Q n − z δ s n √ n , Q n + z δ s n where σ 2 = Var[ g ( X , ¯ θ )] / c 2 with c = E [ ∂ g ( X , ¯ √ n θ ) /∂θ ] ◮ For quantile estimation take g ( X , θ ) = I ( X ≤ θ ) − p ◮ Problem: Estimating a pdf f X is hard (e.g., need to choose ◮ ¯ θ = q p and θ n = Q n , since E [ g ( X , ¯ θ )] = P ( X ≤ ¯ θ ) − p = 0 “bandwidth” for “kernel density estimator”) ◮ E [ ∂ g ( X , ¯ θ ) /∂θ ] = ∂ E [ g ( X , ¯ F X (¯ /∂θ = f X (¯ � � θ )] /∂θ = ∂ θ ) − p θ ) ◮ So we want to avoid estimation of σ θ ) 2 ] = E [ I 2 − 2 pI + p 2 ] ◮ Var[ g ( X , ¯ θ )] = E [ g ( X , ¯ = E [ I − 2 pI + p 2 ] = p − 2 p 2 + p 2 = p (1 − p ) 9 / 20 10 / 20 Confidence Interval Attempt #2: Sectioning Sectioning: So What’s the Problem? ◮ Assume that n = mk and divide X 1 , . . . , X n into m sections of k observations each ◮ Can show, as with nonlinear functions of means, that ◮ m is small (around 10–20) and k is large E [ Q n ] ≈ q p + b n + c ◮ Let Q n ( i ) be estimator of q p based on data in i th section n 2 ◮ Observe that Q n (1) , . . . , Q n ( m ) are i.i.d. ◮ It follows that q p , σ 2 ◮ By prior CLT, each Q n ( i ) is approx. distributed as N � � n + m 2 c k E [ Q n ( i )] ≈ q p + b k + c k 2 = q p + mb ◮ For i.i.d. normals, standard 100(1 − δ )% CI for mean is n 2 � v n � v n � � ¯ m , ¯ Q n − t m − 1 ,δ Q n + t m − 1 ,δ ◮ So m n + m 2 c Q n ] ≈ q p + mb E [ ¯ Q n = (1 / m ) � m ¯ i =1 Q n ( i ) ◮ n 2 � 2 Q n ( i ) − ¯ 1 � m ◮ v n = � Q n ◮ Bias of ¯ i =1 m − 1 Q n is roughly m times larger than bias of Q n ! ◮ t m − 1 ,δ is 1 − ( δ/ 2) quantile of Student-t distribution with m − 1 degrees of freedom 11 / 20 12 / 20

Attempt #3: Sectioning + Jackknifing Application to Quantile Estimation Sectioning + Jackknifing: General Algorithm for a Statistic α ◮ ˜ Q n ( i ) = quantile estimate ignoring section i 1. Generate n = mk i.i.d. observations X 1 , . . . , X n ◮ Clearly, ˜ Q n ( i ) has same distribution as Q ( m − 1) k , so 2. Divide observations into m sections, each of size k b c E [ ˜ 3. Compute point estimator α n based on all observations Q n ( i )] ≈ q p + ( m − 1) k + ( m − 1) 2 k 2 4. For i = 1 , 2 , . . . , m : 4.1 Compute estimator ˜ α n ( i ) using all observations ◮ It follows that, for pseudovalue α n ( i ), except those in section i c � � 4.2 Form pseudovalue α n ( i ) = m α n − ( m − 1)˜ α n ( i ) mQ n − ( m − 1) ˜ E [ α n ( i )] = E Q n ( i ) ≈ q p − ( m − 1) mk 2 m 5. Compute point estimator: α J n = 1 � α n ( i ) m ◮ Averaging does not affect bias, so, since n = mk , i =1 m 2 6. Set v J 1 ( α n ( i ) − α J n = � n ) E [ ¯ Q n ] = q p + O (1 / n 2 ) m − 1 i =1 � � � � v J v J 7. Compute 100(1 − δ )% CI: α J m , α J n − t m − 1 ,δ n + t m − 1 ,δ n n ◮ General procedure is also called the “delete- k jackknife” m 13 / 20 14 / 20 Further Comments A confession ◮ There exist special-purpose methods for quantile estimation [Sections 2.6.1 and 2.6.3 in Serfling book] Quantile Estimation ◮ We focus on sectioning + jackknife because method is general Definition and Examples ◮ Can also use bias elimination method from prior lecture Point Estimates Confidence Intervals Conditioning the data for q p when p ≈ 1 Further Comments ◮ Fix r > 1 and get n = rmk i.i.d. observations X 1 , . . . , X n Checking Normality ◮ Divide data into blocks of size r Bootstrap Confidence Intervals ◮ Set Y j = maximum value in j th block for 1 ≤ j ≤ mk ◮ Run quantile estimation procedure on Y 1 , . . . , Y mk ◮ Key observation: F Y ( q p ) = [ F X ( q p )] r = p r ◮ So p -quantile for X equals p r -quantile for Y ◮ Ex: if r = 50, then q 0 . 99 for X equals q 0 . 61 for Y ◮ Often, reduction in sample size outweighs cost of extra runs 15 / 20 16 / 20

Checking Normality Undercoverage ◮ E.g., when a “95% confidence interval” for the mean only brackets the mean 70% of the time ◮ Due to failure of CLT at finite sample sizes Quantile Estimation ◮ Note: If data is truly normal, then no error in CI for the mean Definition and Examples Point Estimates Simple diagnostics Confidence Intervals ◮ Skewness (measures symmetry, equals 0 for normal) Further Comments ◮ Definition: skewness( X ) = E [( X − E ( X )) 3 ] Checking Normality (var X ) 3 / 2 n Bootstrap Confidence Intervals ( X i − ¯ X n ) 3 n − 1 � ◮ Estimator: i =1 � 3 / 2 � n ( X i − ¯ X n ) 2 n − 1 � i =1 ◮ Kurtosis (measures fatness of tails, equals 0 for normal) ◮ Definition: kurtosis( X ) = E [( X − E ( X )) 4 ] − 3 (var X ) 2 n n − 1 ( X i − ¯ X n ) 4 � ◮ Estimator: i =1 � 2 − 3 � n ( X i − ¯ X n ) 2 n − 1 � i =1 17 / 20 18 / 20 Bootstrap Confidence Intervals General method works for quantiles (no normality assumptions needed) Quantile Estimation Bootstrap Confidence Intervals (Pivot Method) Definition and Examples 1. Run simulation n times to get D = { X 1 , . . . , X n } Point Estimates 2. Compute Q n = sample quantile based on D Confidence Intervals 3. Compute bootstrap sample D ∗ = { X ∗ Further Comments 1 , . . . , X ∗ n } Checking Normality 4. Set Q ∗ n = sample quantile based on D ∗ Bootstrap Confidence Intervals 5. Set pivot π ∗ = Q ∗ n − Q n 6. Repeat Steps 3–5 B times to create π ∗ 1 , . . . , π ∗ B 7. Sort pivots to obtain π ∗ (1) ≤ π ∗ (2) ≤ · · · ≤ π ∗ ( B ) 8. Set l = ⌈ (1 − δ/ 2) B ⌉ and u = ⌈ ( δ/ 2) B ⌉ 9. Return 100(1 − δ )% CI [ Q n − π ∗ ( l ) , Q n − π ∗ ( u ) ] 19 / 20 20 / 20

Quantile Estimation Definition and Examples Point Estimates Peter - PowerPoint PPT Presentation

Quantile Estimation Quantile Estimation Definition and Examples Point Estimates Peter J. Haas Confidence Intervals Further Comments Checking Normality CS 590M: Simulation Bootstrap Confidence Intervals Spring Semester 2020 1 / 20 2 / 20

) Quantile Estimation Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 20 Quantile

QUANTILE AUTOREGRESSION ROGER KOENKER AND ZHIJIE XIAO Abstract. We consider quantile

Generalized Quantile Regression in Stata Matthew Baker, Hunter College David Powell, RAND Travis

Quantile plots: New planks in an old campaign Nicholas J. Cox Department of Geography 1

Quantile Regression in R: For Fin and Fun Roger Koenker University of Illinois at

Applications of Normal Quantile Plots David Rose June 13, 2011 David Rose () Applications of

Checking Assumptions Normal distributions: use probability plot (or quantile-quantile plot);

Optimal Estimation for Quantile Regression with Functional Response Xiao Wang, Purdue University

Estimating and Testing a Quantile Regression Model with Interactive Effects Matthew Harding 1 and

A three-level M-quantile model for poverty mapping in Poland Maciej Bersewicz 1 , Stefano

Quantile Response and Panel Data Manuel Arellano CEMFI Africa Region Training Workshop

Quantile regression Christopher F Baum EC 823: Applied Econometrics Boston College, Spring 2013

Rushes in Large Timing Games Model Monotone Payoffs in Quantile Axel Anderson, Lones Smith,

Symbolic Clustering Based on Quantile Representation Paula Brito Manabu Ichino Universidade do

Quantile regression: Basics and recent advances J. M.C. Santos Silva University of Surrey 2019

bias correction quantile mapping challenges Renate A. I. Wilcke Rossby Centre, SMHI,

Candecomp/Parafac based Array Processing David Brie CRAN UMR 7039 - Universit e de Lorraine -

Dissipation and the Foundations of Classical Statistical Denis J. Evans, Stephen R. Williams

Power law violation of the area law in quantum spin chains Ramis Movassagh and Peter W. Shor

A primer in persistent homology Bastian Rieck Motivation What is the shape of data?

Geometric Representation in the Theories of Pseudo-finite Fields Ozlem Beyarslan ci Bo

Streaming Set Cover Amit Chakrabarti Dartmouth College Joint work with A. Wirth Sublinear

Solving p -adic differential equations in point counting algorithms Hendrik Hubrechts Katholieke

CBIG Computational Breast Imaging Group Quantitative imaging phenotyping of breast cancer risk