Confidence Bands for Distribution Functions: The Law of the Iterated - PowerPoint PPT Presentation

Confidence Bands for Distribution Functions: The Law of the Iterated Logarithm and Shape Constraints Lutz Duembgen (Bern) Jon A. Wellner (Seattle) Petro Kolesnyk (Bern) Ralf Wilke (Copenhagen) November 2014

I. The LIL for Brownian Motion and Bridge II. A General LIL for Sub-Exponential Processes III. Implications for the Uniform Empirical Process III.1 Goodness-of-Fit Tests III.2 Confidence Bands IV. Bi-Log-Concave Distribution Functions V. Bi-Log-Concave Binary Regression

I. The LIL for Brownian Motion and Bridge Standard Brownian motion W = ( W ( t )) t ≥ 0 LIL for BM : ± W ( t ) lim sup � = 1 a.s. 2 t log log( t − 1 ) t ↓ 0 ± W ( t ) lim sup � = 1 a.s. 2 t log log( t ) t ↑∞

Refined half of LIL for BM : For any constant ν > 3 / 2, � W ( t ) 2 � − log log( t + t − 1 ) − ν log log log( t + t − 1 ) lim 2 t t →{ 0 , ∞} = −∞ a.s.

Reformulation for standard Brownian bridge U = ( U ( t )) t ∈ (0 , 1) : � � t (0 , 1) ∋ t �→ logit( t ) := log ∈ R , 1 − t e x R ∋ x �→ ℓ ( x ) := ∈ (0 , 1) . 1 + e x

Refined half of LIL for BB : For arbitrary constants ν > 3 / 2, � � U ( t ) 2 2 t (1 − t ) − C ( t ) − ν D ( t ) < ∞ sup a.s. t ∈ (0 , 1) where � � � 1 1 + logit( t ) 2 / 2 ≈ log log C ( t ) := log t (1 − t ) � � � 1 1 + C ( t ) 2 / 2 D ( t ) := log ≈ log log log t (1 − t ) as t → { 0 , 1 } .

II. A General LIL for Sub-Exponential Processes Nonnegative stochastic process X = ( X ( t )) t ∈T on T ⊂ (0 , 1) . Locally uniform sub-exponentiality: LUSE 0 : For arbitrary a ∈ R , c ≥ 0 and η ≥ 0, � � X ( t ) ≥ η ≤ M exp( − L ( c ) η ) , I P sup t ∈ [ ℓ ( a ) ,ℓ ( a + c )] where M ≥ 1 and L : [0 , ∞ ) → [0 , 1] satisfies L ( c ) = 1 − O ( c ) as c ↓ 0

Refinement for ζ ∈ [0 , 1]: LUSE ζ : For arbitrary a ∈ R , c ≥ 0 and η ≥ 0, � � ≤ M exp( − L ( c ) η ) I P sup X ( t ) ≥ η max(1 , L ( c ) η ) ζ , t ∈ [ ℓ ( a ) ,ℓ ( a + c )] with M and L ( · ) as in LUSE 0 . Example : U ( t ) 2 X ( t ) := 2 t (1 − t ) satisfies LUSE 1 / 2 with L ( c ) = e − c . M = 2 and

Proposition. Suppose that X satisfies LUSE ζ . For any L o ∈ (0 , 1) and ν > 2 − ζ there exists a constant M o = M o ( M , L ( · ) , ζ, L o , ν ) ≥ 1 such that � � I P sup ( X − C − ν D ) ≥ η ≤ M o exp( − L o η ) T for arbitary η ≥ 0.

III. Implications for the Uniform Empirical Process Let U 1 , U 2 , . . . , U n be i.i.d. ∼ Unif [0 , 1]. Auxiliary function K : [0 , 1] × (0 , 1) → [0 , ∞ ], � x � � 1 − x � K ( x , p ) := x log + (1 − x ) log p 1 − p i.e. Kullback-Leibler divergence between Bin (1 , x ) and Bin (1 , p ). Two key properties: ( x − p ) 2 � � K ( x , p ) = 1 + o (1) as x → p . 2 p (1 − p ) �� 2 c p (1 − p ) + c K ( x , p ) ≤ c = ⇒ | x − p | ≤ � 2 c x (1 − x ) + c

Implication 1 : Uniform empirical distribution function n � G n ( t ) := 1 � 1 [ U i ≤ t ] n i =1 Lemma 1. The process X n = ( X n ( t )) t ∈ (0 , 1) with X n ( t ) := n K ( � G n ( t ) , t ) satisfies LUSE 0 with M = 2 and L ( c ) = e − c . Theorem 1. For any fixed ν > 2, � � � � U ( t ) 2 sup X n − C − ν D → L sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . (0 , 1) t ∈ (0 , 1)

Main ingredients for proofs: ◮ � � � G n ( t ) / t t ∈ (0 , 1] is a reverse martingale. ◮ Exponential transform and Doob’s inequality for submartingales. ◮ Analytical properties of K ( · , · ). ◮ Donsker’s invariance for uniform empirical process.

Implication 2 : Uniform order statistics 0 < U n :1 < U n :2 < · · · < U n : n < 1 . i T n := { t n 1 , t n 2 , . . . , t nn } with t ni := I E( U n : i ) = n + 1 . Lemma 2. The process ˜ X n = ( ˜ X n ( t )) t ∈T n with ˜ X n ( t ni ) := ( n + 1) K ( t ni , U n : i ) satisfies LUSE 0 with M = 2 and L ( c ) = e − c . Theorem 2. For any fixed ν > 2, � � � ˜ � U ( t ) 2 sup X n − C − ν D → L sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . T n t ∈ (0 , 1)

Main ingredients for proofs: ◮ � � n U n : i / t ni i =1 is a reverse martingale. ◮ Exponential transform and Doob’s inequality for submartingales. ◮ Connection between Beta and Gamma distributions. ◮ Analytical properties of K ( · , · ). ◮ Donsker’s invariance principle for uniform quantile process.

Some realizations of ˜ X n for n = 5000 and ν = 3: 2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t

2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t

3 2 1 X n ( t ) 0 -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t

2 1 0 X n ( t ) -1 -2 -3 0.0 0.2 0.4 0.6 0.8 1.0 t

Distribution function of arg max t ˜ X n ( t ): 1.0 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 t

III.1 Goodness-of-Fit Tests Let X 1 , X 2 , . . . , X n be i.i.d. with unknown c.d.f. F on R . Empirical c.d.f. n � F n ( x ) := 1 � 1 [ X i ≤ x ] . n i =1 Testing problem: H o : F ≡ F o versus H A : F �≡ F o .

Berk–Jones (1979) proposed the test statistic n K ( � T n ( F o ) := sup F n , F o ) R with critical value n K n ( � κ BJ n ,α := (1 − α ) − quantile of sup G n ( t ) , t ) t ∈ (0 , 1) � � = log log( n ) + O log log log( n ) .

New proposal: � � n K ( � T n ( F o ) := sup F n , F o ) − C ( F o ) − ν D ( F o ) R with critical value κ new := (1 − α ) − quantile of n ,α � � n K ( � G n ( t ) , t ) − C ( t ) − ν D ( t ) sup t ∈ (0 , 1) → (1 − α ) − quantile of � � U ( t ) 2 sup 2 t (1 − t ) − C ( t ) − ν D ( t ) . t ∈ (0 , 1)

Power. For any fixed κ > 0, � � I P F n T n ( F o ) > κ → 1 as √ n | F n − F o | sup � → ∞ . (1 + C ( F o )) F o (1 − F o ) + C ( F o ) / √ n R

Special case: Detecting heterogeneous Gaussian mixtures (Ingster 1997, 1998; Donoho–Jin 2004) Setting 1: F o := Φ , F n := (1 − ε n ) Φ + ε n Φ( · − µ n ) with ε n = n − β + o (1) , β ∈ (1 / 2 , 1) , µ n → ∞ .

Theorem. For any fixed κ > 0, � � → 1 I P F n T n ( F o ) > κ provided that � � β − 1 / 2 if β ≤ 3 / 4 , µ n = 2 r log( n ) with r > � 1 − √ 1 − β � 2 if β ≥ 3 / 4 .

Setting 2 (Contiguous alternatives): F o := Φ , � � 1 − π Φ + π F n := √ n √ n Φ( · − µ ) , π, µ > 0 . Optimal level- α test of F o versus F n has asymptotic power � � Φ − 1 ( α ) + π 2 (exp( µ 2 ) − 1) Φ . 4

� Theorem. Let µ = 2 s log(1 /π ) for fixed s > 0. As π ↓ 0, � � � Φ − 1 ( α ) + π 2 (exp( µ 2 ) − 1) α if s < 1 , Φ → 4 1 if s > 1 , while for any fixed κ > 0, � � → I P F n T n ( F o ) > κ 1 if s > 1 .

III.2 Confidence Bands Owen (1995) proposed (1 − α )-confidence band � � n K ( � F n , F ) ≤ κ BJ F : sup . n ,α R New proposal: With order statistics X n :1 ≤ X n :2 ≤ · · · ≤ X n : n , � � � � F : max ( n + 1) K ( t ni , F ( X n : i )) − C ( t ni ) − ν D ( t ni ) ≤ ˜ κ n ,α 1 ≤ i ≤ n

Resulting bounds for F ( x ): With confidence 1 − α , on [ X n : i , X n : i +1 ), 0 ≤ i ≤ n ,   [ a BJO , b BJO ] with Owen’s (1995) proposal , ni ni F ∈  [ a new , b new ] with new proposal , ni ni while F n = s ni := i � n .

i �→ a new , s ni , b new n = 500: ni ni 1.0 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500

i �→ a ∗ ni − s ni , b ∗ n = 500: ni − s ni 0.05 0.00 -0.05 0 100 200 300 400 500

i �→ a ∗ ni − s ni , b ∗ n = 2000: ni − s ni 0.04 0.02 0.00 -0.02 -0.04 0 500 1000 1500 2000

i �→ a ∗ ni − s ni , b ∗ n = 8000: ni − s ni 0.02 0.01 0.00 -0.01 -0.02 0 2000 4000 6000 8000

Theorem. For any fixed α ∈ (0 , 1), b new − a new ni ni max → 1 , b BJO − a BJO 0 ≤ i ≤ n ni ni while � 2 log log n 0 ≤ i ≤ n ( b BJO − a BJO max ) = (1 + o (1)) , ni ni n ) = O ( n − 1 / 2 ) . 0 ≤ i ≤ n ( b new − a new max ni ni

IV. Bi-Log-Concave Distribution Functions Shape constraint 1: Log-concave density. F has density f = e φ with φ : R → [ −∞ , ∞ ) concave. Shape constraint 2: Bi-log-concave distribution function. Both log( F ) and log(1 − F ) are concave. • Log-concave density = ⇒ bi-log-concave c.d.f. • A bi-log-concave c.d.f. may have arbitrarily many modes!

Theorem. Let J ( F ) := { x ∈ R : 0 < F ( x ) < 1 } � = ∅ . Four equivalent statements: ◮ F bi-log-concave. ◮ F has a density f . On J ( F ), f f f = F ′ > 0 , ց and ր . F 1 − F ◮ F has a bounded density f . On J ( F ), − f 2 ≤ f ′ ≤ f 2 f = F ′ > 0 and F . 1 − F ◮ F has a density f s.t. for arbitrary x ∈ J ( F ) and t ∈ R ,  � f �   ≤ F ( x ) exp F ( x ) · t , F ( x + t ) � � f   ≥ 1 − (1 − F ( x )) exp − 1 − F ( x ) · t .

1.5 − log ( 1 − F ) 1.0 0.5 F 0.0 1 + log ( F ) -0.5 -4 -2 0 2 4

0.4 0.3 f ( 1 − F ) f F 0.2 f = F ' 0.1 0.0 -4 -2 0 2 4

0.2 f 2 F 0.1 f ' 0.0 -0.1 − f 2 ( 1 − F ) -0.2 -4 -2 0 2 4

Confidence Bands for Distribution Functions: The Law of the Iterated - PowerPoint PPT Presentation

Confidence Bands for Distribution Functions: The Law of the Iterated Logarithm and Shape Constraints Lutz Duembgen (Bern) Jon A. Wellner (Seattle) Petro Kolesnyk (Bern) Ralf Wilke (Copenhagen) November 2014 I. The LIL for Brownian Motion and

Lecture 11: Nonparametric Regression (3) Confidence Bands Applied Statistics 2015 1 / 21

THE LISTING PRESENTATION A Natural Close! CONFIDENCE CONFIDENCE CONFIDENCE CONFIDENCE Hi

CS70: Jean Walrand: Lecture 29. Confidence? Confidence? Confidence is essential is many

The New 630m and 2200m Amateur Bands Rudy Severns N6LF, WD2XSH/20 n6lf@arrl.net,

Seaborne Royal Observer Corps From time immemorial when bands of men fought bands of men there

Three Evil Kings GODS FAITHFULNESS IN JUDGMENT 2 Kings 23:2924:9 Jeremiah 22 JOSIAH

Creating Confidence Intervals using Excel 2013 XL8A-V0R XL8A-V0R XL8A-V0R Create Confidence

Creating Confidence Intervals using Excel 2010 5/08/2015 V0M V0M V0M Create Confidence

STAT 113 Confidence Intervals Colin Reimer Dawson Oberlin College October 3, 2017 1 / 51

1. Normal distribution 2. Geometric distribution 3. Binomial distribution 4.

A kernel estimator and associated confidence bands in the Spektor-Lord-Willis problem Bogdan

Institute of Law Institute of Law Institute of Law Institute of Law Law Made Simple

Statement of Ohms Law Circuit diagram of Ohms Law Formula of Ohms Law Ohms law in

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Elementary Functions Part 1, Functions Lecture 1.1b, Functions defined by equations Dr. Ken W.

Law of the iterated logarithm for pure jump L evy processes Elena Shmileva, St.Petersburg

Foundations of Artificial Intelligence 13. Acting under Uncertainty Maximizing Expected Utility

18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 18.175 Lecture 16 1 Outline More

18.175: Lecture 18 Poisson random variables Scott Sheffield MIT 18.175 Lecture 18 1 Outline Extend

Large deviations principles for lacunary sums Nina Gantert joint work, soon to be finished with

Local regularity, multifractal analysis and boundary behavior of harmonic functions Eugenia

Computability, Complexity and Randomness 2016 Permutations of the integers do not induce

A family of self-avoiding walks on the Sierpi nski gasket Takafumi Otsuka Departmant of