the central limit theorem more of the story
play

The Central Limit Theorem: More of the Story Steven Janke November - PowerPoint PPT Presentation

The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33 Central Limit Theorem Theorem (Central Limit Theorem) Let X 1 , X 2 , . . . be a


  1. The Central Limit Theorem: More of the Story Steven Janke November 2015 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 1 / 33

  2. Central Limit Theorem Theorem (Central Limit Theorem) Let X 1 , X 2 , . . . be a sequence of independent and identically distributed random variables, each with expectation µ and variance σ 2 . Then the distribution of Z n = X 1 + X 2 + · · · + X n − n µ σ √ n converges to the distribution of a standard normal random variable. � x 1 e − y 2 n →∞ P ( Z n ≤ x ) = √ 2 dy lim 2 π −∞ Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 2 / 33

  3. Central Limit Theorem Applications The sampling distribution of the mean is approximately normal. The distribution of experimental errors is approximately normal. >> But why the normal distribution? Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 3 / 33

  4. Benford’s Law In an arbitrary table of data, such as populations or lake areas, P [Leading digit is d] = log 10 (1 + 1 d ) Data: List of 60 Tallest Buildings Lead Digit Meters Feet Benford 1 0.433 0.300 0.301 2 0.117 0.133 0.176 3 0.150 0.133 0.125 4 0.100 0.100 0.097 5 0.067 0.167 0.079 6 0.017 0.083 0.067 7 0.033 0.033 0.058 8 0.083 0.017 0.051 9 0.000 0.033 0.046 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 4 / 33

  5. Benford Justification Simon Newcomb 1881 Frank Benford 1938 ”Proof” arguments: Positional number system Densities Scale invariance Scale and base unbiased (Hill 1995) Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 5 / 33

  6. Central Limit Theorem Theorem (Central Limit Theorem) Let X 1 , X 2 , . . . be a sequence of independent and identically distributed random variables, each with expectation µ and variance σ 2 . Then the distribution of Z n = X 1 + X 2 + · · · + X n − n µ σ √ n converges to the distribution of a standard normal random variable. � x 1 e − y 2 n →∞ P ( Z n ≤ x ) = √ 2 dy lim 2 π −∞ Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 6 / 33

  7. Central Limit Theorem Proof Proof Sketch: Let Y i = X i − µ Moment Generating Function of Y i is M Y i ( t ) = Ee tY i σ √ n ] n t MGF of Z n is M Z n ( t ) = [ M Y 1 ( lim n →∞ ln M Z n ( t ) = t 2 2 t 2 The MGF of the standard normal is e 2 Since the MGF’s converge, the distributions converge. (L´ evy Continuity Theorem). Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 7 / 33

  8. Counter-Examples Moment problem: Lognormal R.V. not determined by moments. No first moment: Cauchy R.V. has no MGF, EX = ∞ so CLT does not hold. 1 | x | 3 for | x | ≥ 1. No second moment: f ( x ) = Pairwise independence is not sufficient for CLT. Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 8 / 33

  9. Demoivre’s Theorem 1733 Each X i is Bernoulli ( 0 or 1 ). � 1 � n b ( k ) = P [ S n = k ] = 2 n k √ 2 π ) n n √ ne − n (Stirling’s formula) n ! ≈ ( √ b ( n 2 2 ) ≈ √ π n b ( n 2 + d 2 ) ) ≈ − 2 d 2 log( b ( n n √ √ π n e − 2 d 2 2 b ( n 2 + d ) ≈ n � b lim n →∞ P [ a ≤ S n − n / 2 a e − x 2 1 √ n / 2 ≤ b ] = 2 dx √ 2 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 9 / 33

  10. Laplace 1810 Dealt with independent and identically distributed case. Started with discrete variables: Consider X i where p k = P [ X i = k m ] for k = − m , − m + 1 , · · · , m − 1 , m Generating function: T ( t ) = � m k = − m p k t k q j = P [ � X i = j m ] is coefficient of t j in T ( t ) n � π Substitute e ix for t and recall 1 − π e − itx e isx dx = δ ts 2 π � π − π e − ijx [ � m 1 − m p k e ikx ] n dx Then, q j = 2 π Now, expand e ikx in a power series around 0 and use the fact that the mean of X i is zero. Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 10 / 33

  11. Why Normal? Normal Characteristic Function 1 � e iux e − x 2 2 dx = e − u 2 √ f ( u ) = 2 2 π σ √ n ( u ) = E [ e iu ( S n /σ √ n ) ] = ( f ( u σ √ n )) n f Sn 2 σ 2 nu 2 + o ( σ 2 σ 2 σ 2 nu 2 )) n = (1 − = (1 − u 2 2 n + o ( u 2 n )) n → e − u 2 2 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 11 / 33

  12. Levy Continuity Theorem If distribution functions F n converge to F, then the corresponding ch.f. f n converge to f . Conversely, if f n converges to g continuous at 0, then F n converges to F. Proof Sketch: First direction is the Helly-Bray theorem. The set { e iux } is a separating set for distribution functions. In both directions, continuity points and mass of F n are critical. Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 12 / 33

  13. History Laplace never presented a general CLT statement. (Concerned with limiting probabilities for particular problems). Concern over convergence led Poisson to improvements (not identically distributed case). Dirichlet and Cauchy changed conception of analysis (epsilon/delta). Counter-examples uncovered limitations. Chebychev proved CLT using convergence of moments. (Markov and Liapounov were students). First rigorous proof (Liapounov 1900). CLT holds with independent (but not necessarily i.i.d.) X i if � E | X j | 3 � E | X j | 3 → 0 [ � X 2 j ] 3 / 2 = s 3 n Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 13 / 33

  14. Liapounov proof � E | X j | 3 � E | X j | 3 Assume: j ] 3 / 2 = → 0 [ � X 2 s 3 n 1 f k ( u 1 [1 + ( f k ( u g n ( u ) = Π n s n ) = Π n s n ) − 1)] s n ) = 1 − u 2 f k ( u n ( σ 2 k + δ k ( u s n )) 2 s 2 s n ) − 1 | ≤ 2 u 2 σ 2 ⇒ � k | f k ( u 1 | f k ( u s n ) − 1 | ≤ 2 u 2 = k s 2 n 3 ( E | X 2 2 < E | X k | 3 k | ) σ k ⇒ sup | f k ( u s n → 0 = s n ) − 1 | → 0 sup k ≤ n Use log(1 + z ) = z (1 + θ z ) where | θ | ≤ 1 for | z | ≤ 1 2 log g n ( u ) = � n s n ) − 1) + θ � n 1 ( f k ( u 1 ( f k ( u s n ) − 1) 2 Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 14 / 33

  15. Liapounov proof continued log g n ( u ) = � n s n ) − 1) + θ � n 1 ( f k ( u 1 ( f k ( u s n ) − 1) 2 | θ � n 1 ( f k ( u s n ) − 1) 2 | ≤ sup | f k ( u s n ) − 1 | · � n 1 | ( f k ( u s n ) − 1) | → 0 σ 2 s n ) − 1 = − u 2 n + θ k u 3 f k ( u n E | X k | 3 k 2 s 2 s 3 1 E | X k | 3 → − u 2 s n ) − 1) = − u 2 2 + θ u 3 � n � n 1 ( f k ( u s 3 2 n Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 15 / 33

  16. Lindeberg 1922 Theorem (Central Limit Theorem) Let the variables X i be independent with EX i = 0 and EX 2 i = σ 2 i . Let s be the standard deviation of the sum S and let F be the distribution of S s . 1 � � | x |≥ ǫ s n x 2 dF k → 0 , we With Φ( x ) the normal distribution, then if s 2 n have x | F ( x ) − Φ( x ) | ≤ 5 ǫ sup Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 16 / 33

  17. Lindeberg Proof Pick auxiliary function f . � Arbitrary distribution V , define F ( x ) = f ( x − t ) dV ( x ) � With φ ( x ) the normal density, define Ψ( x ) = ( f ( x − t ) φ ( x ) dx Taylor expansion of f to third power gives | x | 3 dV ( x ) � | F ( x ) − Ψ( x ) | < k With U i the distribution of X i , � � F 1 ( x ) = f ( x − t ) dU 1 ( x ) . . . F n ( x ) = F n − 1 ( x − t ) dU n ( x ) � � · · · U ( x − t 1 − t 2 − · · · t n ) dU 1 ( t 1 ) · · · dU n ( t n ) Note U ( x ) = 1 By selecting f carefully, | U ( x ) − Φ( x ) | < 3( � n � | x | 3 dU i ( x )) 4 i Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 17 / 33

  18. Still Why Normal? Let X = X 1 + X 2 where X is N (0 , 1) and X 1 independent of X 2 f ( u ) = f 1 ( u ) f 2 ( u ) = e − u 2 2 e − u 2 2 is an entire, non-vanishing function with | f 1 ( z ) | ≤ e c | u | 2 , Hadamard factorization theorem = ⇒ log f 1 ( u ) is a polynomial in u of at most degree 2. ⇒ f (0) = 1 , f ( u ) = ¯ f is a characteristic function = f ( − u ), and it is bounded. Hence, log f ( u ) = iua + bu 2 . This is the general form of the normal characteristic function. Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 18 / 33

  19. Feller - Levy 1935 Theorem (Final Central Limit Theorem) Let the variables X i be independent with EX i = 0 and EX 2 i = σ 2 i . Let S n = � n n = � n 1 X i and s 2 1 σ 2 k . Φ is the normal distribution and F k is the distribution of X k . Then as n → ∞ , σ k P [ S n / s n ≤ x ] → Φ( x ) and max k ≤ n → 0 s n if and only if for every ǫ > 0 1 � � x 2 dF k → 0 s 2 | x |≥ ǫ s n n Steven Janke (Seminar) The Central Limit Theorem:More of the Story November 2015 19 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend