18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott - PowerPoint PPT Presentation

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott Sheffield MIT 1 18.175 Lecture 12

Outline DeMoivre-Laplace limit theorem Weak convergence Characteristic functions 2 18.175 Lecture 12

DeMoivre-Laplace limit theorem n � Let X i be i.i.d. random variables. Write S n = i =1 X n . � Suppose each X i is 1 with probability p and 0 with probability q = 1 − p . � DeMoivre-Laplace limit theorem: S n − np lim P { a ≤ ≤ b } → Φ( b ) − Φ( a ) . √ npq n →∞ � Here Φ( b ) − Φ( a ) = P { a ≤ Z ≤ b } when Z is a standard normal random variable. S n − np npq describes “number of standard deviations that S n is � √ above or below its mean”. � Proof idea: use binomial coefficients and Stirling’s formula. � Question: Does similar statement hold if X i are i.i.d. from some other law? � Central limit theorem: Yes, if they have finite variance. 4 18.175 Lecture 12

Local p = = 1 / 2 DeMoivre-Laplace limit theorem √ Stirling: n ! ∼ n n e − n 2 π n where ∼ means ratio tends to one. � � √ Theorem: If 2 k / 2 n → x then � � − x 2 / 2 P ( S 2 n = 2 k ) ∼ ( π n ) − 1 / 2 e . 5 18.175 Lecture 12

Weak convergence Let X be random variable, X n a sequence of random variables. � � Say X n converge in distribution or converge in law to X if � � lim n →∞ F X n ( x ) = F X ( x ) at all x ∈ R at which F X is continuous. Also say that the F n = F X n converge weakly to F = F X . � � Example: X i chosen from {− 1 , 1 } with i.i.d. fair coin tosses: � � n − 1 / 2 then n i =1 X i converges in law to a normal random variable (mean zero, variance one) by Demoivre-Laplace. Example: If X n is equal to 1 / n a.s. then X n converge weakly � � to an X equal to 0 a.s. Note that lim n →∞ F n (0) = F (0) in this case. Example: If X i are i.i.d. then the empirical distributions � � converge a.s. to law of X 1 (Glivenko-Cantelli). Example: Let X n be the n th largest of 2 n + 1 points chosen � � i.i.d. from fixed law. 8 18.175 Lecture 12

Convergence results Theorem: If F n → F ∞ , then we can find corresponding � � random variables Y n on a common measure space so that Y n → Y ∞ almost surely. Proof idea: Take Ω = (0 , 1) and Y n = sup { y : F n ( y ) < x } . � � Theorem: X n = ⇒ X ∞ if and only if for every bounded � � continuous g we have Eg ( X n ) → Eg ( X ∞ ). Proof idea: Define X n on common sample space so converge � � a.s., use bounded convergence theorem. Theorem: Suppose g is measurable and its set of � � discontinuity points has µ X measure zero. Then X n = ⇒ X ∞ implies g ( X n ) = ⇒ g ( X ). Proof idea: Define X n on common sample space so converge � � a.s., use bounded convergence theorem. 9 18.175 Lecture 12

Compactness Theorem: Every sequence F n of distribution has subsequence � � converging to right continuous nondecreasing F so that lim F n ( k ) ( y ) = F ( y ) at all continuity points of F . Limit may not be a distribution function. � � Need a “tightness” assumption to make that the case. Say µ n � � are tight if for every E we can find an M so that µ n [ − M , M ] < E for all n . Define tightness analogously for corresponding real random variables or distributions functions. Theorem: Every subsequential limit of the F n above is the � � distribution function of a probability measure if and only if the F n are tight. 10 18.175 Lecture 12

Total variation norm If we have two probability measures µ and ν we define the � � total variation distance between them is || µ − ν || := sup B | µ ( B ) − ν ( B ) | . Intuitively, it two measures are close in the total variation � � sense, then (most of the time) a sample from one measure looks like a sample from the other. Convergence in total variation norm is much stronger than � � weak convergence. 11 18.175 Lecture 12

Characteristic functions Let X be a random variable. � � The characteristic function of X is defined by � � φ ( t ) = φ X ( t ) := E [ e itX ]. Like M ( t ) except with i thrown in. Recall that by definition e it = cos( t ) + i sin( t ). � � Characteristic functions are similar to moment generating � � functions in some ways. For example, φ X + Y = φ X φ Y , just as M X + Y = M X M Y , if X � � and Y are independent. And φ aX ( t ) = φ X ( at ) just as M aX ( t ) = M X ( at ). � � ( m ) m ] = i m φ And if X has an m th moment then E [ X (0). � � X But characteristic functions have an advantage: they are well � � defined at all t for all random variables X . 14 18.175 Lecture 12

Continuity theorems L´ evy’s continuity theorem: if � � lim φ X n ( t ) = φ X ( t ) n →∞ for all t , then X n converge in law to X . By this theorem, we can prove the weak law of large numbers � � by showing lim n →∞ φ A n ( t ) = φ µ ( t ) = e it µ for all t . In the special case that µ = 0, this amounts to showing lim n →∞ φ A n ( t ) = 1 for all t . Moment generating analog: if moment generating � � functions M X n ( t ) are defined for all t and n and lim n →∞ M X n ( t ) = M X ( t ) for all t , then X n converge in law to X . 15 18.175 Lecture 12

MIT OpenCourseWare http://ocw.mit.edu 18.175 Theory of Probability Spring 2014 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott - PowerPoint PPT Presentation

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott Sheffield MIT 1 18.175 Lecture 12 Outline DeMoivre-Laplace limit theorem Weak convergence Characteristic functions 2 18.175 Lecture 12 Outline DeMoivre-Laplace limit theorem Weak

18.175: Lecture 14 Weak convergence and characteristic functions Scott Sheffield MIT 1 18.175

JUST THE MATHS SLIDES NUMBER 16.2 LAPLACE TRANSFORMS 2 (Inverse Laplace Transforms) by

Topic 9: The Laplace Transform o Introduction o Laplace Transform & Examples o Region of

18.175: Lecture 3 Random variables and distributions Scott Sheffield MIT 1 18.175 Lecture 3

18.175: Lecture 5 More integration and expectation Scott Sheffield MIT 1 18.175 Lecture 5 Outline

18.175: Lecture 11 Independent sums and large deviations Scott Sheffield MIT 1 18.175 Lecture 11

18.175: Lecture 13 More large deviations Scott Sheffield MIT 1 18.175 Lecture 13 Outline Legendre

18.175: Lecture 7 Sums of random variables Scott Sheffield MIT 1 18.175 Lecture 7 Outline

18.175: Lecture 23 Random walks Scott Sheffield MIT 18.175 Lecture 23 1 Outline Random walks

18.175: Lecture 18 Poisson random variables Scott Sheffield MIT 18.175 Lecture 18 1 Outline Extend

18.175: Lecture 4 Integration Scott Sheffield MIT 1 18.175 Lecture 4 Outline Integration

18.175: Lecture 32 More Markov chains Scott Sheffield MIT 1 18.175 Lecture 32 Outline General

18.175: Lecture 1 Probability spaces and -algebras Scott Sheffield MIT 1 18.175 Lecture 1

18.175: Lecture 15 Characteristic functions and central limit theorem Scott Sheffield MIT 1 18.175

18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 18.175 Lecture 16 1 Outline More

TOC Chapter 4. The Laplace Transform [part 1] 4.1 Preliminaries 4.2 Laplace Transform 4.3

Multistage robust convex optimization problems: A sampling based approach Fabrizio Dabbene/

Randomness in C 2 and Pluripotential Theory Randomness in C 2 and Pluripotential Theory Outline 1

Probability & Information Theory Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer

Derivative Free Optimization Optimization and AMS Masters - University Paris Saclay Exercices -

Scaling limit of random planar maps Lecture 2. Olivier Bernardi, CNRS, Universit Paris-Sud

Convergence to stable laws in the space D cois Roueff 1 Philippe Soulier 2 Fran Poitiers,

Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF Cours

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS & Telecom

Sambuz

Useful Links

Newsletter

Mail Us

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott - PowerPoint PPT Presentation

18.175: Lecture 12 DeMoivre-Laplace and weak convergence Scott Sheffield MIT 1 18.175 Lecture 12 Outline DeMoivre-Laplace limit theorem Weak convergence Characteristic functions 2 18.175 Lecture 12 Outline DeMoivre-Laplace limit theorem Weak

18.175: Lecture 14 Weak convergence and characteristic functions Scott Sheffield MIT 1 18.175

JUST THE MATHS SLIDES NUMBER 16.2 LAPLACE TRANSFORMS 2 (Inverse Laplace Transforms) by

Topic 9: The Laplace Transform o Introduction o Laplace Transform &amp; Examples o Region of

18.175: Lecture 3 Random variables and distributions Scott Sheffield MIT 1 18.175 Lecture 3

18.175: Lecture 5 More integration and expectation Scott Sheffield MIT 1 18.175 Lecture 5 Outline

18.175: Lecture 11 Independent sums and large deviations Scott Sheffield MIT 1 18.175 Lecture 11

18.175: Lecture 13 More large deviations Scott Sheffield MIT 1 18.175 Lecture 13 Outline Legendre

18.175: Lecture 7 Sums of random variables Scott Sheffield MIT 1 18.175 Lecture 7 Outline

18.175: Lecture 23 Random walks Scott Sheffield MIT 18.175 Lecture 23 1 Outline Random walks

18.175: Lecture 18 Poisson random variables Scott Sheffield MIT 18.175 Lecture 18 1 Outline Extend

18.175: Lecture 4 Integration Scott Sheffield MIT 1 18.175 Lecture 4 Outline Integration

18.175: Lecture 32 More Markov chains Scott Sheffield MIT 1 18.175 Lecture 32 Outline General

18.175: Lecture 1 Probability spaces and -algebras Scott Sheffield MIT 1 18.175 Lecture 1

18.175: Lecture 15 Characteristic functions and central limit theorem Scott Sheffield MIT 1 18.175

18.175: Lecture 17 Poisson random variables Scott Sheffield MIT 18.175 Lecture 16 1 Outline More

TOC Chapter 4. The Laplace Transform [part 1] 4.1 Preliminaries 4.2 Laplace Transform 4.3

Multistage robust convex optimization problems: A sampling based approach Fabrizio Dabbene/

Randomness in C 2 and Pluripotential Theory Randomness in C 2 and Pluripotential Theory Outline 1

Probability &amp; Information Theory Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer

Derivative Free Optimization Optimization and AMS Masters - University Paris Saclay Exercices -

Scaling limit of random planar maps Lecture 2. Olivier Bernardi, CNRS, Universit Paris-Sud

Convergence to stable laws in the space D cois Roueff 1 Philippe Soulier 2 Fran Poitiers,

Introduction to Stochastic Optimization January 13, 2015 P. Carpentier Master MMMEF Cours

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS &amp; Telecom

Sambuz

Useful Links

Newsletter

Mail Us

Topic 9: The Laplace Transform o Introduction o Laplace Transform & Examples o Region of

Probability & Information Theory Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer

Convergence and Efficiency of the Wang Landau algorithm Gersende FORT CNRS & Telecom