statistics
play

Statistics Asymptotic Theory Shiu-Sheng Chen Department of - PowerPoint PPT Presentation

Statistics Asymptotic Theory Shiu-Sheng Chen Department of Economics National Taiwan University Fall 2019 Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 1 / 28 Asymptotic Theory: Motivation Asymptotic theory (or large sample theory) aims


  1. Statistics Asymptotic Theory Shiu-Sheng Chen Department of Economics National Taiwan University Fall 2019 Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 1 / 28

  2. Asymptotic Theory: Motivation Asymptotic theory (or large sample theory) aims at answering the question: what happens as we gather more and more data? In particular, given random sample, { X 1 , X 2 , X 3 , . . . , X n } , and statistic: T n = t ( X 1 , X 2 , . . . , X n ) , what is the limiting behavior of T n as n � → ∞ ? Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 2 / 28

  3. Asymptotic Theory: Motivation Why asking such a question? For instance, given random sample { X i } n i = 1 ∼ i . i . d . N ( µ , σ 2 ) , we know that X n ∼ N ( µ , σ 2 n ) ¯ i = 1 ∼ i . i . d . ( µ , σ 2 ) without normal assumption, However, if { X i } n what is the distribution of ¯ X n ? We don’t know, indeed. Is it possible to find a good approximation of the distribution of X n as n � → ∞ ? ¯ Yes! This is where the asymptotic theory kicks in. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 3 / 28

  4. Preliminary Knowledge Section 1 Preliminary Knowledge Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 4 / 28

  5. Preliminary Knowledge Preliminary Knowledge Limit Markov Inequality Chebyshev Inequality Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 5 / 28

  6. Preliminary Knowledge Limit of a Real Sequence Definition (Limit) If for every ε > 0 , and an integer N ( ε ) , ∣ b n − b ∣ < ε , ∀ n > N ( ε ) then we say that a sequence of real numbers { b 1 , . . . , b n } converges to a limit b . It is denoted by n → ∞ b n = b lim Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 6 / 28

  7. Preliminary Knowledge Markov Inequality Theorem (Markov Inequality) Suppose that X is a random variable such that P ( X ≥ 0 ) = 1 . Then for every real number m > 0 , P ( X ≥ m ) ≤ E ( X ) m Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 7 / 28

  8. Preliminary Knowledge Chebyshev Inequality Theorem (Chebyshev Inequality) Let Y ∼ ( E ( Y ) , Var ( Y )) . Then for every number ε > 0 , P (∣ Y − E ( Y )∣ ≥ ε ) ≤ Var ( Y ) ε 2 Proof: Let X = [ Y − E ( Y )] 2 , then P ( X ≥ 0 ) = 1 and E ( X ) = Var ( Y ) Hence, the result can be derived by applying the Markov Inequality. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 8 / 28

  9. Modes of Convergence Section 2 Modes of Convergence Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 9 / 28

  10. Modes of Convergence Types of Convergence For a random variable, we consider three modes of convergence: Converge in Probability Converge in Distribution Converge in Mean Square Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 10 / 28

  11. Modes of Convergence Converge in Probability Definition (Converge in Probability) Let { Y n } be a sequence of random variables and let Y be another random variable. For any ε > 0 , P (∣ Y n − Y ∣ < ε ) � → 1, as n � → ∞ then we say that Y n converges in probability to Y , and denote it by p Y n → Y � Equivalently, P (∣ Y n − Y ∣ ≥ ε ) � → 0, as n � → ∞ Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 11 / 28

  12. Modes of Convergence Converge in Probability { X i } n i = 1 ∼ i . i . d . Bernoulli ( 0.5 ) and then compute Y n = ¯ X n = ∑ i X i n p In this case, Y n � → 0.5 1.0 0.8 z 0.6 0.4 0.2 0 200 400 600 800 1000 toss Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 12 / 28

  13. Modes of Convergence Converge in Distribution Definition (Converge in Distribution) Let { Y n } be a sequence of random variables with distribution function F Y n ( y ) , (denoted by F n ( y ) for simplicity). Let Y be another random variable with distribution function, F Y ( y ) . If n → ∞ F n ( y ) = F Y ( y ) at all y for which F Y ( y ) is continuous lim then we say that Y n converges in distribution to Y . It is denoted by d Y n � → Y F Y ( y ) is called the limiting distribution of Y n . Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 13 / 28

  14. Modes of Convergence Converge in Mean Square Definition (Converge in Mean Square) Let { Y n } be a sequence of random variables and let Y be another random variable. If E ( Y n − Y ) 2 � → 0, as n � → ∞ . Then we say that Y n converges in mean square to Y . It is denoted by ms Y n → Y � It is also called converge in quadratic mean. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 14 / 28

  15. Important Theorems Section 3 Important Theorems Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 15 / 28

  16. Important Theorems Theorems Theorem � → c if and only if ms Y n n → ∞ E ( Y n ) = c , and lim n → ∞ Var ( Y n ) = 0. lim Proof. It can be shown that E ( Y n − c ) 2 = E ([ Y n − E ( Y n )] 2 ) + [ E ( Y n ) − c ] 2 Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 16 / 28

  17. Important Theorems Theorems Theorem � → Y then Y n � → Y p ms If Y n Proof: Note that P (∣ Y n − Y ∣ 2 ≥ 0 ) = 1 , and by Markov Inequality, P (∣ Y n − Y ∣ ≥ k ) = P (∣ Y n − Y ∣ 2 ≥ k 2 ) ≤ E (∣ Y n − Y ∣ 2 ) k 2 Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 17 / 28

  18. Important Theorems Weak Law of Large Numbers, WLLN Theorem (WLLN) i = 1 with σ 2 = Var ( X 1 ) < ∞ . Let ¯ Given a random sample { X i } n X n denote the sample mean, and note that E ( ¯ X n ) = E ( X 1 ) = µ . Then � → µ p ¯ X n Proof: (1) By Chebyshev Inequality (2) By Converge in Mean Square Sample mean ¯ X n is getting closer (in probability sense) to the population mean µ as the sample size increases. That is, if we use ¯ X n as a guess of unknown µ , we are quite happy that the sample mean makes a good guess. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 18 / 28

  19. Important Theorems WLLN for Other Moments Note that the WLLN can be thought as = X 1 + X 2 + ⋯ X n ∑ n i = 1 X i � → E ( X 1 ) p n n Let Y = X 2 , and by the WLLN, = Y 1 + Y 2 + ⋯ Y n ∑ n i = 1 Y i � → E ( Y 1 ) p n n Hence, ∑ n 1 + X 2 2 + ⋯ X 2 i = 1 X 2 = X 2 � → E ( X 2 1 ) p i n n n Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 19 / 28

  20. Important Theorems Example: An Application of WLLN Assume W n ∼ Binomial ( n , µ ) , and let Y n = W n n . Then � → µ p Y n Why? Since W n = ∑ i X i , X i ∼ i . i . d . Bernoulli( µ ) with E ( X 1 ) = µ , Var ( X 1 ) = µ ( 1 − µ ) , the result follows by WLLN. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 20 / 28

  21. Important Theorems Central Limit Theorem, CLT Theorem (CLT) Let { X i } n i = 1 be a random sample, where E ( X 1 ) = µ < ∞ , Var ( X 1 ) = σ 2 < ∞ , then √ n ( ¯ X n − E ( ¯ X n ) X n − µ ) ¯ Z n = √ = � → N ( 0, 1 ) d Var ( ¯ X n ) σ If a random sample is taken from any distribution with mean µ and variance σ 2 , regardless of whether this distribution is discrete or continuous, then the distribution of the random variable Z n will be approximately the standard normal distribution in large sample. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 21 / 28

  22. Important Theorems CLT Using notation of asymptotic distribution, X n − µ ¯ √ ∼ A N ( 0, 1 ) , σ 2 n Or X n ∼ A N ( µ , σ 2 n ) , ¯ where ∼ A represents asymptotic distribution, and A represents Asymptotically Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 22 / 28

  23. Important Theorems An Application of CLT Example: Assume { X i } ∼ i.i.d.Bernoulli( µ ), then X n − µ ¯ √ � → N ( 0, 1 ) . d µ ( 1 − µ ) n Why? Since E ( ¯ X n ) = µ , and Var ( ¯ X n ) = σ 2 n = µ ( 1 − µ ) n Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 23 / 28

  24. Important Theorems Continuous Mapping Theorem Theorem (CMT) � → Y , and g ( ⋅ ) is continuous, then p Given Y n g ( Y n ) � → g ( Y ) . p Proof: omitted here. � → Y , then p Examples: if Y n � → 1 p 1 Y n Y � → Y 2 p Y 2 √ √ Y n n � → p Y Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 24 / 28

  25. Important Theorems Theorem Theorem � → W and Y n � → Y , then p p Given W n W n + Y n � → W + Y p � → WY p W n Y n Proof: omitted here. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 25 / 28

  26. Important Theorems Slutsky Theorem Theorem � → W and Y n � → c , where c is a constant. Then p d Given W n W n + Y n � → W + c d � → cW d W n Y n � → W c for c ≠ 0 d W n Y n Proof: omitted here. Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 26 / 28

  27. Important Theorems The Delta Method Theorem Given √ n ( Y n − θ ) � → N ( 0, σ 2 ) . Let g ( ⋅ ) be differentiable, and d g ′ ( θ ) ≠ 0 exists, then √ n ( g ( Y n ) − g ( θ )) � → N ( 0, [ g ′ ( θ )] 2 σ 2 ) . d Proof: (sketch) Given 1st-order Taylor approximation g ( Y n ) ≈ g ( θ ) + g ′ ( θ )( Y n − θ ) , then √ n ( g ( Y n ) − g ( θ )) ≈ √ n ( Y n − θ ) � → N ( 0, σ 2 ) d g ′ ( θ ) Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 27 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend