Statistics Asymptotic Theory Shiu-Sheng Chen Department of - - PowerPoint PPT Presentation

statistics
SMART_READER_LITE
LIVE PREVIEW

Statistics Asymptotic Theory Shiu-Sheng Chen Department of - - PowerPoint PPT Presentation

Statistics Asymptotic Theory Shiu-Sheng Chen Department of Economics National Taiwan University Fall 2019 Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 1 / 28 Asymptotic Theory: Motivation Asymptotic theory (or large sample theory) aims


slide-1
SLIDE 1

Statistics

Asymptotic Theory Shiu-Sheng Chen

Department of Economics National Taiwan University

Fall 2019

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 1 / 28

slide-2
SLIDE 2

Asymptotic Theory: Motivation Asymptotic theory (or large sample theory) aims at answering the question: what happens as we gather more and more data? In particular, given random sample, {X1, X2, X3, . . . , Xn}, and statistic: Tn = t(X1, X2, . . . , Xn), what is the limiting behavior of Tn as n → ∞?

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 2 / 28

slide-3
SLIDE 3

Asymptotic Theory: Motivation Why asking such a question? For instance, given random sample {Xi}n

i=1 ∼i.i.d. N(µ, σ2), we

know that ¯ Xn ∼ N (µ, σ2 n ) However, if {Xi}n

i=1 ∼i.i.d. (µ, σ2) without normal assumption,

what is the distribution of ¯ Xn?

We don’t know, indeed.

Is it possible to find a good approximation of the distribution of ¯ Xn as n → ∞?

Yes! This is where the asymptotic theory kicks in.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 3 / 28

slide-4
SLIDE 4

Preliminary Knowledge

Section 1 Preliminary Knowledge

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 4 / 28

slide-5
SLIDE 5

Preliminary Knowledge

Preliminary Knowledge Limit Markov Inequality Chebyshev Inequality

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 5 / 28

slide-6
SLIDE 6

Preliminary Knowledge

Limit of a Real Sequence Definition (Limit) If for every ε > 0, and an integer N(ε), ∣bn − b∣ < ε, ∀ n > N(ε) then we say that a sequence of real numbers {b1, . . . , bn} converges to a limit b. It is denoted by lim

n→∞ bn = b

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 6 / 28

slide-7
SLIDE 7

Preliminary Knowledge

Markov Inequality Theorem (Markov Inequality) Suppose that X is a random variable such that P(X ≥ 0) = 1. Then for every real number m > 0, P(X ≥ m) ≤ E(X) m

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 7 / 28

slide-8
SLIDE 8

Preliminary Knowledge

Chebyshev Inequality Theorem (Chebyshev Inequality) Let Y ∼ (E(Y), Var(Y)). Then for every number ε > 0, P(∣Y − E(Y)∣ ≥ ε) ≤ Var(Y) ε2 Proof: Let X = [Y − E(Y)]2, then P(X ≥ 0) = 1 and E(X) = Var(Y) Hence, the result can be derived by applying the Markov Inequality.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 8 / 28

slide-9
SLIDE 9

Modes of Convergence

Section 2 Modes of Convergence

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 9 / 28

slide-10
SLIDE 10

Modes of Convergence

Types of Convergence For a random variable, we consider three modes of convergence: Converge in Probability Converge in Distribution Converge in Mean Square

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 10 / 28

slide-11
SLIDE 11

Modes of Convergence

Converge in Probability Definition (Converge in Probability) Let {Yn} be a sequence of random variables and let Y be another random variable. For any ε > 0, P(∣Yn − Y∣ < ε) → 1, as n → ∞ then we say that Yn converges in probability to Y, and denote it by Yn

p

  • → Y

Equivalently, P(∣Yn − Y∣ ≥ ε) → 0, as n → ∞

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 11 / 28

slide-12
SLIDE 12

Modes of Convergence

Converge in Probability {Xi}n

i=1 ∼i.i.d. Bernoulli(0.5) and then compute Yn = ¯

Xn = ∑i Xi

n

In this case, Yn

p

  • → 0.5

200 400 600 800 1000 0.2 0.4 0.6 0.8 1.0 toss z

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 12 / 28

slide-13
SLIDE 13

Modes of Convergence

Converge in Distribution Definition (Converge in Distribution) Let {Yn} be a sequence of random variables with distribution function FYn(y), (denoted by Fn(y) for simplicity). Let Y be another random variable with distribution function, FY(y). If lim

n→∞ Fn(y) = FY(y) at all y for which FY(y) is continuous

then we say that Yn converges in distribution to Y. It is denoted by Yn

d

  • → Y

FY(y) is called the limiting distribution of Yn.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 13 / 28

slide-14
SLIDE 14

Modes of Convergence

Converge in Mean Square Definition (Converge in Mean Square) Let {Yn} be a sequence of random variables and let Y be another random variable. If E(Yn − Y)2 → 0, as n → ∞. Then we say that Yn converges in mean square to Y. It is denoted by Yn

ms

  • → Y

It is also called converge in quadratic mean.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 14 / 28

slide-15
SLIDE 15

Important Theorems

Section 3 Important Theorems

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 15 / 28

slide-16
SLIDE 16

Important Theorems

Theorems Theorem Yn

ms

  • → c if and only if

lim

n→∞ E(Yn) = c, and lim n→∞ Var(Yn) = 0.

  • Proof. It can be shown that

E(Yn − c)2 = E([Yn − E(Yn)]2) + [E(Yn) − c]2

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 16 / 28

slide-17
SLIDE 17

Important Theorems

Theorems Theorem If Yn

ms

  • → Y then Yn

p

  • → Y

Proof: Note that P(∣Yn − Y∣2 ≥ 0) = 1, and by Markov Inequality, P(∣Yn − Y∣ ≥ k) = P(∣Yn − Y∣2 ≥ k2) ≤ E(∣Yn − Y∣2) k2

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 17 / 28

slide-18
SLIDE 18

Important Theorems

Weak Law of Large Numbers, WLLN Theorem (WLLN) Given a random sample {Xi}n

i=1 with σ2 = Var(X1) < ∞. Let ¯

Xn denote the sample mean, and note that E( ¯ Xn) = E(X1) = µ. Then ¯ Xn

p

  • → µ

Proof: (1) By Chebyshev Inequality (2) By Converge in Mean Square Sample mean ¯ Xn is getting closer (in probability sense) to the population mean µ as the sample size increases. That is, if we use ¯ Xn as a guess of unknown µ, we are quite happy that the sample mean makes a good guess.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 18 / 28

slide-19
SLIDE 19

Important Theorems

WLLN for Other Moments Note that the WLLN can be thought as ∑n

i=1 Xi

n = X1 + X2 + ⋯Xn n

p

  • → E(X1)

Let Y = X2, and by the WLLN, ∑n

i=1 Yi

n = Y1 + Y2 + ⋯Yn n

p

  • → E(Y1)

Hence, ∑n

i=1 X2 i

n = X2

1 + X2 2 + ⋯X2 n

n

p

  • → E(X2

1 )

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 19 / 28

slide-20
SLIDE 20

Important Theorems

Example: An Application of WLLN Assume Wn ∼ Binomial(n, µ), and let Yn = Wn

n . Then

Yn

p

  • → µ

Why? Since Wn = ∑i Xi, Xi ∼i.i.d.Bernoulli(µ) with E(X1) = µ, Var(X1) = µ(1 − µ), the result follows by WLLN.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 20 / 28

slide-21
SLIDE 21

Important Theorems

Central Limit Theorem, CLT Theorem (CLT) Let {Xi}n

i=1 be a random sample, where E(X1) = µ < ∞,

Var(X1) = σ2 < ∞, then Zn = ¯ Xn − E( ¯ Xn) √ Var( ¯ Xn) = √n( ¯ Xn − µ) σ

d

  • → N(0, 1)

If a random sample is taken from any distribution with mean µ and variance σ2, regardless of whether this distribution is discrete

  • r continuous, then the distribution of the random variable Zn will

be approximately the standard normal distribution in large sample.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 21 / 28

slide-22
SLIDE 22

Important Theorems

CLT Using notation of asymptotic distribution, ¯ Xn − µ √

σ2 n

∼A N(0, 1), Or ¯ Xn ∼A N (µ, σ2 n ) , where ∼A represents asymptotic distribution, and A represents Asymptotically

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 22 / 28

slide-23
SLIDE 23

Important Theorems

An Application of CLT Example: Assume {Xi} ∼i.i.d.Bernoulli(µ), then ¯ Xn − µ √

µ(1−µ) n d

  • → N(0, 1).

Why? Since E( ¯ Xn) = µ, and Var( ¯ Xn) = σ2

n = µ(1−µ) n

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 23 / 28

slide-24
SLIDE 24

Important Theorems

Continuous Mapping Theorem Theorem (CMT) Given Yn

p

  • → Y, and g(⋅) is continuous, then

g(Yn)

p

  • → g(Y).

Proof: omitted here. Examples: if Yn

p

  • → Y, then

1 Yn p

  • → 1

Y

Y2

n p

  • → Y2

√Yn

p

√ Y

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 24 / 28

slide-25
SLIDE 25

Important Theorems

Theorem Theorem Given Wn

p

  • → W and Yn

p

  • → Y, then

Wn + Yn

p

  • → W + Y

WnYn

p

  • → WY

Proof: omitted here.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 25 / 28

slide-26
SLIDE 26

Important Theorems

Slutsky Theorem Theorem Given Wn

d

  • → W and Yn

p

  • → c, where c is a constant. Then

Wn + Yn

d

  • → W + c

WnYn

d

  • → cW

Wn Yn d

  • → W

c for c ≠ 0

Proof: omitted here.

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 26 / 28

slide-27
SLIDE 27

Important Theorems

The Delta Method Theorem Given √n(Yn − θ)

d

  • → N(0, σ2). Let g(⋅) be differentiable, and

g′(θ) ≠ 0 exists, then √n(g(Yn) − g(θ))

d

  • → N(0, [g′(θ)]2σ2).

Proof: (sketch) Given 1st-order Taylor approximation g(Yn) ≈ g(θ) + g′(θ)(Yn − θ), then √n(g(Yn) − g(θ)) g′(θ) ≈ √n(Yn − θ)

d

  • → N(0, σ2)

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 27 / 28

slide-28
SLIDE 28

Important Theorems

Example Given {Xi}n

i=1 ∼i.i.d. (µ, σ2), find the asymptotic distribution of ¯ Xn 1− ¯ Xn .

Note that by CLT, √n( ¯ Xn − µ)

d

  • → N(0, σ2)

Hence, by the Delta method, g( ¯ Xn) = ¯ Xn 1 − ¯ Xn , g(µ) = µ 1 − µ , g′(µ) = 1 (1 − µ)2 √n ( ¯ Xn 1 − ¯ Xn − µ 1 − µ)

d

  • → N (0,

1 (1 − µ)4 σ2)

Shiu-Sheng Chen (NTU Econ) Statistics Fall 2019 28 / 28