[PPT] - Convergence of Random Variables Saravanan Vijayakumaran PowerPoint Presentation

SLIDE 1

Convergence of Random Variables

Saravanan Vijayakumaran sarva@ee.iitb.ac.in

Department of Electrical Engineering Indian Institute of Technology Bombay

March 19, 2014

1 / 15

SLIDE 2

Motivation

Theorem (Weak Law of Large Numbers)

Let X1, X2, . . . be a sequence of independent identically distributed random variables with finite means µ. Their partial sums Sn = X1 + X2 + · · · + Xn satisfy Sn n

P

− → µ as n → ∞.

Theorem (Central Limit Theorem)

Let X1, X2, . . . be a sequence of independent identically distributed random variables with finite means µ and finite non-zero variance σ2. Their partial sums Sn = X1 + X2 + · · · + Xn satisfy √ n Sn n − µ

D

− → N(0, σ2) as n → ∞.

2 / 15

SLIDE 3

Modes of Convergence

A sequence of real numbers {xn : n = 1, 2, . . .} is said to converge to a

limit x if for all ε > 0 there exists an mε ∈ N such that |xn − x| < ε for all n ≥ mε.

We want to define convergence of random variables but they are

functions from Ω to R

The solution
Derive real number sequences from sequences of random

variables

Define convergence of the latter in terms of the former
Four ways of defining convergence for random variables
Convergence almost surely
Convergence in rth mean
Convergence in probability
Convergence in distribution

3 / 15

SLIDE 4

Convergence Almost Surely

Let X, X1, X2, . . . be random variables on a probability space (Ω, F, P)
For each ω ∈ Ω, X(ω) and Xn(ω) are reals
Xn → X almost surely if {ω ∈ Ω : Xn(ω) → X(ω) as n → ∞} is an event

whose probability is 1

“Xn → X almost surely” is abbreviated as Xn

a.s.

− − → X

Example

Let Ω = [0, 1] and P be the uniform distribution on Ω
P (ω ∈ [a, b]) = b − a for 0 ≤ a ≤ b ≤ 1
Let Xn be defined as

Xn(ω) =

n,

ω ∈

0, 1

n

0,

ω ∈ 1

n, 1

Let X(ω) = 0 for all ω ∈ [0, 1]
Xn

a.s.

− − → X

4 / 15

SLIDE 5

Convergence in rth Mean

Let X, X1, X2, . . . be random variables on a probability space (Ω, F, P)
Suppose E [|X r|] < ∞ and E [|X r

n|] < ∞ for all n

Xn → X in rth mean if

E

|Xn − X|r

→ 0 as n → ∞ where r ≥ 1

“Xn → X in rth mean” is abbreviated as Xn

r

− → X

For r = 1, Xn

1

− → X is written as “Xn → X in mean”

For r = 2, Xn

2

− → X is written as “Xn → X in mean square” or Xn

m.s.

− − → X

Example

Let Ω = [0, 1] and P be the uniform distribution on Ω
Let Xn be defined as

Xn(ω) =

n,

ω ∈

0, 1

n

0,

ω ∈ 1

n, 1

Let X(ω) = 0 for all ω ∈ [0, 1]
E[|Xn|] = 1 and so Xn does not converge in mean to X

5 / 15

SLIDE 6

Convergence in Probability

Let X, X1, X2, . . . be random variables on a probability space (Ω, F, P)
Xn → X in probability if

P (|Xn − X| > ǫ) → 0 as n → ∞ for all ǫ > 0

“Xn → X in probability” is abbreviated as Xn

P

− → X

Example

Let Ω = [0, 1] and P be the uniform distribution on Ω
Let Xn be defined as

Xn(ω) =

n,

ω ∈

0, 1

n

0,

ω ∈ 1

n, 1

Let X(ω) = 0 for all ω ∈ [0, 1]
For ε > 0, P[|Xn − X| > ε] = P[|Xn| > ε] ≤ P[Xn = n] = 1

n −

→ 0

Xn

P

− → X

6 / 15

SLIDE 7

Convergence in Distribution

Let X, X1, X2, . . . be random variables on a probability space (Ω, F, P)
Xn → X in distribution if

P (Xn ≤ x) → P (X ≤ x) as n → ∞ for all points x where FX(x) = P(X ≤ x) is continuous

“Xn → X in distribution” is abbreviated as Xn

D

− → X

Convergence in distribution is also termed weak convergence

Example

Let X be a Bernoulli RV taking values 0 and 1 with equal probability 1

2.

Let X1, X2, X3, . . . be identical random variables given by Xn = X for all n. The Xn’s are not independent but Xn

D

− → X. Let Y = 1 − X. Then Xn

D

− → Y. But |Xn − Y| = 1 and the Xn’s do not converge to Y in any other mode.

7 / 15

SLIDE 8

Relations between Modes of Convergence

Theorem (Xn

r

− → X) (Xn

a.s.

− − → X) (Xn

P

− → X) (Xn

D

− → X) ⇒ ⇒ ⇒

for any r ≥ 1.

8 / 15

SLIDE 9

Convergence in Probability Implies Convergence in Distribution

Suppose Xn

P

− → X

Let Fn(x) = P(Xn ≤ x) and F(x) = P(X ≤ x)
If ε > 0,

Fn(x) = P(Xn ≤ x) = P(Xn ≤ x, X ≤ x + ε) + P(Xn ≤ x, X > x + ε) ≤ F(x + ε) + P (|Xn − X| > ε) F(x − ε) = P(X ≤ x − ε) = P(X ≤ x − ε, Xn ≤ x) + P(X ≤ x − ε, Xn > x) ≤ Fn(x) + P (|Xn − X| > ε)

Combining the above inequalities we have

F(x − ε) − P (|Xn − X| > ε) ≤ Fn(x) ≤ F(x + ε) + P (|Xn − X| > ε)

If F is continuous at x, F(x − ε) −

→ F(x) and F(x + ε) − → F(x) as ε ↓ 0

Since Xn

P

− → X, P (|Xn − X| > ε) − → 0 as n − → ∞

9 / 15

SLIDE 10

Convergence in rth Mean Implies Convergence in Probability

If r > s ≥ 1 and Xn

r

− → X then Xn

s

− → X

Lyapunov’s inequality: If r > s > 0, then (E [|Y|s])

1 s ≤ (E [|Y|r]) 1 r

If Xn

r

− → X, then E [|Xn − X|r] − → 0 and (E [|Xn − X|s])

1 s ≤ (E [|Xn − X|r]) 1 r

If Xn

1

− → X then Xn

P

− → X

By Markov’s inequality, we have

P (|Xn − X| > ε) ≤ E (|Xn − X|) ε for all ε > 0

10 / 15

SLIDE 11

Convergence Almost Surely Implies Convergence in Probability

Let An(ε) = {|Xn − X|> ε} and Bm(ε) =

n≥m An(ε)

Xn

a.s.

− − → X if and only if P (Bm(ε)) − → 0 as m − → ∞, for all ε > 0

Let

C = {ω ∈ Ω : Xn(ω) − → X(ω) as n − → ∞} A(ε) = {ω ∈ Ω : ω ∈ An(ε) for infinitely many values of n} =

m

∞

n=m

An(ε)

Xn(ω) −

→ X(ω) if and only if ω / ∈ A(ε) for all ε > 0

P(C) = 1 if and only if P (A(ε)) = 0 for all ε > 0
Bm(ε) is a decreasing sequence of events with limit A(ε)
P(A(ε)) = 0 if and only if P (Bm(ε)) −

→ 0 as m − → ∞

Since An(ε) ⊆ Bn(ε), we have P (|Xn − X| > ε) = P (An(ε)) −

→ 0 whenever P (Bn(ε)) − → 0

Thus Xn

a.s.

− − → X = ⇒ Xn

P

− → X

11 / 15

SLIDE 12

Some Converses

If Xn

D

− → c, where c is a constant, then Xn

P

− → c P (|Xn − c| > ε) = P(Xn < c − ε) + P(Xn > c + ε) − → 0 if Xn

D

− → c

If Pn(ε) = P (|Xn − X| > ε) satisfies

n Pn(ε) < ∞ for all ε > 0, then

Xn

a.s.

− − → X

Let An(ε) = {|Xn − X|> ε} and Bm(ε) =

n≥m An(ε)

P (Bm(ε)) ≤

∞

n=m

P (An(ε)) =

∞

n=m

Pn(ε) − → 0 as m − → ∞

Xn

a.s.

− − → X if and only P (Bm(ε)) − → 0 as m − → ∞, for all ε > 0

12 / 15

SLIDE 13

Borel-Cantelli Lemmas

Let A1, A2, . . . be an infinite sequence of events from (Ω, F, P)
Consider the event that infinitely many of the An occur

A = {An i.o.} =

n

∞

m=n

Am

Theorem

Let A be the event that infinitely many of the An occur. Then

P(A) = 0 if

n P(An) < ∞,

P(A) = 1 if

n P(An) = ∞ and A1, A2, A3, . . . are independent events

Proof of first lemma.

We have A ⊆ ∞

m=n Am for all n

P(A) ≤

∞

m=n

P(Am) → 0 as n → 0

13 / 15

SLIDE 14

Proof of Second Borel-Cantelli Lemma

Ac =

n

∞

m=n

Ac

m

P ∞

m=n

Ac

m

=

lim

r→∞ P

r
m=n

Ac

m

= lim

r→∞ r

m=n

[1 − P(Am)] =

∞

m=n

[1 − P(Am)] ≤

∞

m=n

exp [−P(Am)] = exp

−

∞

m=n

P(Am)

= 0

Thus P(Ac) = lim

n→∞ P

∞

m=n

Ac

m

= 0

14 / 15

SLIDE 15

Reference

Chapter 7, Probability and Random Processes, Grimmett

and Stirzaker, Third Edition, 2001.

15 / 15