18.175: Lecture 11 Independent sums and large deviations Scott - - PowerPoint PPT Presentation

18 175 lecture 11 independent sums and large deviations
SMART_READER_LITE
LIVE PREVIEW

18.175: Lecture 11 Independent sums and large deviations Scott - - PowerPoint PPT Presentation

18.175: Lecture 11 Independent sums and large deviations Scott Sheffield MIT 1 18.175 Lecture 11 Outline Recollections Large deviations 2 18.175 Lecture 11 Outline Recollections Large deviations 3 18.175 Lecture 11 Recall Borel-Cantelli lemmas S


slide-1
SLIDE 1

18.175: Lecture 11 Independent sums and large deviations

Scott Sheffield

MIT

18.175 Lecture 11

1

slide-2
SLIDE 2

Outline

Recollections Large deviations

18.175 Lecture 11

2

slide-3
SLIDE 3

Outline

Recollections Large deviations

18.175 Lecture 11

3

slide-4
SLIDE 4

Recall Borel-Cantelli lemmas

S∞

First Borel-Cantelli lemma: If

P(An) < ∞ then

n=1

P(An i.o.) = 0.

Second Borel-Cantelli lemma: If An are independent, then

S∞ P(An) = ∞ implies P(An i.o.) = 1.

n=1

18.175 Lecture 11

4

slide-5
SLIDE 5
  • Kolmogorov zero-one law

Consider sequence of random variables Xn on some probability

  • space. Write F = σ(Xn, Xn1 , . . .) and T = ∩nF.

n n

T is called the tail σ-algebra. It contains the information you can observe by looking only at stuff arbitrarily far into the

  • future. Intuitively, membership in tail event doesn’t change

when finitely many Xn are changed. Event that Xn converge to a limit is example of a tail event. Other examples? Theorem: If X1, X2, . . . are independent and A ∈ T then P(A) ∈ {0, 1}.

18.175 Lecture 11

5

slide-6
SLIDE 6
  • Kolmogorov maximal inequality

Thoerem: Suppose Xi are independent with mean zero and Sn finite variances, and Sn =

i=1 Xn. Then

P( max |Sk | ≥ x) ≤ x

−2Var(Sn) = x −2E |Sn|2 . 1≤k≤n

Main idea of proof: Consider first time maximum is

  • exceeded. Bound below the expected square sum on that

event.

18.175 Lecture 11

6

slide-7
SLIDE 7
  • Kolmogorov three-series theorem

Theorem: Let X1, X2, . . . be independent and fix A > 0. S Write Yi = Xi 1(|Xi |≤A). Then Xi converges a.s. if and only if the following are all true:

S∞ P(|Xn| > A) < ∞

n=1

S∞

  • EYn converges

n=1

S∞

  • Var(Yn) < ∞

n=1

Main ideas behind the proof: Kolmogorov zero-one law S implies that Xi converges with probability p ∈ {0, 1}. We just have to show that p = 1 when all hypotheses are satisfied (sufficiency of conditions) and p = 0 if any one of them fails (necessity). To prove sufficiency, apply Borel-Cantelli to see that probability that Xn i.o. is zero. Subtract means from = Yn Yn, reduce to case that each Yn has mean zero. Apply Kolmogorov maximal inequality.

18.175 Lecture 11

7

slide-8
SLIDE 8

Outline

Recollections Large deviations

8

18.175 Lecture 11

slide-9
SLIDE 9

Outline

Recollections Large deviations

18.175 Lecture 11

9

slide-10
SLIDE 10
  • Recall: moment generating functions

Let X be a random variable. The moment generating function of X is defined by M(t) = MX (t) := E [etX ]. S

tx

When X is discrete, can write M(t) = e pX (x). So M(t)

x

is a weighted average of countably many exponential functions. ∞ When X is continuous, can write M(t) = etx f (x)dx. So

−∞

M(t) is a weighted average of a continuum of exponential functions. We always have M(0) = 1. If b > 0 and t > 0 then

tX ] ≥ E [et min{X ,b}] ≥ P{X ≥ b}etb

E [e . If X takes both positive and negative values with positive probability then M(t) grows at least exponentially fast in |t| as |t| → ∞.

18.175 Lecture 11

10

slide-11
SLIDE 11
  • Recall: moment generating functions for i.i.d. sums

We showed that if Z = X + Y and X and Y are independent, then MZ (t) = MX (t)MY (t) If X1 . . . Xn are i.i.d. copies of X and Z = X1 + . . . + Xn then what is MZ ? Answer: MX

n . Follows by repeatedly applying formula above.

This a big reason for studying moment generating functions. It helps us understand what happens when we sum up a lot of independent copies of the same random variable.

18.175 Lecture 11

11

slide-12
SLIDE 12
  • Large deviations

Consider i.i.d. random variables Xi . Want to show that if φ(θ) := MXi (θ) = E exp(θXi ) is less than infinity for some θ > 0, then P(Sn ≥ na) → 0 exponentially fast when a > E [Xi ]. Kind of a quantitative form of the weak law of large numbers. The empirical average An is very unlikely to E away from its expected value (where “very” means with probability less than some exponentially decaying function of n).

1

Write γ(a) = limn→∞ log P(Sn ≥ na). It gives the “rate” of

n

exponential decay as a function of a.

18.175 Lecture 11

12

slide-13
SLIDE 13

MIT OpenCourseWare http://ocw.mit.edu

18.175 Theory of Probability

Spring 2014 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.