Convergence of Random Processes DS GA 1002 Probability and - - PowerPoint PPT Presentation

convergence of random processes
SMART_READER_LITE
LIVE PREVIEW

Convergence of Random Processes DS GA 1002 Probability and - - PowerPoint PPT Presentation

Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17 Carlos Fernandez-Granda, Brett Bernstein Review Question Let X ( 1 ) , . . . , X ( n ) be iid


slide-1
SLIDE 1

Convergence of Random Processes

DS GA 1002 Probability and Statistics for Data Science

http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17 Carlos Fernandez-Granda, Brett Bernstein

slide-2
SLIDE 2

Review Question

Let X(1), . . . , X(n) be iid random variables each having mean µ and variance σ2.

  • 1. What is E[

X(1) + · · · + X(n)]?

  • 2. What is Std[

X(1) + · · · + X(n)] :=

  • Var[

X(1) + · · · + X(n)]?

slide-3
SLIDE 3

Review Question

Let X(1), . . . , X(n) be iid random variables each having mean µ and variance σ2.

  • 1. What is E[

X(1) + · · · + X(n)]?

  • Solution. nµ
  • 2. What is Std[

X(1) + · · · + X(n)] :=

  • Var[

X(1) + · · · + X(n)]?

slide-4
SLIDE 4

Review Question

Let X(1), . . . , X(n) be iid random variables each having mean µ and variance σ2.

  • 1. What is E[

X(1) + · · · + X(n)]?

  • Solution. nµ
  • 2. What is Std[

X(1) + · · · + X(n)] :=

  • Var[

X(1) + · · · + X(n)]?

  • Solution. σ√n
slide-5
SLIDE 5

An Experiment In Coin Flipping

We will repeatedly flip a fair coin, and count how many heads we get. More formally, let S(i) = X(1) + · · · + X(i) where X(1), . . . are iid Bernoulli random variables with p = 1/2.

slide-6
SLIDE 6

An Experiment In Coin Flipping

i=10 i=20 i=30

slide-7
SLIDE 7

An Experiment In Coin Flipping

i=50 i=100

slide-8
SLIDE 8

An Experiment In Coin Flipping: 5000 flip sequences

i=50 i=100

slide-9
SLIDE 9

An Experiment In Coin Flipping: 5000 flip sequences

Brighter color means more occurences

i=50 i=100

slide-10
SLIDE 10

An Experiment In Coin Flipping: 5000 flip sequences

−4σ √ i −3σ √ i −2σ √ i −1σ √ i µi +1σ √ i +2σ √ i +3σ √ i +4σ √ i i=50 i=100

What if we take averages instead of sums?

slide-11
SLIDE 11

An Experiment In Coin Flipping: 5000 flip sequences

Averages (i.e., divide by i)

−4σ/ √ i −3σ/ √ i −2σ/ √ i −1σ/ √ i µ +1σ/ √ i +2σ/ √ i +3σ/ √ i +4σ/ √ i i=100 i=200

slide-12
SLIDE 12

An Experiment In Coin Flipping: 5000 flip sequences

Averages (i.e., divide by i)

µ i=1000 i=2000

slide-13
SLIDE 13

An Experiment In Coin Flipping: 5000 flip sequences

−4σ √ i −3σ √ i −2σ √ i −1σ √ i µi +1σ √ i +2σ √ i +3σ √ i +4σ √ i i=50 i=100

How do we isolate the fluctuations about the mean?

slide-14
SLIDE 14

An Experiment In Coin Flipping: 5000 flip sequences

Subtract µi

−4σ √ i −3σ √ i −2σ √ i −1σ √ i +1σ √ i +2σ √ i +3σ √ i +4σ √ i i=100 i=200

How do we normalize the scale?

slide-15
SLIDE 15

An Experiment In Coin Flipping: 5000 flip sequences

Subtract µi and then divide by √ i

−4σ −3σ −2σ −1σ +1σ +2σ +3σ +4σ i=100 i=200

slide-16
SLIDE 16

An Experiment In Coin Flipping: 5000 flip sequences

Subtract µi and then divide by √ i

−4σ −3σ −2σ −1σ +1σ +2σ +3σ +4σ i=250 i=500

slide-17
SLIDE 17

Aim

How do we rigorously describe the experiments we just conducted?

  • 1. Define convergence for random processes
  • 2. Illustrate some of the subtleties associated with different forms of

convergence

  • 3. Describe two convergence phenomena: the law of large numbers and

the central limit theorem

slide-18
SLIDE 18

Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation

slide-19
SLIDE 19

Convergence of deterministic sequences

A deterministic sequence of real numbers x1, x2, . . . converges to x ∈ R, lim

i→∞ xi = x

if xi is arbitrarily close to x as i grows For any ǫ > 0 there is an i0 such that for all i > i0, |xi − x| < ǫ Problem: Random sequences do not have fixed values

slide-20
SLIDE 20

Convergence with probability one

Consider a discrete random process X and a random variable X defined on the same probability space If we fix the outcome ω, X (i, ω) is a deterministic sequence and X (ω) is a constant We can determine whether lim

i→∞

  • X (i, ω) = X (ω)

for that particular ω

slide-21
SLIDE 21

Convergence with probability one

Ω ω1 ω2 ˜ X(0, ω) ˜ X(1, ω) ˜ X(2, ω) ˜ X(3, ω) ˜ X(4, ω) ˜ X(5, ω)

slide-22
SLIDE 22

Convergence with probability one

  • X converges with probability one to X if

P

  • ω | ω ∈ Ω,

lim

i→∞

  • X (ω, i) = X (ω)
  • = 1

Deterministic convergence occurs with probability one Also called almost sure convergence

slide-23
SLIDE 23

Puddle

Initial amount of water is uniform between 0 and 1 gallon After a time interval i there is i times less water

  • D (ω, i) := ω

i , i = 1, 2, . . .

slide-24
SLIDE 24

Puddle

1 2 3 4 5 6 7 8 9 10 0.2 0.4 0.6 0.8 i

  • D (ω, i)

ω = 0.31 ω = 0.89 ω = 0.52

slide-25
SLIDE 25

Puddle

If we fix ω ∈ (0, 1) lim

i→∞

  • D (ω, i) = lim

i→∞

ω i = 0

  • D converges to zero with probability one
slide-26
SLIDE 26

Puddle

10 20 30 40 50 0.5 1 i

  • D (ω, i)
slide-27
SLIDE 27

Alternative idea

Idea: Instead of fixing ω and checking deterministic convergence:

  • 1. Measure how close

X (i) and X are for a fixed i using a deterministic quantity

  • 2. Check whether the quantity tends to zero
slide-28
SLIDE 28

Convergence in mean square

The mean square of Y − X is a measure of how close X and Y are If E

  • (X − Y )2

= 0 then X = Y with probability one Proof: By Markov’s inequality for any ǫ > 0 P

  • (Y − X)2 > ǫ

E

  • (X − Y )2

ǫ = 0

slide-29
SLIDE 29

Convergence in mean square

  • X converges to X in mean square if

lim

i→∞ E

  • X −

X (i) 2 = 0

slide-30
SLIDE 30

Convergence in probability

Alternative measure: Probability that |Y − X| > ǫ for small ǫ

  • X converges to X in probability if for any ǫ > 0

lim

i→∞ P

  • X −

X (i)

  • > ǫ
  • = 0
slide-31
SLIDE 31
  • Conv. in mean square implies conv. in probability

lim

i→∞ P

  • X −

X (i)

  • > ǫ
slide-32
SLIDE 32
  • Conv. in mean square implies conv. in probability

lim

i→∞ P

  • X −

X (i)

  • > ǫ
  • = lim

i→∞ P

  • X −

X (i) 2 > ǫ2

slide-33
SLIDE 33
  • Conv. in mean square implies conv. in probability

lim

i→∞ P

  • X −

X (i)

  • > ǫ
  • = lim

i→∞ P

  • X −

X (i) 2 > ǫ2

  • ≤ lim

i→∞

E

  • X −

X (i) 2 ǫ2

slide-34
SLIDE 34
  • Conv. in mean square implies conv. in probability

lim

i→∞ P

  • X −

X (i)

  • > ǫ
  • = lim

i→∞ P

  • X −

X (i) 2 > ǫ2

  • ≤ lim

i→∞

E

  • X −

X (i) 2 ǫ2 = 0

slide-35
SLIDE 35
  • Conv. in mean square implies conv. in probability

lim

i→∞ P

  • X −

X (i)

  • > ǫ
  • = lim

i→∞ P

  • X −

X (i) 2 > ǫ2

  • ≤ lim

i→∞

E

  • X −

X (i) 2 ǫ2 = 0 Convergence with probability one also implies convergence in probability

slide-36
SLIDE 36

Convergence in distribution

The distribution of X (i) converges to the distribution of X

  • X converges in distribution to X if

lim

i→∞ F X(i) (x) = FX (x)

for all x at which FX is continuous

slide-37
SLIDE 37

Convergence in distribution

Convergence in distribution does not imply that X (i) and X are close as i → ∞! Convergence in probability does imply convergence in distribution

slide-38
SLIDE 38

Binomial tends to Poisson (Review)

X (i) is binomial with parameters i and p := λ/i (for i > λ)

◮ X is a Poisson random variable with parameter λ ◮

X (i) converges to X in distribution lim

i→∞ p X(i) (x) = lim i→∞

i x

  • px (1 − p)(i−x)

= λx e−λ x! = pX (x)

slide-39
SLIDE 39

Probability mass function of X (40) with λ = 20

Binomial with n = 40 and p = 20/40

10 20 30 40 5 · 10−2 0.1 0.15 k

slide-40
SLIDE 40

Probability mass function of X (80) with λ = 20

Binomial with n = 80 and p = 20/80

10 20 30 40 5 · 10−2 0.1 0.15 k

slide-41
SLIDE 41

Probability mass function of X (400) with λ = 20

Binomial with n = 400 and p = 20/400

10 20 30 40 5 · 10−2 0.1 0.15 k

slide-42
SLIDE 42

Probability mass function of X with λ = 20

10 20 30 40 5 · 10−2 0.1 0.15 k

slide-43
SLIDE 43

Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation

slide-44
SLIDE 44

Moving average

The moving average A of a discrete random process X is

  • A (i) := 1

i

i

  • j=1
  • X (j)
slide-45
SLIDE 45

Weak law of large numbers

Let X be an iid discrete random process with mean µ

X := µ and

finite variance σ2 The average A of X converges in mean square to µ

slide-46
SLIDE 46

Proof

E

  • A (i)
slide-47
SLIDE 47

Proof

E

  • A (i)
  • = E

 1 i

i

  • j=1
  • X (j)

 

slide-48
SLIDE 48

Proof

E

  • A (i)
  • = E

 1 i

i

  • j=1
  • X (j)

  = 1 i

i

  • j=1

E

  • X (j)
slide-49
SLIDE 49

Proof

E

  • A (i)
  • = E

 1 i

i

  • j=1
  • X (j)

  = 1 i

i

  • j=1

E

  • X (j)
  • = µ
slide-50
SLIDE 50

Proof

Var

  • A (i)
slide-51
SLIDE 51

Proof

Var

  • A (i)
  • = Var

 1 i

i

  • j=1
  • X (j)

 

slide-52
SLIDE 52

Proof

Var

  • A (i)
  • = Var

 1 i

i

  • j=1
  • X (j)

  = 1 i2

i

  • j=1

Var

  • X (j)
slide-53
SLIDE 53

Proof

Var

  • A (i)
  • = Var

 1 i

i

  • j=1
  • X (j)

  = 1 i2

i

  • j=1

Var

  • X (j)
  • = σ2

i

slide-54
SLIDE 54

Proof

lim

i→∞ E

  • A (i) − µ

2

slide-55
SLIDE 55

Proof

lim

i→∞ E

  • A (i) − µ

2 = lim

i→∞ E

  • A (i) − E
  • A (i)

2

slide-56
SLIDE 56

Proof

lim

i→∞ E

  • A (i) − µ

2 = lim

i→∞ E

  • A (i) − E
  • A (i)

2 = lim

i→∞ Var

  • A (i)
slide-57
SLIDE 57

Proof

lim

i→∞ E

  • A (i) − µ

2 = lim

i→∞ E

  • A (i) − E
  • A (i)

2 = lim

i→∞ Var

  • A (i)
  • = lim

i→∞

σ2 i

slide-58
SLIDE 58

Proof

lim

i→∞ E

  • A (i) − µ

2 = lim

i→∞ E

  • A (i) − E
  • A (i)

2 = lim

i→∞ Var

  • A (i)
  • = lim

i→∞

σ2 i = 0

slide-59
SLIDE 59

Strong law of large numbers

Let X be an iid discrete random process with mean µ

X := µ

The average A of X converges with probability one to µ

slide-60
SLIDE 60

Our Bernoulli Experiment: Look at averages

µ i=1000 i=2000

slide-61
SLIDE 61

iid standard Gaussian

10 20 30 40 50 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

Moving average Mean of iid seq.

slide-62
SLIDE 62

iid standard Gaussian

100 200 300 400 500 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

Moving average Mean of iid seq.

slide-63
SLIDE 63

iid standard Gaussian

1000 2000 3000 4000 5000 i 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

Moving average Mean of iid seq.

slide-64
SLIDE 64

iid geometric with p = 0.4

10 20 30 40 50 i 2 4 6 8 10 12

Moving average Mean of iid seq.

slide-65
SLIDE 65

iid geometric with p = 0.4

100 200 300 400 500 i 2 4 6 8 10 12

Moving average Mean of iid seq.

slide-66
SLIDE 66

iid geometric with p = 0.4

1000 2000 3000 4000 5000 i 2 4 6 8 10 12

Moving average Mean of iid seq.

slide-67
SLIDE 67

iid Cauchy

10 20 30 40 50 i 5 5 10 15 20 25 30

Moving average Median of iid seq.

slide-68
SLIDE 68

iid Cauchy

100 200 300 400 500 i 10 5 5 10

Moving average Median of iid seq.

slide-69
SLIDE 69

iid Cauchy

1000 2000 3000 4000 5000 i 60 50 40 30 20 10 10 20 30

Moving average Median of iid seq.

slide-70
SLIDE 70

Strong law of large numbers

Why do we care about the convergence of averages?

slide-71
SLIDE 71

Strong law of large numbers

Why do we care about the convergence of averages? One of the most fundamental tools a statistician/data science has access to SLLN says that as we acquire more data, the average will always converge to the true mean Justifies the convergence of pointwise estimators (coming soon)

slide-72
SLIDE 72

Question to think about during break

  • 1. Suppose

X(1), . . . are iid with E[ X(i)] = µ, and E[ X(i)2] = η. Construct two sequences of random variables from the X(i) that converge to η and µ2, respectively, with probability one.

slide-73
SLIDE 73

Question to think about during break

  • 1. Suppose

X(1), . . . are iid with E[ X(i)] = µ, and E[ X(i)2] = η. Construct two sequences of random variables from the X(i) that converge to η and µ2, respectively, with probability one. Solution. 1 n

n

  • i=1
  • X(i)2 → η
slide-74
SLIDE 74

Question to think about during break

  • 1. Suppose

X(1), . . . are iid with E[ X(i)] = µ, and E[ X(i)2] = η. Construct two sequences of random variables from the X(i) that converge to η and µ2, respectively, with probability one. Solution. 1 n

n

  • i=1
  • X(i)2 → η

and

  • 1

n

n

  • i=1
  • X(i)

2 → µ2.

slide-75
SLIDE 75

Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation

slide-76
SLIDE 76

Central Limit Theorem

Let X be an iid discrete random process with mean µ

X := µ and

finite variance σ2 √n

  • A − µ
  • converges in distribution to a Gaussian random variable

with mean 0 and variance σ2 The average A is approximately Gaussian with mean µ and variance σ2/i

slide-77
SLIDE 77

Height data

◮ Example: Data from a population of 25 000 people ◮ We compare the histogram of the heights and the pdf of a Gaussian

random variable fitted to the data

slide-78
SLIDE 78

Height data

60 62 64 66 68 70 72 74 76

Height (inches)

0.05 0.10 0.15 0.20 0.25

Gaussian distribution Real data

slide-79
SLIDE 79

Sketch of proof

Pdf of sum of two independent random variables is the convolution

  • f their pdfs

fX+Y (z) = ∞

y=−∞

fX (z − y) fY (y) dy Repeated convolutions of any pdf with finite variance results in a Gaussian!

slide-80
SLIDE 80

Repeated convolutions

i = 1 i = 2 i = 3 i = 4 i = 5

slide-81
SLIDE 81

Repeated convolutions

i = 1 i = 2 i = 3 i = 4 i = 5

slide-82
SLIDE 82

iid exponential λ = 2, i = 102

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 1 2 3 4 5 6 7 8 9

slide-83
SLIDE 83

iid exponential λ = 2, i = 103

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 5 10 15 20 25 30

slide-84
SLIDE 84

iid exponential λ = 2, i = 104

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 10 20 30 40 50 60 70 80 90

slide-85
SLIDE 85

iid geometric p = 0.4, i = 102

1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 0.5 1.0 1.5 2.0 2.5

slide-86
SLIDE 86

iid geometric p = 0.4, i = 103

1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 1 2 3 4 5 6 7

slide-87
SLIDE 87

iid geometric p = 0.4, i = 104

1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 5 10 15 20 25

slide-88
SLIDE 88

iid Cauchy, i = 102

20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30

slide-89
SLIDE 89

iid Cauchy, i = 103

20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30

slide-90
SLIDE 90

iid Cauchy, i = 104

20 15 10 5 5 10 15 0.05 0.10 0.15 0.20 0.25 0.30

slide-91
SLIDE 91

Gaussian approximation to the binomial

X is binomial with parameters n and p Computing the probability that X is in a certain interval requires summing its pmf over the interval Central limit theorem provides a quick approximation X =

n

  • i=1

Bi, E (Bi) = p, Var (Bi) = p (1 − p)

1 nX is approximately Gaussian with mean p and variance p (1 − p) /n

X is approximately Gaussian with mean np and variance np (1 − p)

slide-92
SLIDE 92

Gaussian approximation to the binomial

Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =

1000

  • x=420

pX (x) =

1000

  • x=420

1000 x

  • 0.4x0.6(n−x) = 10.4 · 10−2

Approximation : P (X ≥ 420)

slide-93
SLIDE 93

Gaussian approximation to the binomial

Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =

1000

  • x=420

pX (x) =

1000

  • x=420

1000 x

  • 0.4x0.6(n−x) = 10.4 · 10−2

Approximation (U is standard Gaussian): P (X ≥ 420) ≈ P

  • np (1 − p)U + np ≥ 420
slide-94
SLIDE 94

Gaussian approximation to the binomial

Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =

1000

  • x=420

pX (x) =

1000

  • x=420

1000 x

  • 0.4x0.6(n−x) = 10.4 · 10−2

Approximation (U is standard Gaussian): P (X ≥ 420) ≈ P

  • np (1 − p)U + np ≥ 420
  • = P (U ≥ 1.29)
slide-95
SLIDE 95

Gaussian approximation to the binomial

Basketball player makes shot with probability p = 0.4 (shots are iid) Probability that she makes more than 420 shots out of 1000? Exact answer: P (X ≥ 420) =

1000

  • x=420

pX (x) =

1000

  • x=420

1000 x

  • 0.4x0.6(n−x) = 10.4 · 10−2

Approximation (U is standard Gaussian): P (X ≥ 420) ≈ P

  • np (1 − p)U + np ≥ 420
  • = P (U ≥ 1.29)

= 1 − Φ (1.29) = 9.85 · 10−2

slide-96
SLIDE 96

CLT: Things to think about

  • 1. The CLT allows us to model many phenomena using Gaussian

distributions

  • 2. General intuition that an average of random variables concentrates

tightly around the mean, since the Gaussian distribution has very thin tails (i.e., its pdf decays very quickly).

slide-97
SLIDE 97

CLT vs Chebyshev: 5000 Flip Sequences

−4σ √ i −3σ √ i −2σ √ i −1σ √ i µi +1σ √ i +2σ √ i +3σ √ i +4σ √ i i=50 i=100

Chebyshev says Pr(|X − µ| > 3σ) ≤ 1

9

CLT approximation says Pr(|X − µ| > 3σ) ≈

3 1000

slide-98
SLIDE 98

Types of convergence Law of Large Numbers Central Limit Theorem Monte Carlo simulation

slide-99
SLIDE 99

Monte Carlo simulation

Simulation is a powerful tool in probability and statistics Models are too complex to derive closed-form solutions (life is not a homework problem!) Example: Game of solitaire

slide-100
SLIDE 100

Game of solitaire

Aim: Compute the probability that you win at solitaire If every permutation of the cards has the same probability P (Win) = Number of permutations that lead to a win Total number Problem: Characterizing permutations that lead to a win is very difficult without playing out the game We can’t just check because there are 52! ≈ 8 · 1067 permutations! Solution: Sample many permutations and compute the fraction of wins

slide-101
SLIDE 101

In the words of Stanislaw Ulam

The first thoughts and attempts I made to practice (the Monte Carlo Method) were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than "abstract thinking" might not be to lay it out say

  • ne hundred times and simply observe and count the number of successful

plays.This was already possible to envisage with the beginning of the new era of fast computers.

slide-102
SLIDE 102

Monte Carlo approximation

Main principle: Use simulation to approximate quantities that are challenging to compute exactly To approximate the probability of an event E

  • 1. Generate n independent samples from 1E: I1, I2, . . . , In
  • 2. Compute the average of the n samples
  • A (n) := 1

n

n

  • i=1

Ii By the law of large numbers A converges to P (E) as n → ∞ since E (1E) = P (E)

slide-103
SLIDE 103

Basketball league

Basketball league with m teams In a season every pair of teams plays once Teams are ordered: team 1 is best, team m is worst Model: For 1 ≤ i < j ≤ m P (team j beats team i) := 1 j − i + 1 Games are independent

slide-104
SLIDE 104

Basketball league

Aim: Compute probability of team ranks at the end of the season The rank of team i is modeled as a random variable Ri Pmf of R1, R2, . . . , Rm? Ri = j means Team i finished in jth place

slide-105
SLIDE 105

m = 3

Game outcomes Rank Probability 1-2 1-3 2-3 R1 R2 R3 1 1 2 1 2 3 1/6 1 1 3 1 3 2 1/6 1 3 2 1 1 1 1/12 1 3 3 2 3 1 1/12 2 1 2 2 1 3 1/6 2 1 3 1 1 1 1/6 2 3 2 3 1 2 1/12 2 3 3 3 2 1 1/12

slide-106
SLIDE 106

m = 3

Probability mass function R1 R2 R3 1 7/12 1/2 5/12 2 1/4 1/4 1/4 3 1/6 1/4 1/3

slide-107
SLIDE 107

Basketball league: How do we compute the PMF table?

Problem: Number of possible outcomes is 2m(m−1)/2! For m = 10 this is larger than 1013 Solution: Apply Monte Carlo approximation

slide-108
SLIDE 108

m = 3

Game outcomes Rank 1-2 1-3 2-3 R1 R2 R3 1 3 2 1 1 1 1 1 3 1 3 2 2 1 2 2 1 3 2 3 2 3 1 2 2 1 3 1 1 1 1 1 2 1 2 3 2 1 3 1 1 1 2 3 2 3 1 2 1 1 2 1 2 3 2 3 2 3 1 2

slide-109
SLIDE 109

m = 3

Estimated pmf (n = 10) R1 R2 R3 1 0.6 (0.583) 0.7 (0.5) 0.3 (0.417) 2 0.1 (0.25) 0.2 (0.25) 0.4 (0.25) 3 0.3 (0.167) 0.1 (0.25) 0.3 (0.333)

slide-110
SLIDE 110

m = 3

Estimated pmf (n = 2, 000) R1 R2 R3 1 0.582 (0.583) 0.496 (0.5) 0.417 (0.417) 2 0.248 (0.25) 0.261 (0.25) 0.244 (0.25) 3 0.171 (0.167) 0.245 (0.25) 0.339 (0.333)

slide-111
SLIDE 111

Running times

2 4 6 8 10 12 14 16 18 20 10−3 10−2 10−1 100 101 102 103 Number of teams m Running time (seconds) Exact computation Monte Carlo approx.

slide-112
SLIDE 112

Error

m Average error 3 9.28 · 10−3 4 12.7 · 10−3 5 7.95 · 10−3 6 7.12 · 10−3 7 7.12 · 10−3

slide-113
SLIDE 113

m = 5: PMF as Heat Map

slide-114
SLIDE 114

m = 20: PMF as Heat Map

slide-115
SLIDE 115

m = 100: PMF as Heat Map

slide-116
SLIDE 116

Monte Carlo Question to Think About

  • 1. Suppose we want to use Monte Carlo to approximate the probability p
  • f an event E. How many steps should we use? More precisely,

suppose we want an n step Monte Carlo approximation ˆ p such that Pr(|p − ˆ p| > ǫ) is small. How can we bound this probability?

slide-117
SLIDE 117

Monte Carlo Question to Think About

  • 1. Suppose we want to use Monte Carlo to approximate the probability p
  • f an event E. How many steps should we use? More precisely,

suppose we want an n step Monte Carlo approximation ˆ p such that Pr(|p − ˆ p| > ǫ) is small. How can we bound this probability?

  • Solution. Note that 1E is Bernoulli so Var(1E) ≤ 1/4. Thus

Var(ˆ p) ≤ 1 4n, where n is the number of iterations. We can use the CLT to compute an approximate bound of 2Φ(−ǫ √ 4n).