CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean - - PowerPoint PPT Presentation

cs70 jean walrand lecture 36
SMART_READER_LITE
LIVE PREVIEW

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean - - PowerPoint PPT Presentation

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and CLT Warning: CS70: Jean Walrand: Lecture 36. Gaussian and CLT Warning: This lecture is also rated R. CS70: Jean Walrand: Lecture 36. Gaussian and


slide-1
SLIDE 1

CS70: Jean Walrand: Lecture 36.

Gaussian and CLT

slide-2
SLIDE 2

CS70: Jean Walrand: Lecture 36.

Gaussian and CLT Warning:

slide-3
SLIDE 3

CS70: Jean Walrand: Lecture 36.

Gaussian and CLT Warning: This lecture is also rated R.

slide-4
SLIDE 4

CS70: Jean Walrand: Lecture 36.

Gaussian and CLT Warning: This lecture is also rated R.

  • 1. Review of continuous probability
  • 2. Motivation for Gaussian
  • 3. Gaussian
  • 4. CLT
slide-5
SLIDE 5

Review of Continuous Probability

Ω is continuous space.

slide-6
SLIDE 6

Review of Continuous Probability

Ω is continuous space. Probability of any outcome is 0.

slide-7
SLIDE 7

Review of Continuous Probability

Ω is continuous space. Probability of any outcome is 0. Work with events.

slide-8
SLIDE 8

Review of Continuous Probability

Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000].

slide-9
SLIDE 9

Review of Continuous Probability

Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000]. Probability lands in an interval [a,b] ⊆ [0,1000] is

slide-10
SLIDE 10

Review of Continuous Probability

Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000]. Probability lands in an interval [a,b] ⊆ [0,1000] is b −a 1000.

slide-11
SLIDE 11

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.
slide-12
SLIDE 12

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf).

slide-13
SLIDE 13

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

slide-14
SLIDE 14

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ.

slide-15
SLIDE 15

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

slide-16
SLIDE 16

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy

slide-17
SLIDE 17

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

.

slide-18
SLIDE 18

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf).

slide-19
SLIDE 19

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf). Pr[a < X ≤ b] =

b

a fX(x)dx = FX(b)−FX(a)

slide-20
SLIDE 20

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf). Pr[a < X ≤ b] =

b

a fX(x)dx = FX(b)−FX(a)

2.1 fX(x) ≥ 0 for all x ∈ ℜ.

slide-21
SLIDE 21

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf). Pr[a < X ≤ b] =

b

a fX(x)dx = FX(b)−FX(a)

2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2

−∞ fX(x)dx = 1.

slide-22
SLIDE 22

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf). Pr[a < X ≤ b] =

b

a fX(x)dx = FX(b)−FX(a)

2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2

−∞ fX(x)dx = 1.

Recall that Pr[X ∈ (x,x +δ)] ≈ fX(x)δ.

slide-23
SLIDE 23

Random Variables

Continuous random variable X, specified by

  • 1. FX(x) = Pr[X ≤ x] for all x.

Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)

1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.

  • 2. Or fX(x) , where FX(x) =

x

−∞ fX(y)dy or fX(x) = d(FX (x)) dx

. Probability Density Function (pdf). Pr[a < X ≤ b] =

b

a fX(x)dx = FX(b)−FX(a)

2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2

−∞ fX(x)dx = 1.

Recall that Pr[X ∈ (x,x +δ)] ≈ fX(x)δ. Think of X taking discrete values nδ for n = ...,−2,−1,0,1,2,... with Pr[X = nδ] = fX(nδ)δ.

slide-24
SLIDE 24

A Picture

slide-25
SLIDE 25

A Picture

The pdf fX(x) is a nonnegative function that integrates to 1.

slide-26
SLIDE 26

A Picture

The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX.

slide-27
SLIDE 27

A Picture

The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX. Pr[x < X < x +δ] ≈ fX(x)δ

slide-28
SLIDE 28

A Picture

The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX. Pr[x < X < x +δ] ≈ fX(x)δ Pr[X ≤ x] = Fx(x) =

x

−∞ fX(y)dy

slide-29
SLIDE 29

Example: U[a,b]

slide-30
SLIDE 30

Expo(λ)

The exponential distribution with parameter λ > 0 is defined by

slide-31
SLIDE 31

Expo(λ)

The exponential distribution with parameter λ > 0 is defined by

fX(x) = λe−λx1{x ≥ 0}

slide-32
SLIDE 32

Expo(λ)

The exponential distribution with parameter λ > 0 is defined by

fX(x) = λe−λx1{x ≥ 0} FX(x) = 0, if x < 0 1−e−λx, if x ≥ 0.

slide-33
SLIDE 33

Shooting in a circle

slide-34
SLIDE 34

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as

slide-35
SLIDE 35

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

slide-36
SLIDE 36

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ.

slide-37
SLIDE 37

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑

n

(nδ)Pr[X = nδ]

slide-38
SLIDE 38

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑

n

(nδ)Pr[X = nδ] = ∑

n

(nδ)fX(nδ)δ

slide-39
SLIDE 39

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑

n

(nδ)Pr[X = nδ] = ∑

n

(nδ)fX(nδ)δ =

−∞ xfX(x)dx.

slide-40
SLIDE 40

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑

n

(nδ)Pr[X = nδ] = ∑

n

(nδ)fX(nδ)δ =

−∞ xfX(x)dx.

Indeed,

g(x)dx ≈ ∑n g(nδ)δ with g(x) = xfX(x).

slide-41
SLIDE 41

Expectation

Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =

∞ xfX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑

n

(nδ)Pr[X = nδ] = ∑

n

(nδ)fX(nδ)δ =

−∞ xfX(x)dx.

Indeed,

g(x)dx ≈ ∑n g(nδ)δ with g(x) = xfX(x).

slide-42
SLIDE 42

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as

slide-43
SLIDE 43

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

slide-44
SLIDE 44

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ.

slide-45
SLIDE 45

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑

n

h(nδ)Pr[X = nδ]

slide-46
SLIDE 46

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑

n

h(nδ)Pr[X = nδ] = ∑

n

h(nδ)fX(nδ)δ

slide-47
SLIDE 47

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑

n

h(nδ)Pr[X = nδ] = ∑

n

h(nδ)fX(nδ)δ =

−∞ h(x)fX(x)dx.

Indeed,

g(x)dx ≈ ∑n g(nδ)δ with g(x) = h(x)fX(x).

slide-48
SLIDE 48

Expectation of function of RV

Definition The expectation of a function of a random variable is defined as E[h(X)] =

−∞ h(x)fX(x)dx.

Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑

n

h(nδ)Pr[X = nδ] = ∑

n

h(nδ)fX(nδ)δ =

−∞ h(x)fX(x)dx.

Indeed,

g(x)dx ≈ ∑n g(nδ)δ with g(x) = h(x)fX(x).

Fact Expectation is linear.

slide-49
SLIDE 49

Variance

Definition: The variance of a continuous random variable X is defined as

slide-50
SLIDE 50

Variance

Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2)

slide-51
SLIDE 51

Variance

Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2) = E(X 2)−(E(X))2

slide-52
SLIDE 52

Variance

Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2) = E(X 2)−(E(X))2 =

−∞ x2f(x)dx −

−∞ xf(x)dx

2 .

slide-53
SLIDE 53

Motivation for Gaussian Distribution

Key fact:

slide-54
SLIDE 54

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution.

slide-55
SLIDE 55

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem.

slide-56
SLIDE 56

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.)

slide-57
SLIDE 57

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled.

slide-58
SLIDE 58

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution

slide-59
SLIDE 59

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution (the bell curve)

slide-60
SLIDE 60

Motivation for Gaussian Distribution

Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution (the bell curve) shows up everywhere.

slide-61
SLIDE 61

Normal Distribution.

For any µ and σ, a normal (aka Gaussian)

slide-62
SLIDE 62

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf

slide-63
SLIDE 63

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2.

slide-64
SLIDE 64

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1.

slide-65
SLIDE 65

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1.

slide-66
SLIDE 66

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1. Note: Pr[|Y − µ| > 1.65σ] = 10%;

slide-67
SLIDE 67

Normal Distribution.

For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1. Note: Pr[|Y − µ| > 1.65σ] = 10%;Pr[|Y − µ| > 2σ] = 5%.

slide-68
SLIDE 68

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2).

slide-69
SLIDE 69

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }.

slide-70
SLIDE 70

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

slide-71
SLIDE 71

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] =

slide-72
SLIDE 72

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]]

slide-73
SLIDE 73

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]]

slide-74
SLIDE 74

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]]

slide-75
SLIDE 75

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ

slide-76
SLIDE 76

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ = 1 σ fX(y − µ σ )dy

slide-77
SLIDE 77

Scaling and Shifting

Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =

1 √ 2π exp{− x2 2 }. Now,

fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ = 1 σ fX(y − µ σ )dy = 1 √ 2πσ2 exp{−(y − µ)2 2σ2 }dy.

slide-78
SLIDE 78

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ

slide-79
SLIDE 79

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2.

slide-80
SLIDE 80

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,....

slide-81
SLIDE 81

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

slide-82
SLIDE 82

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry.

slide-83
SLIDE 83

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2]

slide-84
SLIDE 84

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2] =

  • x2

1 √ 2π exp{−x2 2 }dx

slide-85
SLIDE 85

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2] =

  • x2

1 √ 2π exp{−x2 2 }dx = − 1 √ 2π

  • xd exp{−x2

2 }

slide-86
SLIDE 86

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2] =

  • x2

1 √ 2π exp{−x2 2 }dx = − 1 √ 2π

  • xd exp{−x2

2 } = 1 √ 2π

  • exp{−x2

2 }dx by IBP

slide-87
SLIDE 87

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2] =

  • x2

1 √ 2π exp{−x2 2 }dx = − 1 √ 2π

  • xd exp{−x2

2 } = 1 √ 2π

  • exp{−x2

2 }dx by IBP =

  • fX(x)dx
slide-88
SLIDE 88

Expectation, Variance.

Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since

Y = µ +σX,.... Thus, fX(x) =

1 √ 2π exp{− x2 2 }.

First note that E[X] = 0, by symmetry. var[X] = E[X 2] =

  • x2

1 √ 2π exp{−x2 2 }dx = − 1 √ 2π

  • xd exp{−x2

2 } = 1 √ 2π

  • exp{−x2

2 }dx by IBP =

  • fX(x)dx = 1.
slide-89
SLIDE 89

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.”

slide-90
SLIDE 90

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2.

slide-91
SLIDE 91

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

slide-92
SLIDE 92

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

slide-93
SLIDE 93

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n)

slide-94
SLIDE 94

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ)

slide-95
SLIDE 95

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ) = 0.

slide-96
SLIDE 96

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ) = 0.

Var(A′

n)

slide-97
SLIDE 97

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ) = 0.

Var(A′

n) = 1 σ2/nVar(An)

slide-98
SLIDE 98

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ) = 0.

Var(A′

n) = 1 σ2/nVar(An) = 1.

slide-99
SLIDE 99

Central limit theorem.

Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1

n ∑Xi “tends to the

mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2

n .

Let A′

n = An−µ σ/√n.

E(A′

n) = 1 σ/√n(E(An)− µ) = 0.

Var(A′

n) = 1 σ2/nVar(An) = 1.

Central limit theorem: As n goes to infinity the distribution of A′

n approaches the standard normal distribution.

Pr[A′

n ≤ α] →

1 √ 2π

α

∞ e−x2/2dx.

slide-100
SLIDE 100

Coins and normal..

Let X1,X2,... be i.i.d. B(p).

slide-101
SLIDE 101

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p).

slide-102
SLIDE 102

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1).

slide-103
SLIDE 103

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1).

slide-104
SLIDE 104

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1).

slide-105
SLIDE 105

Coins and normal..

Let X1,X2,... be i.i.d. B(p).

slide-106
SLIDE 106

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p).

slide-107
SLIDE 107

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1).

slide-108
SLIDE 108

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1). Thus, Pr[|X1 +···+Xn −np

  • p(1−p)n

| ≥ 2] ≈ 5%.

slide-109
SLIDE 109

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1). Thus, Pr[|X1 +···+Xn −np

  • p(1−p)n

| ≥ 2] ≈ 5%. Hence, Pr[|X1 +···+Xn −np (1/2)√n | ≥ 2] ≤ 5%.

slide-110
SLIDE 110

Coins and normal..

Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np

  • p(1−p)n

→ N (0,1). Thus, Pr[|X1 +···+Xn −np

  • p(1−p)n

| ≥ 2] ≈ 5%. Hence, Pr[|X1 +···+Xn −np (1/2)√n | ≥ 2] ≤ 5%. This implies that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%.

slide-111
SLIDE 111

Coins and normal..

Let X1,X2,... be i.i.d. B(p).

slide-112
SLIDE 112

Coins and normal..

Let X1,X2,... be i.i.d. B(p). We just saw that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%.

slide-113
SLIDE 113

Coins and normal..

Let X1,X2,... be i.i.d. B(p). We just saw that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%. Hence, [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] is a 95%−CI for p.

slide-114
SLIDE 114

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2.

slide-115
SLIDE 115

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2

n .

slide-116
SLIDE 116

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2

n . The CLT states that

An − µ σ/√n → N (0,1) as n → ∞.

slide-117
SLIDE 117

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2

n . The CLT states that

An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%.

slide-118
SLIDE 118

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2

n . The CLT states that

An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%. Equivalently, Pr[µ ∈ [An −2 σ √n,An +2 σ √n]] ≈ 95%.

slide-119
SLIDE 119

CI for Mean

Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2

n . The CLT states that

An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%. Equivalently, Pr[µ ∈ [An −2 σ √n,An +2 σ √n]] ≈ 95%. That is, [An −2 σ √n,An +2 σ √n] is a 95%−CI for µ.

slide-120
SLIDE 120

Summary

Gaussian and CLT

  • 1. Gaussian: N (µ,σ2) : fX(x) = ... “bell curve”
  • 2. CLT: Xn i.i.d. =

⇒ An−µ

σ/√n → N (0,1)

  • 3. CI: [An −2 σ

√n,An +2 σ √n] = 95%-CI for µ.