SLIDE 1
CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean - - PowerPoint PPT Presentation
CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean - - PowerPoint PPT Presentation
CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and CLT Warning: CS70: Jean Walrand: Lecture 36. Gaussian and CLT Warning: This lecture is also rated R. CS70: Jean Walrand: Lecture 36. Gaussian and
SLIDE 2
SLIDE 3
CS70: Jean Walrand: Lecture 36.
Gaussian and CLT Warning: This lecture is also rated R.
SLIDE 4
CS70: Jean Walrand: Lecture 36.
Gaussian and CLT Warning: This lecture is also rated R.
- 1. Review of continuous probability
- 2. Motivation for Gaussian
- 3. Gaussian
- 4. CLT
SLIDE 5
Review of Continuous Probability
Ω is continuous space.
SLIDE 6
Review of Continuous Probability
Ω is continuous space. Probability of any outcome is 0.
SLIDE 7
Review of Continuous Probability
Ω is continuous space. Probability of any outcome is 0. Work with events.
SLIDE 8
Review of Continuous Probability
Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000].
SLIDE 9
Review of Continuous Probability
Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000]. Probability lands in an interval [a,b] ⊆ [0,1000] is
SLIDE 10
Review of Continuous Probability
Ω is continuous space. Probability of any outcome is 0. Work with events. Example: James Bond lands on position uniformly [0,1000]. Probability lands in an interval [a,b] ⊆ [0,1000] is b −a 1000.
SLIDE 11
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
SLIDE 12
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf).
SLIDE 13
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
SLIDE 14
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ.
SLIDE 15
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
SLIDE 16
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy
SLIDE 17
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
.
SLIDE 18
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf).
SLIDE 19
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf). Pr[a < X ≤ b] =
b
a fX(x)dx = FX(b)−FX(a)
SLIDE 20
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf). Pr[a < X ≤ b] =
b
a fX(x)dx = FX(b)−FX(a)
2.1 fX(x) ≥ 0 for all x ∈ ℜ.
SLIDE 21
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf). Pr[a < X ≤ b] =
b
a fX(x)dx = FX(b)−FX(a)
2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2
∞
−∞ fX(x)dx = 1.
SLIDE 22
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf). Pr[a < X ≤ b] =
b
a fX(x)dx = FX(b)−FX(a)
2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2
∞
−∞ fX(x)dx = 1.
Recall that Pr[X ∈ (x,x +δ)] ≈ fX(x)δ.
SLIDE 23
Random Variables
Continuous random variable X, specified by
- 1. FX(x) = Pr[X ≤ x] for all x.
Cumulative Distribution Function (cdf). Pr[a < X ≤ b] = FX(b)−FX(a)
1.1 0 ≤ FX(x) ≤ 1 for all x ∈ ℜ. 1.2 FX(x) ≤ FX(y) if x ≤ y.
- 2. Or fX(x) , where FX(x) =
x
−∞ fX(y)dy or fX(x) = d(FX (x)) dx
. Probability Density Function (pdf). Pr[a < X ≤ b] =
b
a fX(x)dx = FX(b)−FX(a)
2.1 fX(x) ≥ 0 for all x ∈ ℜ. 2.2
∞
−∞ fX(x)dx = 1.
Recall that Pr[X ∈ (x,x +δ)] ≈ fX(x)δ. Think of X taking discrete values nδ for n = ...,−2,−1,0,1,2,... with Pr[X = nδ] = fX(nδ)δ.
SLIDE 24
A Picture
SLIDE 25
A Picture
The pdf fX(x) is a nonnegative function that integrates to 1.
SLIDE 26
A Picture
The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX.
SLIDE 27
A Picture
The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX. Pr[x < X < x +δ] ≈ fX(x)δ
SLIDE 28
A Picture
The pdf fX(x) is a nonnegative function that integrates to 1. The cdf FX(x) is the integral of fX. Pr[x < X < x +δ] ≈ fX(x)δ Pr[X ≤ x] = Fx(x) =
x
−∞ fX(y)dy
SLIDE 29
Example: U[a,b]
SLIDE 30
Expo(λ)
The exponential distribution with parameter λ > 0 is defined by
SLIDE 31
Expo(λ)
The exponential distribution with parameter λ > 0 is defined by
fX(x) = λe−λx1{x ≥ 0}
SLIDE 32
Expo(λ)
The exponential distribution with parameter λ > 0 is defined by
fX(x) = λe−λx1{x ≥ 0} FX(x) = 0, if x < 0 1−e−λx, if x ≥ 0.
SLIDE 33
Shooting in a circle
SLIDE 34
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as
SLIDE 35
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
SLIDE 36
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ.
SLIDE 37
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑
n
(nδ)Pr[X = nδ]
SLIDE 38
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑
n
(nδ)Pr[X = nδ] = ∑
n
(nδ)fX(nδ)δ
SLIDE 39
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑
n
(nδ)Pr[X = nδ] = ∑
n
(nδ)fX(nδ)δ =
∞
−∞ xfX(x)dx.
SLIDE 40
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑
n
(nδ)Pr[X = nδ] = ∑
n
(nδ)fX(nδ)δ =
∞
−∞ xfX(x)dx.
Indeed,
g(x)dx ≈ ∑n g(nδ)δ with g(x) = xfX(x).
SLIDE 41
Expectation
Definition The expectation of a random variable X with pdf f(x) is defined as E[X] =
∞
∞ xfX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[X] = ∑
n
(nδ)Pr[X = nδ] = ∑
n
(nδ)fX(nδ)δ =
∞
−∞ xfX(x)dx.
Indeed,
g(x)dx ≈ ∑n g(nδ)δ with g(x) = xfX(x).
SLIDE 42
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as
SLIDE 43
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
SLIDE 44
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ.
SLIDE 45
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑
n
h(nδ)Pr[X = nδ]
SLIDE 46
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑
n
h(nδ)Pr[X = nδ] = ∑
n
h(nδ)fX(nδ)δ
SLIDE 47
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑
n
h(nδ)Pr[X = nδ] = ∑
n
h(nδ)fX(nδ)δ =
∞
−∞ h(x)fX(x)dx.
Indeed,
g(x)dx ≈ ∑n g(nδ)δ with g(x) = h(x)fX(x).
SLIDE 48
Expectation of function of RV
Definition The expectation of a function of a random variable is defined as E[h(X)] =
∞
−∞ h(x)fX(x)dx.
Justification: Say X = nδ w.p. fX(nδ)δ. Then, E[h(X)] = ∑
n
h(nδ)Pr[X = nδ] = ∑
n
h(nδ)fX(nδ)δ =
∞
−∞ h(x)fX(x)dx.
Indeed,
g(x)dx ≈ ∑n g(nδ)δ with g(x) = h(x)fX(x).
Fact Expectation is linear.
SLIDE 49
Variance
Definition: The variance of a continuous random variable X is defined as
SLIDE 50
Variance
Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2)
SLIDE 51
Variance
Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2) = E(X 2)−(E(X))2
SLIDE 52
Variance
Definition: The variance of a continuous random variable X is defined as var[X] = E((X −E(X))2) = E(X 2)−(E(X))2 =
∞
−∞ x2f(x)dx −
∞
−∞ xf(x)dx
2 .
SLIDE 53
Motivation for Gaussian Distribution
Key fact:
SLIDE 54
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution.
SLIDE 55
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem.
SLIDE 56
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.)
SLIDE 57
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled.
SLIDE 58
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution
SLIDE 59
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution (the bell curve)
SLIDE 60
Motivation for Gaussian Distribution
Key fact: The sum of many small independent RVs has a Gaussian distribution. This is the Central Limit Theorem. (See later.) Examples: Binomial and Poisson suitably scaled. This explains why the Gaussian distribution (the bell curve) shows up everywhere.
SLIDE 61
Normal Distribution.
For any µ and σ, a normal (aka Gaussian)
SLIDE 62
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf
SLIDE 63
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2.
SLIDE 64
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1.
SLIDE 65
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1.
SLIDE 66
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1. Note: Pr[|Y − µ| > 1.65σ] = 10%;
SLIDE 67
Normal Distribution.
For any µ and σ, a normal (aka Gaussian) random variable Y, which we write as Y = N (µ,σ2), has pdf fY(y) = 1 √ 2πσ2 e−(y−µ)2/2σ2. Standard normal has µ = 0 and σ = 1. Note: Pr[|Y − µ| > 1.65σ] = 10%;Pr[|Y − µ| > 2σ] = 5%.
SLIDE 68
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2).
SLIDE 69
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }.
SLIDE 70
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
SLIDE 71
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] =
SLIDE 72
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]]
SLIDE 73
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]]
SLIDE 74
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]]
SLIDE 75
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ
SLIDE 76
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ = 1 σ fX(y − µ σ )dy
SLIDE 77
Scaling and Shifting
Theorem Let X = N (0,1) and Y = µ +σX. Then Y = N (µ,σ2). Proof: fX(x) =
1 √ 2π exp{− x2 2 }. Now,
fY (y)dy = Pr[Y ∈ [y,y +dy]] = Pr[µ +σX ∈ [y,y +dy]] = Pr[σX ∈ [y − µ,y − µ +dy]] = Pr[X ∈ [y − µ σ , y − µ σ + dy σ ]] = fX(y − µ σ )dy σ = 1 σ fX(y − µ σ )dy = 1 √ 2πσ2 exp{−(y − µ)2 2σ2 }dy.
SLIDE 78
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ
SLIDE 79
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2.
SLIDE 80
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,....
SLIDE 81
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
SLIDE 82
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry.
SLIDE 83
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2]
SLIDE 84
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2] =
- x2
1 √ 2π exp{−x2 2 }dx
SLIDE 85
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2] =
- x2
1 √ 2π exp{−x2 2 }dx = − 1 √ 2π
- xd exp{−x2
2 }
SLIDE 86
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2] =
- x2
1 √ 2π exp{−x2 2 }dx = − 1 √ 2π
- xd exp{−x2
2 } = 1 √ 2π
- exp{−x2
2 }dx by IBP
SLIDE 87
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2] =
- x2
1 √ 2π exp{−x2 2 }dx = − 1 √ 2π
- xd exp{−x2
2 } = 1 √ 2π
- exp{−x2
2 }dx by IBP =
- fX(x)dx
SLIDE 88
Expectation, Variance.
Theorem If Y = N (µ,σ2), then E[Y] = µ and var[Y] = σ2. Proof: It suffices to show the result for X = N (0,1) since
Y = µ +σX,.... Thus, fX(x) =
1 √ 2π exp{− x2 2 }.
First note that E[X] = 0, by symmetry. var[X] = E[X 2] =
- x2
1 √ 2π exp{−x2 2 }dx = − 1 √ 2π
- xd exp{−x2
2 } = 1 √ 2π
- exp{−x2
2 }dx by IBP =
- fX(x)dx = 1.
SLIDE 89
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.”
SLIDE 90
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2.
SLIDE 91
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
SLIDE 92
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
SLIDE 93
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n)
SLIDE 94
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ)
SLIDE 95
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ) = 0.
SLIDE 96
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ) = 0.
Var(A′
n)
SLIDE 97
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ) = 0.
Var(A′
n) = 1 σ2/nVar(An)
SLIDE 98
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ) = 0.
Var(A′
n) = 1 σ2/nVar(An) = 1.
SLIDE 99
Central limit theorem.
Law of Large Numbers: For any set of independent identically distributed random variables, Xi, An = 1
n ∑Xi “tends to the
mean.” Say Xi have expecation µ = E(Xi) and variance σ2. Mean of An is µ, and variance is σ2
n .
Let A′
n = An−µ σ/√n.
E(A′
n) = 1 σ/√n(E(An)− µ) = 0.
Var(A′
n) = 1 σ2/nVar(An) = 1.
Central limit theorem: As n goes to infinity the distribution of A′
n approaches the standard normal distribution.
Pr[A′
n ≤ α] →
1 √ 2π
α
∞ e−x2/2dx.
SLIDE 100
Coins and normal..
Let X1,X2,... be i.i.d. B(p).
SLIDE 101
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p).
SLIDE 102
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1).
SLIDE 103
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1).
SLIDE 104
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1).
SLIDE 105
Coins and normal..
Let X1,X2,... be i.i.d. B(p).
SLIDE 106
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p).
SLIDE 107
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1).
SLIDE 108
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1). Thus, Pr[|X1 +···+Xn −np
- p(1−p)n
| ≥ 2] ≈ 5%.
SLIDE 109
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1). Thus, Pr[|X1 +···+Xn −np
- p(1−p)n
| ≥ 2] ≈ 5%. Hence, Pr[|X1 +···+Xn −np (1/2)√n | ≥ 2] ≤ 5%.
SLIDE 110
Coins and normal..
Let X1,X2,... be i.i.d. B(p). Thus, X1 +···+Xn = B(n,p). CLT states that X1 +···+Xn −np
- p(1−p)n
→ N (0,1). Thus, Pr[|X1 +···+Xn −np
- p(1−p)n
| ≥ 2] ≈ 5%. Hence, Pr[|X1 +···+Xn −np (1/2)√n | ≥ 2] ≤ 5%. This implies that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%.
SLIDE 111
Coins and normal..
Let X1,X2,... be i.i.d. B(p).
SLIDE 112
Coins and normal..
Let X1,X2,... be i.i.d. B(p). We just saw that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%.
SLIDE 113
Coins and normal..
Let X1,X2,... be i.i.d. B(p). We just saw that Pr[p ∈ [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] ≥ 95%. Hence, [X1 +···+Xn n − 1 √n, X1 +···+Xn n + 1 √n] is a 95%−CI for p.
SLIDE 114
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2.
SLIDE 115
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2
n .
SLIDE 116
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2
n . The CLT states that
An − µ σ/√n → N (0,1) as n → ∞.
SLIDE 117
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2
n . The CLT states that
An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%.
SLIDE 118
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2
n . The CLT states that
An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%. Equivalently, Pr[µ ∈ [An −2 σ √n,An +2 σ √n]] ≈ 95%.
SLIDE 119
CI for Mean
Let X1,X2,... be i.i.d. with mean µ and variance σ2. Let An = X1 +···+Xn n . Recall that E[An] = µ and var[An] = σ2
n . The CLT states that
An − µ σ/√n → N (0,1) as n → ∞. Thus, for n ≫ 1, one has Pr[−2 ≤ |An − µ σ/√n | ≤ 2] ≈ 95%. Equivalently, Pr[µ ∈ [An −2 σ √n,An +2 σ √n]] ≈ 95%. That is, [An −2 σ √n,An +2 σ √n] is a 95%−CI for µ.
SLIDE 120
Summary
Gaussian and CLT
- 1. Gaussian: N (µ,σ2) : fX(x) = ... “bell curve”
- 2. CLT: Xn i.i.d. =
⇒ An−µ
σ/√n → N (0,1)
- 3. CI: [An −2 σ