SLIDE 1
Continuous Probability, RVs, Distributions EECS 126 Fall 2019 - - PowerPoint PPT Presentation
Continuous Probability, RVs, Distributions EECS 126 Fall 2019 - - PowerPoint PPT Presentation
Continuous Probability, RVs, Distributions EECS 126 Fall 2019 September 17, 2019 Agenda Announcements Review Continuous Probability Definitions Cumulative Distribution Functions Distributions Uniform Exponential Gaussian Analogs to
SLIDE 2
SLIDE 3
Announcements
◮ HW3 AND Lab2 are due Friday (9/20). ◮ Feel free to come to Lab Party with HW questions on
Thursday!
◮ HW4 will be optional to give you more time to study. We still
recommend reading and attempting the problems.
◮ Midterm 1 is coming up quick on 9/26! You can find past
exams on the Exams page of the website.
SLIDE 4
Probability Densities
In a continuous space, we describe distributions with probability density functions (PDFs) rather than assigned probability values. A valid probability density of a continuous random variable X in R, fX(x), requires
◮ Non-negativity: ∀x ∈ R fX(x) ≥ 0 ◮ Normalized:
- R fX(x)dx = 1
SLIDE 5
Continuous Probability Definitions
Getting probabilities from densities:
◮ P(X ∈ B) =
- B fX(x)dx
◮ P(X ∈ [a, b]) = P(a ≤ X ≤ b) =
b
a fX(x)dx
(Note: P(X = a) = 0, so open and closed intervals do not matter here)
Figure: Geometric interpretation of the PDF
SLIDE 6
Questions
Suppose we uniformly sample a point in a ball of radius 1. What is the
◮ Probability of picking the origin? ◮ Probability density of picking the origin? ◮ Probability of picking a point on the surface? ◮ Probability of picking a point within a radius of 1 2?
SLIDE 7
Answers
◮ Probability of picking the origin?
0.
◮ Probability density of picking the origin?
Volume of ball is 4
3πr3 = 4 3π. Density is 3 4π. ◮ Probability of picking a point on the surface?
- 0. A 2D surface has 0 volume in a 3D object.
◮ Probability of picking a point within a radius of 1 2?
Since the we’re uniformly picking a point in the ball, we can just look at the ratio of the volumes.
4π 3 ( 1 2 )3 4π 3
= 1
8.
SLIDE 8
Cumulative Distribution Functions (CDFs)
In both discrete and continuous distributions, the cumulative distribution is defined as FX(x) := P(X ≤ x). However, they are computed slightly differently. FX(x) = x
−∞
f (t)dt Consequently (by the Fundamental Theorem of Calculus), fX(x) = d dx FX(x)
SLIDE 9
More familiar definitions
Expectation:
◮ E[X] :=
- R xfX(x)dx
◮ E[g(X)] :=
- R g(x)fX(x)dx
◮ Linearity of expectation holds due to the linearity of integrals:
E[X + Y ] = E[X] + E[Y ] Variance stays the same Var(X) = E[(X − E[X])2] = E[X 2] − E[X]2
SLIDE 10
Questions
Let R be equal to the distance from the origin of a point randomly sampled on a unit ball. What is the
◮ CDF of R? ◮ PDF of R? ◮ Expectation of R?
SLIDE 11
Answers
Let R be the distance from the origin of a point randomly sampled
- n a unit ball. What is the
◮ CDF of R?
FR(r) =
3 4π · 4 3πr3 = r3. ◮ PDF of R? d dr r3 = 3r2. ◮ Expectation of R?
1
0 r · 3r2 = 3 4.
SLIDE 12
Uniform Distribution
The density is uniform across a bounded interval (a, b). For X ∼ Unif (a, b) fX(x) = 1 b − a, a < x < b E[X] = a + b 2 , Var(X) = (b − a)2 12 Easy to work with distribution. Many problems can reduce to a uniform distribution!
SLIDE 13
Uniform Variance Proof
Var(X) = E[X 2] − E[X]2 E[X] = b
a
x 1 b − adx = x2 2(b − a)|b
a
= a + b 2 E[X 2] = b
a
x2 1 b − adx = x3 3(b − a)|b
a
= b3 − a3 3(b − a) Var(X) = b3 − a3 3(b − a) − (a + b)2 4 = (b − a)2 12
SLIDE 14
Exponential Distribution
The exponential distribution PDF: fX(x) = λe−λx, x > 0 The exponential distribution CDF: FX(x) = 1 − e−λx, x > 0 E[X] = 1 λ, Var(X) = 1 λ2
Figure: Exponential distribution for varying λ
SLIDE 15
Memoryless Property
The defining characteristic of the exponential is the memoryless
- property. Recall the memoryless property is:
P(X > x + a|X > x) = P(X > a) Think about banging your head on the wall. What distribution does this remind you of?
SLIDE 16
Connection to Geometric
One can think of the exponential distribution as the continuous analog to the geometric distribution. Remark: These are the only distributions in discrete and continuous spaces respectively with the memoryless property!
Figure: Relating the Exponential dist. to the Geometric dist.
SLIDE 17
Connection to Geometric cont.
Intuition that the geometric distribution approaches the exponential distribution as trials per second approaches infinity. Let X ∼ Geo(p), Y ∼ Expo(λ). Recall the CDF of the geometric distribution FX(n) = 1 − (1 − p)n If we let δ = −ln(1−p)
λ
, we have e−λδ = 1 − p. Thus, FX(n) = FY (nδ). If we drive δ down, we can interpret this as a geometric r.v. holding infinitely many trials per second while making sure that the expected number of trials passed stays the
- same. As δ → 0, we approach a continuous exponential
distribution.
SLIDE 18
Normal / Gaussian Distribution
The Gaussian is seen abundantly in nature (e.g. exam scores). This can be explained by the Central Limit Theorem (CLT), which we will go over later in the course. Gaussian PDF and CDF for mean µ and variance σ2: fX(x) = 1 √ 2πσ2 e−(x−µ)2/2σ2 FX(x) = Φ(x), (cannot be expressed in elementary functions)
SLIDE 19
Properties of the Gaussian
◮ The sum of two independent Gaussians is Gaussian. If
X ∼ N(µ1, σ2
1), Y ∼ N(µ2, σ2 2), and Z = X + Y , then
Z ∼ N(µ1 + µ2, σ2
1 + σ2 2) ◮ The sum of two dependent Gaussians isn’t always Gaussian.
Consider the following example. X = N(0, 1) Y =
- X
w.p. 1
2
−X w.p. 1
2
They are both Gaussian but X + Y is not Gaussian.
◮ A Gaussian multiplied by a constant is Gaussian. If
X ∼ N(µ, σ2) and Y = aX, then Y ∼ N(a · µ, a2 · σ2)
SLIDE 20
Scaling to the Standard Gaussian
◮ The properties on the previous slide allow us to convert any
Gaussian into the standard Gaussian.
◮ If X ∼ N(µ, σ2), then
Z = X − µ σ is distributed with Z ∼ N(0, 1).
◮ Intuition: I got 1 SD on midterm 1.
SLIDE 21
Joint PDFs
Just how multiple discrete RVs have a joint PMF, multiple continuous RVs have a joint PDF.
◮ Discrete
pX,Y (x, y)
◮ Continuous
fX,Y (x, y)
◮ Still needs to be non-negative. ◮ Still needs to integrate to 1.
SLIDE 22
Joint CDFs
◮ Single RV
FX(x) = P(X ≤ x)
◮ Multiple RVs
FX,Y (x, y) = P(X ≤ x, Y ≤ y)
◮ Single RV
d dx FX(x) = fX(x)
◮ Multiple RV
∂2 ∂x∂y FX,Y (x, y) = fX,Y (x, y)
SLIDE 23
Marginal Probability Density
◮ Discrete
pX(x) =
- y∈Y
pX,Y (x, y)
◮ Continuous
fX(x) = ∞
−∞
fX,Y (x, y) dy
◮ fX(x) is still a density, not a probability.
SLIDE 24
Conditional Probability Density
◮ Discrete
pX|Y (x | y) = pX,Y (x, y) pY (y)
◮ Continuous
fX|Y (x | y) = fX,Y (x, y) fY (y)
◮ By definition, Multiplication Rule still holds.
SLIDE 25
Independence
Similar to discrete, 3 equivalent definitions.
◮ For all x and y,
fX,Y (x, y) = fX(x)fY (y)
◮ For all x and y,
fX|Y (x | y) = fX(x)
◮ For all x and y,
fY |X(y | x) = fY (y)
SLIDE 26
Bayes Rule
◮ Discrete (simple form)
pX|Y (x | y) = pY |X(y | x)pX(x) pY (y)
◮ Discrete (extended form)
pX|Y (x | y) = pY |X(y | x)pX(x)
- x′∈X pY |X(y | x′)pX(x′)
◮ Continuous (simple form)
fX|Y (x | y) = fY |X(y | x)fX(x) fY (y)
◮ Continuous (extended form)
fX|Y (x | y) = fY |X(y | x)fX(x) ∞
−∞ fY |X(y | t)pX(t) dt
SLIDE 27
Conditional Expectation
◮ Discrete
E[Y | X = x] =
- y∈Y
y · pY |X(y | x)
◮ Continuous
E[Y | X = x] = ∞
−∞
y · fY |X(y | x) dy
SLIDE 28
Combining Discrete and Continuous RVs
◮ You can also have discrete and continuous RVs defined jointly. ◮ Ex. let X be the outcome of a dice roll and Y be Exp(X).
pX(x) = 1 6 fY |X(y | x) = xe−xy
SLIDE 29
Change of Variables / Derived Distributions
◮ Let X ∼ U[0, 1], and Y = 2X. Then is it true that
fY (y) = P(Y = y) = P(2X = y) = P(X = y 2) = fX(y 2)
◮ No, this won’t integrate to 1. ◮ You have to use the CDF.
FY (y) = P(Y ≤ y) = P(2X ≤ y) = P(X ≤ y 2) = FX(y 2)
◮
fY (y) = d dy FX(y 2) = fX(y 2) · 1 2
SLIDE 30