Continuous Probability, RVs, Distributions EECS 126 Fall 2019 - - PowerPoint PPT Presentation

continuous probability rvs distributions
SMART_READER_LITE
LIVE PREVIEW

Continuous Probability, RVs, Distributions EECS 126 Fall 2019 - - PowerPoint PPT Presentation

Continuous Probability, RVs, Distributions EECS 126 Fall 2019 September 17, 2019 Agenda Announcements Review Continuous Probability Definitions Cumulative Distribution Functions Distributions Uniform Exponential Gaussian Analogs to


slide-1
SLIDE 1

Continuous Probability, RVs, Distributions

EECS 126 Fall 2019 September 17, 2019

slide-2
SLIDE 2

Agenda

Announcements Review Continuous Probability Definitions Cumulative Distribution Functions Distributions Uniform Exponential Gaussian Analogs to Discrete Probability / RVs Derived Distributions

slide-3
SLIDE 3

Announcements

◮ HW3 AND Lab2 are due Friday (9/20). ◮ Feel free to come to Lab Party with HW questions on

Thursday!

◮ HW4 will be optional to give you more time to study. We still

recommend reading and attempting the problems.

◮ Midterm 1 is coming up quick on 9/26! You can find past

exams on the Exams page of the website.

slide-4
SLIDE 4

Probability Densities

In a continuous space, we describe distributions with probability density functions (PDFs) rather than assigned probability values. A valid probability density of a continuous random variable X in R, fX(x), requires

◮ Non-negativity: ∀x ∈ R fX(x) ≥ 0 ◮ Normalized:

  • R fX(x)dx = 1
slide-5
SLIDE 5

Continuous Probability Definitions

Getting probabilities from densities:

◮ P(X ∈ B) =

  • B fX(x)dx

◮ P(X ∈ [a, b]) = P(a ≤ X ≤ b) =

b

a fX(x)dx

(Note: P(X = a) = 0, so open and closed intervals do not matter here)

Figure: Geometric interpretation of the PDF

slide-6
SLIDE 6

Questions

Suppose we uniformly sample a point in a ball of radius 1. What is the

◮ Probability of picking the origin? ◮ Probability density of picking the origin? ◮ Probability of picking a point on the surface? ◮ Probability of picking a point within a radius of 1 2?

slide-7
SLIDE 7

Answers

◮ Probability of picking the origin?

0.

◮ Probability density of picking the origin?

Volume of ball is 4

3πr3 = 4 3π. Density is 3 4π. ◮ Probability of picking a point on the surface?

  • 0. A 2D surface has 0 volume in a 3D object.

◮ Probability of picking a point within a radius of 1 2?

Since the we’re uniformly picking a point in the ball, we can just look at the ratio of the volumes.

4π 3 ( 1 2 )3 4π 3

= 1

8.

slide-8
SLIDE 8

Cumulative Distribution Functions (CDFs)

In both discrete and continuous distributions, the cumulative distribution is defined as FX(x) := P(X ≤ x). However, they are computed slightly differently. FX(x) = x

−∞

f (t)dt Consequently (by the Fundamental Theorem of Calculus), fX(x) = d dx FX(x)

slide-9
SLIDE 9

More familiar definitions

Expectation:

◮ E[X] :=

  • R xfX(x)dx

◮ E[g(X)] :=

  • R g(x)fX(x)dx

◮ Linearity of expectation holds due to the linearity of integrals:

E[X + Y ] = E[X] + E[Y ] Variance stays the same Var(X) = E[(X − E[X])2] = E[X 2] − E[X]2

slide-10
SLIDE 10

Questions

Let R be equal to the distance from the origin of a point randomly sampled on a unit ball. What is the

◮ CDF of R? ◮ PDF of R? ◮ Expectation of R?

slide-11
SLIDE 11

Answers

Let R be the distance from the origin of a point randomly sampled

  • n a unit ball. What is the

◮ CDF of R?

FR(r) =

3 4π · 4 3πr3 = r3. ◮ PDF of R? d dr r3 = 3r2. ◮ Expectation of R?

1

0 r · 3r2 = 3 4.

slide-12
SLIDE 12

Uniform Distribution

The density is uniform across a bounded interval (a, b). For X ∼ Unif (a, b) fX(x) = 1 b − a, a < x < b E[X] = a + b 2 , Var(X) = (b − a)2 12 Easy to work with distribution. Many problems can reduce to a uniform distribution!

slide-13
SLIDE 13

Uniform Variance Proof

Var(X) = E[X 2] − E[X]2 E[X] = b

a

x 1 b − adx = x2 2(b − a)|b

a

= a + b 2 E[X 2] = b

a

x2 1 b − adx = x3 3(b − a)|b

a

= b3 − a3 3(b − a) Var(X) = b3 − a3 3(b − a) − (a + b)2 4 = (b − a)2 12

slide-14
SLIDE 14

Exponential Distribution

The exponential distribution PDF: fX(x) = λe−λx, x > 0 The exponential distribution CDF: FX(x) = 1 − e−λx, x > 0 E[X] = 1 λ, Var(X) = 1 λ2

Figure: Exponential distribution for varying λ

slide-15
SLIDE 15

Memoryless Property

The defining characteristic of the exponential is the memoryless

  • property. Recall the memoryless property is:

P(X > x + a|X > x) = P(X > a) Think about banging your head on the wall. What distribution does this remind you of?

slide-16
SLIDE 16

Connection to Geometric

One can think of the exponential distribution as the continuous analog to the geometric distribution. Remark: These are the only distributions in discrete and continuous spaces respectively with the memoryless property!

Figure: Relating the Exponential dist. to the Geometric dist.

slide-17
SLIDE 17

Connection to Geometric cont.

Intuition that the geometric distribution approaches the exponential distribution as trials per second approaches infinity. Let X ∼ Geo(p), Y ∼ Expo(λ). Recall the CDF of the geometric distribution FX(n) = 1 − (1 − p)n If we let δ = −ln(1−p)

λ

, we have e−λδ = 1 − p. Thus, FX(n) = FY (nδ). If we drive δ down, we can interpret this as a geometric r.v. holding infinitely many trials per second while making sure that the expected number of trials passed stays the

  • same. As δ → 0, we approach a continuous exponential

distribution.

slide-18
SLIDE 18

Normal / Gaussian Distribution

The Gaussian is seen abundantly in nature (e.g. exam scores). This can be explained by the Central Limit Theorem (CLT), which we will go over later in the course. Gaussian PDF and CDF for mean µ and variance σ2: fX(x) = 1 √ 2πσ2 e−(x−µ)2/2σ2 FX(x) = Φ(x), (cannot be expressed in elementary functions)

slide-19
SLIDE 19

Properties of the Gaussian

◮ The sum of two independent Gaussians is Gaussian. If

X ∼ N(µ1, σ2

1), Y ∼ N(µ2, σ2 2), and Z = X + Y , then

Z ∼ N(µ1 + µ2, σ2

1 + σ2 2) ◮ The sum of two dependent Gaussians isn’t always Gaussian.

Consider the following example. X = N(0, 1) Y =

  • X

w.p. 1

2

−X w.p. 1

2

They are both Gaussian but X + Y is not Gaussian.

◮ A Gaussian multiplied by a constant is Gaussian. If

X ∼ N(µ, σ2) and Y = aX, then Y ∼ N(a · µ, a2 · σ2)

slide-20
SLIDE 20

Scaling to the Standard Gaussian

◮ The properties on the previous slide allow us to convert any

Gaussian into the standard Gaussian.

◮ If X ∼ N(µ, σ2), then

Z = X − µ σ is distributed with Z ∼ N(0, 1).

◮ Intuition: I got 1 SD on midterm 1.

slide-21
SLIDE 21

Joint PDFs

Just how multiple discrete RVs have a joint PMF, multiple continuous RVs have a joint PDF.

◮ Discrete

pX,Y (x, y)

◮ Continuous

fX,Y (x, y)

◮ Still needs to be non-negative. ◮ Still needs to integrate to 1.

slide-22
SLIDE 22

Joint CDFs

◮ Single RV

FX(x) = P(X ≤ x)

◮ Multiple RVs

FX,Y (x, y) = P(X ≤ x, Y ≤ y)

◮ Single RV

d dx FX(x) = fX(x)

◮ Multiple RV

∂2 ∂x∂y FX,Y (x, y) = fX,Y (x, y)

slide-23
SLIDE 23

Marginal Probability Density

◮ Discrete

pX(x) =

  • y∈Y

pX,Y (x, y)

◮ Continuous

fX(x) = ∞

−∞

fX,Y (x, y) dy

◮ fX(x) is still a density, not a probability.

slide-24
SLIDE 24

Conditional Probability Density

◮ Discrete

pX|Y (x | y) = pX,Y (x, y) pY (y)

◮ Continuous

fX|Y (x | y) = fX,Y (x, y) fY (y)

◮ By definition, Multiplication Rule still holds.

slide-25
SLIDE 25

Independence

Similar to discrete, 3 equivalent definitions.

◮ For all x and y,

fX,Y (x, y) = fX(x)fY (y)

◮ For all x and y,

fX|Y (x | y) = fX(x)

◮ For all x and y,

fY |X(y | x) = fY (y)

slide-26
SLIDE 26

Bayes Rule

◮ Discrete (simple form)

pX|Y (x | y) = pY |X(y | x)pX(x) pY (y)

◮ Discrete (extended form)

pX|Y (x | y) = pY |X(y | x)pX(x)

  • x′∈X pY |X(y | x′)pX(x′)

◮ Continuous (simple form)

fX|Y (x | y) = fY |X(y | x)fX(x) fY (y)

◮ Continuous (extended form)

fX|Y (x | y) = fY |X(y | x)fX(x) ∞

−∞ fY |X(y | t)pX(t) dt

slide-27
SLIDE 27

Conditional Expectation

◮ Discrete

E[Y | X = x] =

  • y∈Y

y · pY |X(y | x)

◮ Continuous

E[Y | X = x] = ∞

−∞

y · fY |X(y | x) dy

slide-28
SLIDE 28

Combining Discrete and Continuous RVs

◮ You can also have discrete and continuous RVs defined jointly. ◮ Ex. let X be the outcome of a dice roll and Y be Exp(X).

pX(x) = 1 6 fY |X(y | x) = xe−xy

slide-29
SLIDE 29

Change of Variables / Derived Distributions

◮ Let X ∼ U[0, 1], and Y = 2X. Then is it true that

fY (y) = P(Y = y) = P(2X = y) = P(X = y 2) = fX(y 2)

◮ No, this won’t integrate to 1. ◮ You have to use the CDF.

FY (y) = P(Y ≤ y) = P(2X ≤ y) = P(X ≤ y 2) = FX(y 2)

fY (y) = d dy FX(y 2) = fX(y 2) · 1 2

slide-30
SLIDE 30

References

Introduction to probability. DP Bertsekas, JN Tsitsiklis - 2002