[PPT] - Continuous Probability CS70 Summer 2016 - Lecture 6A David Dinh 25 PowerPoint Presentation

SLIDE 1

Continuous Probability

CS70 Summer 2016 - Lecture 6A

David Dinh 25 July 2016

UC Berkeley

SLIDE 2

Logistics

Tutoring Sections - M/W 5-8PM in 540 Cory.

Conceptual discussions of material
No homework discussion (take that to OH/HW party, please)

Midterm is this Friday - 11:30-1:30, same rooms as last time.

Covers material from MT1 to this Wednesday...
...but we will expect you to know everything we’ve covered from

the start of class.

One double-sided sheet of notes allowed (our advice: reuse

sheet from MT1 and add MT2 topics to the other side).

Students with time conflicts and DSP students should have been

contacted by us - if you are one and you haven’t heard from us, get in touch ASAP.

1

SLIDE 3

Today

What is continuous probability?
Expectation and variance in the continuous setting.
Some common distributions.

2

SLIDE 4

Continuous Probability

SLIDE 5

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

SLIDE 6

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

SLIDE 7

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

SLIDE 8

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

SLIDE 9

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

SLIDE 10

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to [0, 1]. What is an event in continuous probability?

3

SLIDE 11

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to [0, 1]. What is an event in continuous probability?

3

SLIDE 12

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 13

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 14

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 15

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 16

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 17

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 18

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 19

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? Not so simple to define events in continuous probability!

4

SLIDE 20

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 21

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

SLIDE 22

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 23

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 24

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 25

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 26

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 27

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 28

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

SLIDE 29

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10/k minutes? 1/k.

5

SLIDE 30

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10/k minutes? 1/k.

5

SLIDE 31

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

6

SLIDE 32

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

6

SLIDE 33

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

SLIDE 34

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

SLIDE 35

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

SLIDE 36

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

SLIDE 37

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

SLIDE 38

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

The resulting curve as k → ∞ is the probability density function (PDF).

6

SLIDE 39

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX t lim Pr X t t Another way of looking at it: Pr X a b

b a

fX t dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

SLIDE 40

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr X a b

b a

fX t dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

SLIDE 41

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

SLIDE 42

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

SLIDE 43

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: ∫ ∞

−∞ fX(t)dt = 1 7

SLIDE 44

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX t

t

fX z dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 45

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX t

t

fX z dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 46

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 47

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 48

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 49

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX t 0 1 lim

t

FX t lim

t

FX t 1

8

SLIDE 50

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t

FX t lim

t

FX t 1

8

SLIDE 51

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) =

lim

t

FX t 1

8

SLIDE 52

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t

FX t 1

8

SLIDE 53

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t→∞ FX(t) =

1

8

SLIDE 54

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t→∞ FX(t) = 1 8

SLIDE 55

In Pictures

9

SLIDE 56

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum integral. E X tfX t dt Expectation of a function? E g X g t fX t dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 57

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E g X g t fX t dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 58

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 59

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 60

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 61

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 62

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 63

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

SLIDE 64

Variance

Variance is defined exactly like it is for the discrete case. Var(X) = E[(X − E[X])2] = E[X2] − E[X]2 The standard properties for variance hold in the continuous case as well. Var aX a2Var X For independent r.v. X, Y: Var X Y Var X Var Y .

11

SLIDE 65

Variance

Variance is defined exactly like it is for the discrete case. Var(X) = E[(X − E[X])2] = E[X2] − E[X]2 The standard properties for variance hold in the continuous case as well. Var(aX) = a2Var(X) For independent r.v. X, Y: Var(X + Y) = Var(X) + Var(Y) .

11

SLIDE 66

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

SLIDE 67

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

SLIDE 68

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

SLIDE 69

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard t2 t2

12

SLIDE 70

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard = πt2 π t2

12

SLIDE 71

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard = πt2 π = t2.

12

SLIDE 72

Target shooting II

CDF: FY(t) = Pr[Y ≤ t] =      for t < 0 t2 for 0 ≤ t ≤ 1 1 for t > 1 PDF? fY(t) = FY(t)′ = 2t for 0 t 1

therwise

13

SLIDE 73

Target shooting II

CDF: FY(t) = Pr[Y ≤ t] =      for t < 0 t2 for 0 ≤ t ≤ 1 1 for t > 1 PDF? fY(t) = FY(t)′ = { 2t for 0 ≤ t ≤ 1

therwise

13

SLIDE 74

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: Area of ring: t

2

t2 t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 75

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: t

2

t2 t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 76

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 77

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 78

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) 2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 79

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2t . PDF for t 1: 2t

14

SLIDE 80

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2tδ. PDF for t 1: 2t

14

SLIDE 81

Target shooting III

Another way of attacking the same problem: what’s the probability

f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2tδ. PDF for t ≤ 1: 2t

14

SLIDE 82

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr Y y y Pr a bX y y Pr X y a b y a b Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 83

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] Pr X y a b y a b Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 84

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 85

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 86

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 87

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

SLIDE 88

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY y 1 bfX y a b

15

SLIDE 89

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY(y) = 1 bfX(y − a b ).

15

SLIDE 90

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY(y) = 1 bfX(y − a b ).

15

SLIDE 91

Continuous Distributions

SLIDE 92

Uniform Distribution: CDF and PDF

PDF is constant over some interval [a, b], zero outside the interval. What’s the value of the constant in the interval? ∫ ∞

−∞

kdt = ∫ b

a

kdt = b − a = 1 so PDF is 1/(b − a) in [a, b] and 0 otherwise. CDF?

t

1 b a dz 0 for t a, t a b a for a t b, and 1 for t b.

16

SLIDE 93

Uniform Distribution: CDF and PDF

PDF is constant over some interval [a, b], zero outside the interval. What’s the value of the constant in the interval? ∫ ∞

−∞

kdt = ∫ b

a

kdt = b − a = 1 so PDF is 1/(b − a) in [a, b] and 0 otherwise. CDF? ∫ t

−∞

1/(b − a)dz 0 for t < a, (t − a)/(b − a) for a < t < b, and 1 for t > b.

16

SLIDE 94

Uniform Distribution: CDF and PDF, Graphically

fX(t) = { 1/(b − a) a < t < b

therwise

FX(t) =        t < a (t − a)/(b − a) a < t < b 1 b<t

17

SLIDE 95

Uniform Distribution: Expectation and Variance

Expectation? E X

b a

t b adt 1 2 b2 a2 b a b a 2 Variance? Var X E X2 E X 2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

SLIDE 96

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var X E X2 E X 2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

SLIDE 97

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

SLIDE 98

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2

t3 3 b a b a b a 2 2 a b 2 12 18

SLIDE 99

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2 =

t3 3(b−a)

b

a −

( b+a

‘2

)2

a b 2 12 18

SLIDE 100

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2 =

t3 3(b−a)

b

a −

( b+a

‘2

)2 = (a−b)2

12 18

SLIDE 101

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

SLIDE 102

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

SLIDE 103

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

SLIDE 104

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: 1

t 1

More precision! What’s the probability that it fails in a 12-hour period? 2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

SLIDE 105

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? 2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

SLIDE 106

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

SLIDE 107

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

SLIDE 108

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1/n-day time period as t: 1 n

tn 1

n

20

SLIDE 109

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1/n-day time period as t: ( 1 − λ n )⌈tn⌉−1 λ n

20

SLIDE 110

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero... but we can make a PDF out of this! Remove the width of the interval (1 n) and take the limit as n to get: lim

n

1 n

tn 1

limn 1

n tn 1

e

t

This is the PDF of the exponential distribution!

21

SLIDE 111

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero...but we can make a PDF out of this! Remove the width of the interval (1/n) and take the limit as n → ∞ to get: lim

n→∞

( 1 − λ n )⌈tn⌉−1 λ = λ limn→∞ ( 1 − λ

n

)tn−1 e

t

This is the PDF of the exponential distribution!

21

SLIDE 112

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero...but we can make a PDF out of this! Remove the width of the interval (1/n) and take the limit as n → ∞ to get: lim

n→∞

( 1 − λ n )⌈tn⌉−1 λ = λ limn→∞ ( 1 − λ

n

)tn−1 = λe−λt This is the PDF of the exponential distribution!

21

SLIDE 113

Exponential Distribution: PDF and CDF

The exponential distribution with parameter λ > 0 is defined by fX t if t e

t

if t FX t if t 1 e

t

if t Note that Pr X t e

t for t

0.

22

SLIDE 114

Exponential Distribution: PDF and CDF

The exponential distribution with parameter λ > 0 is defined by fX(t) = { 0, if t < 0 λe−λt, if t ≥ 0. FX(t) = { 0, if t < 0 1 − e−λt, if t ≥ 0. Note that Pr[X > t] = e−λt for t > 0.

22

SLIDE 115

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX x e

x for 0

x

1. Thus,

E X x e

xdx

xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 116

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E X x e

xdx

xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 117

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 118

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 119

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx 1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 120

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = 1 So: expectation is E X

1

Variance: 1

2 23

SLIDE 121

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E X

1

Variance: 1

2 23

SLIDE 122

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1

2 23

SLIDE 123

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1

2 23

SLIDE 124

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1/λ2

23

SLIDE 125

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X Expo . Then, for s t 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

SLIDE 126

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s t 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

SLIDE 127

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

SLIDE 128

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

SLIDE 129

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] e

t s

e

s

e

t

Pr X t

24

SLIDE 130

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e

t

Pr X t

24

SLIDE 131

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt Pr X t

24

SLIDE 132

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt = Pr[X > t].

24

SLIDE 133

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt = Pr[X > t].

24

SLIDE 134

Properties of the Exponential Distribution: Scaling

Let X Expo and Y aX for some a

0. Then

Pr Y t Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 135

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr Y t Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 136

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 137

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 138

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 139

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 140

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

SLIDE 141

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a × Expo(λ) = Expo(λ/a). Also, Expo

1 Expo 1 . 25

SLIDE 142

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a × Expo(λ) = Expo(λ/a). Also, Expo(λ) = 1

λExpo(1). 25

SLIDE 143

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters ,

2, denoted 2 :

fX t 1 2

2 e

t 2 2 2

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

SLIDE 144

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

SLIDE 145

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

4
2

2 4 0.1 0.2 0.3 0.4

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

SLIDE 146

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

4
2

2 4 0.1 0.2 0.3 0.4

Sometimes called a ”bell curve”. Above: N (0, 1), the ”standard normal”.

26

SLIDE 147

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: (notice that PDF is symmetric around ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 148

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: (notice that PDF is symmetric around ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 149

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 150

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 151

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 152

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 153

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

SLIDE 154

How Many Sigmas, Exactly?

28

SLIDE 155

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1 X2 are i.i.d. random variables with expectation and variance

2. Let

Sn An n n

i Xi

n n Then Sn tends towards 0 1 as n . Or: Pr Sn a 1 2 e

x2 2dx

Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

SLIDE 156

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr Sn a 1 2 e

x2 2dx

Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

SLIDE 157

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

SLIDE 158

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

SLIDE 159

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

SLIDE 160

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

SLIDE 161

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

SLIDE 162

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

SLIDE 163

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution → normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

SLIDE 164

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution → normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

SLIDE 165

Today’s Gig: Cauchy Distribution

SLIDE 166

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 167

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 168

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 169

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 170

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 171

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

SLIDE 172

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

SLIDE 173

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

SLIDE 174

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

SLIDE 175

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

SLIDE 176

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

SLIDE 177

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan θ = t θ = tan−1 t dθ = 1 1 + t2 dt dθ π = 1 1 + t2 dt π

32

SLIDE 178

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt lima

a a t 1 t2 dt

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 179

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 180

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 181

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 182

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 183

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 184

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 185

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 186

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

ut there.

33

SLIDE 187

Questions?

33