Continuous Probability CS70 Summer 2016 - Lecture 6A David Dinh 25 - - PowerPoint PPT Presentation

continuous probability
SMART_READER_LITE
LIVE PREVIEW

Continuous Probability CS70 Summer 2016 - Lecture 6A David Dinh 25 - - PowerPoint PPT Presentation

Continuous Probability CS70 Summer 2016 - Lecture 6A David Dinh 25 July 2016 UC Berkeley Logistics Tutoring Sections - M/W 5-8PM in 540 Cory. Conceptual discussions of material No homework discussion (take that to OH/HW party,


slide-1
SLIDE 1

Continuous Probability

CS70 Summer 2016 - Lecture 6A

David Dinh 25 July 2016

UC Berkeley

slide-2
SLIDE 2

Logistics

Tutoring Sections - M/W 5-8PM in 540 Cory.

  • Conceptual discussions of material
  • No homework discussion (take that to OH/HW party, please)

Midterm is this Friday - 11:30-1:30, same rooms as last time.

  • Covers material from MT1 to this Wednesday...
  • ...but we will expect you to know everything we’ve covered from

the start of class.

  • One double-sided sheet of notes allowed (our advice: reuse

sheet from MT1 and add MT2 topics to the other side).

  • Students with time conflicts and DSP students should have been

contacted by us - if you are one and you haven’t heard from us, get in touch ASAP.

1

slide-3
SLIDE 3

Today

  • What is continuous probability?
  • Expectation and variance in the continuous setting.
  • Some common distributions.

2

slide-4
SLIDE 4

Continuous Probability

slide-5
SLIDE 5

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

slide-6
SLIDE 6

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

slide-7
SLIDE 7

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

slide-8
SLIDE 8

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

slide-9
SLIDE 9

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to 0 1 . What is an event in continuous probability?

3

slide-10
SLIDE 10

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to [0, 1]. What is an event in continuous probability?

3

slide-11
SLIDE 11

Motivation I

Sometimes you can’t model things discretely. Random real numbers. Points on a map. Time. Probability space is continuous. What is probability? Function mapping events to [0, 1]. What is an event in continuous probability?

3

slide-12
SLIDE 12

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-13
SLIDE 13

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-14
SLIDE 14

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-15
SLIDE 15

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-16
SLIDE 16

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-17
SLIDE 17

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-18
SLIDE 18

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-19
SLIDE 19

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? Not so simple to define events in continuous probability!

4

slide-20
SLIDE 20

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-21
SLIDE 21

Motivation II

Class starts at 14:10. You take your seat at some ”uniform” random time between 14:00 and 14:10. What’s an event here? Probability of coming in at exactly 14:03:47.32? Sample space: all times between 14:00 and 14:10. Size of sample space? How many numbers are there between 0 and 10? infinite Chance of getting one event in an infinite sized uniform sample space? 0 Not so simple to define events in continuous probability!

4

slide-22
SLIDE 22

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-23
SLIDE 23

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-24
SLIDE 24

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-25
SLIDE 25

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-26
SLIDE 26

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-27
SLIDE 27

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-28
SLIDE 28

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10 k minutes? 1/k.

5

slide-29
SLIDE 29

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10/k minutes? 1/k.

5

slide-30
SLIDE 30

Motivation III

0.0 0.2 0.4 0.6 0.8 1.0

Look at intervals instead of specific times. Probability that you come in between 14:00 and 14:10? 1. Probability that you come in between 14:00 and 14:05? 1/2. Probability that you come between 14:03 and 14:04? 1/10. Probability that you come in some time interval of 10/k minutes? 1/k.

5

slide-31
SLIDE 31

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

6

slide-32
SLIDE 32

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

6

slide-33
SLIDE 33

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

slide-34
SLIDE 34

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

slide-35
SLIDE 35

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

slide-36
SLIDE 36

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

slide-37
SLIDE 37

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

6

slide-38
SLIDE 38

PDF (no, not the file format)

What happens when you take k → ∞? Probability goes to 0. What do we do so that this doesn’t disappear? If we split our sample space into k pieces - multiply each one by k.

0.0 0.2 0.4 0.6 0.8 1.0

The resulting curve as k → ∞ is the probability density function (PDF).

6

slide-39
SLIDE 39

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX t lim Pr X t t Another way of looking at it: Pr X a b

b a

fX t dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

slide-40
SLIDE 40

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr X a b

b a

fX t dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

slide-41
SLIDE 41

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

slide-42
SLIDE 42

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: fX t dt 1

7

slide-43
SLIDE 43

Formally speaking...

PDF fX(t) of a random variable X is defined so that the probability of X taking on a value in [t, t + δ] is δf(t) for infinitesimally small δ. fX(t) = lim

δ→0

Pr[X ∈ [t, t + δ]] δ Another way of looking at it: Pr[X ∈ [a, b]] = ∫ b

a

fX(t)dt f is nonnegative (negative probability doesn’t make much sense). Total probability is 1: ∫ ∞

−∞ fX(t)dt = 1 7

slide-44
SLIDE 44

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX t

t

fX z dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-45
SLIDE 45

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX t

t

fX z dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-46
SLIDE 46

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr X a b Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-47
SLIDE 47

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr X b Pr X a FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-48
SLIDE 48

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] FX b FX a FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-49
SLIDE 49

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX t 0 1 lim

t

FX t lim

t

FX t 1

8

slide-50
SLIDE 50

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t

FX t lim

t

FX t 1

8

slide-51
SLIDE 51

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) =

lim

t

FX t 1

8

slide-52
SLIDE 52

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t

FX t 1

8

slide-53
SLIDE 53

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t→∞ FX(t) =

1

8

slide-54
SLIDE 54

CDF

Cumulative distribution function (CDF): FX(t) = Pr[X ≤ t]. Or, in terms of PDF... FX(t) = ∫ t

−∞

fX(z)dz Pr[X ∈ (a, b]] = Pr[X ≤ b] − Pr[X ≤ a] = FX(b) − FX(a) FX(t) ∈ [0, 1] lim

t→−∞ FX(t) = 0

lim

t→∞ FX(t) = 1 8

slide-55
SLIDE 55

In Pictures

9

slide-56
SLIDE 56

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum integral. E X tfX t dt Expectation of a function? E g X g t fX t dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-57
SLIDE 57

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E g X g t fX t dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-58
SLIDE 58

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E aX bY aE X bE Y Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-59
SLIDE 59

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-60
SLIDE 60

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X Y Z are mutually independent, then E XYZ E X E Y E Z . Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-61
SLIDE 61

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-62
SLIDE 62

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-63
SLIDE 63

Expectation

Discrete case: E[X] = ∑∞

t=−∞(Pr[X = t]t)

Continuous case? Sum → integral. E[X] = ∫ ∞

−∞

tfX(t)dt Expectation of a function? E[g(X)] = ∫ ∞

−∞

g(t)fX(t)dt Linearity of expectation: E[aX + bY] = aE[X] + bE[Y] Proof: similar to discrete case. If X, Y, Z are mutually independent, then E[XYZ] = E[X]E[Y]E[Z]. Proof: also similar to discrete case. Exercise: try proving these yourself.

10

slide-64
SLIDE 64

Variance

Variance is defined exactly like it is for the discrete case. Var(X) = E[(X − E[X])2] = E[X2] − E[X]2 The standard properties for variance hold in the continuous case as well. Var aX a2Var X For independent r.v. X, Y: Var X Y Var X Var Y .

11

slide-65
SLIDE 65

Variance

Variance is defined exactly like it is for the discrete case. Var(X) = E[(X − E[X])2] = E[X2] − E[X]2 The standard properties for variance hold in the continuous case as well. Var(aX) = a2Var(X) For independent r.v. X, Y: Var(X + Y) = Var(X) + Var(Y) .

11

slide-66
SLIDE 66

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

slide-67
SLIDE 67

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

slide-68
SLIDE 68

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr X t area of small circle area of dartboard t2 t2

12

slide-69
SLIDE 69

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard t2 t2

12

slide-70
SLIDE 70

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard = πt2 π t2

12

slide-71
SLIDE 71

Target shooting

Suppose an archer always hits a circular target with 1 meter radius, and the exact point that he hits is distributed uniformly across the

  • target. What is the distribution the distance between his arrow and

the center (call this r.v. X)? t 1 Probability that arrow is closer than t to the center? Pr[X ≤ t] = area of small circle area of dartboard = πt2 π = t2.

12

slide-72
SLIDE 72

Target shooting II

CDF: FY(t) = Pr[Y ≤ t] =      for t < 0 t2 for 0 ≤ t ≤ 1 1 for t > 1 PDF? fY(t) = FY(t)′ = 2t for 0 t 1

  • therwise

13

slide-73
SLIDE 73

Target shooting II

CDF: FY(t) = Pr[Y ≤ t] =      for t < 0 t2 for 0 ≤ t ≤ 1 1 for t > 1 PDF? fY(t) = FY(t)′ = { 2t for 0 ≤ t ≤ 1

  • therwise

13

slide-74
SLIDE 74

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: Area of ring: t

2

t2 t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-75
SLIDE 75

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: t

2

t2 t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-76
SLIDE 76

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) t2 2t

2

t2 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-77
SLIDE 77

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) 2t

2

2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-78
SLIDE 78

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) 2t Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-79
SLIDE 79

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2t . PDF for t 1: 2t

14

slide-80
SLIDE 80

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2tδ. PDF for t 1: 2t

14

slide-81
SLIDE 81

Target shooting III

Another way of attacking the same problem: what’s the probability

  • f hitting some ring with inner radius t and outer radius t + δ for

small δ? t t + δ Area of circle: π Area of ring: π((t + δ)2 − t2) = π(t2 + 2tδ + δ2 − t2) = π(2tδ + δ2) ≈ π2tδ Probability of hitting the ring: 2tδ. PDF for t ≤ 1: 2t

14

slide-82
SLIDE 82

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr Y y y Pr a bX y y Pr X y a b y a b Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-83
SLIDE 83

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] Pr X y a b y a b Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-84
SLIDE 84

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] Pr X y a b y a b b fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-85
SLIDE 85

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] fX y a b b Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-86
SLIDE 86

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-87
SLIDE 87

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY y . Hence, fY y 1 bfX y a b

15

slide-88
SLIDE 88

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY y 1 bfX y a b

15

slide-89
SLIDE 89

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY(y) = 1 bfX(y − a b ).

15

slide-90
SLIDE 90

Shifting & Scaling

Let fX(x) be the pdf of X and Y = a + bX where b > 0. Then Pr[Y ∈ (y, y + δ)] = Pr[a + bX ∈ (y, y + δ)] = Pr[X ∈ (y − a b , y + δ − a b )] = Pr[X ∈ (y − a b , y − a b + δ b)] = fX(y − a b ) δ b. Left-hand side is fY(y)δ. Hence, fY(y) = 1 bfX(y − a b ).

15

slide-91
SLIDE 91

Continuous Distributions

slide-92
SLIDE 92

Uniform Distribution: CDF and PDF

PDF is constant over some interval [a, b], zero outside the interval. What’s the value of the constant in the interval? ∫ ∞

−∞

kdt = ∫ b

a

kdt = b − a = 1 so PDF is 1/(b − a) in [a, b] and 0 otherwise. CDF?

t

1 b a dz 0 for t a, t a b a for a t b, and 1 for t b.

16

slide-93
SLIDE 93

Uniform Distribution: CDF and PDF

PDF is constant over some interval [a, b], zero outside the interval. What’s the value of the constant in the interval? ∫ ∞

−∞

kdt = ∫ b

a

kdt = b − a = 1 so PDF is 1/(b − a) in [a, b] and 0 otherwise. CDF? ∫ t

−∞

1/(b − a)dz 0 for t < a, (t − a)/(b − a) for a < t < b, and 1 for t > b.

16

slide-94
SLIDE 94

Uniform Distribution: CDF and PDF, Graphically

fX(t) = { 1/(b − a) a < t < b

  • therwise

FX(t) =        t < a (t − a)/(b − a) a < t < b 1 b<t

17

slide-95
SLIDE 95

Uniform Distribution: Expectation and Variance

Expectation? E X

b a

t b adt 1 2 b2 a2 b a b a 2 Variance? Var X E X2 E X 2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

slide-96
SLIDE 96

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var X E X2 E X 2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

slide-97
SLIDE 97

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2

b a t2 b adt b a 2 2 t3 3 b a b a b a 2 2 a b 2 12 18

slide-98
SLIDE 98

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2

t3 3 b a b a b a 2 2 a b 2 12 18

slide-99
SLIDE 99

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2 =

t3 3(b−a)

  • b

a −

( b+a

‘2

)2

a b 2 12 18

slide-100
SLIDE 100

Uniform Distribution: Expectation and Variance

Expectation? E[X] = ∫ b

a

t b − adt = 1 2 b2 − a2 b − a = b + a 2 Variance? Var[X] = E[X2] − E[X]2 = ∫ b

a t2 b−adt −

( b+a

‘2

)2 =

t3 3(b−a)

  • b

a −

( b+a

‘2

)2 = (a−b)2

12 18

slide-101
SLIDE 101

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

slide-102
SLIDE 102

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

slide-103
SLIDE 103

Exponential Distribution: Motivation

Continuous-time analogue of the geometric distribution. How long until a server fails? How long does it take you to run into pokemon? Can’t “continuously flip a coin”. What do we do? Look at geometric distributions representing processes with higher and higher granularity.

19

slide-104
SLIDE 104

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: 1

t 1

More precision! What’s the probability that it fails in a 12-hour period? 2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

slide-105
SLIDE 105

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? 2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

slide-106
SLIDE 106

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability n during any 1 n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

slide-107
SLIDE 107

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1 n-day time period as t: 1 n

tn 1

n

20

slide-108
SLIDE 108

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1/n-day time period as t: 1 n

tn 1

n

20

slide-109
SLIDE 109

Exponential Distribution: Motivation II

Suppose a server fails with probability λ every day. Probability that server fails on the same day as time t: (1 − λ)⌈t⌉−1λ More precision! What’s the probability that it fails in a 12-hour period? λ/2 if we assume that there is no time that it’s more likely to fail than another. Generally: server fails with probability λ/n during any 1/n-day time period. Probability that server fails on the same 1/n-day time period as t: ( 1 − λ n )⌈tn⌉−1 λ n

20

slide-110
SLIDE 110

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero... but we can make a PDF out of this! Remove the width of the interval (1 n) and take the limit as n to get: lim

n

1 n

tn 1

limn 1

n tn 1

e

t

This is the PDF of the exponential distribution!

21

slide-111
SLIDE 111

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero...but we can make a PDF out of this! Remove the width of the interval (1/n) and take the limit as n → ∞ to get: lim

n→∞

( 1 − λ n )⌈tn⌉−1 λ = λ limn→∞ ( 1 − λ

n

)tn−1 e

t

This is the PDF of the exponential distribution!

21

slide-112
SLIDE 112

Exponential Distribution: Motivation III

( 1 − λ n )⌈tn⌉−1 λ n What happens when we try to take n to ∞? Probability goes to zero...but we can make a PDF out of this! Remove the width of the interval (1/n) and take the limit as n → ∞ to get: lim

n→∞

( 1 − λ n )⌈tn⌉−1 λ = λ limn→∞ ( 1 − λ

n

)tn−1 = λe−λt This is the PDF of the exponential distribution!

21

slide-113
SLIDE 113

Exponential Distribution: PDF and CDF

The exponential distribution with parameter λ > 0 is defined by fX t if t e

t

if t FX t if t 1 e

t

if t Note that Pr X t e

t for t

0.

22

slide-114
SLIDE 114

Exponential Distribution: PDF and CDF

The exponential distribution with parameter λ > 0 is defined by fX(t) = { 0, if t < 0 λe−λt, if t ≥ 0. FX(t) = { 0, if t < 0 1 − e−λt, if t ≥ 0. Note that Pr[X > t] = e−λt for t > 0.

22

slide-115
SLIDE 115

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX x e

x for 0

x

  • 1. Thus,

E X x e

xdx

xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

slide-116
SLIDE 116

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E X x e

xdx

xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

slide-117
SLIDE 117

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx xde

x

Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

slide-118
SLIDE 118

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: xde

x

xe

x

e

xdx

1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

slide-119
SLIDE 119

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx 1 de

x

1 So: expectation is E X

1

Variance: 1

2 23

slide-120
SLIDE 120

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = 1 So: expectation is E X

1

Variance: 1

2 23

slide-121
SLIDE 121

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E X

1

Variance: 1

2 23

slide-122
SLIDE 122

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1

2 23

slide-123
SLIDE 123

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1

2 23

slide-124
SLIDE 124

Expectation & Variance of the Exponential Distribution

X = Expo(λ). Then, fX(x) = λe−λx for 0 ≤ x ≤ 1. Thus, E[X] = ∫ ∞ xλe−λxdx = − ∫ ∞ xde−λx. Integration by parts: ∫ ∞ xde−λx = [xe−λx]∞

0 −

∫ ∞ e−λxdx = 0 − 0 + 1 λ ∫ ∞ de−λx = − 1 λ. So: expectation is E[X] = 1

λ.

Variance: 1/λ2

23

slide-125
SLIDE 125

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X Expo . Then, for s t 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

slide-126
SLIDE 126

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s t 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

slide-127
SLIDE 127

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr X t s X s Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

slide-128
SLIDE 128

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr X t s Pr X s e

t s

e

s

e

t

Pr X t

24

slide-129
SLIDE 129

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] e

t s

e

s

e

t

Pr X t

24

slide-130
SLIDE 130

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e

t

Pr X t

24

slide-131
SLIDE 131

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt Pr X t

24

slide-132
SLIDE 132

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt = Pr[X > t].

24

slide-133
SLIDE 133

Properties of the Exponential Distribution: Memorylessness

Similar to memorylessness for geometric distributions. “If your server doesn’t fail today, it’s in the same state as it was before today.” Let X = Expo(λ). Then, for s, t > 0, Pr[X > t + s | X > s] = Pr[X > t + s] Pr[X > s] = e−λ(t+s) e−λs = e−λt = Pr[X > t].

24

slide-134
SLIDE 134

Properties of the Exponential Distribution: Scaling

Let X Expo and Y aX for some a

  • 0. Then

Pr Y t Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-135
SLIDE 135

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr Y t Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-136
SLIDE 136

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr aX t Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-137
SLIDE 137

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr X t a e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-138
SLIDE 138

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] e

t a

e

a t

Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-139
SLIDE 139

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr Z t for Z Expo a Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-140
SLIDE 140

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a Expo Expo a . Also, Expo

1 Expo 1 . 25

slide-141
SLIDE 141

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a × Expo(λ) = Expo(λ/a). Also, Expo

1 Expo 1 . 25

slide-142
SLIDE 142

Properties of the Exponential Distribution: Scaling

Let X = Expo(λ) and Y = aX for some a > 0. Then Pr[Y > t] = Pr[aX > t] = Pr[X > t/a] = e−λ(t/a) = e−(λ/a)t = Pr[Z > t] for Z = Expo(λ/a). Thus, a × Expo(λ) = Expo(λ/a). Also, Expo(λ) = 1

λExpo(1). 25

slide-143
SLIDE 143

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters ,

2, denoted 2 :

fX t 1 2

2 e

t 2 2 2

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

slide-144
SLIDE 144

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

slide-145
SLIDE 145

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

  • 4
  • 2

2 4 0.1 0.2 0.3 0.4

Sometimes called a ”bell curve”. Above: 0 1 , the ”standard normal”.

26

slide-146
SLIDE 146

Normal Distribution

Continuous counterpart to Binomial dist. (more on this later) Normal (or Gaussian) distribution with parameters µ, σ2, denoted N (µ, σ2): fX(t) = 1 √ 2πσ2 e− (t−µ)2

2σ2

  • 4
  • 2

2 4 0.1 0.2 0.3 0.4

Sometimes called a ”bell curve”. Above: N (0, 1), the ”standard normal”.

26

slide-147
SLIDE 147

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: (notice that PDF is symmetric around ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-148
SLIDE 148

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: (notice that PDF is symmetric around ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-149
SLIDE 149

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance:

2 (fairly straightforward integration)

Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-150
SLIDE 150

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X 0 1 and Y X, then Y

2 .

“68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-151
SLIDE 151

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-152
SLIDE 152

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-153
SLIDE 153

Normal Distribution: Properties

PDF: fX(t) =

1 √ 2πσ2 e− (t−µ)2

2σ2

CDF: involves an integral with no nice closed form (often expressed in terms of “erf”, the error function). Won’t discuss it here. Expectation: µ (notice that PDF is symmetric around µ). Variance: σ2 (fairly straightforward integration) Scaling/Shifting: if X ∼ N (0, 1) and Y = µ + σX, then Y ∼ N (µ, σ2). “68-95-99.7 rule”: for a normal distribution, roughly 68% of the probability mass lies within one standard deviation of the mean, roughly 95% within two standard deviations, and 99.7% within three standard deviations. “n-sigma events” - sometimes used as a shorthand to describe the probability of the event as being the same probability of something falling over n standard deviations away from the mean in a normal distribution.

27

slide-154
SLIDE 154

How Many Sigmas, Exactly?

28

slide-155
SLIDE 155

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1 X2 are i.i.d. random variables with expectation and variance

  • 2. Let

Sn An n n

i Xi

n n Then Sn tends towards 0 1 as n . Or: Pr Sn a 1 2 e

x2 2dx

Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

slide-156
SLIDE 156

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr Sn a 1 2 e

x2 2dx

Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

slide-157
SLIDE 157

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

slide-158
SLIDE 158

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

slide-159
SLIDE 159

Central Limit Theorem

Basically: if you take a lot of i.i.d random variables from any∗ distribution with zero mean and the same variance and sum them up, the sum will converge to a random Gaussian with the same mean and variance. Suppose X1, X2, ... are i.i.d. random variables with expectation µ and variance σ2. Let Sn := An − nµ σ√n = (∑

i Xi) − nµ

σ√n Then Sn tends towards N (0, 1) as n → ∞. Or: Pr[Sn ≤ a] → 1 √ 2π ∫ α

−∞

e−x2/2dx Proof: EE126 Sum of Bernoullis (binomial) tends towards normal!

29

slide-160
SLIDE 160

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

slide-161
SLIDE 161

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

slide-162
SLIDE 162

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

slide-163
SLIDE 163

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution → normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

slide-164
SLIDE 164

Summary

Continuous probability: translation of discrete probability to a continuous sample space with an infinite number of events. Concepts of variance, expectation, etc. translate to continuous too. Geometric distribution → exponential distribution. Binomial distribution → normal distribution. Central limit theorem: everything converges to normal if we take enough samples

30

slide-165
SLIDE 165

Today’s Gig: Cauchy Distribution

slide-166
SLIDE 166

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-167
SLIDE 167

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-168
SLIDE 168

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-169
SLIDE 169

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-170
SLIDE 170

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-171
SLIDE 171

Cauchy

Augustin-Louis Cauchy (1789-1857) Practically invented complex

  • analysis. Made fundamental

contributions to calculus and group theory. “More concepts and theorems have been named for Cauchy than for any other mathematician.” Was also a baron because he tutored a duke... who ended up hating math.

31

slide-172
SLIDE 172

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

slide-173
SLIDE 173

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

slide-174
SLIDE 174

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

slide-175
SLIDE 175

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

slide-176
SLIDE 176

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan t tan

1 t

d 1 1 t2 dt d 1 1 t2 dt

32

slide-177
SLIDE 177

Definition

Actually first written about by Poisson in 1824. Cauchy became associated with it in 1853! Suppose I have a wall on the x-axis. Stand at (0,1) and point a laser at a uniform random angle such that the laser hits the wall. What is the distribution of the point on the wall? tan θ = t θ = tan−1 t dθ = 1 1 + t2 dt dθ π = 1 1 + t2 dt π

32

slide-178
SLIDE 178

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt lima

a a t 1 t2 dt

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-179
SLIDE 179

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-180
SLIDE 180

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

lima

2a a t 1 t2 dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-181
SLIDE 181

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-182
SLIDE 182

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-183
SLIDE 183

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-184
SLIDE 184

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-185
SLIDE 185

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-186
SLIDE 186

Properties

PDF: 1 π(1 + t2) Expectation? ∫ ∞

−∞

t π(1 + t2)dt = lima→∞ ∫ a

−a t π(1+t2)dt = 0

= lima→∞ ∫ 2a

−a t π(1+t2)dt ̸= 0

Expectation doesn’t exist! If you try to estimate the expectation by sampling points and averaging, you’ll get crazy results. Variance doesn’t exist either. Main takeaway: there are some really badly-behaved distributions

  • ut there.

33

slide-187
SLIDE 187

Questions?

33