Advanced Algorithms COMS31900 Probability recap. Rapha el - - PowerPoint PPT Presentation

advanced algorithms coms31900
SMART_READER_LITE
LIVE PREVIEW

Advanced Algorithms COMS31900 Probability recap. Rapha el - - PowerPoint PPT Presentation

Advanced Algorithms COMS31900 Probability recap. Rapha el Clifford Slides by Markus Jalsenius Randomness and probability Probability The sample space S is the set of outcomes of an experiment. Probability The sample space S is the set


slide-1
SLIDE 1

Advanced Algorithms – COMS31900

Probability recap. Rapha¨ el Clifford Slides by Markus Jalsenius

slide-2
SLIDE 2

Randomness and probability

slide-3
SLIDE 3

Probability

The sample space S is the set of outcomes of an experiment.

slide-4
SLIDE 4

Probability

The sample space S is the set of outcomes of an experiment.

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLES

slide-5
SLIDE 5

Probability

The sample space S is the set of outcomes of an experiment.

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLES

Flip a coin: S = {H, T}.

slide-6
SLIDE 6

Probability

The sample space S is the set of outcomes of an experiment.

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLES

Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:

S = {£0, £10, £100, £1000, £10, 000, £100, 000}.

slide-7
SLIDE 7

Probability

The sample space S is the set of outcomes of an experiment.

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLES

Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:

S = {£0, £10, £100, £1000, £10, 000, £100, 000}.

For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

slide-8
SLIDE 8

Probability

The sample space S is the set of outcomes of an experiment.

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLES

Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:

S = {£0, £10, £100, £1000, £10, 000, £100, 000}.

For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

slide-9
SLIDE 9

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

slide-10
SLIDE 10

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

Roll a die: S = {1, 2, 3, 4, 5, 6}.

EXAMPLE

Pr(1) = Pr(2) = Pr(3) = Pr(4) = Pr(5) = Pr(6) = 1

6 .

slide-11
SLIDE 11

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

slide-12
SLIDE 12

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

Flip a coin: S = {H, T}.

Pr(H) = Pr(T) = 1

2 .

EXAMPLE

slide-13
SLIDE 13

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

slide-14
SLIDE 14

Probability

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,

Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]

Amount of money you can win when playing some lottery:

Pr(£0) = 0.9, Pr(£10) = 0.08, . . . , Pr(£100, 000) = 0.0001.

EXAMPLE

S = {£0, £10, £100, £1000, £10, 000, £100, 000}.

slide-15
SLIDE 15

Probability

The sample space is not necessarily finite.

slide-16
SLIDE 16

Probability

The sample space is not necessarily finite.

EXAMPLE

Flip a coin until first tail shows up

slide-17
SLIDE 17

Probability

The sample space is not necessarily finite. Flip a coin until first tail shows up:

EXAMPLE

S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }.

slide-18
SLIDE 18

Probability

The sample space is not necessarily finite. Flip a coin until first tail shows up:

Pr(“It takes n coin flips”) = 1

2

n

, and

EXAMPLE

S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }.

slide-19
SLIDE 19

Probability

The sample space is not necessarily finite. Flip a coin until first tail shows up:

Pr(“It takes n coin flips”) = 1

2

n

, and

EXAMPLE

S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞

n=1

1

2

n

slide-20
SLIDE 20

Probability

The sample space is not necessarily finite. Flip a coin until first tail shows up:

Pr(“It takes n coin flips”) = 1

2

n

, and

EXAMPLE

S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞

n=1

1

2

n = 1

2 + 1 4 + 1 8 + 1 16 . . .

slide-21
SLIDE 21

Probability

The sample space is not necessarily finite. Flip a coin until first tail shows up:

Pr(“It takes n coin flips”) = 1

2

n

, and

EXAMPLE

S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞

n=1

1

2

n = 1

2 + 1 4 + 1 8 + 1 16 . . . = 1

slide-22
SLIDE 22

Event

An event is a subset V of the sample space S.

slide-23
SLIDE 23

Event

An event is a subset V of the sample space S. The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

slide-24
SLIDE 24

Event

An event is a subset V of the sample space S.

EXAMPLE

The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8

slide-25
SLIDE 25

Event

An event is a subset V of the sample space S.

EXAMPLE

The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same”

slide-26
SLIDE 26

Event

An event is a subset V of the sample space S.

EXAMPLE

The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}

slide-27
SLIDE 27

Event

An event is a subset V of the sample space S.

EXAMPLE

The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}

What is Pr(V )?

slide-28
SLIDE 28

Event

An event is a subset V of the sample space S.

Pr(V ) = Pr(HHH) + Pr(HTH) + Pr(THT) + Pr(TTT) = 4 × 1

8 = 1 2 .

EXAMPLE

The probability of event V happening, denoted Pr(V ), is

Pr(V ) =

  • x∈V

Pr(x).

Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}

What is Pr(V )?

slide-29
SLIDE 29

Random variable

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

slide-30
SLIDE 30

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

slide-31
SLIDE 31

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

slide-32
SLIDE 32

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y

slide-33
SLIDE 33

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y What is Pr(Y = 2)?

slide-34
SLIDE 34

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y What is Pr(Y = 2)?

Pr(Y = 2) =

  • x∈{HH,TT}

Pr(x) = 1 4+

slide-35
SLIDE 35

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y What is Pr(Y = 2)?

Pr(Y = 2) =

  • x∈{HH,TT}

Pr(x) = 1 4+

slide-36
SLIDE 36

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y What is Pr(Y = 2)?

Pr(Y = 2) =

  • x∈{HH,TT}

Pr(x) = 1 4+

slide-37
SLIDE 37

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y Pr(Y = 2) = 1

2

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

sum over all values of x such that Y (x) = y What is Pr(Y = 2)?

Pr(Y = 2) =

  • x∈{HH,TT}

Pr(x) = 1 4+

slide-38
SLIDE 38

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y Pr(Y = 2) = 1

2

EXAMPLE

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

slide-39
SLIDE 39

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y Pr(Y = 2) = 1

2 The expected value (the mean) of a r.v. Y ,

EXAMPLE

denoted E(Y ), is

E

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

slide-40
SLIDE 40

Random variable

The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2

S Y Pr(Y = 2) = 1

2 The expected value (the mean) of a r.v. Y ,

E(Y ) =

  • 2 · 1

2

  • +
  • 1 · 1

4

  • +
  • 5 · 1

4

  • = 5

2

EXAMPLE

denoted E(Y ), is

E

A random variable (r.v.) Y over sample space S is a function S → R

i.e. it maps each outcome x ∈ S to some real number Y (x).

Pr(

slide-41
SLIDE 41

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

THEOREM (Linearity of expectation)

slide-42
SLIDE 42

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation)

(regardless of whether the random variables are independent or not.)

slide-43
SLIDE 43

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values.

slide-44
SLIDE 44

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. random variable

slide-45
SLIDE 45

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values.

slide-46
SLIDE 46

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )?

slide-47
SLIDE 47

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem)

slide-48
SLIDE 48

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)

slide-49
SLIDE 49

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

E(Y ) =

x∈S Y (x) · Pr(x) = 1 36

  • x∈S Y (x) =

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)

slide-50
SLIDE 50

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

E(Y ) =

x∈S Y (x) · Pr(x) = 1 36

  • x∈S Y (x) =

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)

slide-51
SLIDE 51

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

E(Y ) =

x∈S Y (x) · Pr(x) = 1 36

  • x∈S Y (x) =

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)

1 36(1 · 2 + 2 · 3 + 3 · 4 + · · · + 1 · 12) = 7

slide-52
SLIDE 52

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

E(Y ) =

x∈S Y (x) · Pr(x) = 1 36

  • x∈S Y (x) =

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)

1 36(1 · 2 + 2 · 3 + 3 · 4 + · · · + 1 · 12) = 7

slide-53
SLIDE 53

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )?

slide-54
SLIDE 54

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem)

slide-55
SLIDE 55

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second

slide-56
SLIDE 56

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second

E(Y1) = E(Y2) = 3.5

slide-57
SLIDE 57

Linearity of expectation

Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

Linearity of expectation always holds,

THEOREM (Linearity of expectation) EXAMPLE

(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second

E(Y1) = E(Y2) = 3.5

so E(Y ) = E(Y1 + Y2) = E(Y1) + E(Y2) = 7

slide-58
SLIDE 58

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I)

slide-59
SLIDE 59

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I) Fact: E(I) = 0 · Pr(I = 0) + 1 · Pr(I = 1) = Pr(I = 1).

slide-60
SLIDE 60

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

slide-61
SLIDE 61

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Often an indicator r.v. I is associated with an event such that (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

slide-62
SLIDE 62

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together! (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

slide-63
SLIDE 63

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

slide-64
SLIDE 64

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll?

slide-65
SLIDE 65

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

slide-66
SLIDE 66

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

slide-67
SLIDE 67

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

E

  • n
  • j=2

Ij

  • =

n

  • j=2

E(Ij) =

n

  • j=2

Pr(Ij = 1) = (n − 1) · 7 12

slide-68
SLIDE 68

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

E

  • n
  • j=2

Ij

  • =

n

  • j=2

E(Ij) =

n

  • j=2

Pr(Ij = 1) = (n − 1) · 7 12

Linearity of Expectation Let Y1, Y2, . . . , Yk be k random variables. Then

E

  • k
  • i=1

Yi

  • =

k

  • i=1

E(Yi)

slide-69
SLIDE 69

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

E

  • n
  • j=2

Ij

  • =

n

  • j=2

E(Ij) =

n

  • j=2

Pr(Ij = 1) = (n − 1) · 7 12

slide-70
SLIDE 70

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

E

  • n
  • j=2

Ij

  • =

n

  • j=2

E(Ij) =

n

  • j=2

Pr(Ij = 1) = (n − 1) · 7 12

slide-71
SLIDE 71

Indicator random variables

An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.

Pr(Ij = 1) = 21

36 = 7 12 . (by counting the outcomes)

Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!

EXAMPLE

(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).

I = 1 if the event happens (and I = 0 otherwise).

What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)

E

  • n
  • j=2

Ij

  • =

n

  • j=2

E(Ij) =

n

  • j=2

Pr(Ij = 1) = (n − 1) · 7 12

slide-72
SLIDE 72

Markov’s inequality

EXAMPLE

Suppose that the average (mean) speed on the motorway is 60 mph.

slide-73
SLIDE 73

Markov’s inequality

It then follows that at most

EXAMPLE

Suppose that the average (mean) speed on the motorway is 60 mph.

slide-74
SLIDE 74

Markov’s inequality

It then follows that at most

EXAMPLE

Suppose that the average (mean) speed on the motorway is 60 mph. 1 2 of all cars drive at least 120 mph,

slide-75
SLIDE 75

Markov’s inequality

It then follows that at most

EXAMPLE

. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 1 2 of all cars drive at least 120 mph,

slide-76
SLIDE 76

Markov’s inequality

It then follows that at most

EXAMPLE

. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,

slide-77
SLIDE 77

Markov’s inequality

It then follows that at most If X is a non-negative r.v., then for all a > 0,

Pr(X ≥ a) ≤ E(X) a .

THEOREM (Markov’s inequality) EXAMPLE

. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,

slide-78
SLIDE 78

Markov’s inequality

It then follows that at most If X is a non-negative r.v., then for all a > 0,

Pr(X ≥ a) ≤ E(X) a .

From the example above:

Pr(speed of a random car ≥ 120 mph) ≤

60 120 = 1 2 ,

Pr(speed of a random car ≥ 90mph) ≤ 60

90 = 2 3 .

EXAMPLE THEOREM (Markov’s inequality) EXAMPLE

. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,

slide-79
SLIDE 79

Markov’s inequality

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat.

slide-80
SLIDE 80

Markov’s inequality

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

slide-81
SLIDE 81

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

  • therwise Ij = 0.
slide-82
SLIDE 82

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

slide-83
SLIDE 83

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . . Fact: E(I) = Pr(I = 1).

slide-84
SLIDE 84

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

slide-85
SLIDE 85

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

slide-86
SLIDE 86

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

Pr(5 or more people leaving with their own hats) ≤ 1

5 ,

slide-87
SLIDE 87

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

Pr(5 or more people leaving with their own hats) ≤ 1

5 ,

Pr(at least 1 person leaving with their own hat) ≤ 1

1 = 1.

slide-88
SLIDE 88

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ), (sometimes Markov’s inequality is not particularly informative)

EXAMPLE

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

Pr(5 or more people leaving with their own hats) ≤ 1

5 ,

Pr(at least 1 person leaving with their own hat) ≤ 1

1 = 1.

slide-89
SLIDE 89

Markov’s inequality

For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ), (sometimes Markov’s inequality is not particularly informative)

EXAMPLE

In fact, here it can be shown that as n → ∞, the probability that at least

  • ne person leaves with their own hat is 1 − 1

e ≈ 0.632.

n people go to a party, leaving their hats at the door.

Each person leaves with a random hat. How many people leave with their own hat?

E

  • therwise Ij = 0.

By linearity of expectation. . .

Pr(5 or more people leaving with their own hats) ≤ 1

5 ,

Pr(at least 1 person leaving with their own hat) ≤ 1

1 = 1.

slide-90
SLIDE 90

Markov’s inequality

If X is a non-negative r.v. that only takes integer values, then

Pr(X > 0) = Pr(X ≥ 1) ≤ E(X) .

COROLLARY

For an indicator r.v. I, the bound is tight (=), as Pr(I > 0) = E(I).

slide-91
SLIDE 91

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

THEOREM (union bound)

slide-92
SLIDE 92

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

THEOREM (union bound)

This is the probability at least one of the events happens

slide-93
SLIDE 93

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

THEOREM (union bound)

slide-94
SLIDE 94

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

slide-95
SLIDE 95

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

slide-96
SLIDE 96

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

slide-97
SLIDE 97

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty) Let the r.v. X = k j=1 Ij be the number of events that happen.

slide-98
SLIDE 98

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

Pr k

j=1 Vj

  • = Pr(X >0) ≤ E(X) = E(k

j=1 Ij) = k j=1 E(Ij)

Let the r.v. X = k j=1 Ij be the number of events that happen.

= k

j=1 Pr(Vj)

slide-99
SLIDE 99

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

by previous

Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

Pr k

j=1 Vj

  • = Pr(X >0) ≤ E(X) = E(k

j=1 Ij) = k j=1 E(Ij)

Let the r.v. X = k j=1 Ij be the number of events that happen.

= k

j=1 Pr(Vj)

Markov corollary

slide-100
SLIDE 100

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

by previous Linearity of expectation

Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

Pr k

j=1 Vj

  • = Pr(X >0) ≤ E(X) = E(k

j=1 Ij) = k j=1 E(Ij)

Let the r.v. X = k j=1 Ij be the number of events that happen.

= k

j=1 Pr(Vj)

Markov corollary

slide-101
SLIDE 101

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

by previous Linearity of expectation

Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.

THEOREM (union bound) PROOF

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

Pr k

j=1 Vj

  • = Pr(X >0) ≤ E(X) = E(k

j=1 Ij) = k j=1 E(Ij)

Let the r.v. X = k j=1 Ij be the number of events that happen.

= k

j=1 Pr(Vj)

Markov corollary

E(I) = Pr(I = 1)

slide-102
SLIDE 102

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

slide-103
SLIDE 103

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

EXAMPLE

slide-104
SLIDE 104

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

slide-105
SLIDE 105

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1

3 + 1 2 = 5 6

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

slide-106
SLIDE 106

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

1 2 S V1 V2 4 6 3 5

Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1

3 + 1 2 = 5 6

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

slide-107
SLIDE 107

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

1 2 S V1 V2 4 6 3 5

Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1

3 + 1 2 = 5 6

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

in fact, Pr(V1 ∪ V2) = 2 3

slide-108
SLIDE 108

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

1 2 S V1 V2 4 6 3 5

Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1

3 + 1 2 = 5 6

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

in fact, Pr(V1 ∪ V2) = 2 3

(3 was ‘double counted’)

slide-109
SLIDE 109

Union bound

Let V1, . . . , Vk be k events. Then

Pr

  • k
  • i=1

Vi

k

  • i=1

Pr(Vi).

This bound is tight (=) when the events are all disjoint.

THEOREM (union bound)

(Vi and Vj are disjoint iff Vi ∩ Vj is empty)

S = {1, . . . , 6} is the set of outcomes of a die roll.

1 2 S V1 V2 4 6 3 5

Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1

3 + 1 2 = 5 6

EXAMPLE

We define two events: V1 = {3, 4}

V2 = {1, 2, 3}

in fact, Pr(V1 ∪ V2) = 2 3

(3 was ‘double counted’) Typically the union bound is used when each Pr(Vi) is much smaller than k.

slide-110
SLIDE 110

Summary

The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1, An event is a subset V of the sample space S, Pr(V ) = x∈V Pr(x) The probability of Y taking value y is

{x ∈ S st. Y(x) = y}

A random variable (r.v.) Y is a function which maps x ∈ S to S(x) ∈ R

Pr(

The expected value (the mean) of Y is E An indicator random variable is a r.v. that can only be 0 or 1. Fact: E(I) = Pr(I = 1).

Let V1, . . . , Vk be k events then,

THEOREM (union bound)

Pr

If X is a non-negative r.v., then for all a > 0,

THEOREM (Markov’s inequality)

Pr(X ≥ a) ≤ E(X) a .

Let Y1, Y2, . . . , Yk be k random variables then,

E k

  • i=1

Yi

  • =

k

  • i=1

E(Yi)

THEOREM (Linearity of expectation)