SLIDE 1
Advanced Algorithms COMS31900 Probability recap. Rapha el - - PowerPoint PPT Presentation
Advanced Algorithms COMS31900 Probability recap. Rapha el - - PowerPoint PPT Presentation
Advanced Algorithms COMS31900 Probability recap. Rapha el Clifford Slides by Markus Jalsenius Randomness and probability Probability The sample space S is the set of outcomes of an experiment. Probability The sample space S is the set
SLIDE 2
SLIDE 3
Probability
The sample space S is the set of outcomes of an experiment.
SLIDE 4
Probability
The sample space S is the set of outcomes of an experiment.
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLES
SLIDE 5
Probability
The sample space S is the set of outcomes of an experiment.
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLES
Flip a coin: S = {H, T}.
SLIDE 6
Probability
The sample space S is the set of outcomes of an experiment.
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLES
Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:
S = {£0, £10, £100, £1000, £10, 000, £100, 000}.
SLIDE 7
Probability
The sample space S is the set of outcomes of an experiment.
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLES
Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:
S = {£0, £10, £100, £1000, £10, 000, £100, 000}.
For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
SLIDE 8
Probability
The sample space S is the set of outcomes of an experiment.
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLES
Flip a coin: S = {H, T}. Amount of money you can win when playing some lottery:
S = {£0, £10, £100, £1000, £10, 000, £100, 000}.
For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
SLIDE 9
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
SLIDE 10
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
Roll a die: S = {1, 2, 3, 4, 5, 6}.
EXAMPLE
Pr(1) = Pr(2) = Pr(3) = Pr(4) = Pr(5) = Pr(6) = 1
6 .
SLIDE 11
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
SLIDE 12
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
Flip a coin: S = {H, T}.
Pr(H) = Pr(T) = 1
2 .
EXAMPLE
SLIDE 13
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
SLIDE 14
Probability
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1,
Pr is ‘just’ a function which maps each x ∈ S to Pr(x) ∈ [0, 1]
Amount of money you can win when playing some lottery:
Pr(£0) = 0.9, Pr(£10) = 0.08, . . . , Pr(£100, 000) = 0.0001.
EXAMPLE
S = {£0, £10, £100, £1000, £10, 000, £100, 000}.
SLIDE 15
Probability
The sample space is not necessarily finite.
SLIDE 16
Probability
The sample space is not necessarily finite.
EXAMPLE
Flip a coin until first tail shows up
SLIDE 17
Probability
The sample space is not necessarily finite. Flip a coin until first tail shows up:
EXAMPLE
S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }.
SLIDE 18
Probability
The sample space is not necessarily finite. Flip a coin until first tail shows up:
Pr(“It takes n coin flips”) = 1
2
n
, and
EXAMPLE
S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }.
SLIDE 19
Probability
The sample space is not necessarily finite. Flip a coin until first tail shows up:
Pr(“It takes n coin flips”) = 1
2
n
, and
EXAMPLE
S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞
n=1
1
2
n
SLIDE 20
Probability
The sample space is not necessarily finite. Flip a coin until first tail shows up:
Pr(“It takes n coin flips”) = 1
2
n
, and
EXAMPLE
S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞
n=1
1
2
n = 1
2 + 1 4 + 1 8 + 1 16 . . .
SLIDE 21
Probability
The sample space is not necessarily finite. Flip a coin until first tail shows up:
Pr(“It takes n coin flips”) = 1
2
n
, and
EXAMPLE
S = {T, HT, HHT, HHHT, HHHHT, HHHHHT, . . . }. ∞
n=1
1
2
n = 1
2 + 1 4 + 1 8 + 1 16 . . . = 1
SLIDE 22
Event
An event is a subset V of the sample space S.
SLIDE 23
Event
An event is a subset V of the sample space S. The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
SLIDE 24
Event
An event is a subset V of the sample space S.
EXAMPLE
The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8
SLIDE 25
Event
An event is a subset V of the sample space S.
EXAMPLE
The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same”
SLIDE 26
Event
An event is a subset V of the sample space S.
EXAMPLE
The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}
SLIDE 27
Event
An event is a subset V of the sample space S.
EXAMPLE
The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}
What is Pr(V )?
SLIDE 28
Event
An event is a subset V of the sample space S.
Pr(V ) = Pr(HHH) + Pr(HTH) + Pr(THT) + Pr(TTT) = 4 × 1
8 = 1 2 .
EXAMPLE
The probability of event V happening, denoted Pr(V ), is
Pr(V ) =
- x∈V
Pr(x).
Flip a coin 3 times: S = {TTT, TTH, THT, HTT, HHT, HTH, THH, HHH} For each x ∈ S, Pr(x) = 1 8 Define V to be the event “the first and last coin flips are the same” in other words, V = {HHH, HTH, THT, TTT}
What is Pr(V )?
SLIDE 29
Random variable
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
SLIDE 30
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
SLIDE 31
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
SLIDE 32
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y
SLIDE 33
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y What is Pr(Y = 2)?
SLIDE 34
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y What is Pr(Y = 2)?
Pr(Y = 2) =
- x∈{HH,TT}
Pr(x) = 1 4+
SLIDE 35
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y What is Pr(Y = 2)?
Pr(Y = 2) =
- x∈{HH,TT}
Pr(x) = 1 4+
SLIDE 36
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y What is Pr(Y = 2)?
Pr(Y = 2) =
- x∈{HH,TT}
Pr(x) = 1 4+
SLIDE 37
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y Pr(Y = 2) = 1
2
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
sum over all values of x such that Y (x) = y What is Pr(Y = 2)?
Pr(Y = 2) =
- x∈{HH,TT}
Pr(x) = 1 4+
SLIDE 38
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y Pr(Y = 2) = 1
2
EXAMPLE
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
SLIDE 39
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y Pr(Y = 2) = 1
2 The expected value (the mean) of a r.v. Y ,
EXAMPLE
denoted E(Y ), is
E
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
SLIDE 40
Random variable
The probability of Y taking value y is {x ∈ S st. Y(x) = y} Two coin flips. H H H T T H T T 2 1 5 2
S Y Pr(Y = 2) = 1
2 The expected value (the mean) of a r.v. Y ,
E(Y ) =
- 2 · 1
2
- +
- 1 · 1
4
- +
- 5 · 1
4
- = 5
2
EXAMPLE
denoted E(Y ), is
E
A random variable (r.v.) Y over sample space S is a function S → R
i.e. it maps each outcome x ∈ S to some real number Y (x).
Pr(
SLIDE 41
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
THEOREM (Linearity of expectation)
SLIDE 42
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation)
(regardless of whether the random variables are independent or not.)
SLIDE 43
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values.
SLIDE 44
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. random variable
SLIDE 45
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values.
SLIDE 46
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )?
SLIDE 47
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem)
SLIDE 48
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)
SLIDE 49
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
E(Y ) =
x∈S Y (x) · Pr(x) = 1 36
- x∈S Y (x) =
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)
SLIDE 50
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
E(Y ) =
x∈S Y (x) · Pr(x) = 1 36
- x∈S Y (x) =
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)
SLIDE 51
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
E(Y ) =
x∈S Y (x) · Pr(x) = 1 36
- x∈S Y (x) =
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)
1 36(1 · 2 + 2 · 3 + 3 · 4 + · · · + 1 · 12) = 7
SLIDE 52
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
E(Y ) =
x∈S Y (x) · Pr(x) = 1 36
- x∈S Y (x) =
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 1: (without the theorem) The sample space S = {(1, 1), (1, 2), (1, 3) . . . (6, 6)} (36 outcomes)
1 36(1 · 2 + 2 · 3 + 3 · 4 + · · · + 1 · 12) = 7
SLIDE 53
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )?
SLIDE 54
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem)
SLIDE 55
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second
SLIDE 56
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second
E(Y1) = E(Y2) = 3.5
SLIDE 57
Linearity of expectation
Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
Linearity of expectation always holds,
THEOREM (Linearity of expectation) EXAMPLE
(regardless of whether the random variables are independent or not.) Roll two dice. Let the r.v. Y be the sum of the values. What is E(Y )? Approach 2: (with the theorem) Let the r.v. Y1 be the value of the first die and Y2 the value of the second
E(Y1) = E(Y2) = 3.5
so E(Y ) = E(Y1 + Y2) = E(Y1) + E(Y2) = 7
SLIDE 58
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I)
SLIDE 59
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I) Fact: E(I) = 0 · Pr(I = 0) + 1 · Pr(I = 1) = Pr(I = 1).
SLIDE 60
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
SLIDE 61
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Often an indicator r.v. I is associated with an event such that (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
SLIDE 62
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together! (usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
SLIDE 63
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
SLIDE 64
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll?
SLIDE 65
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times. Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
SLIDE 66
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
SLIDE 67
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
E
- n
- j=2
Ij
- =
n
- j=2
E(Ij) =
n
- j=2
Pr(Ij = 1) = (n − 1) · 7 12
SLIDE 68
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
E
- n
- j=2
Ij
- =
n
- j=2
E(Ij) =
n
- j=2
Pr(Ij = 1) = (n − 1) · 7 12
Linearity of Expectation Let Y1, Y2, . . . , Yk be k random variables. Then
E
- k
- i=1
Yi
- =
k
- i=1
E(Yi)
SLIDE 69
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
E
- n
- j=2
Ij
- =
n
- j=2
E(Ij) =
n
- j=2
Pr(Ij = 1) = (n − 1) · 7 12
SLIDE 70
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
E
- n
- j=2
Ij
- =
n
- j=2
E(Ij) =
n
- j=2
Pr(Ij = 1) = (n − 1) · 7 12
SLIDE 71
Indicator random variables
An indicator random variable is a r.v. that can only be 0 or 1. Roll a die n times.
Pr(Ij = 1) = 21
36 = 7 12 . (by counting the outcomes)
Often an indicator r.v. I is associated with an event such that Indicator random variables and linearity of expectation work great together!
EXAMPLE
(usually referred to by the letter I) Fact: E(I) = Pr(I = 1).
I = 1 if the event happens (and I = 0 otherwise).
What is the expected number rolls that show a value that is at least the value of the previous roll? For j ∈ {2, . . . , n}, let indicator r.v. Ij = 1 if the value of the jth roll is at least the value of the previous roll (and Ij = 0 otherwise)
E
- n
- j=2
Ij
- =
n
- j=2
E(Ij) =
n
- j=2
Pr(Ij = 1) = (n − 1) · 7 12
SLIDE 72
Markov’s inequality
EXAMPLE
Suppose that the average (mean) speed on the motorway is 60 mph.
SLIDE 73
Markov’s inequality
It then follows that at most
EXAMPLE
Suppose that the average (mean) speed on the motorway is 60 mph.
SLIDE 74
Markov’s inequality
It then follows that at most
EXAMPLE
Suppose that the average (mean) speed on the motorway is 60 mph. 1 2 of all cars drive at least 120 mph,
SLIDE 75
Markov’s inequality
It then follows that at most
EXAMPLE
. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 1 2 of all cars drive at least 120 mph,
SLIDE 76
Markov’s inequality
It then follows that at most
EXAMPLE
. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,
SLIDE 77
Markov’s inequality
It then follows that at most If X is a non-negative r.v., then for all a > 0,
Pr(X ≥ a) ≤ E(X) a .
THEOREM (Markov’s inequality) EXAMPLE
. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,
SLIDE 78
Markov’s inequality
It then follows that at most If X is a non-negative r.v., then for all a > 0,
Pr(X ≥ a) ≤ E(X) a .
From the example above:
Pr(speed of a random car ≥ 120 mph) ≤
60 120 = 1 2 ,
Pr(speed of a random car ≥ 90mph) ≤ 60
90 = 2 3 .
EXAMPLE THEOREM (Markov’s inequality) EXAMPLE
. . . otherwise the mean must be higher than 60 mph. (a contradiction) Suppose that the average (mean) speed on the motorway is 60 mph. 2 3 of all cars drive at least 90 mph,
SLIDE 79
Markov’s inequality
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat.
SLIDE 80
Markov’s inequality
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
SLIDE 81
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
- therwise Ij = 0.
SLIDE 82
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
SLIDE 83
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . . Fact: E(I) = Pr(I = 1).
SLIDE 84
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat,
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
SLIDE 85
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
SLIDE 86
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
Pr(5 or more people leaving with their own hats) ≤ 1
5 ,
SLIDE 87
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ),
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
Pr(5 or more people leaving with their own hats) ≤ 1
5 ,
Pr(at least 1 person leaving with their own hat) ≤ 1
1 = 1.
SLIDE 88
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ), (sometimes Markov’s inequality is not particularly informative)
EXAMPLE
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
Pr(5 or more people leaving with their own hats) ≤ 1
5 ,
Pr(at least 1 person leaving with their own hat) ≤ 1
1 = 1.
SLIDE 89
Markov’s inequality
For j ∈ {1, . . . , n}, let indicator r.v. Ij = 1 if the jth person gets their own hat, By Markov’s inequality (recall: Pr(X ≥ a) ≤ E(X) a ), (sometimes Markov’s inequality is not particularly informative)
EXAMPLE
In fact, here it can be shown that as n → ∞, the probability that at least
- ne person leaves with their own hat is 1 − 1
e ≈ 0.632.
n people go to a party, leaving their hats at the door.
Each person leaves with a random hat. How many people leave with their own hat?
E
- therwise Ij = 0.
By linearity of expectation. . .
Pr(5 or more people leaving with their own hats) ≤ 1
5 ,
Pr(at least 1 person leaving with their own hat) ≤ 1
1 = 1.
SLIDE 90
Markov’s inequality
If X is a non-negative r.v. that only takes integer values, then
Pr(X > 0) = Pr(X ≥ 1) ≤ E(X) .
COROLLARY
For an indicator r.v. I, the bound is tight (=), as Pr(I > 0) = E(I).
SLIDE 91
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
THEOREM (union bound)
SLIDE 92
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
THEOREM (union bound)
This is the probability at least one of the events happens
SLIDE 93
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
THEOREM (union bound)
SLIDE 94
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
SLIDE 95
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
SLIDE 96
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
SLIDE 97
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty) Let the r.v. X = k j=1 Ij be the number of events that happen.
SLIDE 98
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint. Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
Pr k
j=1 Vj
- = Pr(X >0) ≤ E(X) = E(k
j=1 Ij) = k j=1 E(Ij)
Let the r.v. X = k j=1 Ij be the number of events that happen.
= k
j=1 Pr(Vj)
SLIDE 99
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
by previous
Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
Pr k
j=1 Vj
- = Pr(X >0) ≤ E(X) = E(k
j=1 Ij) = k j=1 E(Ij)
Let the r.v. X = k j=1 Ij be the number of events that happen.
= k
j=1 Pr(Vj)
Markov corollary
SLIDE 100
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
by previous Linearity of expectation
Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
Pr k
j=1 Vj
- = Pr(X >0) ≤ E(X) = E(k
j=1 Ij) = k j=1 E(Ij)
Let the r.v. X = k j=1 Ij be the number of events that happen.
= k
j=1 Pr(Vj)
Markov corollary
SLIDE 101
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
by previous Linearity of expectation
Define indicator r.v. Ij to be 1 if event Vj happens, otherwise Ij = 0.
THEOREM (union bound) PROOF
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
Pr k
j=1 Vj
- = Pr(X >0) ≤ E(X) = E(k
j=1 Ij) = k j=1 E(Ij)
Let the r.v. X = k j=1 Ij be the number of events that happen.
= k
j=1 Pr(Vj)
Markov corollary
E(I) = Pr(I = 1)
SLIDE 102
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
SLIDE 103
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
EXAMPLE
SLIDE 104
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
SLIDE 105
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1
3 + 1 2 = 5 6
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
SLIDE 106
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
1 2 S V1 V2 4 6 3 5
Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1
3 + 1 2 = 5 6
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
SLIDE 107
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
1 2 S V1 V2 4 6 3 5
Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1
3 + 1 2 = 5 6
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
in fact, Pr(V1 ∪ V2) = 2 3
SLIDE 108
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
1 2 S V1 V2 4 6 3 5
Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1
3 + 1 2 = 5 6
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
in fact, Pr(V1 ∪ V2) = 2 3
(3 was ‘double counted’)
SLIDE 109
Union bound
Let V1, . . . , Vk be k events. Then
Pr
- k
- i=1
Vi
- ≤
k
- i=1
Pr(Vi).
This bound is tight (=) when the events are all disjoint.
THEOREM (union bound)
(Vi and Vj are disjoint iff Vi ∩ Vj is empty)
S = {1, . . . , 6} is the set of outcomes of a die roll.
1 2 S V1 V2 4 6 3 5
Pr(V1 ∪ V2) ≤ Pr(V1) + Pr(V2) = 1
3 + 1 2 = 5 6
EXAMPLE
We define two events: V1 = {3, 4}
V2 = {1, 2, 3}
in fact, Pr(V1 ∪ V2) = 2 3
(3 was ‘double counted’) Typically the union bound is used when each Pr(Vi) is much smaller than k.
SLIDE 110
Summary
The sample space S is the set of outcomes of an experiment. For x ∈ S, the probability of x, written Pr(x), such that x∈S Pr(x) = 1. is a real number between 0 and 1, An event is a subset V of the sample space S, Pr(V ) = x∈V Pr(x) The probability of Y taking value y is
{x ∈ S st. Y(x) = y}
A random variable (r.v.) Y is a function which maps x ∈ S to S(x) ∈ R
Pr(
The expected value (the mean) of Y is E An indicator random variable is a r.v. that can only be 0 or 1. Fact: E(I) = Pr(I = 1).
Let V1, . . . , Vk be k events then,
THEOREM (union bound)
Pr
If X is a non-negative r.v., then for all a > 0,
THEOREM (Markov’s inequality)
Pr(X ≥ a) ≤ E(X) a .
Let Y1, Y2, . . . , Yk be k random variables then,
E k
- i=1
Yi
- =
k
- i=1
E(Yi)
THEOREM (Linearity of expectation)