Probability Review
Gonzalo Mateos
- Dept. of ECE and Goergen Institute for Data Science
University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 16, 2020
Introduction to Random Processes Probability Review 1
Probability Review Gonzalo Mateos Dept. of ECE and Goergen - - PowerPoint PPT Presentation
Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 16, 2020 Introduction to Random Processes Probability
Introduction to Random Processes Probability Review 1
Introduction to Random Processes Probability Review 2
◮ RV X with E [|X|] < ∞, constant a > 0 ◮ Markov’s inequality states ⇒ P (|X| ≥ a) ≤ E(|X|)
◮ I {|X| ≥ a} = 1 when |X| ≥ a and
◮ Use linearity of expected value
◮ Indicator function’s expectation = Probability of indicated event
Introduction to Random Processes Probability Review 3
◮ RV X with E(X) = µ and E
◮ Chebyshev’s inequality states ⇒ P (|X − µ| ≥ k) ≤ σ2
◮ Markov’s inequality for the RV Z = (X − µ)2 and constant a = k2
◮ Notice that (X − µ)2 ≥ k2 if and only if |X − µ| ≥ k thus
◮ Chebyshev’s inequality follows from definition of variance
Introduction to Random Processes Probability Review 4
◮ If absolute expected value is finite, i.e., E [|X|] < ∞
◮ If mean E(X) and variance E
◮ Most cdfs decrease exponentially (e.g. e−x2 for normal)
◮ Markov’s inequality often derived for nonnegative RV X ≥ 0
a
ar
Introduction to Random Processes Probability Review 5
Introduction to Random Processes Probability Review 6
◮ Sequence of RVs XN = X1, X2, . . . , Xn, . . .
n→∞ xn
n→∞ P (Xn ∈ X) ◮ Translate what we now about regular limits to definitions for RVs ◮ Can start from convergence of sequences:
n→∞ xn
◮ Or from convergence of probabilities:
n→∞ P (Xn)
Introduction to Random Processes Probability Review 7
◮ Denote sequence of numbers xN = x1, x2, . . . , xn, . . . ◮ Def: Sequence xN converges to the value x if given any ǫ > 0
◮ Sequence xn comes arbitrarily close to its limit ⇒ |xn − x| < ǫ
◮ Random process (sequence of RVs) XN = X1, X2, . . . , Xn, . . .
◮ Def: We say XN converges surely to RV X if
n→∞ xn = x for all realizations xN of XN ◮ Said differently, lim n→∞ Xn(s) = X(s) for all s ∈ S ◮ Not really adequate. Even a (practically unimportant) outcome that
Introduction to Random Processes Probability Review 8
◮ RV X and random process XN = X1, X2, . . . , Xn, . . . ◮ Def: We say XN converges almost surely to RV X if
n→∞ Xn = X
◮ Almost sure convergence denoted as ⇒
n→∞ Xn = X
◮ X0 ∼ N(0, 1) (normal, mean 0, variance 1) ◮ Zn sequence of Bernoulli RVs, parameter p ◮ Define ⇒ Xn = X0 − Zn
◮ Zn
n→∞ Xn = X0 a.s. (also surely)
10 20 30 40 50 60 70 80 90 100 −2 −1.5 −1 −0.5 0.5 1
Introduction to Random Processes Probability Review 9
◮ Consider S = [0, 1] and let P (·) be the uniform probability distribution
◮ Define the RVs Xn(s) = s + sn and X(s) = s ◮ For all s ∈ [0, 1) ⇒ sn → 0 as n → ∞, hence Xn(s) → s = X(s) ◮ For s = 1 ⇒ Xn(1) = 2 for all n, while X(1) = 1 ◮ Convergence only occurs on the set [0, 1), and P ([0, 1)) = 1
n→∞ Xn = X a.s.
Introduction to Random Processes Probability Review 10
◮ Def: We say XN converges in probability to RV X if for any ǫ > 0
n→∞ P (|Xn − X| < ǫ) = 1
◮ Statement is about probabilities, not about realizations (sequences)
◮ If
n→∞ Xn = X then for any ǫ > 0 there is n0 such that
◮ True for all almost all sequences so P (|Xn − X| < ǫ) → 1
Introduction to Random Processes Probability Review 11
◮ X0 ∼ N(0, 1) (normal, mean 0, variance 1) ◮ Zn sequence of Bernoulli RVs, parameter 1/n ◮ Define ⇒ Xn = X0 − Zn ◮ Xn converges in probability to X0 because
◮ Plot of path xn up to n = 102, n = 103, n = 104
10 20 30 40 50 60 70 80 90 100 −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 100 200 300 400 500 600 700 800 900 1000 −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 −2 −1.8 −1.6 −1.4 −1.2 −1 −0.8 −0.6
Introduction to Random Processes Probability Review 12
◮ Almost sure convergence implies that almost all sequences converge ◮ Convergence in probability does not imply convergence of sequences ◮ Latter example: Xn = X0 − Zn, Zn is Bernoulli with parameter 1/n
n→∞ xn does not exist ◮ Almost sure convergence ⇒ disturbances stop happening ◮ Convergence in prob. ⇒ disturbances happen with vanishing freq. ◮ Difference not irrelevant
◮ Interpret Zn as rate of change in savings ◮ With a.s. convergence risk is eliminated ◮ With convergence in prob. risk decreases but does not disappear Introduction to Random Processes Probability Review 13
◮ Def: We say XN converges in mean square to RV X if
n→∞ E
◮ From Markov’s inequality
◮ If Xn → X in mean-square sense, E
◮ Almost sure and mean square ⇒ neither one implies the other
Introduction to Random Processes Probability Review 14
◮ Consider a random process XN. Cdf of Xn is Fn(x) ◮ Def: We say XN converges in distribution to RV X with cdf FX(x) if
n→∞ Fn(x) = FX(x) for all x at which FX(x) is continuous ◮ No claim about individual sequences, just the cdf of Xn
◮ Implied by almost sure, in probability, and mean square convergence
◮ Yn ∼ N(0, 1) ◮ Zn Bernoulli with parameter p ◮ Define ⇒ Xn = Yn − 10Zn/n ◮ Zn
n→∞ Fn(x) “=” N(0, 1)
10 20 30 40 50 60 70 80 90 100 −12 −10 −8 −6 −4 −2 2 4
Introduction to Random Processes Probability Review 15
◮ Individual sequences xn do not converge in any sense
−15 −10 −5 5 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 −6 −4 −2 2 4 6 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 −4 −3 −2 −1 1 2 3 4 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4
◮ As the effect of Zn/n vanishes pdf of Xn converges to pdf of Yn
Introduction to Random Processes Probability Review 16
◮ Sure ⇒ almost sure ⇒ in probability ⇒ in distribution ◮ Mean square ⇒ in probability ⇒ in distribution ◮ In probability ⇒ in distribution
Introduction to Random Processes Probability Review 17
Introduction to Random Processes Probability Review 18
◮ Independent identically distributed (i.i.d.) RVs X1, X2, . . . , Xn, . . . ◮ Mean E [Xn] = µ and variance E
◮ Q: What happens with sum SN := N
n=1 Xn as N grows?
◮ Expected value of sum is E [SN] = Nµ ⇒ Diverges if µ = 0 ◮ Variance is E
◮ One interesting normalization ⇒ ¯
n=1 Xn
◮ Now E
◮ Another interesting normalization ⇒ ZN :=
n=1 Xn − Nµ
◮ Now E [ZN] = 0 and var [ZN] = 1 for all values of N
Introduction to Random Processes Probability Review 19
◮ Sequence of i.i.d. RVs X1, X2, . . . , Xn, . . . with mean µ ◮ Define sample average ¯
n=1 Xn
N→∞ P
N→∞
◮ Strong law implies weak law. Can forget weak law if so wished
Introduction to Random Processes Probability Review 20
◮ Weak law of large numbers is very simple to prove
◮ Variance of ¯
N
◮ But, what is the variance of ¯
◮ Then, ¯
◮ Strong law is a little more challenging. Will not prove it here
Introduction to Random Processes Probability Review 21
◮ Repeated experiment ⇒ Sequence of i.i.d. RVs X1, X2, . . . , Xn, . . .
◮ Fraction of times X ∈ E happens in N experiments is
N
◮ Since the indicators also i.i.d., the strong law asserts that
N→∞
◮ Strong law consistent with our intuitive notion of probability
Introduction to Random Processes Probability Review 22
N→∞ P
n=1 Xn − Nµ
−∞
◮ Former statement implies that for N sufficiently large
n=1 Xn − Nµ
Introduction to Random Processes Probability Review 23
◮ Equivalently can say ⇒ N
◮ Sum of large number of i.i.d. RVs has a normal distribution
◮ Binomial RV X with parameters (n, p) ◮ Write as X = n i=1 Xi with Xi i.i.d. Bernoulli with parameter p ◮ Mean E [Xi] = p and variance var [Xi] = p(1 − p)
Introduction to Random Processes Probability Review 24
Introduction to Random Processes Probability Review 25
◮ Recall definition of conditional probability for events E and F
◮ Def: Conditional pmf of RV X given Y is (both RVs discrete)
◮ Which we can rewrite as
◮ Def: Conditional cdf is (a range of X conditioned on a value of Y )
Introduction to Random Processes Probability Review 26
◮ Consider independent Bernoulli RVs Y and Z, define X = Y + Z ◮ Q: Conditional pmf of X given Y ? For X = 0, Y = 0
◮ Or, from joint and marginal pmfs (just a matter of definition)
◮ Can compute the rest analogously
Introduction to Random Processes Probability Review 27
◮ Consider independent Poisson RVs Y and Z, parameters λ1 and λ2 ◮ Define X = Y + Z. Q: Conditional pmf of Y given X?
◮ Used Y and Z independent. Now recall X is Poisson, λ = λ1 + λ2
1
2
1λx−y 2
Introduction to Random Processes Probability Review 28
◮ Def: Conditional pdf of RV X given Y is (both RVs continuous)
◮ For motivation, define intervals ∆x = [x, x+dx] and ∆y = [y, y+dy]
◮ From definition of conditional pdf it follows
◮ Def: Conditional cdf is ⇒ FX|Y (x) =
−∞
Introduction to Random Processes Probability Review 29
◮ Random message (RV) Y , transmit signal y (realization of Y ) ◮ Received signal is x = y + z (z realization of random noise)
◮ Q: Conditional pdf of X given Y ? Try the definition
◮ Computing conditional probs. typically easier than computing joints
Introduction to Random Processes Probability Review 30
◮ If Y = y is given, then “Y not random anymore”
◮ If Y were not random, say Y = y with y given then X = y + Z
◮ But since Z is normal with zero mean and variance σ2
−∞
−∞
Introduction to Random Processes Probability Review 31
◮ Conditioning is a common tool to compute probabilities ◮ Message 1 (w.p. p) ⇒ Transmit Y = 1 ◮ Message 2 (w.p. q) ⇒ Transmit Y = −1 ◮ Received signal ⇒ X = Y + Z
◮ Decoding rule ⇒ ˆ
◮ Q: What is the probability of error, Pe := P
Introduction to Random Processes Probability Review 32
◮ From communications channel example we know
Introduction to Random Processes Probability Review 33
◮ Write probability of error by conditioning on Y = ±1 (total probability)
◮ According to the decision rule
◮ But X given Y is normally distributed, then
−∞
Introduction to Random Processes Probability Review 34
Introduction to Random Processes Probability Review 35
◮ Def: For continuous RVs X, Y , conditional expectation is
−∞
◮ Def: For discrete RVs X, Y , conditional expectation is
◮ Defined for given y ⇒ E
◮ If X and Y independent, then E
Introduction to Random Processes Probability Review 36
◮ Consider independent Bernoulli RVs Y and Z, define X = Y + Z ◮ Q: What is E
◮ Use definition of conditional expectation for discrete RVs
Introduction to Random Processes Probability Review 37
◮ If E
◮ Q: What is EY
x
y
y
◮ Offers a useful method to compute expected values
Introduction to Random Processes Probability Review 38
◮ Consider a probability class in some university
◮ Q: Expectation of X = exchange student’s grade? ◮ Start by conditioning on standing
◮ Now sum over standing’s probability
Introduction to Random Processes Probability Review 39
◮ Consider independent Poisson RVs Y and Z, parameters λ1 and λ2 ◮ Define X = Y + Z. What is E
◮ Now use iterated expectations to obtain E [Y ]
∞
∞
◮ Of course, since Y is Poisson with parameter λ1
Introduction to Random Processes Probability Review 40
◮ As with probabilities conditioning is useful to compute expectations
◮ A baseball player scores Xi runs per game
◮ Player plays N games in the season. N is random (playoffs, injuries?)
◮ What is the expected number of runs in the season? ⇒ E
i=1 Xi is known as compound RV
Introduction to Random Processes Probability Review 41
n
Introduction to Random Processes Probability Review 42
◮ Calculate E [X] by conditioning on Y = I {“first trial is a success”}
◮ Use iterated expectations
◮ Solving for E [X] yields
◮ Here, direct approach is straightforward (geometric series, derivative)
Introduction to Random Processes Probability Review 43
◮ A miner is trapped in a mine containing three doors ◮ At all times n ≥ 1 while still trapped
◮ The miner chooses a door Dn = j, j = 1, 2, 3 ◮ Choice of door Dn made independently of prior choices ◮ Equally likely to pick either door, i.e., P (Dn = j) = 1/3
◮ Each door leads to a tunnel, but only one leads to safety
◮ Door 1: the miner reaches safety after two hours of travel ◮ Door 2: the miner returns back after three hours of travel ◮ Door 3: the miner returns back after five hours of travel
◮ Let X denote the total time traveled till the miner reaches safety ◮ Q: What is E [X]?
Introduction to Random Processes Probability Review 44
◮ Calculate E [X] by conditioning on first door choice D1
◮ Use iterated expectations
3
3
◮ Solving for E [X] yields
◮ You will solve it again using compound RVs in the homework
Introduction to Random Processes Probability Review 45
◮ Def: The conditional variance of X given Y = y is
◮ Calculate var [X] by conditioning on Y = y. Quick guesses?
◮ Neither. Following conditional variance formula is the correct way
Introduction to Random Processes Probability Review 46
◮ Start from the first summand, use linearity, iterated expectations
◮ For the second term use variance definition, iterated expectations
◮ Summing up both terms yields (blue terms cancel)
Introduction to Random Processes Probability Review 47
◮ Let X1, X2, . . . be i.i.d. RVs with E [X1] = µ and var [X1] = σ2 ◮ Let N be a nonnegative integer-valued RV independent of the Xi ◮ Consider the compound RV S = N i=1 Xi. What is var [S]? ◮ The conditional variance formula is useful here ◮ Earlier, we found E [S|N] = Nµ. What about var [S|N = n]?
◮ The conditional variance formula is var [S] = E
Introduction to Random Processes Probability Review 48
◮ Markov’s inequality ◮ Chebyshev’s inequality ◮ Limit of a sequence ◮ Almost sure convergence ◮ Convergence in probability ◮ Mean-square convergence ◮ Convergence in distribution ◮ I.i.d. random variables ◮ Sample average ◮ Centering and scaling ◮ Law of large numbers ◮ Central limit theorem ◮ Conditional distribution ◮ Communication channel ◮ Probability of error ◮ Conditional expectation ◮ Iterated expectations ◮ Expectations by conditioning ◮ Compound random variable ◮ Conditional variance
Introduction to Random Processes Probability Review 49