probability review
play

Probability Review Gonzalo Mateos Dept. of ECE and Goergen - PowerPoint PPT Presentation

Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 16, 2020 Introduction to Random Processes Probability


  1. Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 16, 2020 Introduction to Random Processes Probability Review 1

  2. Markov and Chebyshev’s inequalities Markov and Chebyshev’s inequalities Convergence of random variables Limit theorems Conditional probabilities Conditional expectation Introduction to Random Processes Probability Review 2

  3. Markov’s inequality ◮ RV X with E [ | X | ] < ∞ , constant a > 0 ◮ Markov’s inequality states ⇒ P ( | X | ≥ a ) ≤ E ( | X | ) a Proof. ◮ I {| X | ≥ a } = 1 when | X | ≥ a and | X | 0 else. Then (figure to the right) a I {| X | ≥ a } ≤ | X | a ◮ Use linearity of expected value a E ( I {| X | ≥ a } ) ≤ E ( | X | ) X − a a ◮ Indicator function’s expectation = Probability of indicated event a P ( | X | ≥ a ) ≤ E ( | X | ) Introduction to Random Processes Probability Review 3

  4. Chebyshev’s inequality ◮ RV X with E ( X ) = µ and E ( X − µ ) 2 � = σ 2 , constant k > 0 � ◮ Chebyshev’s inequality states ⇒ P ( | X − µ | ≥ k ) ≤ σ 2 k 2 Proof. ◮ Markov’s inequality for the RV Z = ( X − µ ) 2 and constant a = k 2 ( X − µ ) 2 � � ≤ E [ | Z | ] = E ( X − µ ) 2 ≥ k 2 � | Z | ≥ k 2 � � � P = P k 2 k 2 ◮ Notice that ( X − µ ) 2 ≥ k 2 if and only if | X − µ | ≥ k thus ( X − µ ) 2 � � P ( | X − µ | ≥ k ) ≤ E k 2 ◮ Chebyshev’s inequality follows from definition of variance Introduction to Random Processes Probability Review 4

  5. Comments and observations ◮ If absolute expected value is finite, i.e., E [ | X | ] < ∞ ⇒ Complementary (c)cdf decreases at least like x − 1 (Markov’s) ◮ If mean E ( X ) and variance E ( X − µ ) 2 � � are finite ⇒ Ccdf decreases at least like x − 2 (Chebyshev’s) ◮ Most cdfs decrease exponentially (e.g. e − x 2 for normal) ⇒ Power law bounds ∝ x − α are loose but still useful ◮ Markov’s inequality often derived for nonnegative RV X ≥ 0 ⇒ Can drop the absolute value to obtain P ( X ≥ a ) ≤ E ( X ) a ⇒ General bound P ( X ≥ a ) ≤ E ( X r ) holds for r > 0 a r Introduction to Random Processes Probability Review 5

  6. Convergence of random variables Markov and Chebyshev’s inequalities Convergence of random variables Limit theorems Conditional probabilities Conditional expectation Introduction to Random Processes Probability Review 6

  7. Limits ◮ Sequence of RVs X N = X 1 , X 2 , . . . , X n , . . . ⇒ Distinguish between random process X N and realizations x N Q1) Say something about X n for n large? ⇒ Not clear, X n is a RV Q2) Say something about x n for n large? ⇒ Certainly, look at n →∞ x n lim Q3) Say something about P ( X n ∈ X ) for n large? ⇒ Yes, n →∞ P ( X n ∈ X ) lim ◮ Translate what we now about regular limits to definitions for RVs ◮ Can start from convergence of sequences: n →∞ x n lim ⇒ Sure and almost sure convergence ◮ Or from convergence of probabilities: n →∞ P ( X n ) lim ⇒ Convergence in probability, in mean square and distribution Introduction to Random Processes Probability Review 7

  8. Convergence of sequences and sure convergence ◮ Denote sequence of numbers x N = x 1 , x 2 , . . . , x n , . . . ◮ Def: Sequence x N converges to the value x if given any ǫ > 0 ⇒ There exists n 0 such that for all n > n 0 , | x n − x | < ǫ ◮ Sequence x n comes arbitrarily close to its limit ⇒ | x n − x | < ǫ ⇒ And stays close to its limit for all n > n 0 ◮ Random process (sequence of RVs) X N = X 1 , X 2 , . . . , X n , . . . ⇒ Realizations of X N are sequences x N ◮ Def: We say X N converges surely to RV X if ⇒ n →∞ x n = x for all realizations x N of X N lim ◮ Said differently, lim n →∞ X n ( s ) = X ( s ) for all s ∈ S ◮ Not really adequate. Even a (practically unimportant) outcome that happens with vanishingly small probability prevents sure convergence Introduction to Random Processes Probability Review 8

  9. Almost sure convergence ◮ RV X and random process X N = X 1 , X 2 , . . . , X n , . . . ◮ Def: We say X N converges almost surely to RV X if � � P n →∞ X n = X lim = 1 ⇒ Almost all sequences converge, except for a set of measure 0 ◮ Almost sure convergence denoted as ⇒ n →∞ X n = X lim a.s. ⇒ Limit X is a random variable 1 Example 0.5 ◮ X 0 ∼ N (0 , 1) (normal, mean 0, variance 1) 0 ◮ Z n sequence of Bernoulli RVs, parameter p − 0.5 ◮ Define ⇒ X n = X 0 − Z n − 1 n − 1.5 ◮ Z n n → 0 so lim n →∞ X n = X 0 a.s. (also surely) − 2 10 20 30 40 50 60 70 80 90 100 Introduction to Random Processes Probability Review 9

  10. Almost sure convergence example ◮ Consider S = [0 , 1] and let P ( · ) be the uniform probability distribution ⇒ P ([ a , b ]) = b − a for 0 ≤ a ≤ b ≤ 1 ◮ Define the RVs X n ( s ) = s + s n and X ( s ) = s ◮ For all s ∈ [0 , 1) ⇒ s n → 0 as n → ∞ , hence X n ( s ) → s = X ( s ) ◮ For s = 1 ⇒ X n (1) = 2 for all n , while X (1) = 1 ◮ Convergence only occurs on the set [0 , 1), and P ([0 , 1)) = 1 ⇒ We say lim n →∞ X n = X a.s. ⇒ Once more, note the limit X is a random variable Introduction to Random Processes Probability Review 10

  11. Convergence in probability ◮ Def: We say X N converges in probability to RV X if for any ǫ > 0 n →∞ P ( | X n − X | < ǫ ) = 1 lim ⇒ Prob. of distance | X n − X | becoming smaller than ǫ tends to 1 ◮ Statement is about probabilities, not about realizations (sequences) ⇒ Probability converges, realizations x N may or may not converge ⇒ Limit and prob. interchanged with respect to a.s. convergence Theorem Almost sure (a.s.) convergence implies convergence in probability Proof. ◮ If n →∞ X n = X then for any ǫ > 0 there is n 0 such that lim | X n − X | < ǫ for all n ≥ n 0 ◮ True for all almost all sequences so P ( | X n − X | < ǫ ) → 1 Introduction to Random Processes Probability Review 11

  12. Convergence in probability example − 0.6 − 0.8 − 1 − 1.2 ◮ X 0 ∼ N (0 , 1) (normal, mean 0, variance 1) − 1.4 ◮ Z n sequence of Bernoulli RVs, parameter 1 / n − 1.6 − 1.8 ◮ Define ⇒ X n = X 0 − Z n − 2 10 20 30 40 50 60 70 80 90 100 − 0.6 ◮ X n converges in probability to X 0 because − 0.8 − 1 − 1.2 P ( | X n − X 0 | < ǫ ) = P ( | Z n | < ǫ ) − 1.4 = 1 − P ( Z n = 1) − 1.6 − 1.8 = 1 − 1 − 2 100 200 300 400 500 600 700 800 900 1000 n → 1 − 0.6 − 0.8 − 1 ◮ Plot of path x n up to n = 10 2 , n = 10 3 , n = 10 4 − 1.2 − 1.4 ⇒ Z n = 1 becomes ever rarer but still happens − 1.6 − 1.8 − 2 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Introduction to Random Processes Probability Review 12

  13. Difference between a.s. and in probability ◮ Almost sure convergence implies that almost all sequences converge ◮ Convergence in probability does not imply convergence of sequences ◮ Latter example: X n = X 0 − Z n , Z n is Bernoulli with parameter 1 / n ⇒ Showed it converges in probability P ( | X n − X 0 | < ǫ ) = 1 − 1 n → 1 ⇒ But for almost all sequences, n →∞ x n does not exist lim ◮ Almost sure convergence ⇒ disturbances stop happening ◮ Convergence in prob. ⇒ disturbances happen with vanishing freq. ◮ Difference not irrelevant ◮ Interpret Z n as rate of change in savings ◮ With a.s. convergence risk is eliminated ◮ With convergence in prob. risk decreases but does not disappear Introduction to Random Processes Probability Review 13

  14. Mean-square convergence ◮ Def: We say X N converges in mean square to RV X if | X n − X | 2 � � lim = 0 n →∞ E ⇒ Sometimes (very) easy to check Theorem Convergence in mean square implies convergence in probability Proof. ◮ From Markov’s inequality � | X n − X | 2 � ≤ E | X n − X | 2 ≥ ǫ 2 � � P ( | X n − X | ≥ ǫ ) = P ǫ 2 /ǫ 2 → 0 for all ǫ ◮ If X n → X in mean-square sense, E | X n − X | 2 � � ◮ Almost sure and mean square ⇒ neither one implies the other Introduction to Random Processes Probability Review 14

  15. Convergence in distribution ◮ Consider a random process X N . Cdf of X n is F n ( x ) ◮ Def: We say X N converges in distribution to RV X with cdf F X ( x ) if ⇒ lim n →∞ F n ( x ) = F X ( x ) for all x at which F X ( x ) is continuous ◮ No claim about individual sequences, just the cdf of X n ⇒ Weakest form of convergence covered ◮ Implied by almost sure, in probability, and mean square convergence Example 4 2 ◮ Y n ∼ N (0 , 1) 0 − 2 ◮ Z n Bernoulli with parameter p − 4 − 6 ◮ Define ⇒ X n = Y n − 10 Z n / n − 8 ◮ Z n − 10 n → 0 so lim n →∞ F n ( x ) “=” N (0 , 1) − 12 10 20 30 40 50 60 70 80 90 100 Introduction to Random Processes Probability Review 15

  16. Convergence in distribution (continued) ◮ Individual sequences x n do not converge in any sense ⇒ It is the distribution that converges n = 1 n = 10 n = 100 0.4 0.4 0.4 0.35 0.35 0.35 0.3 0.3 0.3 0.25 0.25 0.25 0.2 0.2 0.2 0.15 0.15 0.15 0.1 0.1 0.1 0.05 0.05 0.05 0 0 0 − 15 − 10 − 5 0 5 − 6 − 4 − 2 0 2 4 6 − 4 − 3 − 2 − 1 0 1 2 3 4 ◮ As the effect of Z n / n vanishes pdf of X n converges to pdf of Y n ⇒ Standard normal N (0 , 1) Introduction to Random Processes Probability Review 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend