1
play

1 One- Sided Chebyshevs Inequality Of the Midterm What Say You - PDF document

Markovs Inequality Inequality, Probability, and Joviality In many cases, we dont know the true form of a Say X is a non-negative random variable probability distribution E [ X ] for all P ( X a ) , a 0


  1. Markov’s Inequality Inequality, Probability, and Joviality • In many cases, we don’t know the true form of a • Say X is a non-negative random variable probability distribution E [ X ]    for all P ( X a ) , a 0  E.g., Midterm scores a  But, we know the mean • Proof:  May also have other measures/properties  I = 1 if X ≥ a , 0 otherwise o Variance X   Since  X 0 , I o Non-negativity a  Taking expectations: o Etc.   X E [ X ]      Inequalities and bounds still allow us to say something E [ I ] P ( X a ) E     a a about the probability distribution in such cases o May be imprecise compared to knowing true distribution! Andrey Andreyevich Markov Markov and the Midterm • Statistics from last quarter’s CS109 midterm • Andrey Andreyevich Markov (1856-1922) was a Russian mathematician  X = midterm score  Using sample mean X = 78.1  E[X]  What is P(X ≥ 91)? [ ] 78 . 1 E X     P ( X 91 ) 0 . 8582 91 91  Markov bound:  85.82% of class scored 91 or greater  Markov’s Inequality is named after him  In fact, 34.44% of class scored 91 or greater  He also invented Markov Chains… o Markov inequality can be a very loose bound o …which are the basis for Google’s PageRank algorithm  His facial hair inspires fear in Charlie Sheen o But, it made no assumption at all about form of distribution! Chebyshev’s Inequality Pafnuty Chebyshev • X is a random variable with E[X] = m , Var(X) = s 2 • Pafnuty Lvovich Chebyshev (1821-1894) was also a Russian mathematician s 2  m    for all P ( X k ) , k 0 2 k • Proof:  Since (X – m ) 2 is non-negative random variable, apply Markov’s Inequality with a = k 2  m s 2 2 E [( X ) ]  m    2 2 P (( X ) k )  Chebyshev’s Inequality is named after him 2 2 k k  Note that: (X – m ) 2 ≥ k 2  |X – m | ≥ k , yielding: o But actually formulated by his colleague Irénée-Jules Bienaymé  He was Markov’s doctoral advisor s 2  m   P ( X k ) o And sometimes credited with first deriving Markov’s Inequality 2 k  There is a crater on the moon named in his honor 1

  2. One- Sided Chebyshev’s Inequality Of the Midterm What Say You Chebyshev? • Statistics from last quarter’s CS109 midterm • X is a random variable with E[X] = 0, Var(X) = s 2 s 2  X = midterm score    for any P ( X a ) , a 0 s  2 2 a  Using sample mean X = 78.1  E[X]  Equivalently, when E[Y] = m and Var(Y) = s 2 :  Using sample variance S 2 = (24.5) 2 = 600.25  s 2 s 2  What is P(| X – 78.1 | ≥ 30)?     for any P ( Y E [ Y ] a ) , a 0 s  2 2 a s 2 600 . 25      P ( X E [ X ] 30 ) 0 . 6669 s 2 2 ( 30 ) 900     for any P ( Y E [ Y ] a ) , a 0 s  2 2          a P ( X E [ X ] 30 ) 1 P ( X E [ X ] 30 ) 1 0 . 6669 0 . 3331  Follows directly by setting X = Y – E[Y], noting E[X] = 0  Chebyshev bound:  66.69% scored ≥ 108.1 or  48.1  In fact, 21.85% of class scored ≥ 108.1 or  48.1 o Chebyshev’s inequality is really a theoretical tool Comments on Midterm, One-Sided One? Chernoff Bound • Statistics from last quarter’s CS109 midterm • Say we have MGF, M( t ), for a random variable X  X = midterm score  Chernoff bounds:      Using sample mean X = 78.1  E[X] ta for all P ( X a ) e M ( t ), t 0      Using sample variance S 2 = (24.5) 2 = 600.25  s 2 ta for all P ( X a ) e M ( t ), t 0  Bounds hold for t  0, so use t that minimizes e - ta M( t )  What is P(X ≥ 103.1)? 600 . 25    2  • Proof: P ( X 78 . 1 25 ) 0 . 4899  600 . 25 ( 25 )  X has MGF: M( t ) = E[e tX ]  One-sided Chebyshev bound:  48.99% scored ≥ 103.1  Note P(X ≥ a ) = P(e tX ≥ e ta ), use Markov’s inequality:  In fact, 13.26% of class scored ≥ 103.1 tX E [ e ]          tX ta ta tX ta P ( X a ) P ( e e ) e E [ e ] e M ( t ), for all t 0 ta e 78 . 1  Using Markov’s inequality:    P ( X 103 . 1 ) 0 . 7575  Similarity for P(X  a) when t < 0 103 . 1 Chernoff’s Feeling (Unit) Normal Herman Chernoff • Herman Chernoff (1923-) is an American • Z is standard normal random variable: Z ~ N(0, 1) mathematician and statistician  2 t / 2  Moment generating function: M ( t ) e Z  Chernoff bounds for P(Z ≥ a )    2  2   ta t / 2 t / 2 ta for all P ( Z a ) e e e , t 0  To minimize bound, minimize: t 2 /2 – ta o Differentiate w.r.t. t , and set to 0: t – a = 0  t = a    2   a / 2 for all P ( Z a ) e , t a 0  Chernoff Bound is named after him  Can proceed similarly for t = a < 0 to obtain: o And it actually was derived by him!    2   a / 2 P ( Z a ) e , for all t a 0  He is Professor Emeritus of Applied Mathematics at z 1 MIT and of Statistics at Harvard University         2  Compare to: x / 2 P ( Z z ) 1 P ( Z z ) 1 e dx  o I do not know if he is a fan of Charlie Sheen 2   2

  3. Chernoff’s Poisson Pill Jensen’s Inequality • X is Poisson random variable: X ~ Poi( l ) • If f ( x ) is a convex function then E[ f ( x )] ≥ f (E[X]) l   t ( e 1 )  f ( x ) is convex if f ’’ ( x ) ≥ 0 for all x  Moment generating function: M ( t ) e X  Chernoff bounds for P(X ≥ i )  Intuition: Convex = “bowl”. E.g.: f ( x ) = x 2 , f ( x ) = e x   l t    l t    ( e 1 ) it ( e 1 ) it P ( X i ) e e e , for all t 0  To minimize bound, minimize: l ( e t – 1) – it o Differentiate w.r.t. t , and set to 0: l e t – i = 0  e t = i / l    i  l  i  l  i  if g ( x ) = - f ( x ) is convex, then f ( x ) is concave i e   l l    l   l l        ( i / 1 ) i for all P ( X i ) e e e e , i/ 1 l        Proof outline: Taylor series of f ( x ) about m . Be happy. i i l i  Note: E[ f ( x )] = f (E[X]) only holds when f (x) is a line    l  Compare to: P ( X i ) e i ! o That is when: f ’’ ( x ) = 0 for all x Johan Jensen A Brief Digression on Utility Theory • Utility U(x) is “value” you derive from x • Johan Ludwig William Valdemar Jensen (1859- 1925) was a Danish mathematician 0.5 $20,000 yes $0 0.5 Play? no $10,000 0.5 U($20,000) yes U($0)  He derived Jensen’s inequality 0.5 Play?  He was president of the Danish Mathematical Society no U($10,000) from 1892 to 1903  Can be monetary, but often includes intangibles  He has more names than Charlie Sheen o E.g., quality of life, life expectancy, personal beliefs, etc. Jensen’s Investment Advice Utility Curves • Example: risk-taking investor, with two choices:  Choice 1: Invest money to get return X where E[X] = m  Choice 2: Invest money to get return m (probability 1) Utility • Want to maximize utility: u (R), where R is return  if u (X) convex then E[ u (X)] ≥ u ( m ), so choice 1 better  If u (X) concave then E[ u (X)]  u ( m ) so choice 2 better  Convex u  “risk preferring”, concave u  “risk averse” Dollars • Utility curve determines your “risk preference”  Can be different in different parts of the curve  We’ll talk more about this near the end of the quarter 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend