expectation
play

Expectation DS GA 1002 Probability and Statistics for Data Science - PowerPoint PPT Presentation

Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance, covariance Expectation


  1. Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P

  2. Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P ≤ E ( Y ) a 2

  3. Chebyshev’s inequality Define Y := ( X − E ( X )) 2 By Markov’s inequality � Y ≥ a 2 � P ( | X − E ( X ) | ≥ a ) = P ≤ E ( Y ) a 2 = Var ( X ) a 2

  4. Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30?

  5. Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30? P ( A ≥ 30 ) ≤ P ( | A − 20 | ≥ 10 )

  6. Age of students at NYU Mean: 20 years, standard deviation: 3 years How many are younger than 30? P ( A ≥ 30 ) ≤ P ( | A − 20 | ≥ 10 ) ≤ Var ( A ) 100 9 = 100 At least 91 %

  7. Expectation operator Mean and variance Covariance Conditional expectation

  8. Covariance The covariance of X and Y is Cov ( X , Y ) := E (( X − E ( X )) ( Y − E ( Y ))) = E ( XY − Y E ( X ) − X E ( Y ) + E ( X ) E ( Y )) = E ( XY ) − E ( X ) E ( Y ) If Cov ( X , Y ) = 0, X and Y are uncorrelated

  9. Covariance Cov ( X , Y ) 0.5 0.9 0.99 Cov ( X , Y ) 0 -0.9 -0.99

  10. Variance of the sum � ( X + Y − E ( X + Y )) 2 � Var ( X + Y ) = E � ( X − E ( X )) 2 � � ( Y − E ( Y )) 2 � = E + E + 2 E (( X − E ( X )) ( Y − E ( Y ))) = Var ( X ) + Var ( Y ) + 2 Cov ( X , Y )

  11. Variance of the sum � ( X + Y − E ( X + Y )) 2 � Var ( X + Y ) = E � ( X − E ( X )) 2 � � ( Y − E ( Y )) 2 � = E + E + 2 E (( X − E ( X )) ( Y − E ( Y ))) = Var ( X ) + Var ( Y ) + 2 Cov ( X , Y ) If X and Y are uncorrelated, then Var ( X + Y ) = Var ( X ) + Var ( Y )

  12. Independence implies uncorrelation Cov ( X , Y ) = E ( XY ) − E ( X ) E ( Y ) = E ( X ) E ( Y ) − E ( X ) E ( Y ) = 0

  13. Uncorrelation does not imply independence X , Y are independent Bernoulli with parameter 1 2 Let U = X + Y and V = X − Y Are U and V independent? Are they uncorrelated?

  14. Uncorrelation does not imply independence p U ( 0 ) p V ( 0 ) p U , V ( 0 , 0 )

  15. Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) p U , V ( 0 , 0 )

  16. Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 )

  17. Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 ) = P ( X = 0 , Y = 0 ) = 1 4

  18. Uncorrelation does not imply independence p U ( 0 ) = P ( X = 0 , Y = 0 ) = 1 4 p V ( 0 ) = P ( X = 1 , Y = 1 ) + P ( X = 0 , Y = 0 ) = 1 2 p U , V ( 0 , 0 ) = P ( X = 0 , Y = 0 ) = 1 4 � = p U ( 0 ) p V ( 0 ) = 1 8

  19. Uncorrelation does not imply independence Cov ( U , V ) = E ( UV ) − E ( U ) E ( V ) = E (( X + Y ) ( X − Y )) − E ( X + Y ) E ( X − Y ) − E 2 ( X ) + E 2 ( Y ) X 2 � Y 2 � � � = E − E

  20. Uncorrelation does not imply independence Cov ( U , V ) = E ( UV ) − E ( U ) E ( V ) = E (( X + Y ) ( X − Y )) − E ( X + Y ) E ( X − Y ) − E 2 ( X ) + E 2 ( Y ) X 2 � Y 2 � � � = E − E = 0

  21. Correlation coefficient Pearson correlation coefficient of X and Y ρ X , Y := Cov ( X , Y ) . σ X σ Y Covariance between X /σ X and Y /σ Y

  22. Correlation coefficient σ Y = 1, σ Y = 3, σ Y = 3, Cov ( X , Y ) = 0 . 9, Cov ( X , Y ) = 0 . 9, Cov ( X , Y ) = 2 . 7, ρ X , Y = 0 . 9 ρ X , Y = 0 . 3 ρ X , Y = 0 . 9

  23. Cauchy-Schwarz inequality For any X and Y � | E ( XY ) | ≤ E ( X 2 ) E ( Y 2 ) . and � E ( Y 2 ) � E ( X 2 ) E ( Y 2 ) ⇐ E ( XY ) = ⇒ Y = E ( X 2 ) X � E ( Y 2 ) � E ( XY ) = − E ( X 2 ) E ( Y 2 ) ⇐ ⇒ Y = − E ( X 2 ) X

  24. Cauchy-Schwarz inequality We have Cov ( X , Y ) ≤ σ X σ Y and equivalently | ρ X , Y | ≤ 1 In addition | ρ X , Y | = 1 ⇐ ⇒ Y = c X + d where � σ Y if ρ X , Y = 1 , σ X c := d := E ( Y ) − c E ( X ) − σ Y if ρ X , Y = − 1 , σ X

  25. Covariance matrix of a random vector The covariance matrix of � X is defined as   Var ( X 1 ) Cov ( X 1 , X 2 ) · · · Cov ( X 1 , X n ) Cov ( X 2 , X 1 ) Var ( X 2 ) · · · Cov ( X 2 , X n )   Σ � X =  . . .  ... . . .   . . .   Cov ( X n , X 2 ) Cov ( X n , X 2 ) · · · Var ( X n ) � T � X T � � � � X � � � � = E − E X E X

  26. Covariance matrix after a linear transformation Σ A � X + � b

  27. Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + �

  28. Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X

  29. Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X � � T � � X T � � � � X � � � � A T = A − E E X E X

  30. Covariance matrix after a linear transformation �� � T � � T � � � � � A � X + � A � X + � A � X + � A � X + � Σ A � b = E b b − E b b E X + � � T � X T � A T + � � A T + A E � � b T + � X � � � � � b � b T = A E b E X X � T � T � � � A T − A E � � b T − � � A T − � � � � � � b � b T − A E X E X X b E X � � T � � X T � � � � X � � � � A T = A − E E X E X X A T = A Σ �

  31. Variance in a fixed direction For any unit vector � u � u T � � u T Σ � Var � X = � X � u

  32. Direction of maximum variance To find direction of maximum variance we must solve u T Σ � arg max u || 2 = 1 � X � u || �

  33. Linear algebra Symmetric matrices have orthogonal eigenvectors X = U Λ U T Σ �  λ 1 0 · · · 0  0 λ 2 · · · 0 � T   � � � � � � � � � = u 1 u 2 · · · u n u 1 u 2 · · · u n   · · ·   0 0 · · · λ n

  34. Linear algebra || u || 2 = 1 u T Au λ 1 = max || u || 2 = 1 u T Au u 1 = arg max u T Au λ k = max || u || 2 = 1 , u ⊥ u 1 ,..., u k − 1 u T Au u k = arg max || u || 2 = 1 , u ⊥ u 1 ,..., u k − 1

  35. Direction of maximum variance √ λ 1 = 1 . 22, √ λ 1 = 1 . 38, √ λ 1 = 1, √ λ 2 = 1 √ λ 2 = 0 . 71 √ λ 2 = 0 . 32

  36. Coloring Goal: Transform uncorrelated samples with unit variance so that they have a prescribed covariance matrix Σ 1. Compute the eigendecomposition Σ = U Λ U T . 2. Set √ � y := U Λ � x where √ λ 1  0 · · · 0  √ λ 2 √ 0 0 · · ·   Λ :=   · · ·  √ λ n  0 0 · · ·

  37. Coloring Σ � Y

  38. Coloring √ √ T U T Σ � Y = U ΛΣ � Λ X

  39. Coloring √ √ T U T Σ � Y = U ΛΣ � Λ X √ √ T U T = U Λ I Λ

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend