multivariate random variables
play

Multivariate random variables DS GA 1002 Statistical and - PowerPoint PPT Presentation

Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall16 Carlos Fernandez-Granda Joint distributions Tool to characterize several uncertain numerical quantities of


  1. Joint probability density function For any Borel set S ⊆ R n � � � � X ( � x ) d � P X ∈ S = f � x S In particular, � X ( � x ) d � x = 1 R n f �

  2. Example: Triangle lake 1 . 5 E F 1 C D 0 . 5 B 0 A − 0 . 5 − 0 . 5 0 0 . 5 1 1 . 5

  3. Example: Triangle lake  0 if x 1 < 0 or x 2 < 0,     2 x 1 x 2 , if x 1 ≥ 0 , x 2 ≥ 0 , x 1 + x 2 ≤ 1 ,      2 x 1 + 2 x 2 − x 2 2 − x 2 1 − 1 , if x 1 ≤ 1 , x 2 ≤ 1 , x 1 + x 2 ≥ 1 ,  X ( � x ) = F � 2 x 2 − x 2 2 , if x 1 ≥ 1 , 0 ≤ x 2 ≤ 1 ,    2 x 1 − x 2  1 , if 0 ≤ x 1 ≤ 1 , x 2 ≥ 1 ,      1 , if x 1 ≥ 1 , x 2 ≥ 1 

  4. Marginalization We can compute the marginal cdf from the joint cdf F X ( x ) = P ( X ≤ x ) = lim y →∞ F X , Y ( x , y ) or from the joint pdf � ∞ � x F X ( x ) = P ( X ≤ x ) = f X , Y ( u , y ) d u d y u = −∞ y = −∞ Differentiating we obtain � ∞ f X ( x ) = f X , Y ( x , y ) d y y = −∞

  5. Marginalization Marginal pdf of a subvector � X I , I := { i 1 , i 2 , . . . , i m } , � � � X I ( � x I ) = · · · X ( � x ) d x j 1 d x j 2 · · · d x j n − m f � f � x j 1 x j 2 x jn − m where { j 1 , j 2 , . . . , j n − m } := { 1 , 2 , . . . , n } / I

  6. Example: Triangle lake (continued) Marginal cdf of x 1  0 if x 1 < 0,   2 x 1 − x 2 F X 1 ( x 1 ) = lim x 2 →∞ F � X ( � x ) = if 0 ≤ x 1 ≤ 1 , 1  1 if x 1 ≥ 1  Marginal pdf of x 1 � f X 1 ( x 1 ) = d F X 1 ( x 1 ) 2 ( 1 − x 1 ) if 0 ≤ x 1 ≤ 1 = d x 1 0 otherwise

  7. Joint conditional cdf and pdf given an event If we know that ( X , Y ) ∈ S for any Borel set in R 2 F X , Y | ( X , Y ) ∈S ( x , y ) := P ( X ≤ x , Y ≤ y | ( X , Y ) ∈ S ) = P ( X ≤ x , Y ≤ y , ( X , Y ) ∈ S ) P (( X , Y ) ∈ S ) � u ≤ x , v ≤ y , ( u , v ) ∈S f X , Y ( u , v ) d u d v = � ( u , v ) ∈S f X , Y ( u , v ) d u d v f X , Y | ( X , Y ) ∈S ( x , y ) := ∂ 2 F X , Y | ( X , Y ) ∈S ( x , y ) ∂ x ∂ y

  8. Conditional cdf and pdf Distribution of Y given X = x ? The event has zero probability!

  9. Conditional cdf and pdf Distribution of Y given X = x ? The event has zero probability! Define f Y | X ( y | x ) := f X , Y ( x , y ) , if f X ( x ) > 0 f X ( x ) � y F Y | X ( y | x ) := f Y | X ( u | x ) d u u = −∞ Chain rule for continuous random variables f X , Y ( x , y ) = f X ( x ) f Y | X ( y | x )

  10. Conditional cdf and pdf P ( x ≤ X ≤ x + ∆ x ) f X ( x ) = lim ∆ x ∆ x → 0 1 ∂ P ( x ≤ X ≤ x + ∆ x , Y ≤ y ) f X , Y ( x , y ) = lim ∆ x ∂ y ∆ x → 0

  11. Conditional cdf and pdf F Y | X ( y | x ) � y 1 ∂ P ( x ≤ X ≤ x + ∆ x , Y ≤ u ) = lim d u P ( x ≤ X ≤ x + ∆ x ) ∂ y ∆ x → 0 , ∆ y → 0 u = −∞ � y 1 ∂ P ( x ≤ X ≤ x + ∆ x , Y ≤ u ) = lim d u P ( x ≤ X ≤ x + ∆ x ) ∂ y ∆ x → 0 u = −∞ P ( x ≤ X ≤ x + ∆ x , Y ≤ y ) = lim P ( x ≤ X ≤ x + ∆ x ) ∆ x → 0 = lim ∆ x → 0 P ( Y ≤ y | x ≤ X ≤ x + ∆ x )

  12. Conditional pdf of a random subvector Conditional pdf of a random subvector � X I , I ⊆ { 1 , 2 , . . . , n } , given another subvector � X { 1 ,..., n } / I is X ( � x ) f � � � � x I | � := f � x { 1 ,..., n } / I X I | � X { 1 ,..., n } / I � � � f � x { 1 ,..., n } / I X { 1 ,..., n } / I Chain rule for continuous random vectors f � X ( � x ) = f X 1 ( x 1 ) f X 2 | X 1 ( x 2 | x 1 ) . . . f X n | X 1 ,..., X n − 1 ( x n | x 1 , . . . , x n − 1 ) n � � � = x i | � f X i | � x { 1 ,..., i − 1 } X { 1 ,..., i − 1 } i = 1 Any order works!

  13. Example: Triangle lake (continued) Conditioned on { x 1 = 0 . 75 } what is the pdf and cdf of x 2 ?

  14. Example: Triangle lake (continued) f X 2 | X 1 ( x 2 | x 1 )

  15. Example: Triangle lake (continued) X ( � x ) f X 2 | X 1 ( x 2 | x 1 ) = f � f X 1 ( x 1 )

  16. Example: Triangle lake (continued) X ( � x ) f X 2 | X 1 ( x 2 | x 1 ) = f � f X 1 ( x 1 ) 1 = , 0 ≤ x 2 ≤ 1 − x 1 1 − x 1

  17. Example: Triangle lake (continued) X ( � x ) f X 2 | X 1 ( x 2 | x 1 ) = f � f X 1 ( x 1 ) 1 = , 0 ≤ x 2 ≤ 1 − x 1 1 − x 1 � x 2 F X 2 | X 1 ( x 2 | x 1 ) = f X 2 | X 1 ( u | x 1 ) d u −∞ x 2 = 1 − x 1

  18. Example: Desert ◮ Car traveling through the desert ◮ Time until the car breaks down: T ◮ State of the motor: M ◮ State of the road: R ◮ Model : ◮ M uniform between 0 (no problem) and 1 (very bad) ◮ R uniform between 0 (no problem) and 1 (very bad) ◮ M and R independent ◮ T exponential with parameter M + R

  19. Example: Desert Joint pdf?

  20. Example: Desert Joint pdf? f M , R , T ( m , r , t )

  21. Example: Desert Joint pdf? f M , R , T ( m , r , t ) = f M ( m ) f R | M ( r | m ) f T | M , R ( t | m , r )

  22. Example: Desert Joint pdf? f M , R , T ( m , r , t ) = f M ( m ) f R | M ( r | m ) f T | M , R ( t | m , r ) = f M ( m ) f R ( r ) f T | M , R ( t | m , r ) by independence

  23. Example: Desert Joint pdf? f M , R , T ( m , r , t ) = f M ( m ) f R | M ( r | m ) f T | M , R ( t | m , r ) = f M ( m ) f R ( r ) f T | M , R ( t | m , r ) by independence � ( m + r ) e − ( m + r ) t for t ≥ 0 , 0 ≤ m ≤ 1 , 0 ≤ r ≤ 1 , = 0 otherwise

  24. Example: Desert ◮ Car breaks down after 15 min (0.25 h), T = 0 . 25 ◮ Road seems OK, R = 0 . 2 ◮ What was the state of the motor M ?

  25. Example: Desert ◮ Car breaks down after 15 min (0.25 h), T = 0 . 25 ◮ Road seems OK, R = 0 . 2 ◮ What was the state of the motor M ? f M | R , T ( m | r , t ) = f M , R , T ( m , r , t ) f R , T ( r , t )

  26. Example: Desert f R , T ( r , t ) =

  27. Example: Desert � 1 f R , T ( r , t ) = f M , R , T ( m , r , t ) d m m = 0

  28. Example: Desert � 1 f R , T ( r , t ) = f M , R , T ( m , r , t ) d m m = 0 �� 1 � 1 � me − tm d m + r e − tm d m = e − tr m = 0 m = 0

  29. Example: Desert � 1 f R , T ( r , t ) = f M , R , T ( m , r , t ) d m m = 0 �� 1 � 1 � me − tm d m + r e − tm d m = e − tr m = 0 m = 0 � 1 − ( 1 + t ) e − t + r ( 1 − e − t ) � = e − tr t 2 t

  30. Example: Desert � 1 f R , T ( r , t ) = f M , R , T ( m , r , t ) d m m = 0 �� 1 � 1 � me − tm d m + r e − tm d m = e − tr m = 0 m = 0 � 1 − ( 1 + t ) e − t + r ( 1 − e − t ) � = e − tr t 2 t = e − tr 1 + tr − e − t ( 1 + t + tr ) � � for t ≥ 0 , 0 ≤ r ≤ 1 t 2

  31. Example: Desert f M | R , T ( m | r , t ) = f M , R , T ( m , r , t ) f R , T ( r , t ) ( m + r ) e − ( m + r ) t = e − tr t 2 ( 1 + tr − e − t ( 1 + t + tr )) ( m + r ) t 2 e − tm = 1 + tr − e − t ( 1 + t + tr )

  32. Example: Desert f M | R , T ( m | r , t ) = f M , R , T ( m , r , t ) f R , T ( r , t ) ( m + r ) e − ( m + r ) t = e − tr t 2 ( 1 + tr − e − t ( 1 + t + tr )) ( m + r ) t 2 e − tm = 1 + tr − e − t ( 1 + t + tr ) ( m + 0 . 2 ) 0 . 25 2 e − 0 . 25 m f M | R , T ( m | 0 . 2 , 0 . 25 ) = 1 + 0 . 25 · 0 . 2 − e − 0 . 25 ( 1 + 0 . 25 + 0 . 25 · 0 . 2 ) = 1 . 66 ( m + 0 . 2 ) e − 0 . 25 m for 0 ≤ m ≤ 1

  33. State of the car 1 . 5 f M | R , T ( m | 0 . 2 , 0 . 25 ) 1 0 . 5 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 m

  34. Independent continuous random variables Two random variables X and Y are independent if and only if for all ( x , y ) ∈ R 2 F X , Y ( x , y ) = F X ( x ) F Y ( y ) , Equivalently, F X | Y ( x | y ) = F X ( x ) for all ( x , y ) ∈ R 2 F Y | X ( y | x ) = F Y ( y )

  35. Independent continuous random variables Two random variables X and Y with joint pdf f X , Y are independent if and only if for all ( x , y ) ∈ R 2 f X , Y ( x , y ) = f X ( x ) f Y ( y ) , Equivalently, f X | Y ( x | y ) = f X ( x ) for all ( x , y ) ∈ R 2 f Y | X ( y | x ) = f Y ( y )

  36. Mutually independent continuous random variables The components of a random vector � X are mutually independent if and only if n � X ( � F � x ) = F X i ( x i ) i = 1 Equivalently, n � f � X ( � x ) = f X i ( x i ) i = 1

  37. Mutually conditionally independent random variables The components of a subvector � X I , I ⊆ { 1 , 2 , . . . , n } are mutually conditionally independent given another subvector � X J , J ⊆ { 1 , 2 , . . . , n } , if and only if � F � X J ( � x I | � x J ) = F X i | � X J ( x i | � x J ) X I | � i ∈I Equivalently, � f � X J ( � x I | � x J ) = f X i | � X J ( x i | � x J ) X I | � i ∈I

  38. Functions of random variables U = g ( X , Y ) and V = h ( X , Y ) F U , V ( u , v ) = P ( U ≤ u , V ≤ v ) = P ( g ( X , Y ) ≤ u , h ( X , Y ) ≤ v ) � = f X , Y ( x , y ) d x d y { ( x , y ) | g ( x , y ) ≤ u , h ( x , y ) ≤ v }

  39. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ?

  40. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ? F Z ( z )

  41. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ? F Z ( z ) = P ( X + Y ≤ z )

  42. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ? F Z ( z ) = P ( X + Y ≤ z ) � ∞ � z − y = f X ( x ) f Y ( y ) d x d y y = −∞ x = −∞ � ∞ = F X ( z − y ) f Y ( y ) d y y = −∞

  43. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ? F Z ( z ) = P ( X + Y ≤ z ) � ∞ � z − y = f X ( x ) f Y ( y ) d x d y y = −∞ x = −∞ � ∞ = F X ( z − y ) f Y ( y ) d y y = −∞ � u f Z ( z ) = d d z lim F X ( z − y ) f Y ( y ) d y u →∞ y = − u

  44. Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y ? F Z ( z ) = P ( X + Y ≤ z ) � ∞ � z − y = f X ( x ) f Y ( y ) d x d y y = −∞ x = −∞ � ∞ = F X ( z − y ) f Y ( y ) d y y = −∞ � u f Z ( z ) = d d z lim F X ( z − y ) f Y ( y ) d y u →∞ y = − u � ∞ = f X ( z − y ) f Y ( y ) d y y = −∞ Convolution of individual pdfs

  45. Example: Coffee beans ◮ Company buys coffee beans from two local producers ◮ Beans from Colombia: C tons/year ◮ Beans from Vietnam: V tons/year ◮ Model : ◮ C uniform between 0 and 1 ◮ V uniform between 0 and 2 ◮ C and V independent ◮ What is the distribution of the total amount of beans B ?

  46. Example: Coffee beans f B ( b ) =

  47. Example: Coffee beans � ∞ f B ( b ) = f C ( b − u ) f V ( u ) d u u = −∞

  48. Example: Coffee beans � ∞ f B ( b ) = f C ( b − u ) f V ( u ) d u u = −∞ � 2 = 1 f C ( b − u ) d u 2 u = 0

  49. Example: Coffee beans � ∞ f B ( b ) = f C ( b − u ) f V ( u ) d u u = −∞ � 2 = 1 f C ( b − u ) d u 2 u = 0 � b  u = 0 d u = b 1 if b ≤ 1 2 2  � b  1 u = b − 1 d u = 1 = if 1 ≤ b ≤ 2 2 2 � 2  1 u = b − 1 d u = 3 − b if 2 ≤ b ≤ 3  2 2

  50. Example: Coffee beans f C f B f V 1 1 0 . 5 0 . 5 0 0 0 0 . 5 1 1 . 5 2 2 . 5 3 0 0 . 5 1 1 . 5 2 2 . 5 3

  51. Gaussian random vector A Gaussian random vector � X has a joint pdf of the form 1 � − 1 � µ ) T Σ − 1 ( � X ( � x ) = exp 2 ( � x − � x − � µ ) f � ( 2 π ) n | Σ | � µ ∈ R n and the covariance matrix Σ is a symmetric where the mean � positive definite matrix

  52. Linear transformation of Gaussian random vectors � X is a Gaussian r.v. of dimension n with mean � µ and covariance matrix Σ For any matrix A ∈ R m × n and � b ∈ R m Y = A � � X + � b µ + � b and covariance matrix A Σ A T is Gaussian with mean A �

  53. Marginal distributions are Gaussian Gaussian random vector, � � � � µ � � X � Z := , with mean � µ := X � µ � Y Y and covariance matrix � Σ � Σ � � X � X Y Σ � Z = Σ T Σ � X � � Y Y � X is a Gaussian random vector with mean µ � X and covariance matrix Σ � X

  54. Marginal distributions are Gaussian f X ( x ) f Y ( y ) f X , Y ( X , Y ) 0 . 2 0 . 1 0 2 − 3 − 2 − 1 0 0 1 y x − 2 2 3

  55. Discrete random variables Continuous random variables Joint distributions of discrete and continuous random variables

  56. Discrete and continuous random variables How do we model the relation between a continuous random variable C and a discrete random variable D ? Conditional cdf and pdf of C given D F C | D ( c | d ) := P ( C ≤ c | D = d ) f C | D ( c | d ) := d F C | D ( c | d ) d c By the Law of Total Probability � F C ( c ) = p D ( d ) F C | D ( c | d ) d ∈ R D � f C ( c ) = p D ( d ) f C | D ( c | d ) d ∈ R D

  57. Mixture models Data are drawn from continuous distribution whose parameters are chosen from a discrete set Important example: Gaussian mixture models

  58. Grizzlies in Yellowstone Model for the weight of grizzly bears in Yellowstone: Males: Gaussian with µ := 240 kg and σ := 40 kg Females: Gaussian with µ := 140 kg and σ := 20 kg There are about the same number of females and males

  59. Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight)

  60. Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) f W ( w )

  61. Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) 1 � f W ( w ) = p S ( s ) f W | S ( w | s ) s = 0

  62. Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) 1 � f W ( w ) = p S ( s ) f W | S ( w | s ) s = 0    e − ( w − 240 ) 2 + e − ( w − 140 ) 2 1 3200 800 √ =  40 20 2 2 π

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend