multivariate probability distributions
play

Multivariate probability distributions September 1, 2017 STAT 151 - PowerPoint PPT Presentation

Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1 Outline Background Discrete bivariate distribution Continuous bivariate


  1. Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1

  2. Outline Background Discrete bivariate distribution Continuous bivariate distribution Outline of Topics Background 1 Discrete bivariate distribution 2 Continuous bivariate distribution 3 STAT 151 Class 2 Slide 2

  3. Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate analysis When one measurement is made on each observation in a dataset, univariate analysis is used, e.g. , survival time of patients If more than one measurement is made on each observation, a multivariate analysis is used, e.g. , survival time, age, cancer subtype, size of cancer, etc. We focus on bivariate analysis, where exactly two measurements are made on each observation The two measurements will be called X and Y . Since X and Y are obtained for each observation, the data for one observation is the pair ( X , Y ) STAT 151 Class 2 Slide 3

  4. Outline Background Discrete bivariate distribution Continuous bivariate distribution Bivariate data Bivariate data can be represented as: Observation X Y 1 X 1 Y 1 2 X 2 Y 2 3 X 3 Y 3 4 X 4 Y 4 . . . . . . . . . n X n Y n Each observation is a pair of values, e.g. , ( X 4 , Y 4 ) is the 4-th observation X and Y can be both discrete, both continuous, or one discrete and one continuous. We focus on the first two cases Some examples: X (survived > 1 year) and Y (cancer subtype) of each patient in a sample X (length of job training) and Y (time to find a job) for each unemployed individual in a job training program X (income) and Y (happiness) for each individual in a survey STAT 151 Class 2 Slide 4

  5. Outline Background Discrete bivariate distribution Continuous bivariate distribution Bivariate distributions We can study X and Y separately, i.e. , we can analyse X 1 , X 2 , ..., X n and Y 1 , Y 2 , ..., Y n separately using probability distribution function, probability density function or cumulative distribution function. These are examples of univariate analyses. When X and Y are studied separately, their distribution and probability are called marginal When X and Y are considered together, many interesting questions can be answered, e.g. , Is subtype I cancer ( X ) associated with a higher chance of survival beyond 1 year ( Y )? Does longer job training ( X ) result in shorter time to find a job ( Y )? Do people with higher income ( X ) lead a happier life ( Y )? The joint behavior of X and Y is summarized in a bivariate probability distribution . A bivariate distribution is an example of a joint distribution STAT 151 Class 2 Slide 5

  6. Outline Background Discrete bivariate distribution Continuous bivariate distribution Review of a discrete distribution: Drawing a marble from an urn Probability 1 2 4 3 2 3 5 5 5 Probability distribution tells us the long run frequency for is higher than STAT 151 Class 2 Slide 6

  7. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distribution - Drawing 2 marbles with replacement Draw 2 ( Y ) Draw 1 ( X ) 9 6 1 25 25 2 4 3 5 6 4 25 25 � 3 � � 3 � = 9 P ( X = and Y = ) = P ( , ) = 5 5 25 � 2 � � 3 � = 6 P ( X = and Y = ) = P ( , ) = 5 5 25 ) = 9 25 + 6 25 + 6 25 + 4 P ( , ) + P ( , ) + P ( , ) + P ( , 25 = 1 STAT 151 Class 2 Slide 7

  8. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distribution- Drawing 2 marbles without replacement Draw 2 ( Y ) Draw 1 ( X ) 6 6 1 20 20 2 4 3 5 6 2 20 20 � 2 � � 3 � = 6 P ( X = and Y = ) = P ( , ) = 4 5 20 � 3 � � 2 � = 6 P ( X = and Y = ) = P ( , ) = 4 5 20 ) = 6 20 + 6 20 + 6 20 + 2 P ( , ) + P ( , ) + P ( , ) + P ( , 20 = 1 STAT 151 Class 2 Slide 8

  9. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distributions A discrete bivariate distribution is used to model the joint behavior of two variables, X and Y , both of which are discrete. X and Y are discrete random variables if there is a countable number of possible values for X : a 1 , a 2 , ..., a k and for Y : b 1 , b 2 , ..., b l . ( X , Y ) is the unknown outcome if we randomly draw an observation from the population. P ( X = a i , Y = b j ) is the joint probability distribution function of observing X = a i , Y = b j . A valid joint probability distribution function must satisfy the following rules: P ( X = a i , Y = b j ) must be between 0 and 1 We are certain that one of the values will appear, therefore: P [( X = a 1 , Y = b 1 ) or ( X = a 2 , Y = b 1 ) or ... or ( X = a k , Y = b l )] = P ( X = a 1 , Y = b 1 ) + P ( X = a 2 , Y = b 1 ) + ... + P ( X = a k , Y = b l ) = 1 STAT 151 Class 2 Slide 9

  10. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution: Example 1 P ( X = a , Y = b ) = a + b 48 , if a , b = 0 , 1 , 2 , 3 Y X 0 1 2 3 P ( X = a ) P ( X , Y ) 0 1 2 3 6 0 48 48 48 48 48 Y 1 2 3 4 10 1 48 48 48 48 48 2 3 4 5 14 2 0 3 48 48 48 48 48 1 2 X 2 1 3 4 5 6 18 48 + 4 3 48 + 5 48 + 6 3 48 ⇐ 48 48 48 48 48 3 0 6 10 14 18 P ( Y = b ) 1 48 48 48 48 P ( X = a , Y = b ) are the joint probabilities P ( X = a ) , P ( Y = b ) are called marginal probabilities . P ( X = a ) gives us information about X ignoring Y and P ( Y = b ) gives us information about Y ignoring X We can always find marginal probabilities from joint probabilities (as in Example 1) but not the other way around unless X and Y are independent (see next slide) STAT 151 Class 2 Slide 10

  11. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution- Independence X and Y are independent if, for all X = a , Y = b : P ( X = a | Y = b ) = P ( X = a ) ⇔ P ( Y = b | X = a ) = P ( Y = b ) ⇔ P ( X = a , Y = b ) = P ( X = a ) P ( Y = b ) P ( Y = b | X = a ) and P ( X = a | Y = b ) are conditional probabilities If X and Y are independent, then we can easily (a) calculate P ( X = a , Y = b ) by P ( X = a ) P ( Y = b ) (b) write P ( X = a | Y = b ) as P ( X = a ) (c) write P ( Y = b | X = a ) as P ( Y = b ) STAT 151 Class 2 Slide 11

  12. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution- Example 1 (cont’d) P ( X , Y ) Y 0 3 1 2 X 2 1 3 0 Try, P ( Y = 1 | X = 3) = P ( Y = 1 , X = 3) = 4 / 48 18 / 48 = 4 18 � = P ( Y = 1) = 10 48 . P ( X = 3) Alternatively, try 48 � = P ( X = 3) P ( Y = 1) = 18 4 48 × 10 P ( X = 3 , Y = 1) = 48 Either way is sufficient to show X and Y are not independent. Furthermore, we can try any combination of X = a , Y = b to disprove independence. STAT 151 Class 2 Slide 12

  13. Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution: Example 2 P ( X = a , Y = b ) = ab 18 , if a = 1 , 2 , 3; b = 1 , 2 Y X 1 2 P ( X = a ) P ( X , Y ) 1 2 3 1 Y 18 18 18 2 4 6 2 18 18 18 3 6 9 18 + 6 3 3 18 ⇐ 1 X 18 18 18 2 2 6 12 3 1 P ( Y = b ) 1 18 18 P ( X = 1 , Y = 1) = 1 P ( X = 1) P ( Y = 1) = 3 18 × 6 18 = 18 18 2 = 1 = 18 18 . . . P ( X = 3 , Y = 2) = 6 P ( X = 3) P ( Y = 2) = 9 18 × 12 18 = 108 18 2 = 6 = 18 18 To show independence, we must show P ( X = a , Y = b ) = P ( X = a ) P ( Y = b ) for all combinations of X = a , Y = b STAT 151 Class 2 Slide 13

  14. Outline Background Discrete bivariate distribution Continuous bivariate distribution Probability under a univariate probability density function (PDF) PDF P ( X ≤ 1) can be found by integration : � 1 P ( X ≤ 1) = f ( x ) dx 0 P ( X ≤ 1) � 1 f ( x ) = 1 . 5 e − 1 . 5 x 1 . 5 e − 1 . 5 x dx = 0 � − e − 1 . 5 x � 1 = 0 1 − e − 1 . 5 = 1 X ≈ 0 . 776 PDF It turns out, for any x > 0 , 1 − e − 1 . 5 x P ( X ≤ x ) = f ( x ) we often write P ( X ≤ x ) as F ( x ) and call F ( x ) dx the cumulative distribution function (CDF) F (1) is a probability but f (1) (a point on f ( x )) is not a probability 1 X STAT 151 Class 2 Slide 14

  15. Outline Background Discrete bivariate distribution Continuous bivariate distribution The univariate cumulative distribution function (CDF) PDF F ( x ) can be used to find P ( X ≤ x ) ≡ F ( x ) P ( X ≤ x ) for any x , e.g., F (1) = P ( X ≤ 1) A plot of F ( x ) is a convenient way for finding probabilities. Probability is found by drawing a x X line ( ) from the CDF horizontal axis until it meets the 1 CDF and then drawing a horizontal line until it meets the vertical axis F ( x ) All CDF plots have an asymptote at 1 ( ): F ( x ) ≤ 1 because it is a probability x X STAT 151 Class 2 Slide 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend