Multivariate probability distributions September 1, 2017 STAT 151 - PowerPoint PPT Presentation

Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1

Outline Background Discrete bivariate distribution Continuous bivariate distribution Outline of Topics Background 1 Discrete bivariate distribution 2 Continuous bivariate distribution 3 STAT 151 Class 2 Slide 2

Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate analysis When one measurement is made on each observation in a dataset, univariate analysis is used, e.g. , survival time of patients If more than one measurement is made on each observation, a multivariate analysis is used, e.g. , survival time, age, cancer subtype, size of cancer, etc. We focus on bivariate analysis, where exactly two measurements are made on each observation The two measurements will be called X and Y . Since X and Y are obtained for each observation, the data for one observation is the pair ( X , Y ) STAT 151 Class 2 Slide 3

Outline Background Discrete bivariate distribution Continuous bivariate distribution Bivariate data Bivariate data can be represented as: Observation X Y 1 X 1 Y 1 2 X 2 Y 2 3 X 3 Y 3 4 X 4 Y 4 . . . . . . . . . n X n Y n Each observation is a pair of values, e.g. , ( X 4 , Y 4 ) is the 4-th observation X and Y can be both discrete, both continuous, or one discrete and one continuous. We focus on the first two cases Some examples: X (survived > 1 year) and Y (cancer subtype) of each patient in a sample X (length of job training) and Y (time to find a job) for each unemployed individual in a job training program X (income) and Y (happiness) for each individual in a survey STAT 151 Class 2 Slide 4

Outline Background Discrete bivariate distribution Continuous bivariate distribution Bivariate distributions We can study X and Y separately, i.e. , we can analyse X 1 , X 2 , ..., X n and Y 1 , Y 2 , ..., Y n separately using probability distribution function, probability density function or cumulative distribution function. These are examples of univariate analyses. When X and Y are studied separately, their distribution and probability are called marginal When X and Y are considered together, many interesting questions can be answered, e.g. , Is subtype I cancer ( X ) associated with a higher chance of survival beyond 1 year ( Y )? Does longer job training ( X ) result in shorter time to find a job ( Y )? Do people with higher income ( X ) lead a happier life ( Y )? The joint behavior of X and Y is summarized in a bivariate probability distribution . A bivariate distribution is an example of a joint distribution STAT 151 Class 2 Slide 5

Outline Background Discrete bivariate distribution Continuous bivariate distribution Review of a discrete distribution: Drawing a marble from an urn Probability 1 2 4 3 2 3 5 5 5 Probability distribution tells us the long run frequency for is higher than STAT 151 Class 2 Slide 6

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distribution - Drawing 2 marbles with replacement Draw 2 ( Y ) Draw 1 ( X ) 9 6 1 25 25 2 4 3 5 6 4 25 25 � 3 � � 3 � = 9 P ( X = and Y = ) = P ( , ) = 5 5 25 � 2 � � 3 � = 6 P ( X = and Y = ) = P ( , ) = 5 5 25 ) = 9 25 + 6 25 + 6 25 + 4 P ( , ) + P ( , ) + P ( , ) + P ( , 25 = 1 STAT 151 Class 2 Slide 7

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distribution- Drawing 2 marbles without replacement Draw 2 ( Y ) Draw 1 ( X ) 6 6 1 20 20 2 4 3 5 6 2 20 20 � 2 � � 3 � = 6 P ( X = and Y = ) = P ( , ) = 4 5 20 � 3 � � 2 � = 6 P ( X = and Y = ) = P ( , ) = 4 5 20 ) = 6 20 + 6 20 + 6 20 + 2 P ( , ) + P ( , ) + P ( , ) + P ( , 20 = 1 STAT 151 Class 2 Slide 8

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete bivariate distributions A discrete bivariate distribution is used to model the joint behavior of two variables, X and Y , both of which are discrete. X and Y are discrete random variables if there is a countable number of possible values for X : a 1 , a 2 , ..., a k and for Y : b 1 , b 2 , ..., b l . ( X , Y ) is the unknown outcome if we randomly draw an observation from the population. P ( X = a i , Y = b j ) is the joint probability distribution function of observing X = a i , Y = b j . A valid joint probability distribution function must satisfy the following rules: P ( X = a i , Y = b j ) must be between 0 and 1 We are certain that one of the values will appear, therefore: P [( X = a 1 , Y = b 1 ) or ( X = a 2 , Y = b 1 ) or ... or ( X = a k , Y = b l )] = P ( X = a 1 , Y = b 1 ) + P ( X = a 2 , Y = b 1 ) + ... + P ( X = a k , Y = b l ) = 1 STAT 151 Class 2 Slide 9

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution: Example 1 P ( X = a , Y = b ) = a + b 48 , if a , b = 0 , 1 , 2 , 3 Y X 0 1 2 3 P ( X = a ) P ( X , Y ) 0 1 2 3 6 0 48 48 48 48 48 Y 1 2 3 4 10 1 48 48 48 48 48 2 3 4 5 14 2 0 3 48 48 48 48 48 1 2 X 2 1 3 4 5 6 18 48 + 4 3 48 + 5 48 + 6 3 48 ⇐ 48 48 48 48 48 3 0 6 10 14 18 P ( Y = b ) 1 48 48 48 48 P ( X = a , Y = b ) are the joint probabilities P ( X = a ) , P ( Y = b ) are called marginal probabilities . P ( X = a ) gives us information about X ignoring Y and P ( Y = b ) gives us information about Y ignoring X We can always find marginal probabilities from joint probabilities (as in Example 1) but not the other way around unless X and Y are independent (see next slide) STAT 151 Class 2 Slide 10

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution- Independence X and Y are independent if, for all X = a , Y = b : P ( X = a | Y = b ) = P ( X = a ) ⇔ P ( Y = b | X = a ) = P ( Y = b ) ⇔ P ( X = a , Y = b ) = P ( X = a ) P ( Y = b ) P ( Y = b | X = a ) and P ( X = a | Y = b ) are conditional probabilities If X and Y are independent, then we can easily (a) calculate P ( X = a , Y = b ) by P ( X = a ) P ( Y = b ) (b) write P ( X = a | Y = b ) as P ( X = a ) (c) write P ( Y = b | X = a ) as P ( Y = b ) STAT 151 Class 2 Slide 11

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution- Example 1 (cont’d) P ( X , Y ) Y 0 3 1 2 X 2 1 3 0 Try, P ( Y = 1 | X = 3) = P ( Y = 1 , X = 3) = 4 / 48 18 / 48 = 4 18 � = P ( Y = 1) = 10 48 . P ( X = 3) Alternatively, try 48 � = P ( X = 3) P ( Y = 1) = 18 4 48 × 10 P ( X = 3 , Y = 1) = 48 Either way is sufficient to show X and Y are not independent. Furthermore, we can try any combination of X = a , Y = b to disprove independence. STAT 151 Class 2 Slide 12

Outline Background Discrete bivariate distribution Continuous bivariate distribution Discrete joint distribution: Example 2 P ( X = a , Y = b ) = ab 18 , if a = 1 , 2 , 3; b = 1 , 2 Y X 1 2 P ( X = a ) P ( X , Y ) 1 2 3 1 Y 18 18 18 2 4 6 2 18 18 18 3 6 9 18 + 6 3 3 18 ⇐ 1 X 18 18 18 2 2 6 12 3 1 P ( Y = b ) 1 18 18 P ( X = 1 , Y = 1) = 1 P ( X = 1) P ( Y = 1) = 3 18 × 6 18 = 18 18 2 = 1 = 18 18 . . . P ( X = 3 , Y = 2) = 6 P ( X = 3) P ( Y = 2) = 9 18 × 12 18 = 108 18 2 = 6 = 18 18 To show independence, we must show P ( X = a , Y = b ) = P ( X = a ) P ( Y = b ) for all combinations of X = a , Y = b STAT 151 Class 2 Slide 13

Outline Background Discrete bivariate distribution Continuous bivariate distribution Probability under a univariate probability density function (PDF) PDF P ( X ≤ 1) can be found by integration : � 1 P ( X ≤ 1) = f ( x ) dx 0 P ( X ≤ 1) � 1 f ( x ) = 1 . 5 e − 1 . 5 x 1 . 5 e − 1 . 5 x dx = 0 � − e − 1 . 5 x � 1 = 0 1 − e − 1 . 5 = 1 X ≈ 0 . 776 PDF It turns out, for any x > 0 , 1 − e − 1 . 5 x P ( X ≤ x ) = f ( x ) we often write P ( X ≤ x ) as F ( x ) and call F ( x ) dx the cumulative distribution function (CDF) F (1) is a probability but f (1) (a point on f ( x )) is not a probability 1 X STAT 151 Class 2 Slide 14

Outline Background Discrete bivariate distribution Continuous bivariate distribution The univariate cumulative distribution function (CDF) PDF F ( x ) can be used to find P ( X ≤ x ) ≡ F ( x ) P ( X ≤ x ) for any x , e.g., F (1) = P ( X ≤ 1) A plot of F ( x ) is a convenient way for finding probabilities. Probability is found by drawing a x X line ( ) from the CDF horizontal axis until it meets the 1 CDF and then drawing a horizontal line until it meets the vertical axis F ( x ) All CDF plots have an asymptote at 1 ( ): F ( x ) ≤ 1 because it is a probability x X STAT 151 Class 2 Slide 15

Multivariate probability distributions September 1, 2017 STAT 151 - PowerPoint PPT Presentation

Outline Background Discrete bivariate distribution Continuous bivariate distribution Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1 Outline Background Discrete bivariate distribution Continuous bivariate

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Multivariate normal distribution Surajit Ray Reader, University of Glasgow DataCamp

Principal Component Analysis Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Lecture 5: Probability Distributions Random Variables Probability Distributions

Formal Modeling in Cognitive Science 1 Special Probability Distributions Uniform Distribution

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Unit 2: Probability and distributions Lecture 1: Probability and conditional probability

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Chapter II.2: Basic Probability Theory and Statistics 1. What is a probability? 1.1. Probability

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory

Sustainable Cost Management Power Supply and Hydro Hydro/Power Supply Hydro All regulated

The PIPE project What is it all about ? Presentation at the kick - off conference Tallinn,

Effects of mis-specification of seasonal cointegrating ranks: An empirical study Byeongchan Seong

Process Intensification in Small Scale Pharmaceutical Production Aleksandar Mitic Prof. Krist V.

Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 Introduction: Graphical Models

Distribution of Eigenvalues of Linear Stochastic Systems S A DHIKARI and R. S. L ANGLEY

Determination of the probability of target attainment (PTA) Topic 3b Section 4.4 Matthew Rizk

Hidden Markov Models Biostatistics 615/815 Lecture 12: . . . . . . Summary . . Example

Sambuz

Useful Links

Newsletter

Mail Us