PRIYADARSHAN PATIL Teaching Assistant, The University of Texas at Austin
SPRING 2020
CE 311S : CE 311S : PR PROB OBABILITY ABILITY AND AND ST STATISTICS TISTICS
Week 8 – Class 2 03/11/2020
PR PROB OBABILITY ABILITY AND AND ST STATISTICS TISTICS Week 8 - - PowerPoint PPT Presentation
SPRING 2020 CE 311S : CE 311S : PR PROB OBABILITY ABILITY AND AND ST STATISTICS TISTICS Week 8 Class 2 03/11/2020 PRIYADARSHAN PATIL Teaching Assistant, The University of Texas at Austin Administrative stuff Online assignment 4 is
PRIYADARSHAN PATIL Teaching Assistant, The University of Texas at Austin
SPRING 2020
Week 8 – Class 2 03/11/2020
⚫
Online assignment 4 is due tomorrow
⚫
Spring break
⚫
Jointly distributed random variables
⚫
Multiple discrete random variables
⚫
Multiple continuous random variables
⚫
Covariance and correlation
⚫
By the end of this class, you should be able to:
⚫
Understand a joint PMF (PDF) and CDF
⚫
Calculate marginal PMF (PDF)
⚫
Calculate expected values for the RV and functions of the RV
⚫
Compute covariance and correlation coefficient
⚫
Random variables are often linked with each other
⚫
Examples: Years in college and Credits completed, Years of work experience and salary, Auto and Renters insurance
⚫
We are interested in understanding how random variables behave when studied together
homeowner's insurance from the same company.
the auto and homeowners' policies for a randomly selected customer. X and Y follow the joint PMF shown in the table:
50 150 0.25 0.06 0.15 100 0.07 0.15 0.04 200 0.14 0.05 0.09 Y X
⚫
In general, PMF 𝑄
𝑌𝑍 𝑦, 𝑧 is the probability of 𝑌 = 𝑦 and 𝑍 = 𝑧
⚫
For a valid PMF, 𝑄
𝑌𝑍 𝑦, 𝑧 ≥ 0 ∀ (𝑦, 𝑧) 𝑏𝑜𝑒 σ𝑌 σ𝑍 𝑄 𝑌𝑍 𝑦, 𝑧 = 1
⚫
The marginal PMF of X provides us the distribution of X when we aren’t concerned with Y 𝑸𝒀 𝒚 =
𝒛∈𝑺𝒁
𝑸𝒀𝒁 (𝒚, 𝒛)
P(X=0) when not considering Y
𝐺
𝑌𝑍 𝑦, 𝑧 = 𝑄(𝑌 ≤ 𝑦 ∩ 𝑍 ≤ 𝑧) 50 150 Sum 0.25 0.06 0.15 0.46 100 0.07 0.15 0.04 0.26 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Y X
𝑌 0 = 0.46
𝑌 100 = 0.26
𝑌 200 = 0.28
𝑍 0 = 0.46
𝑍 50 = 0.26
𝑍 150 = 0.28 50 150 Sum 0.25 0.06 0.15 0.46 100 0.07 0.15 0.04 0.26 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Y X
𝑄
𝑌𝑍 𝑦, 𝑧 = 𝑄 𝑌 𝑦 𝑄 𝑍 𝑧 𝑔𝑝𝑠 𝑏𝑚𝑚 𝑦, 𝑧
50 150 Sum 0.25 0.06 0.15 0.46 100 0.07 0.15 0.04 0.26 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Y X
ℎ𝑌𝑍 is σ𝑌 σ𝑍 ℎ 𝑦, 𝑧 𝑄
𝑌𝑍(𝑦, 𝑧)
deductible (X+Y)?
50 150 Sum 0.25 0.06 0.15 0.46 100 0.07 0.15 0.04 0.26 200 0.14 0.05 0.09 0.28 Sum 0.46 0.26 0.28 1
Y X
𝑌𝑍 𝑦, 𝑧
and one with ℎ(𝑦, 𝑧) values
values and add
50 150 0.25 0.06 0.15 100 0.07 0.15 0.04 200 0.14 0.05 0.09
Y X
50 150 50 150 100 100 150 250 200 200 250 350
Y X
values and add.
50 150 0.25 0.06 0.15 100 0.07 0.15 0.04 200 0.14 0.05 0.09
Y X
50 150 100 100 100 100 200 200 200 200
Y X
⚫
All the concepts we studied apply to continuous distributions
⚫
Similar changes as applied to single random variables
⚫
Mass changes to density, summation to integration, etc.
⚫
The joint density function 𝑔
𝑌𝑍 𝑦, 𝑧 is valid if 𝑔 𝑌𝑍 𝑦, 𝑧 ≥ 0 ∀𝑦, 𝑧
and if
−∞ ∞ −∞ ∞ 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 𝑒𝑦 = 1
⚫
The marginal density functions are:
𝑔
𝑌 𝑦 = −∞ ∞ 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 and f𝑍 y =
−∞ ∞ 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑦
⚫
X and Y are independent if 𝑔
𝑌𝑍 𝑦, 𝑧 = 𝑔 𝑌 𝑦 𝑔 𝑍 𝑧 ∀𝑦, 𝑧
⚫
𝐹 ℎ 𝑌, 𝑍 =
−∞ ∞ −∞ ∞ ℎ𝑌𝑍 𝑦, 𝑧 𝑔 𝑌𝑍 𝑦, 𝑧 𝑒𝑧 𝑒𝑦
⚫
A test column you built for your materials class can either fail via the rebars rusting, or by the concrete flaking off.
⚫
Let X be the years before the rebars rust to failure, and Y be the years before the concrete flakes off.
⚫
The joint pdf is: 𝑔
𝑌𝑍 𝑦, 𝑧 = 𝑑𝑓−𝑦𝑓−2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0
⚫
𝑔
𝑌𝑍 𝑦, 𝑧 = 2𝑓−𝑦𝑓−2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0
⚫
What is the marginal distribution of X? This is the pdf for years till rebar rusting
⚫
What is the marginal distribution of Y? This is the pdf for years till concrete flaking
⚫
𝑔
𝑌𝑍 𝑦, 𝑧 = 2𝑓−𝑦𝑓−2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0
⚫
X and Y are independent if 𝑔
𝑌𝑍 𝑦, 𝑧 = 𝑔 𝑌 𝑦 𝑔 𝑍 𝑧 ∀𝑦, 𝑧
⚫
Are X and Y independent?
⚫
𝑔
𝑌𝑍 𝑦, 𝑧 = 2𝑓−𝑦𝑓−2𝑧 for 𝑦 ≥ 0, 𝑧 ≥ 0
⚫
What is the expected time till the rebars rust to failure?
⚫
When two RVs are not independent, we require a measure of how dependent they are.
⚫
Covariance of RVs X and Y is defined as 𝐷𝑝𝑤 𝑌, 𝑍 = 𝐹 𝑌 − 𝐹[𝑌 ]𝐹[𝑍 − 𝐹[𝑍]]
⚫
Equivalently, 𝐷𝑝𝑤 𝑌, 𝑍 = 𝐹 𝑌𝑍 − 𝐹 𝑌 𝐹[𝑍]
⚫
Recall, 𝐹 𝑌 = 82 and 𝐹 𝑍 = 55
⚫
𝐹[𝑌𝑍] = 4550
⚫
𝐷𝑝𝑤 𝑌, 𝑍 = 4550 − 82 ∗ 55 = 40
50 150 0.25 0.06 0.15 100 0.07 0.15 0.04 200 0.14 0.05 0.09
Y X
50 150 100 5000 15000 200 10000 30000
Y X
⚫
Interpretation
⚫
If covariance is positive, when X is above average, Y usually is too; and when X is below average, Y usually is too.
⚫
If covariance is negative, when X is above average, Y is usually below average, and vice versa.
⚫
If X and Y are independent, their covariance is zero. (The converse is not true).
⚫
The magnitude does not mean much (depends on units of X and Y)
⚫
To gain more insight from the magnitude, we define the correlation coefficient as follows: 𝜍𝑌𝑍 = 𝐷𝑝𝑤 𝑌, 𝑍 𝜏𝑌𝜏𝑍
⚫
The correlation coefficient is always between −1, +1
⚫
It quantifies the strength of the linear relationship between X and Y
⚫
If 𝜍𝑌𝑍 = 1, then 𝑍 = 𝑏𝑌 + 𝑐 for some 𝑏 > 0
⚫
If 𝜍𝑌𝑍 = −1, then 𝑍 = 𝑏𝑌 + 𝑐 for some 𝑏 < 0
⚫
If 𝜍𝑌𝑍 = 0, there is no linear relationship between X and Y
⚫
If 𝜍𝑌𝑍=0, it does not imply that X and Y are independent
⚫
𝐷𝑝𝑤 𝑌, 𝑌 = 𝑊𝑏𝑠 𝑌
⚫
If 𝑌 and 𝑍 are independent, 𝐷𝑝𝑤 𝑌, 𝑍 = 0
⚫
𝐷𝑝𝑤 𝑌, 𝑍 = 𝐷𝑝𝑤 𝑍, 𝑌
⚫
𝐷𝑝𝑤 𝑏𝑌, 𝑍 = 𝑏𝐷𝑝𝑤 𝑌, 𝑍
⚫
𝐷𝑝𝑤 𝑌 + 𝑑, 𝑍 = 𝐷𝑝𝑤 𝑌, 𝑍
⚫
𝐷𝑝𝑤(𝑌 + 𝑍, 𝑎) = 𝐷𝑝𝑤(𝑌, 𝑎) + 𝐷𝑝𝑤(𝑍, 𝑎)
⚫
𝐷𝑝𝑤 σ𝑗=1
𝑛 𝑏𝑗𝑌𝑗 , σ𝑘=1 𝑜
𝑐
𝑘𝑍 𝑘 = σ𝑗=1 𝑛 σ𝑘=1 𝑜
𝑏𝑗𝑐
𝑘𝐷𝑝𝑤(𝑌𝑗, 𝑍 𝑘)
⚫
𝑊𝑏𝑠 𝑏𝑌 + 𝑐𝑍 = 𝑏2𝑊𝑏𝑠 𝑦 + 𝑐2𝑊𝑏𝑠 𝑍 + 2𝑏𝑐𝐷𝑝𝑤(𝑌, 𝑍)
⚫
𝐷𝑝𝑤 𝑌1 + 2𝑌2, 3𝑍
1 + 4𝑍 2 = 3𝐷𝑝𝑤 𝑌1, 𝑍 1 + 6𝐷𝑝𝑤 𝑌2, 𝑍 1 +
4𝐷𝑝𝑤 𝑌1, 𝑍
2 + 8𝐷𝑝𝑤(𝑌2, 𝑍 2)
⚫
Let 𝑌 and 𝑍 be independent standard normal random
⚫
Joint discrete (continuous) random variables have a joint PMF (PDF) and CDF
⚫
Marginal distributions for each of the RVs can be calculated by summing (integrating) across the other random variable
⚫
Expected values for functions of joint random variables are like expected values for single random variables
⚫
Covariance and correlation coefficient are measures for determining the linear relation between two RVs
⚫
Thank you for attending
⚫
Have a fun (and safe) spring break