Covariance and correlation
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
Covariance and correlation P RACTICIN G S TATIS TICS IN TERVIEW - - PowerPoint PPT Presentation
Covariance and correlation P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R Zuzanna Chmielewska Actuary Covariance and correlation PRACTICING STATISTICS INTERVIEW QUESTIONS IN R PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Formula for a sample:
cov(X,Y ) =
Formula for a population:
cov(X,Y ) = n − 1 (x − ) ⋅ (y − ) ∑i=1
n i
x
i
y n (x − ) ⋅ (y − ) ∑i=1
n i
x
i
y
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Formula for a sample:
cov(X,Y ) = n − 1 (x − ) ⋅ (y − ) ∑i=1
n i
x
i
y
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Formula for a population:
cov(X,Y ) = n (x − ) ⋅ (y − ) ∑i=1
n i
x
i
y
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
x = 3,x = 5,x = 7 y = 6,y = 11,y = 13 = 5 = 10 (x − ) ⋅ (y − ) = 8 (x − ) ⋅ (y − ) = 0 (x − ) ⋅ (y − ) = 6 (x − ) ⋅ (y − ) = 14 = 7
1 2 3 1 2 3
x y
1
x
1
y
2
x
2
y
3
x
3
y ∑i=1
n i
x
i
y
n−1 (x − )⋅(y − ) ∑i=1
n i
x
i
y
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
covariance correlation coefcient
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
where:
y - dependent variable, x - independent variables, β - parameters, e - error.
i ij j i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
= β + β ⋅ x yi ^
1 i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
= β + β ⋅ x yi ^
1 i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
= β + β ⋅ x yi ^
1 i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
= β + β ⋅ x yi ^
1 i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Examples:
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Linear relationship Normally distributed errors Homoscedastic errors Independent observations
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
model <- lm(dist ~ speed, data = cars) print(model) Call: lm(formula = dist ~ speed, data = cars) Coefficients: (Intercept) speed
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
model <- lm(dist ~ speed, data = cars) new_car <- data.frame(speed = 17.5) predict(model, newdata = new_car) 1 51.23806
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
model <- lm(dist ~ speed, data = cars) plot(model)
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
linear regression model linear predictor function
lm() in R
diagnostic plots
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Probability prediction:
Logit prediction:
i i
1 i1 p ip
i
1 i1 p ip
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
model <- glm(y ~ x, data = df, family = "binomial")
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
model <- glm(y ~ x, data = df, family = "binomial") predict(model, newdata = new_df, type = "response")
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
logistic regression model prediction of a binary response variable logistic regression in R with glm()
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
accuracy = precision = recall =
TP +TN+F P +F N TP +TN TP +F P TP TP +F N TP
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Precision Recall
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Root Mean Squared Error
RMSE =
Mean Absolute Error
MAE = ∣y − ∣ √ (y − )
n 1 ∑i=1 n i
y ^i 2
n 1 ∑i=1 n i
y ^i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Root Mean Squared Error
RMSE =
height weight to large errors Mean Absolute Error
MAE = ∣y − ∣
straightforward interpretation
√ (y − )
n 1 ∑i=1 n i
y ^i 2
n 1 ∑i=1 n i
y ^i
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
validation set approach cross-validation confusion matrix classication metrics regression metrics
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R
Zuzanna Chmielewska
Actuary
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Probability distributions: discrete distributions continuous distributions central limit theorem
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Exploratory Data Analysis: descriptive statistics categorical data time-series principal component analysis
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Statistical tests: normality tests inference for a mean comparing two means ANOVA
PRACTICING STATISTICS INTERVIEW QUESTIONS IN R
Regression models: covariance and correlation linear regression model logistic regression model model evaluation
P RACTICIN G S TATIS TICS IN TERVIEW QUES TION S IN R