- Day 3: Classification
Lucas Leemann
Essex Summer School
Introduction to Statistical Learning
- L. Leemann (Essex Summer School)
Day 3 Introduction to SL 1 / 33
Day 3: Classification Lucas Leemann Essex Summer School - - PowerPoint PPT Presentation
Day 3: Classification Lucas Leemann Essex Summer School Introduction to Statistical Learning L. Leemann (Essex Summer School) Day 3 Introduction to SL 1 / 33 1 Motivation for Classification 2 Logistic Regression The Linear
Day 3 Introduction to SL 1 / 33
Day 3 Introduction to SL 2 / 33
Day 3 Introduction to SL 3 / 33
Day 3 Introduction to SL 4 / 33
Day 3 Introduction to SL 5 / 33
Day 3 Introduction to SL 6 / 33
0.0 0.5 1.0 1.5 2.0 0.0 0.2 0.4 0.6 0.8 1.0
Binary Dependent Variable
Predicted Values Actual Values Prediction >100%
5 10 15 20 1 2 3 4 5 6
Continuous Dependent Variable
Residuals Predicted Values
0.0 0.5 0.2 0.4 0.6 0.8
Binary Dependent Variable
Residuals Predicted Values
Day 3 Introduction to SL 7 / 33
2 4 0.0 0.2 0.4 0.6 0.8 1.0
Cumulative Distribution
β0+β1X Y Logistic Normal
Day 3 Introduction to SL 8 / 33
1 2 0.0 0.2 0.4 0.6 0.8 1.0 x P(Y=1)
Day 3 Introduction to SL 9 / 33
20000 40000 60000 80000 100000 120000 0.0 0.2 0.4 0.6 0.8 1.0 Income in GBP P(Y=1), `Taxes Are Too High'
1 1+exp(−β0−β1·X)
Day 3 Introduction to SL 10 / 33
20000 40000 60000 80000 100000 120000 0.0 0.2 0.4 0.6 0.8 1.0 Income in GBP P(Y=1), `Taxes Are Too High' P(y=1)=F(1-2*x) P(y=1)=F(0-2*x) P(y=1)=F(1-1*x)
Day 3 Introduction to SL 11 / 33
> m1 <- glm(inlf ~ kids + age + educ, dat=data1, family=binomial(logit)) > summary(m1) Call: glm(formula = inlf ~ kids + educ + age, family = binomial(logit), data = data1) Deviance Residuals: Min 1Q Median 3Q Max
0.8026 1.0564 1.5875 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.11437 0.73459
0.87628 kids
0.19932
0.01154 * educ 0.16902 0.03505 4.822 1.42e-06 *** age
0.01137
0.00626 **
0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1029.75
degrees of freedom Residual deviance: 993.53
degrees of freedom AIC: 1001.5
Day 3 Introduction to SL 12 / 33
Call: glm(formula = inlf ~ kids + educ + age, family = binomial(logit), data = data1) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.11437 0.73459
0.87628 kids
0.19932
0.01154 * educ 0.16902 0.03505 4.822 1.42e-06 *** age
0.01137
0.00626 **
0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Day 3 Introduction to SL 13 / 33
glm(formula = inlf ~ kids + educ + age, family = binomial(logit), data = data1) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.11437 0.73459
0.87628 kids
0.19932
0.01154 * educ 0.16902 0.03505 4.822 1.42e-06 *** age
0.01137
0.00626 **
1 1+exp(0.11+.50·0−0.17·13+0.03·32) = 1 1+exp(−1.09) = 0.75
Day 3 Introduction to SL 14 / 33
> z.out1 <- zelig(inlf ~ kids + age + educ + exper + huseduc + huswage, model = "logit", data = data1) > average.woman <- setx(z.out1, kids=median(data1$kids), age=mean(data1$age), educ=mean(data1$educ), exper=mean(data1$exper), huseduc=mean(data1$huseduc), huswage=mean(data1$huswage)) > s.out <- sim(z.out1,x=average.woman) > summary(s.out) sim x :
mean sd 50% 2.5% 97.5% [1,] 0.5746569 0.02574396 0.5754419 0.5232728 0.6217502 pv 1 [1,] 0.432 0.568
Day 3 Introduction to SL 15 / 33
Day 3 Introduction to SL 16 / 33
Day 3 Introduction to SL 17 / 33
(James et al, 2013: 140)
Day 3 Introduction to SL 18 / 33
Day 3 Introduction to SL 19 / 33
Day 3 Introduction to SL 20 / 33
k
Day 3 Introduction to SL 21 / 33
(James et al, 2013: 140)
Day 3 Introduction to SL 22 / 33
Experience in years Frequency 10 20 30 40 20 40 60 80 not in labor force in labor force
Day 3 Introduction to SL 23 / 33
> fit <- lda(inlf ~ exper, data=data1, na.action="na.omit", CV=TRUE) > fit$class [1] 1 0 1 0 0 1 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 1 1 0 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 0 1 0 0 1 1 1 0 1 1 [78] 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 [155] 0 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 1 [232] 1 1 0 1 0 0 1 1 1 1 1 1 0 1 0 0 1 1 1 0 0 1 0 1 1 0 1 0 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0 0 1 1 0 1 1 1 0 1 [309] 1 1 1 1 1 1 1 0 1 0 1 1 1 0 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 1 [386] 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 0 0 0 1 0 1 0 1 1 0 [463] 1 0 0 1 0 1 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 [540] 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 1 0 1 1 1 1 1 0 1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 [617] 1 0 1 0 0 1 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 [694] 0 1 0 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 1 0 1 1 1 1 0 0 1 1 0 0 1 0 0 1 1 Levels: 0 1 > table(fit$class) 1 315 438 > table(fit$class, data1$inlf) 1 0 196 119 1 129 309
Day 3 Introduction to SL 24 / 33
> # several variables LDA > fit <- lda(inlf ~ age + exper + faminc, data=data1, na.action="na.omit", CV=TRUE) > fit$class [1] 1 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 1 [78] 1 1 0 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 0 [155] 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 1 1 [232] 1 1 0 1 1 0 1 1 1 1 1 1 0 0 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 [309] 0 1 1 1 1 1 1 0 1 0 1 1 1 0 0 1 1 0 0 0 1 1 1 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0 1 0 0 1 1 1 0 1 1 1 1 0 0 1 [386] 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 0 1 0 0 [463] 1 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 [540] 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 [617] 1 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 1 1 0 1 0 1 0 [694] 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 1 0 1 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0 1 Levels: 0 1 > table(fit$class) 1 309 444 > table(fit$class, data1$inlf) 1 0 197 112 1 128 316 partimat(as.factor(inlf) ~ exper + faminc + age, data=data1, method="lda", nplots.vert=2, nplots.hor=2)
Day 3 Introduction to SL 25 / 33
20000 40000 60000 80000 10 20 30 40 faminc exper 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 00 0 0 0 0 0 0
30 35 40 45 50 55 60 10 20 30 40 age exper 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
30 35 40 45 50 55 60 20000 40000 60000 80000 age faminc 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
Partition Plot
Day 3 Introduction to SL 26 / 33
(James et al, 2013: 143)
Day 3 Introduction to SL 27 / 33
Day 3 Introduction to SL 28 / 33
Day 3 Introduction to SL 29 / 33
Day 3 Introduction to SL 30 / 33
x1i ∼ N(µ1, σ) x2i ∼ N(µ2, σ) ρx1,x2 = 0 x1i ∼ N(µ1, σ) x2i ∼ N(µ2, σ) ρx1,x2 = −0.5 x1i ∼ t1 x2i ∼ t2 (James et al, 2013: 152)
Day 3 Introduction to SL 31 / 33
x1i ∼ N(µ1, Σ1) x2i ∼ N(µ2, Σ2) ρx11,x12 = 0.5 but ρx21,x22 = −0.5 P(k = 2) = ∆(X 2
1 +X 2 2 +X1 ·X2)
P(k = 2) = f (X1, X2), whereas f (x) is highly non-linear (James et al, 2013: 152)
Day 3 Introduction to SL 32 / 33
Day 3 Introduction to SL 33 / 33