Generalized Additive Models
September 10, 2019
Generalized Additive Models September 10, 2019 1 / 43
Generalized Additive Models September 10, 2019 Generalized Additive - - PowerPoint PPT Presentation
Generalized Additive Models September 10, 2019 Generalized Additive Models September 10, 2019 1 / 43 Motto My nature is to be linear, and when Im not, I feel really proud of myself. Cynthia Weil a songwriter Generalized Additive
Generalized Additive Models September 10, 2019 1 / 43
Generalized Additive Models September 10, 2019 2 / 43
Introduction
Generalized Additive Models September 10, 2019 4 / 43
Introduction
Generalized Additive Models September 10, 2019 5 / 43
Introduction
1This part is often replaced by the cross-validation approach that will be discussed
Generalized Additive Models September 10, 2019 6 / 43
Introduction
Generalized Additive Models September 10, 2019 7 / 43
Introduction
Generalized Additive Models September 10, 2019 8 / 43
Introduction
Generalized Additive Models September 10, 2019 9 / 43
Introduction
2Review the concept of conditional probabilities, the total probability formula, and the Bayes theorem! Generalized Additive Models September 10, 2019 10 / 43
Introduction
Generalized Additive Models September 10, 2019 11 / 43
Introduction
Generalized Additive Models September 10, 2019 12 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 14 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 15 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 16 / 43
Additive Logistic Regression
N
xi (1 − pxi )1−yi
N
Generalized Additive Models September 10, 2019 17 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 18 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 19 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 20 / 43
Additive Logistic Regression
N
N
N
i p(xi; α, β)(1 − p(xi; α, β)). Generalized Additive Models September 10, 2019 21 / 43
Additive Logistic Regression
N
N
N
N
N
i p(xi; α, β)(1 − p(xi; α, β)). Generalized Additive Models September 10, 2019 22 / 43
Additive Logistic Regression
−1 ∂ℓ(αold, βold)
Generalized Additive Models September 10, 2019 23 / 43
Additive Logistic Regression
Generalized Additive Models September 10, 2019 24 / 43
Generalized Additive Models
Generalized Additive Models September 10, 2019 26 / 43
Generalized Additive Models
Generalized Additive Models September 10, 2019 27 / 43
Generalized Additive Models
Generalized Additive Models September 10, 2019 28 / 43
Generalized Additive Models
N
p
2
p
N
Generalized Additive Models September 10, 2019 29 / 43
Generalized Additive Models
Generalized Additive Models September 10, 2019 30 / 43
Generalized Additive Models
Generalized Additive Models September 10, 2019 31 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 33 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 34 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 35 / 43
Smoothing splines and logistic additive regression
N+4
i (t)B′′ j (t) dt
Generalized Additive Models September 10, 2019 36 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 37 / 43
Smoothing splines and logistic additive regression
N
xi (1 − pxi )1−yi
N
Generalized Additive Models September 10, 2019 38 / 43
Smoothing splines and logistic additive regression
N
N
p
j (t)2 dt
Generalized Additive Models September 10, 2019 39 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 40 / 43
Smoothing splines and logistic additive regression
We apply a generalized additive model to the spam data. The data consists of information from 4601 email messages (random test set of size 1536 the rest is in the training set), in a study to screen email for ‘spam’ (i.e., junk email coded as one). (The data was donated by George Forman from Hewlett-Packard laboratories, Palo Alto, California – the reason for the counts of george as a predictor.) After some tweaking the model the fit was made for the generalized additive logistic regression model using a cubic smoothing spline with a nominal four degrees of freedom for each predictor. (i.e. for each predictor Xj , the smoothing-spline parameter λj was chosen so that trace[Sj (λj )]1 = 4, where Sj (λ) is the spline operator matrix constructed using the observed values xij , i = 1, . . . , N (a way of specifying the smoothing in such a complex model). Most of the spam predictors have a very long-tailed distribution so before fitting the GAM model, we log-transformed each variable (actually log(x + 0.1)), (the plots in Figure 9.1 are in the original variables). Generalized Additive Models September 10, 2019 41 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 42 / 43
Smoothing splines and logistic additive regression
Generalized Additive Models September 10, 2019 43 / 43