Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics - PowerPoint PPT Presentation

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics and Physics University of Queensland 1 / 22

Recall: Nonparametric Models Parametric models • Fixed structure and number of parameters. • Represent a fixed class of functions. Nonparametric models • Flexible structure where the number of parameters usually grow as more data becomes available. • The class of functions represented depends on the data. • Not models without parameters, but nonparametric in the sense that they do not have fixed structures and numbers of parameters as in parametric models. 2 / 22

This Lecture • Smoothing splines • Generalized additive models 3 / 22

Smoothing Splines If we fit a degree 8 polynomial on these 9 points, will the polynomial be a good fit? 1.0 ● Actual curve 0.5 ● ● ● ● ● ● 0.0 ● ● y −0.5 −1.0 −1.0 −0.5 0.0 0.5 1.0 x 4 / 22

No... 1.0 ● Actual curve Polynomial fit 0.5 ● ● ● ● ● ● 0.0 ● ● y −0.5 −1.0 −1.0 −0.5 0.0 0.5 1.0 x Runge phenomenon: polynomial fits can be very unstable. 5 / 22

Trade-off between smoothness and quality of fit • We want to find a curve f ( x ) that fits data well, and is sufficiently smooth at the same time. • This can be formulated as finding f to minimize n ( y i − f ( x i )) 2 + λ J ( f ) , ∑︂ R ( f ) = i =1 where J ( f ) is a measure of the roughness of f , and λ > 0 is a parameter controlling the tradeoff between the smoothness and the quality of fit. • J ( f ) is also called a regularizer. 6 / 22

Measuring roughness • For a quadratic function f ( x ) = cx 2 , large f ′′ ( x ) indicates that the curve is very wiggly. • In general, for any function f , if f ′′ ( x ) is usually large, then f looks very wiggly. • We can use ∫︂ b f ′′ ( x ) 2 dx J ( f ) = a as a measure for overall roughness of f over [ a , b ]. 7 / 22

Smoothing splines • Assume that a < min i x i , and b > max i x i . • Consider the problem of finding a function f minimizing ∫︂ b n ( y i − f ( x i )) 2 + λ f ′′ ( x ) 2 dx . ∑︂ R ( f ) = a i =1 • When λ = 0, f can be any function passing through the data. • When λ = ∞ , f is the OLS fit. • When 0 < λ < ∞ , f is a natural cubic spline with knots at the unique x i values. 8 / 22

Revisiting the example 1.0 ● Actual curve Smooth spline 0.5 ● ● ● ● 0.0 ● ● ● ● y −0.5 −1.0 −1.0 −0.5 0.0 0.5 1.0 x A smoothing spline can fit the data well and is smooth! 9 / 22

A basis for natural cubic spline • Recall: natural splines are linear at two ends. • Assume that the knots are t 1 , . . . , t m . • A natural cubic spline is a linear combination of the following m basis functions n 1 ( x ) = 1 , n 2 ( x ) = x , n 2+ i ( x ) = d i ( x ) − d m − 1 ( x ) , i = 1 , . . . , m − 2 , where d i ( x ) = ( x − t i ) 3 + − ( x − t m ) 3 + . t m − t i 10 / 22

Fitting a smoothing spline • Training data: ( x 1 , y 1 ) , . . . , ( x n , y n ) ∈ R × R . • An smoothing spline is fitted by minimizing n ( β ⊤ z i − y i ) 2 + λβ ⊤ Ω β, ˆ ∑︂ β = i =1 where z i = ( n 1 ( x i ) , . . . , n n ( x i )), n i ’s use x i ’s as the knots, and ∫︁ n ′′ i ( x ) n ′′ Ω ij = j ( x ) dx . • The fitted spline is ˆ ∑︂ f ( x ) = β i n i ( x ) . i 11 / 22

Matrix form • Let Z be the n × n matrix with z i as the i -th row. • Then ˆ β can be written as β = ( Z ⊤ Z + λ Ω) − 1 Z ⊤ y . ˆ • We thus have y = Z ˆ ˆ β = S λ y , where S λ is the smoother matrix S λ = Z ( Z ⊤ Z + λ Ω) − 1 Z ⊤ . 12 / 22

Effective degree of freedom • The effective degree of freedom of a smoothing spline is df λ = trace( S λ ) , where the trace of a matrix is the sum of its diagonal elements. • The effective degree of freedom can be considered as a generalization of the concept of the number of free parameters. 13 / 22

Selection of smoothing parameters • The effective degree of freedom df λ provides an intuitive way to manually specify the smoothing parameter λ . • There are various procedures used for automatically determining the λ values, such as cross-validation, generalized cross validation. 14 / 22

Smoothing splines in R > fit.spline.df <- smooth.spline(cars $ speed, cars $ dist, df=9) Smoothing Parameter spar= 0.3858413 lambda= 0.0001576001 (11 iterations) Equivalent Degrees of Freedom (Df): 8.998755 Penalized Criterion (RSS): 2054.319 GCV: 262.3012 > fit.spline.gcv <- smooth.spline(cars $ speed, cars $ dist) Smoothing Parameter spar= 0.7801305 lambda= 0.1112206 (11 iterations) Equivalent Degrees of Freedom (Df): 2.635278 Penalized Criterion (RSS): 4187.776 GCV: 244.1044 • By default, the smoothing parameter λ is determined using generalized cross validation. 15 / 22

120 ● lm smoothing spline (df=2.64) 100 smoothing spline (df=9) ● ● ● ● 80 ● ● ● ● ● dist ● 60 ● ● ● ● ● ● ● ● ● ● 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 20 ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 20 25 speed 16 / 22

Generalized Additive Models • Smoothing spline is a nonparametric analogue of OLS. • We can extend the approach to GLM. 17 / 22

Idea • Replace the linear predictor by β 0 + h 1 ( x 1 ) + . . . + h d ( x d ). • Maximize roughness penalized log-likelihood instead of log-likelihood. 18 / 22

Generalized additive model (GAM) • Recall: A GLM has the following structure E ( Y | x ) = h ( β ⊤ x ) , (systematic) (random) Y | x follows an exponential family distribution . • A generalized additive model has the following structure ∑︂ (systematic) E ( Y | x ) = β 0 + h i ( x i ) i (random) Y | x follows an exponential family distribution . This defines a conditional probability model p ( y | x , β 0 , h 1 , . . . , h d ) 19 / 22

Roughness penalty approach for GAM • We want to choose β 0 , h 1 , . . . , h d to maximize ∫︂ ∑︂ ∑︂ h ′′ j ( x j ) 2 dx j . ln p ( y i | x i , β 0 , h 1 , . . . , h d ) − λ j i j • Again, if each λ j > 0, then each h j must be a natural cubic spline with knots at the unique values of x j . • This reduces the problem to a finite-dimensional parametric regression problem. 20 / 22

Remarks • Higher order derivatives may be used in the regularizer (smoothness penalty). • We can also use regression splines instead of smoothing splines to represent h i ’s. • h i ’s may use a mix of different representations. e.g. h 1 ( x 1 ) = x 1 , h 2 ( x 2 ) a regression spline, h 3 ( x 3 ) a smoothing spline... 21 / 22

What You Need to Know • Smoothing splines • The roughness penalty approach • Natural cubic splines as smoothing splines • Smoothing parameter and effective degree of freedom • Generalized additive model • GAM as a generalization of GLM • Roughness penalty approach for GAM 22 / 22

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics - PowerPoint PPT Presentation

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics and Physics University of Queensland 1 / 22 Recall: Nonparametric Models Parametric models Fixed structure and number of parameters. Represent a fixed class of

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and Physics University of Queensland

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Lecture 16. Mixed Models Nan Ye School of Mathematics and Physics University of Queensland 1 /

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Lecture 12. Quasi-likelihood Nan Ye School of Mathematics and Physics University of Queensland

Lecture 11. Modelling Process and Model Diagnostics (cont.) Nan Ye School of Mathematics and

F4 traces and index calculus on elliptic curves over extension fields Vanessa VITSE Joint work

Pseudo symmetric monomial curves Mesut S ahin HACETTEPE UNIVERSITY Joint work with Nil S

Breaking 128 bit Secure Supersingular Binary Curves (or how to solve Discrete Logarithms in

2D Computer Graphics Resultants and implicitization Two types of renderers and their applications

NSP Clo loseout Page Resources Webinar July 9, 2019 Overv rview The new NSP Closeout page on

HEPI Lobbying Demand Surplus / deficit by activity, HEPI in the crisis 1 2017/18 (England and

Using Python for Record Linkage: Entrepreneurship, Research and Development, and Lobbying in the

Overview Serving the people of Cumbria National Picture Rural areas are reporting

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics - PowerPoint PPT Presentation

Lecture 14. Nonparametric GLMs (cont.) Nan Ye School of Mathematics and Physics University of Queensland 1 / 22 Recall: Nonparametric Models Parametric models Fixed structure and number of parameters. Represent a fixed class of

Lecture 13. Nonparametric GLMs Nan Ye School of Mathematics and Physics University of Queensland

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Lecture 16. Mixed Models Nan Ye School of Mathematics and Physics University of Queensland 1 /

Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang, QCRI Big Data Cleaning Nan Tang,

Development of Taiwans Bunun Tribe Nan-An Tribe Natural Environment in Nan-An Location

CNBC Matlab Mini-Course Inf and NaN 3/0 returns Inf 0/0 returns NaN David S. Touretzky

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Why LINEX Our Explanation (cont-d) Our Explanation (cont-d) (Linear Exponential) Our

Nonparametric Regression Splines for Nonparametric Regression Splines for Regional Atmospheric

Nonparametric Sequential Change Detection for High-Dimensional Problems Yasin Ylmaz Electrical

The np package np : A Package for Nonparametric Kernel The np package implements a variety of

Nonparametric analysis of CMB Nonparametric analysis of CMB power spectrum data and consistency

Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren Maron &amp; Ibben Dron Nan Kojbarok Aurok In Dren

Repairing Four-Atom Conjecture Ting-Ting Nan Advisor: Nigel Boston SP Coding and Information

Lecture 12. Quasi-likelihood Nan Ye School of Mathematics and Physics University of Queensland

Lecture 11. Modelling Process and Model Diagnostics (cont.) Nan Ye School of Mathematics and

F4 traces and index calculus on elliptic curves over extension fields Vanessa VITSE Joint work

Pseudo symmetric monomial curves Mesut S ahin HACETTEPE UNIVERSITY Joint work with Nil S

Breaking 128 bit Secure Supersingular Binary Curves (or how to solve Discrete Logarithms in

2D Computer Graphics Resultants and implicitization Two types of renderers and their applications

NSP Clo loseout Page Resources Webinar July 9, 2019 Overv rview The new NSP Closeout page on

HEPI Lobbying Demand Surplus / deficit by activity, HEPI in the crisis 1 2017/18 (England and

Using Python for Record Linkage: Entrepreneurship, Research and Development, and Lobbying in the

Overview Serving the people of Cumbria National Picture Rural areas are reporting

Maron & Ibben Dron Nan Kojbarok Aurok In Dren Maron & Ibben Dron Nan Kojbarok Aurok In Dren