1D Regression i.i.d. with mean 0. Univariate Linear - PDF document

✏ ✌ ✁ ✑ ✏ ✏ ✏ ✌ ✠ ☛ ✁ � 1D Regression ☛ ✂☎✄✝✆✟✞✡✠ ☛ i.i.d. with mean 0. ☛ Univariate Linear Regression: ✂☎✄✝✆✟✞☞✁ ✍✎✆ fit by least squares. Minimize: ✄✝� ✂✓✄✔✆ ✞✕✞✕✖ ✄✝� ✍✗✆ ✞✘✖ ✏✒✑ ✏✒✑ ✌✚✙✗✍ . to get ☛ The set of all possible functions is ..... – 1 –

Non-linear problems ➠ What if the underlying function is not linear? ➠ Try: fit non-linear function from a bag of functions ➠ Problem: which bag? The space of all functions is HUGE ➠ Another problem: We only have SOME data: want to find the underlying function but avoid noise ➠ Need to be selective in choosing possible non-linear functions – 2 –

✆ ✌ ✙ ✲ ✁ ✠ ✌ ✻ ✩ ✜ ✄ ✻ ✣ ✜ ✮ ✞ ✆ ✭ ✏ ✜ ✙ ✣ ✜ ✰ ✁ ✜ ✰ ✏ ✞ ✹ ✠ ✰ Basis expansion: polynomial terms ➠ Univariate LS has two basis functions: ✛✢✜✤✣ ✜✒✩ ✄✔✆✥✞✦✁ ✧★✙ ✄✝✆✟✞✪✁ ✆✬✫ ✩ : ➠ The resulting fit is a linear combination of ✂☎✄✝✆✟✞☞✁ ✄✝✆✟✞✡✠ ✍✯✮ ✄✔✆✥✞☞✁ ✍☞✮✦✆ ➠ One way: add non-linear functions of to the bag. Polynomial terms seem as good as any: ✜✱✰ ✄✔✆✥✞☞✁ ✳✴✙✶✵✷✵✶✵✸✙✺✹ ➠ Construct matrix , with: ✄✔✆ ✧ terms ➠ and fit linear regression with – 3 –

Global vs Local fits ➠ One problem with polynomial regression: global fit ➠ Must find very good global basis for global fit: unlikely to find the “true” one ➠ Other way: fit locally with “simple” functions ➠ Why it works: It is easier to find a suitable basis for a part of a function. ➠ Tradeoff: in each part we only have a fraction of data to work with: must be extra-careful not to overfit. – 4 –

✼ ✼ ✩ ✰ ❄ ✑ ✛ ✖ ✼ ✖ ✼ ✧ Polynomial Splines ➠ Flebility: fit low-order polynomials in small ✆ . windows of the support of ➠ Most popular are order 4 (cubic) splines ➠ Must join the pieces somehow: with M-order splines we make sure derivatives up-to M-2 order match at knots ➠ “Naive” basis for cubic splines: ✜✤✼ ✄✝✆✟✞☞✁ ✵✽✆ ✙✾✆ ✙✕✆❀✿ but many coefficents constrained by macthing derivatives ➠ Truncated-power basis set: ✰❋❊ ✧❁✙✕✆❂✙✕✆ ✙✕✆❀✿❃✙ ✄✝✆ ✞❅✿ ❆❇✫❉❈ equivalent to “naive” set plus constraints ➠ Procedure: – 5 –

● ✆ ❄ ✰ ✁ ● ✻ ● ✙❍✲ ✧★✙✷✵✶✵✷✵✸✙❏■ choose knots, populate matrix using truncated power basis set (in columns) each evaluated at all ✏ (rows) data points, Run linear regression with ?? terms. ✂✓✄✔✆✥✞ linear beyond data: ➠ Natural Cubic splines: extra two constraints on each side ➠ The number of parameters (degrees of freedom) is now ? – 6 –

❲ ❲ ❚ ✁ ✰ ❩ ❨ ❳ ✖ ✞ ✏ ✰ ✜ ✰ ❚ ✰ ✑ ✏ � ✏ ✆ ❙ Regularization ➠ Avoid knot-selection problem. Use all possible ✏ ’s) knots (unique ➠ But have over-parameterized regression (N+2 parameters, N data points) ➠ Need to regularize (shrink) coeficients: ❑✢▲✾▼❖◆ P❘◗ ✄✝✆ ❙❱❯☎❲❇❙ subject to: ➠ Without constraint we get usual least squares fit: here we get infinite number of them ➠ The constraint on only allows those fits with certain ❚ . controls over-all smoothness of the final fit: ➠ ✰❬❩ ✜❪❭❫❭ ✜✤❭❫❭ ✄✝✆✟✞ ✄✝✆✟✞❵❴❛✆ – 7 –

✛ ❡ ✏ ❤ ❜ ✏ ✆ ✏ ✑ ✖ ✏ ✂ ✖ ✠ ❝ ❞ ➠ This remarkably solves a general variational problem: ❭❫❭ ❑✢▲❏▼❛◆ P❘◗ ✄✝� ✂✓✄✔✆ ✞✕✞ ✄❣❢✾✞✎✫ ❴★❢ ❨ above. ❝ is in one-to-one correspondance with ➠ Solution: Natural Cubic Spline with knots at each ✏ . ✂☎✄✝✆ ✞ in O(N). ➠ Benefit: Can get all fits – 8 –

❲ ✄ ❝ ✠ ✻ ❭ ✻ ✄ ✏ ✻ ✰ ✭q ✁ ✻ ✻ ❲ ✻ ❭ ✻ ✠ ✐ ✄ ❥ ✐ ✠ ❝ ❲ ✩ ✻ ♦ B-spline Basis ☛ Most smoothing splines computationally fitted using B-spline basis ☛ B-spline are a basis for polynomial splines on a closed interval. Each cubic B-spline spans at most 5 knots. ❦❁✞ ☛ Computationally, one sets up an matrix of ordered, evaluated B-spline basis. ✲ , is a ✲ th B-spline, and its center Each column, moves from left-most to right-most point. ✞ , has banded structure and so does ☛ where: ❭❫❭ ❭❫❭ ✄✔❧♠✙❍✲♥✞☞✁ ✄♣❢✾✞❵♦ ✄❣❢❏✞❵❴★❢ ☛ One then solves a penalized regression problem: ❭ts ✞✎r ☛ This is actually done using Choleski and – 9 –

✛ ✭ ✑ s ❝ ❜ ① ① ✁ ✂ ✖ ✖ ❳ ✰ ♦ ✰ ✉ ✰ ❨ ✂ ① q back-substitution to get O(N) running time. ✂ to be fitted is ☛ Conceptually, the function expanded into a B-spline basis set: ✂☎✄✝✆✟✞☞✁ ✄✔✆✥✞ and fit obtained by constrained least-squares: ❑✢▲✾▼❖◆ P❘◗ ✈✇✈ ✈✇✈ subject to penalty: ✄❍✂✟✞ ✄❍✂✟✞ is the familiar squared Here second-derivative functional: ❭❫❭ ✄❍✂✟✞②✁ ✄❣❢✾✞✎✫ ❴★❢ ❨ is one-to-one with and – 10 –

✁ ⑥ ✞ ❯ ✻ ✩ r ✞ ✻ ❯ ✻ ✄ ✻ ✄ ▲ Equivalent DF ➠ Smoothing spline (and many other smoothing procedures) are usually called semi-parametric models ✆ into basis set, it looks like any ➠ Once you expand other regression ✜♥✰ ✄✝✆✟✞ have no real ➠ BUT individual terms, meaning ➠ With penalties, one cannot count number of terms to get degrees of freedom ➠ Equivalent expression is needed for guidance and (approx) inference ➠ In regular regression: ③⑤④ this is the trace of a hat , or projection matrix. ➠ All penalized regressions (including cubic – 11 –

s ✩ ⑥ ⑨ ▲ s ⑨ ✁ ✭ ❯ ⑦ r ❷ ✞ ❲ ❝ ✠ ⑦ ❯ ⑨ ⑦ ✁ s ✁ smoothing splines) are obtained by: ✄⑧⑦ ➠ while is not a projection matrix, it has similar properties (it is a shrunk projection operator) ⑩☞❶ ➠ Define: – 12 –

1D Regression i.i.d. with mean 0. Univariate Linear - PDF document

1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize:

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

10-601 Machine Learning Regression Outline Regression vs Classification Linear regression

Linear regression How to measure the accuracy of linear regression models Linear Regression

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Analysis of variance and regression Other types of regression models Other types of regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

Linear regression Linear regression is a simple approach to supervised learning. It assumes

STARTS: STARTS: STARTS: STARTS: STAtic STAtic Regression Test Selection Regression Test

Learning From Data Lecture 8 Linear Classification and Regression Linear Classification Linear

Multiple Linear Regression Often more than one predictor variable can be used to predict the

CSC321 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC321 Lecture 2: Linear

w o o o o o o o o x o o o o o x o that represents how aligned the o x x x x

Linear Regression 1 / 10 The Linear Model So far weve dealt with classification, where the

Linear regression Petr Po s k P. Po s k c 2015 Artificial Intelligence 1

Linear regression without correspondence Daniel Hsu Columbia University October 3, 2017 Joint

Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade

1D Regression i.i.d. with mean 0. Univariate Linear - PDF document

1D Regression i.i.d. with mean 0. Univariate Linear Regression: fit by least squares. Minimize:

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Business Statistics CONTENTS Multiple regression Dummy regressors Assumptions of regression

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Multiple Regression and Logistic Regression I Dajiang Liu @PHS 525 Apr-14-2016 Multiple

Planning and Optimization B2. Regression: Introduction &amp; STRIPS Case Malte Helmert and

10-601 Machine Learning Regression Outline Regression vs Classification Linear regression

Linear regression How to measure the accuracy of linear regression models Linear Regression

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Analysis of variance and regression Other types of regression models Other types of regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

Linear regression Linear regression is a simple approach to supervised learning. It assumes

STARTS: STARTS: STARTS: STARTS: STAtic STAtic Regression Test Selection Regression Test

Learning From Data Lecture 8 Linear Classification and Regression Linear Classification Linear

Multiple Linear Regression Often more than one predictor variable can be used to predict the

CSC321 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC321 Lecture 2: Linear

w o o o o o o o o x o o o o o x o that represents how aligned the o x x x x

Linear Regression 1 / 10 The Linear Model So far weve dealt with classification, where the

Linear regression Petr Po s k P. Po s k c 2015 Artificial Intelligence 1

Linear regression without correspondence Daniel Hsu Columbia University October 3, 2017 Joint

Machine Learning - MT 2016 4 &amp; 5. Basis Expansion, Regularization, Validation Varun Kanade

Planning and Optimization B2. Regression: Introduction & STRIPS Case Malte Helmert and

Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade