Linear Regression David M. Blei COS424 Princeton University April - PowerPoint PPT Presentation

Linear Regression David M. Blei COS424 Princeton University April 4, 2012

Regression • We have studied classification, the problem of automatically categorizing data into a set of discrete classes. • E.g., based on its words, is an email spam or ham? • Regression is the problem of predicting a real-valued variable from data input.

Linear regression ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● −2 −1 0 1 2 input Data are a set of inputs and outputs � = { ( x n , y n ) } N n = 1

Linear regression ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● −2 −1 0 1 2 input The goal is to predict y from x using a linear function.

Examples ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● ● −2 −1 0 1 2 input • Given today’s weather, how much will it rain tomorrow? • Given today’s market, what will be the price of a stock tomorrow? • Given her emails, how long will a user stay on a page? • Others?

Linear regression ● ● ● ● ● ● 1.0 ( x n , y n ) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● f ( x ) = β 0 + βx ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● − 0.5 ● ● ● ● − 2 − 1 0 1 2 input

Multiple inputs • Usually, we have a vector of inputs, each representing a different feature of the data that might be predictive of the response. x = 〈 x 1 , x 2 ,..., x p 〉 • The response is assumed to be a linear function of the input p � f ( x ) = β 0 + x i β i i = 1 • Here, β ⊤ x = 0 is a hyperplane.

Multiple inputs Y • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • X 2 • • • X 1

Flexibility of linear regression • This set-up is less limiting than you might imagine. • Inputs can be: • Any features of the data • Transformations of the original features, e.g., x 2 = log x 1 or x 2 = � x 1 . • A basis expansion, e.g., x 2 = x 2 1 and x 3 = x 3 1 • Indicators of qualitative inputs, e.g., category • Interactions between inputs, e.g., x 1 = x 2 x 3 • Its simplicity and flexibility make linear regression one of the most important and widely used statistical prediction techniques.

Polynomial regression example 10 ● 8 6 ● response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input

Linear regression 10 ● 8 ● 6 response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input f ( x ) = β 0 + β x

Polynomial regression 10 ● 8 ● 6 response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input f ( x ) = β 0 + β 1 x + β 2 x 2 + β 3 x 3

Fitting a regression ● ● ● ● • Given data � = { ( x n , y n ) } N 1.0 n = 1 , find ● ● ● ● ● ● ● ● ● ● the coefficient β that can predict ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● y new from x new . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● y ● ● • Simplifications: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 • 0-intercept, i.e., β 0 = 0 ● ● ● ● ● ● ● ● ● • One input, i.e., p = 1 ● ● ● −1.0 ● ● ● ● ● ● ● • How should we proceed? ● −2 −1 0 1 2 x

Residual sum of squares ● ● ● ● 1.0 | ( y n − βx n ) | ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● − 0.5 ● ● ● ● ● ● ● ● ● ● ● ● − 1.0 ● ● ● ● ● ● ● ● ● − 2 − 1 0 1 2 x A reasonable approach is to minimize sum of the squared Euclidean distance between each prediction β x n and the truth y n N 1 � ( y n − β x n ) 2 RSS ( β ) = 2 n = 1

RSS for two inputs Y • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • X 2 • • • X 1

Optimizing β The objective function is N 1 � ( y n − β x n ) 2 RSS ( β ) = 2 n = 1 The derivative is N d � d β RSS ( β ) = − ( y n − β x n ) x n n = 1 The optimal value is � N n = 1 y n x n ˆ β = � n x 2 n

The optimal β ● ● ● ● 1.0 • The optimal value is ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● � N ● ● ● ● ● ● ● n = 1 y n x n ● ● ● ● ● ● ˆ ● ● β = ● ● ● ● ● ● ● ● ● ● ● ● ● � 0.0 ● ● n x 2 ● ● y ● ● ● ● ● ● n ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● ● ● ● ● ● • + values pull the slope up. ● ● ● −1.0 ● ● ● ● ● ● ● • − values pull the slope down ● −2 −1 0 1 2 x

Prediction ● ● ● • After finding the optimal β , we ● 1.0 ● ● ● ● ● ● would like to predict a new output ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● from a new input. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● • We use the point on the line at the 0.0 ● ● ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● input, ● ● ● −0.5 ● ● y new = ˆ ● ● ˆ ● ● ● β x new ● ● ● ● ● −1.0 ● ● ● ● ● ● ● ● −2 −1 0 1 2 x

Linear Regression David M. Blei COS424 Princeton University April - PowerPoint PPT Presentation

Linear Regression David M. Blei COS424 Princeton University April 4, 2012 Regression We have studied classification, the problem of automatically categorizing data into a set of discrete classes. E.g., based on its words, is an email

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Technical conditions for linear regression Jo Hardin Professor, Pomona College DataCamp

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Mining and Pattern Analysis in Large Data Sets for Biological Information. David W. Mount

In 2000, the first application of robotic surgery in a case of prostate cancer in the

Partnership Clinician Workgroup In-Person Meeting December 12, 2018 Welcome, Introductions,

2018 Quro: Facilitating user symptom check using a personalised chatbot- oriented dialogue

Effectiveness of the Performance Evaluation System in the Public Health Sector Sabina Nuti

Data Warehouse Chronic Conditions Data Warehouse 1 Your source for national CMS Medicare and

Anxiolytics Anxiolytic means anti-anxiety SSRIs are the most prescribed anxiolytics, but

Examining the Risks and Benefits Benzodiazepine Landscape in the from Benzodiazepines United

Sambuz

Useful Links

Newsletter

Mail Us