Linear regression Petr Po s k P. Po s k c 2015 Artificial - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Linear regression Petr Poˇ s´ ık P. Poˇ s´ ık c � 2015 Artificial Intelligence – 1 / 9

Linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 2 / 9

Linear regression Regression task is a supervised learning task, i.e. ■ a training (multi)set T = { ( x ( 1 ) , y ( 1 ) ) , . . . , ( x ( | T | ) , y ( | T | ) ) } is available, where Linear regression ■ the labels y ( i ) are quantitave , often continuous (as opposed to classification tasks • Regression where y ( i ) are nominal). • Notation remarks • Train, apply ■ Its purpose is to model the relationship between independent variables (inputs) • 1D regression x = ( x 1 , . . . , x D ) and the dependent variable (output) y . • LSM • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 3 / 9

Linear regression Regression task is a supervised learning task, i.e. ■ a training (multi)set T = { ( x ( 1 ) , y ( 1 ) ) , . . . , ( x ( | T | ) , y ( | T | ) ) } is available, where Linear regression ■ the labels y ( i ) are quantitave , often continuous (as opposed to classification tasks • Regression where y ( i ) are nominal). • Notation remarks • Train, apply ■ Its purpose is to model the relationship between independent variables (inputs) • 1D regression x = ( x 1 , . . . , x D ) and the dependent variable (output) y . • LSM • Minimizing J ( w , T ) • Multivariate linear regression Linear regression is a particular regression model which assumes (and learns) linear relationship between the inputs and the output: y = h ( x ) = w 0 + w 1 x 1 + . . . + w D x D = w 0 + � w , x � = w 0 + xw T , � where � ■ y is the model prediction ( estimate of the true value y ), h ( x ) is the linear model (a hypothesis ), ■ w 0 , . . . , w D are the coefficients of the linear function, w 0 is the bias , organized in a row ■ vector w , ■ � w , x � is a dot product of vectors w and x (scalar product), ■ which can be also computed as a matrix product xw T if w and x are row vectors. P. Poˇ s´ ık c � 2015 Artificial Intelligence – 3 / 9

Notation remarks Homogeneous coordinates : If we add “1” as the first element of x so that x = ( 1, x 1 , . . . , x D ) , then we can write the linear model in an even simpler form (without the explicit bias term): Linear regression • Regression y = h ( x ) = w 0 · 1 + w 1 x 1 + . . . + w D x D = � w , x � = xw T . � • Notation remarks • Train, apply • 1D regression • LSM • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 4 / 9

Notation remarks Homogeneous coordinates : If we add “1” as the first element of x so that x = ( 1, x 1 , . . . , x D ) , then we can write the linear model in an even simpler form (without the explicit bias term): Linear regression • Regression y = h ( x ) = w 0 · 1 + w 1 x 1 + . . . + w D x D = � w , x � = xw T . � • Notation remarks • Train, apply • 1D regression • LSM Matrix notation: If we organize the data into matrix X and vector y , such that • Minimizing J ( w , T ) • Multivariate linear     regression x ( 1 ) y ( 1 ) 1     . . .     X = y = . . and .  ,    . . . x ( | T | ) y ( | T | ) 1 and similarly with � y , then we can write a batch computation of predictions for all data in X as y = Xw T . � P. Poˇ s´ ık c � 2015 Artificial Intelligence – 4 / 9

Two operation modes Any ML model has 2 operation modes: 1. learning (training, fitting) and Linear regression 2. application (testing, making predictions). • Regression • Notation remarks • Train, apply • 1D regression • LSM • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 5 / 9

Two operation modes Any ML model has 2 operation modes: 1. learning (training, fitting) and Linear regression 2. application (testing, making predictions). • Regression • Notation remarks • Train, apply • 1D regression • LSM The model h can be viewed as a function of 2 variables: h ( x , w ) . • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 5 / 9

Two operation modes Any ML model has 2 operation modes: 1. learning (training, fitting) and Linear regression 2. application (testing, making predictions). • Regression • Notation remarks • Train, apply • 1D regression • LSM The model h can be viewed as a function of 2 variables: h ( x , w ) . • Minimizing J ( w , T ) • Multivariate linear regression Model application: If the model is given ( w is fixed), we can manipulate x to make predictions: y = h ( x , w ) = h w ( x ) . � P. Poˇ s´ ık c � 2015 Artificial Intelligence – 5 / 9

Two operation modes Any ML model has 2 operation modes: 1. learning (training, fitting) and Linear regression 2. application (testing, making predictions). • Regression • Notation remarks • Train, apply • 1D regression • LSM The model h can be viewed as a function of 2 variables: h ( x , w ) . • Minimizing J ( w , T ) • Multivariate linear regression Model application: If the model is given ( w is fixed), we can manipulate x to make predictions: y = h ( x , w ) = h w ( x ) . � Model learning: If the data is given ( T is fixed), we can manipulate the model parameters w to fit the model to the data: w ∗ = argmin J ( w , T ) . w P. Poˇ s´ ık c � 2015 Artificial Intelligence – 5 / 9

Two operation modes Any ML model has 2 operation modes: 1. learning (training, fitting) and Linear regression 2. application (testing, making predictions). • Regression • Notation remarks • Train, apply • 1D regression • LSM The model h can be viewed as a function of 2 variables: h ( x , w ) . • Minimizing J ( w , T ) • Multivariate linear regression Model application: If the model is given ( w is fixed), we can manipulate x to make predictions: y = h ( x , w ) = h w ( x ) . � Model learning: If the data is given ( T is fixed), we can manipulate the model parameters w to fit the model to the data: w ∗ = argmin J ( w , T ) . w How to train the model? P. Poˇ s´ ık c � 2015 Artificial Intelligence – 5 / 9

Simple (univariate) linear regression Simple (univariate) regression deals with cases where x ( i ) = x ( i ) , i.e. the examples are described by a single feature (they are 1-dimensional). Linear regression • Regression • Notation remarks • Train, apply • 1D regression • LSM • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 6 / 9

Simple (univariate) linear regression Simple (univariate) regression deals with cases where x ( i ) = x ( i ) , i.e. the examples are described by a single feature (they are 1-dimensional). Linear regression Fitting a line to data: • Regression • Notation remarks y = w 0 + w 1 x ■ find parameters w 0 , w 1 of a linear model ˆ • Train, apply • 1D regression ■ given a traning (multi)set T = { ( x ( i ) , y ( i ) ) } | T | i = 1 . • LSM • Minimizing J ( w , T ) • Multivariate linear regression P. Poˇ s´ ık c � 2015 Artificial Intelligence – 6 / 9

Simple (univariate) linear regression Simple (univariate) regression deals with cases where x ( i ) = x ( i ) , i.e. the examples are described by a single feature (they are 1-dimensional). Linear regression Fitting a line to data: • Regression • Notation remarks y = w 0 + w 1 x ■ find parameters w 0 , w 1 of a linear model ˆ • Train, apply • 1D regression ■ given a traning (multi)set T = { ( x ( i ) , y ( i ) ) } | T | i = 1 . • LSM • Minimizing J ( w , T ) • Multivariate linear regression How to fit depending on the number of training examples: ■ Given a single example (1 equation, 2 parameters) ⇒ infinitely many linear function can be fitted. ■ Given 2 examples (2 equations, 2 parameters) ⇒ exactly 1 linear function can be fitted. ■ Given 3 or more examples ( > 2 equations, 2 parameters) ⇒ no line can be fitted without any error ⇒ a line which minimizes the “size” of error y − � y can be fitted: w ∗ = ( w ∗ 0 , w ∗ 1 ) = argmin J ( w 0 , w 1 , T ) . w 0 , w 1 P. Poˇ s´ ık c � 2015 Artificial Intelligence – 6 / 9

The least squares method The least squares method (LSM) suggests to choose such parameters w which minimize the mean squared error Linear regression � y ( i ) � 2 � � 2 | T | | T | 1 1 • Regression y ( i ) − � y ( i ) − h w ( x ( i ) ) ∑ ∑ J ( w ) = = . • Notation remarks | T | | T | i = 1 i = 1 • Train, apply • 1D regression y • LSM • Minimizing J ( w , T ) y = w 0 + w 1 x � ( x ( 3 ) , � y ( 3 ) ) ( x ( 2 ) , y ( 2 ) ) • Multivariate linear regression | y ( 3 ) − � y ( 3 ) | | y ( 2 ) − � y ( 2 ) | ( x ( 3 ) , y ( 3 ) ) ( x ( 2 ) , � y ( 2 ) ) w 1 ( x ( 1 ) , � y ( 1 ) ) 1 | y ( 1 ) − � y ( 1 ) | w 0 ( x ( 1 ) , y ( 1 ) ) x 0 P. Poˇ s´ ık c � 2015 Artificial Intelligence – 7 / 9

Linear regression Petr Po s k P. Po s k c 2015 Artificial - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Linear regression Petr Po s k P. Po s k c 2015 Artificial Intelligence 1 / 9 Linear regression P. Po s k c

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Technical conditions for linear regression Jo Hardin Professor, Pomona College DataCamp

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Linear Regression 1 / 10 The Linear Model So far weve dealt with classification, where the

1D Regression i.i.d. with mean 0. Univariate Linear

Learning From Data Lecture 8 Linear Classification and Regression Linear Classification Linear

Multiple Linear Regression Often more than one predictor variable can be used to predict the

Linear regression without correspondence Daniel Hsu Columbia University October 3, 2017 Joint

Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade

Classification or Regression? Regression Classification: want to learn a discrete target

https://bit.ly/2Lwa3g4 S AMPLE P HOTOGRAPHY A GREEMENTS Work Made for Hire Agreement: you own

Linear regression Petr Po s k P. Po s k c 2015 Artificial - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Linear regression Petr Po s k P. Po s k c 2015 Artificial Intelligence 1 / 9 Linear regression P. Po s k c

Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Linear regression How to measure the accuracy of linear regression models Linear Regression

Linear Models for Regression Greg Mori - CMPT 419/726 Bishop PRML Ch. 3 Regression Linear Basis

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Logistic regression CS 446 1. Linear classifiers Linear regression Last two lectures, we studied

LINEAR REGRESSION LINEAR REGRESSION - FROM A MACHINE LEARNING POINT OF VIEW 25 SIMPLE LINEAR

Notes on the Non-linear Regression The model Non-linear regression models, like ordinary linear

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.):

Chapter 7 Linear Regression 04/05/2016 Huamei Dong 1. Review Least square regression line 2.

Technical conditions for linear regression Jo Hardin Professor, Pomona College DataCamp

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Lecture 8: Regression Trees Instructor: Saravanan Thirumuruganathan CSE 5334 Saravanan

Linear Regression 1 / 10 The Linear Model So far weve dealt with classification, where the

1D Regression i.i.d. with mean 0. Univariate Linear

Learning From Data Lecture 8 Linear Classification and Regression Linear Classification Linear

Multiple Linear Regression Often more than one predictor variable can be used to predict the

Linear regression without correspondence Daniel Hsu Columbia University October 3, 2017 Joint

Machine Learning - MT 2016 4 &amp; 5. Basis Expansion, Regularization, Validation Varun Kanade

Classification or Regression? Regression Classification: want to learn a discrete target

https://bit.ly/2Lwa3g4 S AMPLE P HOTOGRAPHY A GREEMENTS Work Made for Hire Agreement: you own

Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade