linear regression
play

Linear Regression David M. Blei COS424 Princeton University April - PowerPoint PPT Presentation

Linear Regression David M. Blei COS424 Princeton University April 4, 2012 Regression We have studied classification, the problem of automatically categorizing data into a set of discrete classes. E.g., based on its words, is an email


  1. Linear Regression David M. Blei COS424 Princeton University April 4, 2012

  2. Regression • We have studied classification, the problem of automatically categorizing data into a set of discrete classes. • E.g., based on its words, is an email spam or ham? • Regression is the problem of predicting a real-valued variable from data input.

  3. Linear regression ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● −2 −1 0 1 2 input Data are a set of inputs and outputs � = { ( x n , y n ) } N n = 1

  4. Linear regression ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● −2 −1 0 1 2 input The goal is to predict y from x using a linear function.

  5. Examples ● ● ● ● ● ● 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● ● −2 −1 0 1 2 input • Given today’s weather, how much will it rain tomorrow? • Given today’s market, what will be the price of a stock tomorrow? • Given her emails, how long will a user stay on a page? • Others?

  6. Linear regression ● ● ● ● ● ● 1.0 ( x n , y n ) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● f ( x ) = β 0 + βx ● ● ● ● ● response ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● − 0.5 ● ● ● ● − 2 − 1 0 1 2 input

  7. Multiple inputs • Usually, we have a vector of inputs, each representing a different feature of the data that might be predictive of the response. x = 〈 x 1 , x 2 ,..., x p 〉 • The response is assumed to be a linear function of the input p � f ( x ) = β 0 + x i β i i = 1 • Here, β ⊤ x = 0 is a hyperplane.

  8. Multiple inputs Y • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • X 2 • • • X 1

  9. Flexibility of linear regression • This set-up is less limiting than you might imagine. • Inputs can be: • Any features of the data • Transformations of the original features, e.g., x 2 = log x 1 or x 2 = � x 1 . • A basis expansion, e.g., x 2 = x 2 1 and x 3 = x 3 1 • Indicators of qualitative inputs, e.g., category • Interactions between inputs, e.g., x 1 = x 2 x 3 • Its simplicity and flexibility make linear regression one of the most important and widely used statistical prediction techniques.

  10. Polynomial regression example 10 ● 8 6 ● response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input

  11. Linear regression 10 ● 8 ● 6 response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input f ( x ) = β 0 + β x

  12. Polynomial regression 10 ● 8 ● 6 response 4 ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −1 0 1 2 input f ( x ) = β 0 + β 1 x + β 2 x 2 + β 3 x 3

  13. Fitting a regression ● ● ● ● • Given data � = { ( x n , y n ) } N 1.0 n = 1 , find ● ● ● ● ● ● ● ● ● ● the coefficient β that can predict ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● y new from x new . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● y ● ● • Simplifications: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 • 0-intercept, i.e., β 0 = 0 ● ● ● ● ● ● ● ● ● • One input, i.e., p = 1 ● ● ● −1.0 ● ● ● ● ● ● ● • How should we proceed? ● −2 −1 0 1 2 x

  14. Residual sum of squares ● ● ● ● 1.0 | ( y n − βx n ) | ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ● ● ● ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● − 0.5 ● ● ● ● ● ● ● ● ● ● ● ● − 1.0 ● ● ● ● ● ● ● ● ● − 2 − 1 0 1 2 x A reasonable approach is to minimize sum of the squared Euclidean distance between each prediction β x n and the truth y n N 1 � ( y n − β x n ) 2 RSS ( β ) = 2 n = 1

  15. RSS for two inputs Y • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • X 2 • • • X 1

  16. Optimizing β The objective function is N 1 � ( y n − β x n ) 2 RSS ( β ) = 2 n = 1 The derivative is N d � d β RSS ( β ) = − ( y n − β x n ) x n n = 1 The optimal value is � N n = 1 y n x n ˆ β = � n x 2 n

  17. The optimal β ● ● ● ● 1.0 • The optimal value is ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● � N ● ● ● ● ● ● ● n = 1 y n x n ● ● ● ● ● ● ˆ ● ● β = ● ● ● ● ● ● ● ● ● ● ● ● ● � 0.0 ● ● n x 2 ● ● y ● ● ● ● ● ● n ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −0.5 ● ● ● ● ● ● ● ● ● • + values pull the slope up. ● ● ● −1.0 ● ● ● ● ● ● ● • − values pull the slope down ● −2 −1 0 1 2 x

  18. Prediction ● ● ● • After finding the optimal β , we ● 1.0 ● ● ● ● ● ● would like to predict a new output ● ● ● ● ● ● ● ● ● ● ● 0.5 ● ● ● ● ● ● ● from a new input. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● • We use the point on the line at the 0.0 ● ● ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● input, ● ● ● −0.5 ● ● y new = ˆ ● ● ˆ ● ● ● β x new ● ● ● ● ● −1.0 ● ● ● ● ● ● ● ● −2 −1 0 1 2 x

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend