CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear - PowerPoint PPT Presentation

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear Regression [RN] Sec. 18.6.1, [HTF] Sec. 2.3.1, [D] Sec. 7.6, [B] Sec. 3.1, [M] Sec. 1.4.5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1

Linear model for regression • Simple form of regression • Picture: University of Waterloo CS480/680 Spring 2019 Pascal Poupart 2

Problem • Data: { ! " , $ % , ! & , $ ' , … , (! * , $ + )} – ! = < 0 % , 0 ' , … , 0 1 > : input vector – $ : target (continuous value) • Problem: find hypothesis ℎ that maps ! to $ – Assume that ℎ is linear: 4 !, 5 = 6 7 + 6 % 0 % + ⋯ + 6 1 0 1 = 5 : 1 ! • Objective: minimize some loss function – Euclidean loss: < ' (5) = % + − $ > ' ' ∑ >?% 4 ! @ , 5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 3

Optimization • Find best ! that minimizes Euclidean loss 1 7 1 1 " ∗ = %&'()* " 2 . − " 4 2 - 5 6 ./0 • Convex optimization problem ⟹ unique optimum (global) University of Waterloo CS480/680 Spring 2019 Pascal Poupart 4

̅ Solution 1 - , 1 2 / − + 4 ! - ∑ /0, • Let ! " = " then min " 5 + • Find + ∗ by setting the derivative to 0 78 9 1 2 / − + 4 ! = ∑ /0, " 5 = /> = 0 ∀A 7 :; 1 2 / − + 4 ! ⟹ ∑ /0, " 5 ! " 5 = 0 • This is a linear system in + , therefore we rewrite it as C+ = D 4 and D = ∑ /0, 1 1 where C = ∑ /0, ! " 5 ! 2 / ! " 5 " 5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 5

Solution • If training instances span ℜ "#$ then % is invertible: & = % () * • In practice it is faster to solve the linear system %& = * directly instead of inverting % – Gaussian elimination – Conjugate gradient – Iterative methods University of Waterloo CS480/680 Spring 2019 Pascal Poupart 6

Picture University of Waterloo CS480/680 Spring 2019 Pascal Poupart 7

Regularization • Least square solution may not be stable – i.e., slight perturbation of the input may cause a dramatic change in the output – Form of overfitting University of Waterloo CS480/680 Spring 2019 Pascal Poupart 8

Example 1 " # = 1 " ' = 1 • Training data: ! ! 0 ( ) * = 1 ) + = 1 • , = • , -# = . = • / = University of Waterloo CS480/680 Spring 2019 Pascal Poupart 9

Example 2 " # = 1 " ' = 1 • Training data: ! ! 0 ( ) * = 1 + ( ) , = 1 • - = • - .# = / = • 0 = University of Waterloo CS480/680 Spring 2019 Pascal Poupart 10

Picture University of Waterloo CS480/680 Spring 2019 Pascal Poupart 11

Regularization • Idea: favor smaller values " as a penalty term • Tikhonov regularization: add ! " • Ridge regression: 1 1 " + 9 " ! ∗ = %&'()* ! 2 . − ! 4 5 2 - 6 7 2 ! " ./0 where 9 is a weight to adjust the importance of the penalty University of Waterloo CS480/680 Spring 2019 Pascal Poupart 12

Regularization • Solution: !" + $ % = ' • Notes – Without regularization: eigenvalues of linear system may be arbitrarily close to 0 and the inverse may have arbitrarily large eigenvalues. – With Tikhonov regularization, eigenvalues of linear system are ≥ ! and therefore bounded away from 0. Similarly, eigenvalues of inverse are bounded above by 1/! . University of Waterloo CS480/680 Spring 2019 Pascal Poupart 13

Regularized Examples Example 1 Example 2 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 14

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear - PowerPoint PPT Presentation

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear Regression [RN] Sec. 18.6.1, [HTF] Sec. 2.3.1, [D] Sec. 7.6, [B] Sec. 3.1, [M] Sec. 1.4.5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1 Linear model for regression

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

CS480/680 Machine Learning Lecture 1: May 6 th , 2019 Course Introduction Pascal Poupart

CS480/680 Machine Learning Lecture 1: January 7 th , 2020 Course Introduction Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 5: January 21 st , 2020 Information Theory Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 6: January 23 st , 2020 Maximum A posteriori & Maximum

CS480/680 Lecture 4: May 15, 2019 Statistical Learning [RN]: Sec 20.1, 20.2, [M]: Sec. 2.2, 3.2

CS480/680 Machine Learning Lecture 8: January 30 th , 2020 Graphical Models Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 12: February 13 th , 2020 Expectation-Maximization Zahra

CS480/680 Machine Learning Lecture 20: Convolutional Neural Network Zahra Sheikhbahaee March 29,

CS480/680 Machine Learning Lecture 3: January 14 th , 2020 Linear Regression Zahra Sheikhbahaee

CS480/680 Lecture 22: July 22, 2019 Ensemble Learning [RN] Sec. 18.10, [M] Sec. 16.2.5, [B]

CS480/680 Lecture 2: May 8 th , 2019 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D]

CS480/680 Lecture 7: May 29, 2019 Classification with Mixture of Gaussians [B] Sections 4.2,

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

CS480/680 Lecture 18: July 8, 2019 Recurrent and Recursive Neural Networks [GBC] Chap. 10

CS480/680 Lecture 15: June 26, 2019 Deep Neural Networks [GBC] Chap. 6, 7, 8 University of

Optimization MS Maths Big Data Alexandre Gramfort alexandre.gramfort@telecom-paristech.fr

Lecture 08: Ridge Regression, Equivalent Formulations and KKT Conditions Instructor: Prof. Ganesh

Survey of Machine Learning Methods Pedro Rodriguez CU Boulder PhD Student in Large-Scale Machine

Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why

Lecture 3: Kernel Regression Curse of Dimensionality Aykut Erdem February 2016 Hacettepe

5. Summary of linear regression so far Main points Model/function/predictor class of linear

CSI5180. MachineLearningfor BioinformaticsApplications Regularized Linear Models by Marcel

Regression with Many Predictors 21.12.2016 Goals of Todays Lecture Get a (limited) overview

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear - PowerPoint PPT Presentation

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear Regression [RN] Sec. 18.6.1, [HTF] Sec. 2.3.1, [D] Sec. 7.6, [B] Sec. 3.1, [M] Sec. 1.4.5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1 Linear model for regression

CEE 680 Lecture #2 1/22/2020 1 CEE 680 Lecture #2 1/22/2020 2 CEE 680 Lecture #2

CS480/680 Machine Learning Lecture 1: May 6 th , 2019 Course Introduction Pascal Poupart

CS480/680 Machine Learning Lecture 1: January 7 th , 2020 Course Introduction Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 5: January 21 st , 2020 Information Theory Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 6: January 23 st , 2020 Maximum A posteriori &amp; Maximum

CS480/680 Lecture 4: May 15, 2019 Statistical Learning [RN]: Sec 20.1, 20.2, [M]: Sec. 2.2, 3.2

CS480/680 Machine Learning Lecture 8: January 30 th , 2020 Graphical Models Zahra Sheikhbahaee

CS480/680 Machine Learning Lecture 12: February 13 th , 2020 Expectation-Maximization Zahra

CS480/680 Machine Learning Lecture 20: Convolutional Neural Network Zahra Sheikhbahaee March 29,

CS480/680 Machine Learning Lecture 3: January 14 th , 2020 Linear Regression Zahra Sheikhbahaee

CS480/680 Lecture 22: July 22, 2019 Ensemble Learning [RN] Sec. 18.10, [M] Sec. 16.2.5, [B]

CS480/680 Lecture 2: May 8 th , 2019 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D]

CS480/680 Lecture 7: May 29, 2019 Classification with Mixture of Gaussians [B] Sections 4.2,

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11,

CS480/680 Lecture 18: July 8, 2019 Recurrent and Recursive Neural Networks [GBC] Chap. 10

CS480/680 Lecture 15: June 26, 2019 Deep Neural Networks [GBC] Chap. 6, 7, 8 University of

Optimization MS Maths Big Data Alexandre Gramfort alexandre.gramfort@telecom-paristech.fr

Lecture 08: Ridge Regression, Equivalent Formulations and KKT Conditions Instructor: Prof. Ganesh

Survey of Machine Learning Methods Pedro Rodriguez CU Boulder PhD Student in Large-Scale Machine

Why LASSO, Ridge Need for Strictly . . . Regression, and EN: General Analysis of the . . . Why

Lecture 3: Kernel Regression Curse of Dimensionality Aykut Erdem February 2016 Hacettepe

5. Summary of linear regression so far Main points Model/function/predictor class of linear

CSI5180. MachineLearningfor BioinformaticsApplications Regularized Linear Models by Marcel

Regression with Many Predictors 21.12.2016 Goals of Todays Lecture Get a (limited) overview

CS480/680 Machine Learning Lecture 6: January 23 st , 2020 Maximum A posteriori & Maximum