CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear - - PowerPoint PPT Presentation

cs480 680 machine learning lecture 3 may 13 2019
SMART_READER_LITE
LIVE PREVIEW

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear - - PowerPoint PPT Presentation

CS480/680 Machine Learning Lecture 3: May 13, 2019 Linear Regression [RN] Sec. 18.6.1, [HTF] Sec. 2.3.1, [D] Sec. 7.6, [B] Sec. 3.1, [M] Sec. 1.4.5 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1 Linear model for regression


slide-1
SLIDE 1

CS480/680 Machine Learning Lecture 3: May 13, 2019

Linear Regression [RN] Sec. 18.6.1, [HTF] Sec. 2.3.1, [D] Sec. 7.6, [B] Sec. 3.1, [M] Sec. 1.4.5

CS480/680 Spring 2019 Pascal Poupart 1 University of Waterloo

slide-2
SLIDE 2

Linear model for regression

  • Simple form of regression
  • Picture:

CS480/680 Spring 2019 Pascal Poupart 2 University of Waterloo

slide-3
SLIDE 3

Problem

  • Data: { !", $% , !&, $' , … , (!*, $+)}

– ! = < 0%, 0', … , 01 >: input vector – $: target (continuous value)

  • Problem: find hypothesis ℎ that maps ! to $

– Assume that ℎ is linear: 4 !, 5 = 67 + 6%0% + ⋯ + 6101 = 5: 1 !

  • Objective: minimize some loss function

– Euclidean loss: <'(5) = %

' ∑>?% +

4 !@, 5 − $> '

CS480/680 Spring 2019 Pascal Poupart 3 University of Waterloo

slide-4
SLIDE 4

Optimization

  • Find best ! that minimizes Euclidean loss

"∗ = %&'()*" 1 2 -

./0 1

  • 2. − "4

1 56

7

  • Convex optimization problem

⟹ unique optimum (global)

CS480/680 Spring 2019 Pascal Poupart 4 University of Waterloo

slide-5
SLIDE 5

Solution

  • Let !

" = 1 " then min

+ ,

  • ∑/0,

1

2/ − +4! "5

  • Find +∗ by setting the derivative to 0

789 7:;

= ∑/0,

1

2/ − +4! "5 ̅ =/> = 0 ∀A ⟹ ∑/0,

1

2/ − +4! "5 ! "5 = 0

  • This is a linear system in +, therefore we

rewrite it as C+ = D

where C = ∑/0,

1

! "5! "5

4 and D = ∑/0, 1

2/! "5

CS480/680 Spring 2019 Pascal Poupart 5 University of Waterloo

slide-6
SLIDE 6

Solution

  • If training instances span ℜ"#$ then % is

invertible:

& = %()*

  • In practice it is faster to solve the linear

system %& = * directly instead of inverting %

– Gaussian elimination – Conjugate gradient – Iterative methods

CS480/680 Spring 2019 Pascal Poupart 6 University of Waterloo

slide-7
SLIDE 7

Picture

CS480/680 Spring 2019 Pascal Poupart 7 University of Waterloo

slide-8
SLIDE 8

Regularization

  • Least square solution may not be stable

– i.e., slight perturbation of the input may cause a dramatic change in the output – Form of overfitting

CS480/680 Spring 2019 Pascal Poupart 8 University of Waterloo

slide-9
SLIDE 9

Example 1

  • Training data: !

"# = 1 ! "' = 1 ( )* = 1 )+ = 1

  • , =
  • ,-# =

. =

  • / =

CS480/680 Spring 2019 Pascal Poupart 9 University of Waterloo

slide-10
SLIDE 10

Example 2

  • Training data: !

"# = 1 ! "' = 1 ( )* = 1 + ( ), = 1

  • - =
  • -.# =

/ =

  • 0 =

CS480/680 Spring 2019 Pascal Poupart 10 University of Waterloo

slide-11
SLIDE 11

Picture

CS480/680 Spring 2019 Pascal Poupart 11 University of Waterloo

slide-12
SLIDE 12

Regularization

  • Idea: favor smaller values
  • Tikhonov regularization: add !

" " as a penalty term

  • Ridge regression:

!∗ = %&'()*! 1 2 -

./0 1

  • 2. − !45

67

" + 9

2 !

" "

where 9 is a weight to adjust the importance of the penalty

CS480/680 Spring 2019 Pascal Poupart 12 University of Waterloo

slide-13
SLIDE 13

Regularization

  • Solution: !" + $ % = '
  • Notes

– Without regularization: eigenvalues of linear system may be arbitrarily close to 0 and the inverse may have arbitrarily large eigenvalues. – With Tikhonov regularization, eigenvalues of linear system are ≥ ! and therefore bounded away from 0. Similarly, eigenvalues of inverse are bounded above by 1/!.

CS480/680 Spring 2019 Pascal Poupart 13 University of Waterloo

slide-14
SLIDE 14

Regularized Examples

Example 1 Example 2

CS480/680 Spring 2019 Pascal Poupart 14 University of Waterloo