cs480 680 lecture 11 june 12 2019
play

CS480/680 Lecture 11: June 12, 2019 Kernel methods [D] Chap. 11 - PowerPoint PPT Presentation

CS480/680 Lecture 11: June 12, 2019 Kernel methods [D] Chap. 11 [B] Sec. 6.1, 6.2 [M] Sec. 14.1, 14.2 [HTF] Chap. 6 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1 Non-linear Models Recap Generalized linear models:


  1. CS480/680 Lecture 11: June 12, 2019 Kernel methods [D] Chap. 11 [B] Sec. 6.1, 6.2 [M] Sec. 14.1, 14.2 [HTF] Chap. 6 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1

  2. Non-linear Models Recap • Generalized linear models: • Neural networks: University of Waterloo CS480/680 Spring 2019 Pascal Poupart 2

  3. Kernel Methods • Idea: use large (possibly infinite) set of fixed non- linear basis functions • Normally, complexity depends on number of basis functions, but by a “dual trick”, complexity depends on the amount of data • Examples: – Gaussian Processes (next class) – Support Vector Machines (next week) – Kernel Perceptron – Kernel Principal Component Analysis University of Waterloo CS480/680 Spring 2019 Pascal Poupart 3

  4. Kernel Function • Let !(#) be a set of basis functions that map inputs % to a feature space. • In many algorithms, this feature space only appears in the dot product ! # & !(# ' ) of input pairs #, #′ . • Define the kernel function * #, # ' = ! # & !(# ' ) to be the dot product of any pair %, %′ in feature space. – We only need to know ,(#, # ' ) , not !(#) University of Waterloo CS480/680 Spring 2019 Pascal Poupart 4

  5. Dual Representations • Recall linear regression objective % + $ 0 ) " * + , ' − . ' % " * " % ∑ '($ ! " = • Solution: set gradient to 0 1! " = ∑ ' " * + , ' − . ' + , ' + 2" = 0 $ 0 ∑ ' " * + , 4 − . ' +(, 4 ) " = − ∴ " is a linear combination of inputs in feature space + , ' |1 ≤ ; ≤ < University of Waterloo CS480/680 Spring 2019 Pascal Poupart 5

  6. Dual Representations • Substitute ! = #$ • Where # = [& ' ( & ' ) … & ' + ] - ( - ) ( 2 3 4 & ' 0 − 5 0 $ = and - 0 = − ⋮ - / • Dual objective: minimize 6 with respect to $ ) $ 7 # 7 ## 7 #$ − $ 7 # 7 #8 + 8 7 8 6 $ = ( ) + 2 ) $ 7 # 7 #$ University of Waterloo CS480/680 Spring 2019 Pascal Poupart 6

  7. Gram Matrix • Let ! = # $ # be the Gram matrix • Substitute in objective: ( & ) !!& − & ) !+ + + ) + % & = ' ( + - ( & ) !& • Solution: set gradient to 0 .% & = !!& − !+ + /!& = 0 ! ! + /1 & = !+ & = ! + /1 2' + • Prediction: 3 ∗ = 5 6 ∗ $ 7 = 5 6 ∗ $ #& = 8 6 ∗ , : ! + /1 2' + where :, + is the training set and 6 ∗ , 3 ∗ is a test instance University of Waterloo CS480/680 Spring 2019 Pascal Poupart 7

  8. Dual Linear Regression • Prediction: ! ∗ = $ % ∗ & '( , + ./ 01 2 = ) % ∗ , + • Linear regression where we find dual solution ( instead of primal solution w . • Complexity: – Primal solution: depends on # of basis functions – Dual solution: depends on amount of data • Advantage: can use very large # of basis functions • Just need to know kernel ) University of Waterloo CS480/680 Spring 2019 Pascal Poupart 8

  9. Constructing Kernels • Two possibilities: – Find mapping ! to feature space and let " = ! $ ! – Directly specify " • Can any function that takes two arguments serve as a kernel? • No, a valid kernel must be positive semi-definite – In other words, % must factor into the product of a transposed matrix by itself (e.g., " = ! $ ! ) – Or, all eigenvalues must be greater than or equal to 0. University of Waterloo CS480/680 Spring 2019 Pascal Poupart 9

  10. Example ' • Let ! ", $ = " & $ University of Waterloo CS480/680 Spring 2019 Pascal Poupart 10

  11. Constructing Kernels • Can we construct ! directly without knowing " ? • Yes, any positive semi-definite ! is fine since there is a corresponding implicit feature space. But positive semi-definiteness is not always easy to verify. • Alternative, construct kernels from other kernels using rules that preserve positive semi-definiteness University of Waterloo CS480/680 Spring 2019 Pascal Poupart 11

  12. Rules to construct Kernels • Let ! " #, # % and ! & (#, # % ) be valid kernels • The following kernels are also valid: ! #, # % = *! " #, # % 1. ∀* > 0 ! #, # % = . # ! " #, # % . # % 2. ∀. ! #, # % = /(! " #, # % ) / is polynomial with coeffs ≥ 0 3. ! #, # % = exp ! " #, # % 4. ! #, # % = ! " #, # % + ! & #, # % 5. ! #, # % = ! " #, # % ! & (#, # % ) 6. ! #, # % = ! 5 (6 # , 6 # % ) 7. ! #, # % = # 7 8# % 8 is symmetric positive semi-definite 8. % ) ! #, # % = ! 9 # : , # 9 % 9. + ! ; (# < , # ; 10. ! #, # % = ! 9 # 9 , # 9 % ! ; (# ; , # ; % ) # = where # = # > University of Waterloo CS480/680 Spring 2019 Pascal Poupart 12

  13. Common Kernels • Polynomial kernel: ! ", " $ = " & " $ ' – ( is the degree – Feature space: all degree M products of entries in " – Example: Let " and "′ be two images, then feature space could be all products of M pixel intensities • More general polynomial kernel: ! ", " $ = " & " $ + + ' with + > 0 – Feature space: all products of up to M entries in " University of Waterloo CS480/680 Spring 2019 Pascal Poupart 13

  14. Common Kernels , "*" + • Gaussian Kernel: ! ", " $ = exp − -. , • Valid Kernel because: • Implicit feature space is infinite! University of Waterloo CS480/680 Spring 2019 Pascal Poupart 14

  15. Non-vectorial Kernels • Kernels can be defined with respect to other things than vectors such as sets, strings or graphs • Example for strings: ! " # , " % = similarity between two documents (weighted sum of all non-contiguous strings that appear in both documents " # and " % ). • Lodhi, Saunders, Shawe-Taylor, Christianini, Watkins, Text Classification Using String Kernels , JMLR, p. 419-444, 2002. University of Waterloo CS480/680 Spring 2019 Pascal Poupart 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend