advanced machine learning non linear regression
play

ADVANCED MACHINE LEARNING Non-linear regression techniques (SVR and - PowerPoint PPT Presentation

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Non-linear regression techniques (SVR and extensions, GPR, Gradient Boosting) 1 1 ADVANCED MACHINE LEARNING Regression: Principle N Map N-dim. input to a continuous


  1. ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING Non-linear regression techniques (SVR and extensions, GPR, Gradient Boosting) 1 1

  2. ADVANCED MACHINE LEARNING Regression: Principle ∈ ∈   N Map N-dim. input to a continuous output . x y Learn a function of the type: ( ) → =   N : and . f y f x y 3 y True function 2 y Estimate 4 y 1 y x 1 2 3 4 x x x x { } i i Estimate that best predict set of training points , ? f x y = 1,... i M 2 2

  3. ADVANCED MACHINE LEARNING Regression: Issues ∈ ∈   N Map N-dim. input to a continuous output . x y Learn a function of the type: ( ) → =   Fit strongly influenced by choice of: N : and . f y f x - datapoints for training - complexity of the model (interpolation) y 3 y True function 2 y Estimate 4 y 1 y x 1 2 3 4 x x x x { } i i Estimate that best predict set of training points , ? f x y = 1,... i M 3 3

  4. ADVANCED MACHINE LEARNING Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Relevance vector regression Support vector regression Boosting – random projections Boosting – random gaussians Random forest Gradient boosting Gaussian process regression Gaussian Process Not covered in class!! Locally weighted projected regression 4 4

  5. ADVANCED MACHINE LEARNING Regression Algorithms in this Course Support Vector Machine Relevance Vector Machine Relevance Vector Machine Relevance vector regression Support vector regression Boosting – random projections 5 5

  6. ADVANCED MACHINE LEARNING Support Vector Regression 6 6 6

  7. ADVANCED MACHINE LEARNING Support Vector Regression ( ) = Assume a nonlinear mapping , s.t. . f y f x { } i i How to estimate to best predict the pair of training points , ? f x y = 1,... i M How to generalize the support vector machine framework for classification to estimate continuous functions? 1. Assume a non-linear mapping through feature space and then perform linear regression in feature space 2. Supervised learning – minimizes an error function.  First determine a way to measure error on testing set in the linear case! 7 7 7

  8. ADVANCED MACHINE LEARNING Support Vector Regression b is estimated as in SVR ( ) through least-square = = + T Assume a linear mapping , s.t. . f y f x w x b regression on support vectors; hence we omit it from the rest of the developments . { } 1,... i i How to estimate and to best predict the pair of training points , ? w b x y = i M  y ( ) = = + T y f x w x b Measure the error on prediction x 8 8 8

  9. ADVANCED MACHINE LEARNING Support Vector Regression ε Set an upper bound on the error and consider as correctly classified all points − ≤ ε such that ( ) , f x y  y ( ) = = + T y f x w x b Penalize only datapoints that are ε not contained in the -tube. x 9 9 9

  10. ADVANCED MACHINE LEARNING Support Vector Regression The ε -margin is a measure of the width of the ε - insensitive tube. It is a measure of the precision of the regression. A small || w || corresponds to a small slope for f . In the linear case, f is more horizontal. y x ε -margin 10 10 10

  11. ADVANCED MACHINE LEARNING Support Vector Regression A large || w || corresponds to a large slope for f . In the linear case, f is more vertical. The flatter the slope of the function f, the larger the ε− margin. y  To maximize the margin, we must minimize the norm of w. x ε -margin 11 11 11

  12. ADVANCED MACHINE LEARNING Support Vector Regression This can be rephrased as a constraint-based optimization problem of the form: Need to penalize points outside 1 the ε -insensitive 2 minimize w tube. 2  + − ≤ ε i , w x b y  i  subject to − − ≤ ε  i , y w x b i  ∀ = 1,... i M 12 12 12

  13. ADVANCED MACHINE LEARNING Support Vector Regression ξ ξ ≥ * Introduce slack variables , , 0 : C i i Need to penalize points outside the ε -insensitive ( ) 1 C M ∑ 2 ξ + ξ * minimize + w tube. i i 2 M = i 1  + − ≤ ε + ξ i , w x b y i  i ξ  i − − ≤ ε + ξ i *  subject to , ξ y w x b i * i  i ξ ≥ ξ ≥ *  0, 0  i i 13 13 13

  14. ADVANCED MACHINE LEARNING Support Vector Regression ξ ξ ≥ * Introduce slack variables , , 0 : C i i All points outside the ε -tube become ( ) 1 C M ∑ 2 ξ + ξ Support Vectors * minimize + w i i 2 M = i 1  + − ≤ ε + ξ i , w x b y i  i ξ  i − − ≤ ε + ξ i *  subject to , ξ y w x b i * i  i ξ ≥ ξ ≥ *  0, 0  i i We now have the solution to the linear regression problem. How to generalize this to the nonlinear case? 14 14 14

  15. ADVANCED MACHINE LEARNING Support Vector Regression Lift x into feature space and then perform linear regression in feature space. Linear Case: ( ) = = + , y f x w x b ( ) → φ Non-Linear Case: x x ( ) → φ x x ( ) ( ) ( ) = φ = φ + , y f x w x b w lives in feature space! 15 15 15

  16. ADVANCED MACHINE LEARNING Support Vector Regression In feature space, we obtain the same constrained optimization problem: ( ) 1 C M ∑ 2 ξ + ξ * minimize + w i i 2 M = 1 i ( )  φ + − ≤ ε + ξ i , w x b y i  i  ( ) − φ − ≤ ε + ξ i *  subject to , y w x b i i  ξ ≥ ξ ≥  * 0, 0  i i 16 16 16

  17. ADVANCED MACHINE LEARNING Support Vector Regression Again, we can solve this quadratic problem by introducing sets of Lagrange multipliers and writing the Lagrangian : Lagrangian = Objective function + λ * constraints ( ) ( ) M M 1 C C ∑ ∑ ( ) ξ ξ 2 ξ + ξ − η ξ + η ξ * * * L , , *, = + w b w i i i i i i 2 M M = = i 1 i 1 ( ) ( ) M ∑ − α ε + ξ + − φ − i , y w x b i i i = 1 i ( ) ( ) M ∑ − α ε + ξ − + φ + * * i , y w x b i i i = 1 i 17 17 17

  18. ADVANCED MACHINE LEARNING Support Vector Regression α = α = * 0 for all points that do not satisfy the constraints i i → ε points outside the -tube α > 0 i ξ i ξ * i α > * 0 i Constraints on points lying on either side of the ε -tube ( ) ( ) M M 1 C C ∑ ∑ ( ) ξ ξ 2 ξ + ξ − η ξ + η ξ * * * L , , *, = + w b w i i i i i i 2 M M = = i 1 i 1 ( ) ( ) M ∑ − α ε + ξ + − φ − i , y w x b i i i = 1 i ( ) ( ) M ∑ − α ε + ξ − + φ + * * i , y w x b i i i = 1 i 18 18 18

  19. ADVANCED MACHINE LEARNING Support Vector Regression α = α = * 0 for all points that do not satisfy the constraints i i → ε points outside the -tube α > 0 i α > * 0 i Requiring that the partial derivatives are all zero: ∂ ( ) M M M L ∑ ∑ Rebalancing the effect of the support vectors on ∑ → α = α = α − α = * * 0. both sides of the ε -tube ∂ i i i i b = = = 1 1 1 i i i ∂ ( ) ( ) L M ∑ ( ) ( ) Linear combination of support M = − α − α φ = ∑ * i 0. w x ⇒ = α − α φ * i . vectors w x ∂ i i w i i = 1 i = 1 i 19 19 19

  20. ADVANCED MACHINE LEARNING Support Vector Regression And replacing in the primal Lagrangian, we get the Dual optimization problem: − ( ) ( ) ( ) Kernel Trick 1 M ∑ α − α α − α ⋅ * * i j ,  k x x ( ) ( ) ( ) i i j j  2 = φ φ i j i j = , , k x x x x , 1 i j  max  ( ) ( ) M M  − ∑ ∑ α α ε α + α + α + α * , * i * y  i i i i  = = i 1 i 1   ( ) M C ∑ α − α = α α ∈  * * i subject to 0 and , 0,  i i i i   M = 1 i 20 20 20

  21. ADVANCED MACHINE LEARNING Support Vector Regression The solution is given by : ( ) ( ) M ( ) ∑ = = α − α + * i , y f x k x x b i i = 1 i Linear Coefficients If one uses RBF Kernel, (Lagrange multipliers M un-normalized isotropic for each constraint). Gaussians centered on each training datapoint. 21 21 21

  22. ADVANCED MACHINE LEARNING Support Vector Regression The solution is given by : ( ) ( ) M ( ) ∑ = = α − α + * i , y f x k x x b i i = 1 i Kernel places a Gauss function on each SV y x 22 22 22

  23. ADVANCED MACHINE LEARNING Support Vector Regression Converges to b when The solution is given by : SV effect vanishes. ( ) ( ) M ∑ ( ) = = α − α + * i , y f x k x x b i i = 1 i The Lagrange multipliers y define the importance of each Gaussian function. Y=f(x) b x x x x x x x 1 2 3 4 5 6 α = α = α = α = α = α = * * * 1.5 2 2.5 1.5 3 1 3 1 2 4 6 5 23 23 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend