convex programs
play

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D - PowerPoint PPT Presentation

Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex Programs 1 / 16 Logistic Regression Support Vector Machines Support Vector Machines (SVMs) and Convex Programs SVMs are linear predictors in


  1. Convex Programs COMPSCI 371D — Machine Learning COMPSCI 371D — Machine Learning Convex Programs 1 / 16

  2. Logistic Regression → Support Vector Machines Support Vector Machines (SVMs) and Convex Programs • SVMs are linear predictors in their original form • Defined for both regression and classification • Multi-class versions exist • We will cover only binary SVM classification • Why do we need another linear classifier? • We’ll need some new math: Convex Programs • Optimization of convex functions with affine constraints COMPSCI 371D — Machine Learning Convex Programs 2 / 16

  3. Logistic Regression → Support Vector Machines Outline 1 Logistic Regression → Support Vector Machines 2 Local Convex Minimization → Convex Programs 3 Shape of the Solution Set 4 The Karush-Kuhn-Tucker Conditions COMPSCI 371D — Machine Learning Convex Programs 3 / 16

  4. Logistic Regression → Support Vector Machines Logistic Regression → SVMs • A logistic-regression classifier places the decision boundary somewhere (and approximately) between the two classes • Loss is never zero → Exact location of the boundary can be determined by samples that are very distant from the boundary (even on the correct side of it) • SVMs place the boundary “exactly half-way” between the two classes (with exceptions to allow for non linearly-separable classes) • Only samples close to the boundary matter: These are the support vectors • A “kernel trick” allows going beyond linear classifiers • We only look at the binary case COMPSCI 371D — Machine Learning Convex Programs 4 / 16

  5. Logistic Regression → Support Vector Machines Roadmap for SVMs • SVM training minimizes a convex function with constraints • Convex: Unique minimum risk • Constraints: Define a convex program as minimizing a convex function subject to affine constraints • Representer theorem : The SVM hyperplane normal vector is a linear combination of a subset of training samples ( x n , y n ) . The x n are the support vectors . • The proof of the representer theorem is based on a characterization of the solutions of a convex program • Characterization for an unconstrained problem: ∇ f ( u ) = 0 • Characterization for a convex program: The Karush-Kuhn-Tucker (KKT) conditions • The representer theorem leads to the kernel trick , through which SVMs can be turned into nonlinear classifiers • Decision boundary is no longer necessarily a hyperplane COMPSCI 371D — Machine Learning Convex Programs 5 / 16

  6. Logistic Regression → Support Vector Machines Roadmap Summary Convex program → SVM formulation KKT conditions → representer theorem → kernel trick COMPSCI 371D — Machine Learning Convex Programs 6 / 16

  7. Local Convex Minimization → Convex Programs Local Convex Minimization → Convex Programs • Convex function f ( u ) : R m → R • f differentiable, with continuous first derivatives • Unconstrained minimization: u ∗ ∈ arg min u ∈ R m f ( u ) • Constrained minimization: u ∗ ∈ arg min u ∈ C f ( u ) • C = { u ∈ R m : A u + b ≥ 0 } • f is a convex function • C is a convex set : If u , v ∈ C , then for t ∈ [ 0 , 1 ] t u + ( 1 − t ) v ∈ C • The specific C is bounded by hyperplanes • This is a convex program COMPSCI 371D — Machine Learning Convex Programs 7 / 16

  8. Local Convex Minimization → Convex Programs Convex Program u ∗ ∈ arg min u ∈ C f ( u ) where = { u ∈ R m : c ( u ) ≥ 0 } . def C • f differentiable, with continuous gradient, and convex • k inequalities in C are affine: c ( u ) = A u + b ≥ 0 . COMPSCI 371D — Machine Learning Convex Programs 8 / 16

  9. Shape of the Solution Set Shape of the Solution Set • Just as for the unconstrained problem: • There is one f ∗ but there can be multiple u ∗ (a flat valley) • The set of solution points u ∗ is convex • if f is strictly convex at u ∗ , then u ∗ is the unique solution point COMPSCI 371D — Machine Learning Convex Programs 9 / 16

  10. Shape of the Solution Set Zero Gradient → KKT Conditions • For the unconstrained problem, the solution is characterized by ∇ f ( u ) = 0 • Constraints can generate new minima and maxima • Example: f ( u ) = e u f(u) f(u) f(u) 0 1 0 1 0 1 u u u • What is the new characterization? • Karush-Kuhn-Tucker conditions , necessary and sufficient COMPSCI 371D — Machine Learning Convex Programs 10 / 16

  11. The Karush-Kuhn-Tucker Conditions Regular Points s ∇ f u H − H + COMPSCI 371D — Machine Learning Convex Programs 11 / 16

  12. The Karush-Kuhn-Tucker Conditions Corner Points c 2 C C c 2 ∇ f s ∇ f c 1 c 1 u u H − H + H − H + COMPSCI 371D — Machine Learning Convex Programs 12 / 16

  13. The Karush-Kuhn-Tucker Conditions The Convex Cone of the Constraint Gradients c 2 ∇ f c 1 u H − H + COMPSCI 371D — Machine Learning Convex Programs 13 / 16

  14. The Karush-Kuhn-Tucker Conditions Inactive Constraints Do Not Matter c 2 c 3 C ∇ f c 1 u c 1 u c 2 v C H − H + COMPSCI 371D — Machine Learning Convex Programs 14 / 16

  15. The Karush-Kuhn-Tucker Conditions Conic Combinations c 2 n 1 n 2 v ∇ f c 1 u a 1 a 2 H − H + { v : v = α 1 a 1 + α 2 a 2 with α 1 , α 2 ≥ 0 } COMPSCI 371D — Machine Learning Convex Programs 15 / 16

  16. The Karush-Kuhn-Tucker Conditions The KKT Conditions u ∈ C is a solution to a convex program iff there exist α i s.t. � ∇ f ( u ) = α i ∇ c i ( u ) with α i ≥ 0 i ∈A ( u ) where A ( u ) = { i : c i ( u ) = 0 } is the active set at u c 2 ∇ f c 1 u H − H + Convention: � i ∈∅ = 0 (so condition also holds in interior of C ) COMPSCI 371D — Machine Learning Convex Programs 16 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend