interior point methods
play

Interior-point methods 10-725 Optimization Geoff Gordon Ryan - PowerPoint PPT Presentation

Interior-point methods 10-725 Optimization Geoff Gordon Ryan Tibshirani Review SVM duality min v T v/2 + 1 T s s.t. Av yd + s 1 ! 0 s ! 0 max 1 T T K /2 s.t. y T = 0 0 " " 1 Gram


  1. Interior-point methods 10-725 Optimization Geoff Gordon Ryan Tibshirani

  2. Review • SVM duality ‣ min v T v/2 + 1 T s s.t. Av – yd + s – 1 ! 0 s ! 0 ‣ max 1 T α – α T K α /2 s.t. y T α = 0 0 " α " 1 ‣ Gram matrix K • Interpretation ‣ support vectors & complementarity ‣ reconstruct primal solution from dual Geoff Gordon—10-725 Optimization—Fall 2012 2

  3. Review • Kernel trick ‣ high-dim feature spaces, fast 2 ‣ positive definite function • Examples 1 ‣ polynomial 0 ‣ homogeneous polynomial � 1 ‣ linear ‣ Gaussian RBF � 2 � 2 � 1 0 1 2 Geoff Gordon—10-725 Optimization—Fall 2012 3

  4. Review: LF problem Ax + b ! 0 • Ball center ‣ bad summary of LF problem • Max-volume ellipsoid / ellipsoid center ‣ good summary (1/n of volume), but expensive • Analytic center of LF problem ‣ maximize product of distances to constraints ‣ min – # ln(a iT x + b i ) • Dikin ellipsoid @ analytic center: not quite as good (just 1/m < 1/n), but much cheaper Geoff Gordon—10-725 Optimization—Fall 2012 4

  5. Force-field interpretation of analytic center • Pretend constraints are repelling a particle ‣ normal force for each constraint ‣ force ! 1/distance • Analytic center = equilibrium = where forces balance Geoff Gordon—10-725 Optimization—Fall 2012 5

  6. Newton for analytic center • f(x) = – # ln(a iT x + b i ) ‣ df/dx = ‣ d 2 f/df 2 = Geoff Gordon—10-725 Optimization—Fall 2012 6

  7. Dikin ellipsoid • E(x 0 ) = { x | (x–x 0 ) T H(x–x 0 ) " 1 } ‣ H = Hessian of log barrier at x 0 ‣ unit ball of Hessian norm at x 0 • E(x 0 ) ⊆ X for any strictly feasible x 0 ‣ affine constraints can be just feasible ‣ E(x 0 ): as above, but intersected w/ affine constraints • vol(E(x ac )) ! vol(X)/m ‣ weaker than ellipsoid center, but still very useful Geoff Gordon—10-725 Optimization—Fall 2012 7

  8. E(x 0 ) ⊆ X • E(x 0 ) = { x | (x–x 0 ) T H(x–x 0 ) " 1 } ‣ H = A T S -2 A ‣ S = diag(s) = diag(Ax 0 + b) Geoff Gordon—10-725 Optimization—Fall 2012 8

  9. mE(x 0 ) ⊇ X • Feasible point x: Ax + b ! 0 • Analytic center x ac : A T y = 0 y = 1./(Ax ac +b) • Let Y = diag(y ac ), H = A T Y 2 A; show: ‣ (x–x ac ) T H(x–x ac ) " m 2 [+ m] Geoff Gordon—10-725 Optimization—Fall 2012 9

  10. Combinatorics v. analysis • Two ways to find a feasible point of Ax+b ! 0 ‣ find analytic center—minimize a smooth function ‣ find a feasible basis—combinatorial search Geoff Gordon—10-725 Optimization—Fall 2012 10

  11. Bad conditioning? No problem. • Analytic center & Dikin ellipsoids invariant to affine xforms w = Mx+q ‣ W = { w | AM -1 (w–q) + b ! 0 } • Can always xform so that a ball takes up ! vol(Y)/m ‣ Dikin ellipsoid @ac → sphere Geoff Gordon—10-725 Optimization—Fall 2012 11

  12. LF → LP: the central path • Analytic center was for: find x st Ax + b ! 0 • Now: min c T x st Ax + b ! 0 • Same trick: ‣ min f t (x) = c T x – (1/t) # ln(a iT x + b i ) ‣ parameter t > 0 ‣ central path = ‣ t → 0: t → ! : Geoff Gordon—10-725 Optimization—Fall 2012 12

  13. Force-field interpretation of central path • Force along objective; normal forces for each constraint − c − 3 c t=1 t=3 Geoff Gordon—10-725 Optimization—Fall 2012 13

  14. Newton for central path • min f t (x) = c T x – (1/t) # ln(a iT x + b i ) ‣ df/dx = ‣ d 2 f/dx 2 = Geoff Gordon—10-725 Optimization—Fall 2012 14

  15. Central path example objective t → 0 t →∞ Geoff Gordon—10-725 Optimization—Fall 2012 15

  16. New LP algorithm? • Set t=10 12 . Find corresponding point on central path by Newton’s method. ‣ worked for example on previous slide! ‣ but has convergence problems in general • Alternatives? Geoff Gordon—10-725 Optimization—Fall 2012 16

  17. Constraint form of central path • min – # ln s i st Ax + b ! 0 c T x " λ • ∃ a 1-1 mapping λ (t) w/ x( λ (t)) = x(t) ∀ t>0 ‣ but this form is slightly less convenient since we don’t know minimal feasible value of λ or maximal nontrivial value of λ Geoff Gordon—10-725 Optimization—Fall 2012 17

  18. Dual of central path • min c T x – (1/t) # ln s i st Ax + b = s ! 0 ‣ min x,s max y L(x,s,y) = c T x – (1/t) # ln s i + y T (s–Ax–b) Geoff Gordon—10-725 Optimization—Fall 2012 18

  19. Primal-dual correspondence • Primal and dual for central path: ‣ min c T x – (1/t) # ln s i st Ax + b = s ! 0 ‣ max (m ln t)/t + m/t + (1/t) # ln y i – y T b st A T y = c y ! 0 • L(x,s,y) = c T x – (1/t) # ln s i + y T (s–Ax–b) ‣ grad wrt s: ‣ to get x: Geoff Gordon—10-725 Optimization—Fall 2012 19

  20. Duality gap • At optimum: ‣ primal value c T x – (1/t) # ln s i = dual value (m ln t)/t + m/t + (1/t) # ln y i – y T b ‣ s � y = te Geoff Gordon—10-725 Optimization—Fall 2012 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend