incremental gradient subgradient and proximal methods for
play

Incremental Gradient, Subgradient, and Proximal Methods for Convex - PowerPoint PPT Presentation

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity Inducing Norm This chapter is on


  1. Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning

  2. Summary of Chapter 2 • Chapter 2: Convex Optimization with Sparsity Inducing Norm • This chapter is on convex optimization of the form • Where f is convex differentiable function and Ω is sparsity inducing non-smooth norm • Ω l1, l1+ l1/lq, hierarchical l1/lq norm • Subgradient, block co-ordinate descent, reweighted l2 algorithms etc

  3. Summary of Chapter 3 • This chapter is on Cone linear and quadratic programming of the form Where is generalized inequality, � � � � ∈ � , where C is closed pointed cone. Examples of cones :- 1) non-negative orhant 2) Second-order cone :- There is Python package CVXOPT to solve conic problems

  4. Introduction • This chapter considers optimization problems with cost functions such as Where m is very large. Therefore, using incremental methods that operate on singe � � � rather than entire cost function.

  5. Least Square and Related inference problems • Classical regression • L1- regularization problem Other possibilities include using non-quadratic convex loss functions

  6. Dual Optimization in Separable Problems • The problems of the form • On non-convex set Y, have dual form

  7. Weber Problem in Location Theory • Find a point x whose weighted distances from given get of points Y (y1, y2…, ym) is minimized

  8. Incremental Gradient Methods • Differentiable Problems • When the component functions are differentiable we may use incremental gradient methods of the form • Where ik is the index of cost component iterated on Such methods make fast progress when far from convergence but are slow when close to convergence Fixes: use constant step size or reduce to a small positive value

  9. Variant of incremental gradient method • Gradient method with momentum • Aggregate component gradient • Incremental gradient methods are also related to stochastic gradient method.

  10. Incremental Sub-gradient Methods • For cases when component functions are convex and non- differentiable • In place of gradient, arbitrary sub gradient is used. • Convexity of fi(x) is essential • Even non-incremental methods require sub-linear rate of convergence, hence incremental methods are favored

  11. Incremental Proximal Methods • These are the problems of the form This form is desirable as for some components, proximal iteration may be obtained in closed form Proximal iterations are considered more stable than gradient or sub- gradient iterations.

  12. Incremental Subgradient-Proximal methods • These methods include incremental algorithms with combination of proximal and sub-gradient iteration.

  13. • Both zk and xk are within constraint X which can be relaxed for either proximal or sub-gradient iterations which leads to easier computation • So, the iterations in previous slides can be rewritten as: • Or Incremental proximal iterations are closely related to sub-gradient iterations. So, we can re-write two steps given above in one step

  14. Order of components • Incremental sub-gradient proximal method’s effectiveness depends on order {fi, hi} are chosen. • 1) Cyclic : {fi, hi} are taken in fixed deterministic order • 2) randomized order based on uniform sampling: each iteration pair {fi, hi} is randomly chosen • Both order converge, however randomized order is superior to cyclic order

  15. Applications: Regularized least squares • Let’s consider problem of the form • Where R(x) is a l1-norm • Then proximal iteration becomes

  16. Applications: Regularized least squares • It decomposed into Incremental algorithm are well-suited for such problem as proximal updates can be done in closed form Followed by gradient iteration

  17. Iterated Projection Algorithm for Feasibility Problem • Feasibility problem has the form Which can be re-written for Lipschitz continuous f and sufficiently large γ For which incremental algorithms apply

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend