Incremental Gradient, Subgradient, and Proximal Methods for Convex - PowerPoint PPT Presentation

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning

Summary of Chapter 2 • Chapter 2: Convex Optimization with Sparsity Inducing Norm • This chapter is on convex optimization of the form • Where f is convex differentiable function and Ω is sparsity inducing non-smooth norm • Ω l1, l1+ l1/lq, hierarchical l1/lq norm • Subgradient, block co-ordinate descent, reweighted l2 algorithms etc

Summary of Chapter 3 • This chapter is on Cone linear and quadratic programming of the form Where is generalized inequality, � � � � ∈ � , where C is closed pointed cone. Examples of cones :- 1) non-negative orhant 2) Second-order cone :- There is Python package CVXOPT to solve conic problems

Introduction • This chapter considers optimization problems with cost functions such as Where m is very large. Therefore, using incremental methods that operate on singe � � � rather than entire cost function.

Least Square and Related inference problems • Classical regression • L1- regularization problem Other possibilities include using non-quadratic convex loss functions

Dual Optimization in Separable Problems • The problems of the form • On non-convex set Y, have dual form

Weber Problem in Location Theory • Find a point x whose weighted distances from given get of points Y (y1, y2…, ym) is minimized

Incremental Gradient Methods • Differentiable Problems • When the component functions are differentiable we may use incremental gradient methods of the form • Where ik is the index of cost component iterated on Such methods make fast progress when far from convergence but are slow when close to convergence Fixes: use constant step size or reduce to a small positive value

Variant of incremental gradient method • Gradient method with momentum • Aggregate component gradient • Incremental gradient methods are also related to stochastic gradient method.

Incremental Sub-gradient Methods • For cases when component functions are convex and non- differentiable • In place of gradient, arbitrary sub gradient is used. • Convexity of fi(x) is essential • Even non-incremental methods require sub-linear rate of convergence, hence incremental methods are favored

Incremental Proximal Methods • These are the problems of the form This form is desirable as for some components, proximal iteration may be obtained in closed form Proximal iterations are considered more stable than gradient or subgradient iterations.

Incremental Subgradient-Proximal methods • These methods include incremental algorithms with combination of proximal and sub-gradient iteration.

• Both zk and xk are within constraint X which can be relaxed for either proximal or sub-gradient iterations which leads to easier computation • So, the iterations in previous slides can be rewritten as: • Or Incremental proximal iterations are closely related to sub-gradient iterations. So, we can re-write two steps given above in one step

Order of components • Incremental sub-gradient proximal method’s effectiveness depends on order {fi, hi} are chosen. • 1) Cyclic : {fi, hi} are taken in fixed deterministic order • 2) randomized order based on uniform sampling: each iteration pair {fi, hi} is randomly chosen • Both order converge, however randomized order is superior to cyclic order

Applications: Regularized least squares • Let’s consider problem of the form • Where R(x) is a l1-norm • Then proximal iteration becomes

Applications: Regularized least squares • It decomposed into Incremental algorithm are well-suited for such problem as proximal updates can be done in closed form Followed by gradient iteration

Iterated Projection Algorithm for Feasibility Problem • Feasibility problem has the form Which can be re-written for Lipschitz continuous f and sufficiently large γ For which incremental algorithms apply

Incremental Gradient, Subgradient, and Proximal Methods for Convex - PowerPoint PPT Presentation

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity Inducing Norm This chapter is on

Convergence of perturbed Proximal Gradient algorithms Gersende Fort Institut de Math ematiques

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Lecture: Fast Proximal Gradient Methods http://bicmr.pku.edu.cn/~wenzw/opt-2018-fall.html

More Subgradient Calculus: Proximal Operator Following functions are again convex, but again, may

Kurdyka- Lojasiewicz inequality and Kurdyka- Lojasiewicz inequality and subgradient

Convex Optimization ( EE227A: UC Berkeley ) Lecture 18 (Proximal methods; Incremental methods

Gradient, Subgradient and how they may affect your grade(ient) David Sontag & Yoni Halpern

Subgradient method Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember

Generalized gradient descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1

Asymmetric Proximal Point Algorithms with Moving Proximal Centers Deren Han

Inexact variable metric proximal gradient methods with line-search for convex and nonconvex

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

New primal-dual subgradient methods for Convex Problems with Functional Constraints Yurii

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

Meta-Learning of Structured Representation by Proximal Mapping Mao Li, Yingyi Ma, Xinhua

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization L. T. K. Hien 1 N. Gillis 1

Some applications of proximal methods Caroline CHAUX Joint work with P. L. Combettes, L. Duval,

Proximal Policy Optimization Ruifan Yu (ruifan.yu@uwaterloo.ca) CS 885 June 20 Pro roximal l

iPiano: Inertial Proximal Algorithm for Non-Convex Optimization David Stutz June 2, 2016 David

Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau

Proximal Identification and Applications J er ome MALICK CNRS, Lab. J. Kuntzmann, Grenoble

Complexity of a quadratic penalty accelerated inexact proximal point method W. Kong 1 J.G. Melo 2

Incremental Gradient, Subgradient, and Proximal Methods for Convex - PowerPoint PPT Presentation

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey Chapter 4 : Optimization for Machine Learning Summary of Chapter 2 Chapter 2: Convex Optimization with Sparsity Inducing Norm This chapter is on

Convergence of perturbed Proximal Gradient algorithms Gersende Fort Institut de Math ematiques

Stochastic Perturbations of Proximal-Gradient methods for nonsmooth convex optimization: the

Lecture: Fast Proximal Gradient Methods http://bicmr.pku.edu.cn/~wenzw/opt-2018-fall.html

More Subgradient Calculus: Proximal Operator Following functions are again convex, but again, may

Kurdyka- Lojasiewicz inequality and Kurdyka- Lojasiewicz inequality and subgradient

Convex Optimization ( EE227A: UC Berkeley ) Lecture 18 (Proximal methods; Incremental methods

Gradient, Subgradient and how they may affect your grade(ient) David Sontag &amp; Yoni Halpern

Subgradient method Geoff Gordon &amp; Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember

Generalized gradient descent Geoff Gordon &amp; Ryan Tibshirani Optimization 10-725 / 36-725 1

Asymmetric Proximal Point Algorithms with Moving Proximal Centers Deren Han

Inexact variable metric proximal gradient methods with line-search for convex and nonconvex

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

New primal-dual subgradient methods for Convex Problems with Functional Constraints Yurii

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

Incremental Garbage Collection Part II Roland Schatz Incremental Garbage Collection p.1/22

Meta-Learning of Structured Representation by Proximal Mapping Mao Li, Yingyi Ma, Xinhua

Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization L. T. K. Hien 1 N. Gillis 1

Some applications of proximal methods Caroline CHAUX Joint work with P. L. Combettes, L. Duval,

Proximal Policy Optimization Ruifan Yu (ruifan.yu@uwaterloo.ca) CS 885 June 20 Pro roximal l

iPiano: Inertial Proximal Algorithm for Non-Convex Optimization David Stutz June 2, 2016 David

Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau

Proximal Identification and Applications J er ome MALICK CNRS, Lab. J. Kuntzmann, Grenoble

Complexity of a quadratic penalty accelerated inexact proximal point method W. Kong 1 J.G. Melo 2

Gradient, Subgradient and how they may affect your grade(ient) David Sontag & Yoni Halpern

Subgradient method Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember

Generalized gradient descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1