Convergence of Cubic Regularization for Nonconvex Optimization under - PowerPoint PPT Presentation

Feb 04, 2024 •99 likes •215 views

Convergence of Cubic Regularization for Nonconvex Optimization under ojasiewicz Property Cubic-regularization (CR) + 1 + 2 CR :

Convergence of Cubic Regularization for Nonconvex Optimization under Łojasiewicz Property ∗ � �
Cubic-regularization (CR) �∈ℝ � + 1 𝑧 − 𝑦 � + 𝑁 2 𝑧 − 𝑦 � � 𝛼 � 𝑔 𝑦 � � CR : 𝑦 �� ∈ argmin � 𝑧 − 𝑦 � , 𝛼𝑔 𝑦 � 𝑧 − 𝑦 � 6  Converge to 2 nd -order stationary point (Nesterov’06) nd �  Escape strict-saddle points 2
Motivation and Contribution  General nonconvex optimization • global sublinear convergence (Nesterov’06)  Nonconvex + local geometry • gradient dominance (Nesterov’06)  super-linear convergence • error bound (Yue’18)  quadratic convergence • limited function class  Our contributions  general Łojasiewicz property 3
Lojasiewicz Property ∗ on a compact Definition (Lojasiewicz Property) Let takes a constant value � ∗ set . There exists such that for all � ∗ ∗ � where is the Lojasiewicz exponent.  Satisfied by large function class:  analytic function, polynomials, exp-log functions, etc  ML examples: Lasso, phase retrieval, blind deconvolution, etc. 4
Convergence to 2 nd -order Stationary Point � Lojasiewicz exponent 𝜾 Convergence rate Sharp 𝜄 = +∞ 𝜈 𝑦 � � = 0 finite-step 𝜈 𝑦 � ≤ Θ exp − 2(𝜄 − 1) �� 3 𝜄 ∈ 2 , +∞ super-linear 𝜄 = 3 𝜈 𝑦 � ≤ Θ exp −(𝑙 − 𝑙 � ) linear 2 1, 3 � �(��) 𝜄 ∈ 𝜈 𝑦 � ≤ Θ 𝑙 − 𝑙 � Flat �� 2 sub-linear 5
Convergence of Function Value Lojasiewicz exponent 𝜾 Convergence rate 𝑔 𝑦 � � − 𝑔 ∗ = 0 𝜄 = +∞ �� 3 �� 𝑔 𝑦 � − 𝑔 ∗ ≤ Θ exp − 𝜄 ∈ 2 , +∞ � 𝑔 𝑦 � − 𝑔 ∗ ≤ Θ exp −(𝑙 − 𝑙 � ) 𝜄 = 3 2 1, 3 � �� 𝑔 𝑦 � − 𝑔 ∗ ≤ Θ 𝜄 ∈ 𝑙 − 𝑙 � �� 2 6
Convergence of Variable Sequence Theorem Assume satisfies the Lojasiewicz property. Then, the sequence generated by CR is absolutely-summable as � ��  Implies Cauchy-convergent  (Nesterov’06): cubic-summable � 𝟒 �� 7
Convergence of Variable Sequence Lojasiewicz exponent 𝜾 Convergence rate 𝑦 � � − 𝑦 ∗ = 0 𝜄 = +∞ �� 3 �(��) � 𝑦 � − 𝑦 ∗ ≤ Θ exp − + 𝜄 ∈ 2 , +∞ � � 𝜄 = 3 𝑦 � − 𝑦 ∗ ≤ Θ exp −(𝑙 − 𝑙 � ) 2 1, 3 � �(��) 𝑦 � − 𝑦 ∗ 𝜄 ∈ ≤ Θ 𝑙 − 𝑙 � �� 2 8
Comparison with First-order Algorithm Lojasiewicz exponent 𝜾 Gradient descent Cubic-regularization 𝜄 = +∞ finite-step finite-step 𝜄 ∈ 2, +∞ linear super-linear 𝜄 ∈ [ � sub-linear super-linear � , 2) 𝜄 ∈ 1, � sub-linear Θ(𝑙 � �� sub-linear Θ(𝑙 � �� ) �.�� ) � 9
Come to our poster Thursday 05:00 PM Room 210 & 230 AB #4 Thank You! 10

Recommend

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell

Stochastic Cubic Regularization for Fast Nonconvex Optimization Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier and Michael I. Jordan Achin Jain University of Pennsylvania STAT991, Spring 2019 Fast Nonconvex Optimization 1

422 views • 26 slides

Implicit Regularization in Nonconvex Statistical Estimation Yuxin Chen Electrical Engineering,

Implicit Regularization in Nonconvex Statistical Estimation Yuxin Chen Electrical Engineering, Princeton University Cong Ma Kaizheng Wang Yuejie Chi Princeton ORFE Princeton ORFE CMU ECE Nonconvex estimation problems are everywhere

1.03k views • 91 slides

Tariff Metre Metre AED / Cubic Metre metre 10.55 metre kWh kWh kWh fils kWh fils

Water & Electricity Tariff of Expats Water & Electricity Tariff of Nationals Average Daily Average Daily Tariff Tariff Consumption Consumption AED/ Cubic Metre AED/ Cubic Metre Cubic Metre/Day Cubic Metre/Day 10.55 Average Daily

405 views • 3 slides

Regularization Overview Regularization Overview Problems & Multicollinearity We will

Regularization Overview Regularization Overview Problems & Multicollinearity We will discuss three popular methods for obtaining better estimates of the linear model coefficients Regularization Techniques Principal

305 views • 12 slides

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970 Lecture 3: Stephen Scott Stephen Scott and Vinod and Vinod Regularization Variyam Variyam Machine learning can generally be distilled to an

551 views • 9 slides

Regularization Regularization is a general approach to add a complexity parameter to a

Regularization Regularization is a general approach to add a complexity parameter to a learning algorithm. Requires that the model parameters be continuous. (i.e., Regression OK, IAML: Regularization and Ridge Regression Decision trees

204 views • 3 slides

Stabilizing Cubic HfO 2 Doped Y 2 O 3 using TEM Stabilizing Cubic HfO 2 Doped Y 2 O 3 using TEM

Stabilizing Cubic HfO 2 Doped Y 2 O 3 using TEM Stabilizing Cubic HfO 2 Doped Y 2 O 3 using TEM http://www.tedpella.com/grids_html/si-window.jpg Peter Gu, W. Walkosz, R.F. Klie Nanoscale Physics Group University of Illinois at Chicago UIC

368 views • 17 slides

On cubic 4-ordered graphs and cubic 4-ordered Hamiltonian graphs Hamiltonian graphs Lih-Hsing

On cubic 4-ordered graphs and cubic 4-ordered Hamiltonian graphs Hamiltonian graphs Lih-Hsing Hsu Speaker Ming Tsai Speaker Ming Tsai Outline Outline 1 I t 1. Introduction d ti 2. Our Results 3. Q&A Introduction

493 views • 25 slides

Overview Chapter 7 Ideal Gas Equation of State P= RT/V Van der Waals Equation of State Cubic

Overview Chapter 7 Ideal Gas Equation of State P= RT/V Van der Waals Equation of State Cubic Equation of State Virial Equation of State Peng-Robinson Equation of State (PREOS) Cubic Equation of State Solve cubic equations (3 roots) 1 2

728 views • 54 slides

lecture 10 - cubic curves - cubic splines - bicubic surfaces We want to define smooth curves:

lecture 10 - cubic curves - cubic splines - bicubic surfaces We want to define smooth curves: - for defining paths of cameras or objects - for defining 1D shapes of objects We want to define smooth surfaces too. Parametric Equation of a

948 views • 48 slides

Derived categories and cubic persurfaces Paolo Stellari hypersurfaces Paolo Stellari Roma,

Derived categories and cubic hy- Derived categories and cubic persurfaces Paolo Stellari hypersurfaces Paolo Stellari Roma, February 2011 Outline Derived categories The geometric setting and cubic hy- 1 persurfaces Paolo Stellari

707 views • 37 slides

An n component face-cubic model on the complete graph Zongzheng (Eric) Zhou School of

Brief introduction to lattice models and phase transitions Large deviations theory An n -component face-cubic model Limit theorems for the face-cubic model An n component face-cubic model on the complete graph Zongzheng (Eric) Zhou School

362 views • 33 slides

On the cubic-quintic Schr odinger equation R emi Carles CNRS & Univ Rennes Based on a

On the cubic-quintic Schr odinger equation R emi Carles CNRS & Univ Rennes Based on a joint work with Christof Sparber (Univ. Illinois) R emi Carles (CNRS & Univ Rennes) Cubic-quintic Schr odinger equation 1 / 23 Cubic

993 views • 55 slides

AdaGrad Stepsizes: Sharp Convergence Over Nonconvex Landscapes Xiaoxia(Shirley) WU PhD

AdaGrad Stepsizes: Sharp Convergence Over Nonconvex Landscapes Xiaoxia(Shirley) WU PhD Candidate, The University of Texas at Austin June 11th, 2019 joint work with Rachel Ward and L eon Bottou, at Facebook AI Research. Outline

854 views • 38 slides

Nonconvex Demixing from Bilinear Measurements Yuanming Shi 1 Outline Motivations Blind

Nonconvex Demixing from Bilinear Measurements Yuanming Shi 1 Outline Motivations Blind deconvolution meets blind demixing T woVignettes: Implicitly regularized Wirtinger flow Why nonconvex optimization? Implicitly

764 views • 59 slides

PRACTICAL AUGMENTED LAGRANGIAN METHODS FOR NONCONVEX PROBLEMS Jos e Mario Mart nez

PRACTICAL AUGMENTED LAGRANGIAN METHODS FOR NONCONVEX PROBLEMS PRACTICAL AUGMENTED LAGRANGIAN METHODS FOR NONCONVEX PROBLEMS Jos e Mario Mart nez www.ime.unicamp.br/ martinez UNICAMP, Brazil August 2, 2011 PRACTICAL AUGMENTED

698 views • 58 slides

NetChain: Scale-Free Sub-RTT Coordination Xin Jin Xiaozhou Li, Haoyu Zhang, Robert Soul,

NetChain: Scale-Free Sub-RTT Coordination Xin Jin Xiaozhou Li, Haoyu Zhang, Robert Soul, Jeongkeun Lee, Nate Foster, Changhoon Kim, Ion Stoica Conventional wisdom: avoid coordination NetChain: lightning fast coordination enabled by

484 views • 37 slides

Function examples int dinky(int x) 000000000040056b <dinky>: { 40056b: lea

Function examples int dinky(int x) 000000000040056b <dinky>: { 40056b: lea 0x2(%rdi),%eax return x + 2; 40056e: retq } 000000000040056f <binky>: int binky(int x, int y) 40056f: mov %edi,%eax { 400571: imul %esi,%eax

633 views • 15 slides

61A Lecture 27 Friday, November 8 Announcements Homework 8 due Tuesday 11/12 @ 11:59pm, and

61A Lecture 27 Friday, November 8 Announcements Homework 8 due Tuesday 11/12 @ 11:59pm, and it's in Scheme! Project 4 due Thursday 11/21 @ 11:59pm, and it's a Scheme interpreter! Also, the project is very long. Get started today. 2

395 views • 20 slides

CS32 - Week 5 Umut Oztok July 22, 2016 Umut Oztok CS32 - Week 5 Recursion In order to

CS32 - Week 5 Umut Oztok July 22, 2016 Umut Oztok CS32 - Week 5 Recursion In order to understand recursion, one must first understand recursion. Lets see what Google tells us about recursion: That is, in order to find recursion, one must

527 views • 19 slides

Better generalization with less data using robust gradient descent Matthew J. Holland 1 Kazushi

Better generalization with less data using robust gradient descent Matthew J. Holland 1 Kazushi Ikeda 2 1 Osaka University 2 Nara Institute of Science and Technology Distribution robustness In practice, the learner does not know what kind of data

893 views • 16 slides

JBOORET: an Automated Tool to Recover OO Design and Source Models Hong Mei, Tao Xie, Fuqing Yang

JBOORET: an Automated Tool to Recover OO Design and Source Models Hong Mei, Tao Xie, Fuqing Yang Department of Computer Science & Technology Peking University, Beijing, China Oct. 2001 Outline Tool Design Principles Tool

487 views • 10 slides

Knowledge Graphs Large ge and complex plex graphs capturing millions of entities and

Knowledge Graphs Large ge and complex plex graphs capturing millions of entities and relationships between them! Entity Relationship Ubiqu quitous itous toda day: y: Linking Open Data Freebase DBpedia YAGO How to Query Knowledge

512 views • 16 slides

Presentation to ISASI 2016 Kathy Fox Chair, Transportation Safety Board of Canada Reykjavik,

Presentation to ISASI 2016 Kathy Fox Chair, Transportation Safety Board of Canada Reykjavik, Iceland 20 October 2016 Investigations: Putting the pieces together Operational Technical Human 2 Investigations: Putting the pieces together

285 views • 10 slides