problem definition
play

Problem Definition Machine learning problems, such as image - PowerPoint PPT Presentation

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization Zhize Li , Jian Li IIIS, Tsinghua University https://zhizeli.github.io/ Dec 6th, NeurIPS 2018 Problem Definition Machine learning problems, such as image


  1. A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization Zhize Li , Jian Li IIIS, Tsinghua University https://zhizeli.github.io/ Dec 6th, NeurIPS 2018

  2. Problem Definition Machine learning problems, such as image classification or voice recognition, are usually modeled as a (nonconvex) optimization problem: min πœ„ 𝑀 πœ„ . πœ„ β€– 2 ≀ πœ— ΰ·  ‖𝛼𝑀 ΰ·  Goal: find a good enough solution (parameters) , e.g., πœ„ Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 2/7

  3. Problem Definition We consider the more general nonsmooth nonconvex case: π‘œ min 𝑦 𝛸 𝑦 : = 𝑔 𝑦 + β„Ž 𝑦 = 1 ሻ π‘œ ෍ 𝑔 𝑗 (𝑦 + β„Ž 𝑦 , 𝑗=1 ሻ ሻ 𝑔(𝑦 𝑔 𝑗 (𝑦 Where and all are possibly nonconvex (loss on data samples), ሻ ‖𝑦‖ 1 β„Ž(𝑦 π‘š 1 and is nonsmooth but convex (e.g., regularizer or indicator ሻ 𝐽 𝐷 (𝑦 𝐷 function for some convex set ). Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 3/7

  4. Problem Definition We consider the more general nonsmooth nonconvex case: π‘œ min 𝑦 𝛸 𝑦 : = 𝑔 𝑦 + β„Ž 𝑦 = 1 ሻ π‘œ ෍ 𝑔 𝑗 (𝑦 + β„Ž 𝑦 , 𝑗=1 ሻ ሻ 𝑔(𝑦 𝑔 𝑗 (𝑦 Where and all are possibly nonconvex (loss on data samples), ሻ ‖𝑦‖ 1 β„Ž(𝑦 π‘š 1 and is nonsmooth but convex (e.g., regularizer or indicator ሻ 𝐽 𝐷 (𝑦 𝐷 function for some convex set ). ሻ β„Ž(𝑦 Benefit of : try to deal with the nonsmooth and constrained problems. Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 3/7

  5. Our Results We propose a simple ProxSVRG+ algorithm, which recovers/improves several previous results (e.g., ProxGD, ProxSVRG/SAGA, SCSG). Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 4/7

  6. Our Results We propose a simple ProxSVRG+ algorithm, which recovers/improves several previous results (e.g., ProxGD, ProxSVRG/SAGA, SCSG). Benefits: simpler algorithm, simpler analysis, better theoretical results, Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 4/7

  7. Our Results We propose a simple ProxSVRG+ algorithm, which recovers/improves several previous results (e.g., ProxGD, ProxSVRG/SAGA, SCSG). Benefits: simpler algorithm, simpler analysis, better theoretical results, more attractive in practice (prefers moderate minibatch size, auto-adapt to local curvature, i.e., auto-switch to faster linear convergence 𝑃(β‹… log Ξ€ ሻ 1 πœ— in that regions although the objective function is generally nonconvex). Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 4/7

  8. Theoretical Results Our ProxSVRG+ prefers moderate minibatch size (red box) which is not too small for parallelism or vectorization and not too large for better generalization, Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 5/7

  9. Theoretical Results Our ProxSVRG+ prefers moderate minibatch size (red box) which is not too small for parallelism or vectorization and not too large for better generalization, and uses less PO calls than ProxSVRG. Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 5/7

  10. Theoretical Results Our ProxSVRG+ prefers moderate minibatch size (red box) which is not too small for parallelism or vectorization and not too large for better generalization, and uses less PO calls than ProxSVRG. Recently, [Zhou et al., 2018] and [Fang et al., 2018] improve the SFO 1 2 πœ— π‘œ Ξ€ Ξ€ 𝑃( ΰ΅― to in the smooth setting. Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 5/7

  11. Experimental Results Our ProxSVRG+ prefers much smaller minibatch size than ProxSVRG [Reddi et al., 2016], and performs much better than ProxGD and ProxSGD [Ghadimi et al., 2016]. Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 6/7

  12. Thanks! Our Poster: 5:00-7:00 PM Room 210 #5 Zhize Li (Tsinghua) A Simple ProxSVRG+ Algorithm 7/7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend