Pathwise Coordinate Optimization for Nonconvex Sparse Learning
Tuo Zhao http://www.princeton.edu/˜tuoz
Department of Computer Science Johns Hopkins University
- Mar. 25. 2015
Pathwise Coordinate Optimization for Nonconvex Sparse Learning Tuo - - PowerPoint PPT Presentation
Pathwise Coordinate Optimization for Nonconvex Sparse Learning Tuo Zhao http://www.princeton.edu/tuoz Department of Computer Science Johns Hopkins University Mar. 25. 2015 A General Theory of Pathwise Coordinate Optimization Collaborators
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 2/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 3/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 5/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 6/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 7/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 8/45
A General Theory of Pathwise Coordinate Optimization
−3 −2 −1 1 2 3 0.0 0.5 1.0 1.5 2.0 2.5 3.0
SCAD MCP `1 θj rλ(θj)
−3 −2 −1 1 2 3 −2.5 −2.0 −1.5 −1.0 −0.5 0.0
SCAD MCP `1 θj hλ(θj)
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 9/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 10/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 11/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 12/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 13/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 15/45
A General Theory of Pathwise Coordinate Optimization
Inner loop
Warm start initialization Active coordinate minimization Active set identification
Convergence Active set Convergence Coordinate updating I n i t i a l S
u t i
Regularization parameter initialization Output solution
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 16/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 17/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 18/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 19/45
A General Theory of Pathwise Coordinate Optimization
CλK−1 CλK+1 CλK Cλ1 CλN b θ
{0}
b θ
{1}
b θ
{K−2}
b θ
{K−1}
b θ
{K}
b θ
{K+1}
b θ
{N}
θ∗ · · · · · · Basin of Attraction for Basin of Attraction for Basin of Attraction for Basin of Attraction for Basin of Attraction for λ1 λK−1 λK λK+1 λN
j = 0} and yields highly sparse solutions.
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 20/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 21/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 22/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 23/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 24/45
A General Theory of Pathwise Coordinate Optimization
Relevant Blocks Added by Cyclic Search Added by Greedy Selection Added by Randomized Selection Added by Truncated Cyclic Selection
Failure Success Success Success
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 25/45
A General Theory of Pathwise Coordinate Optimization
Relevant Blocks
Add
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 26/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 28/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 29/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 30/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 31/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 32/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 33/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 34/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 35/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 36/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 37/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 38/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 40/45
A General Theory of Pathwise Coordinate Optimization
Method
Timing G-PICASSO 0.8003(0.8908) 2.812(0.4997) 0.844(2.066) 666/1000 0.0169(0.0027) R-PICASSO 0.8102(0.9663) 2.791(0.5355) 0.902(2.353) 653/1000 0.0186(0.0034) TC-PICASSO 0.8057(0.8374) 2.800(0.4839) 0.888(2.038) 645/1000 0.0167(0.0024) SPARSENET 1.1260(1.2708) 2.669(0.6942) 1.678(3.191) 514/1000 0.0171(0.0025) PISTA 0.8135(0.8998) 2.797(0.5115) 0.881(2.112) 664/1000 2.1771(0.3805) Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 41/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 43/45
A General Theory of Pathwise Coordinate Optimization
Tuo Zhao http://www.princeton.edu/˜tuoz — Pathwise Coordinate Optimization for Nonconvex Sparse Learning 44/45