Screening Rules for Lasso with Non-Convex Sparse Regularizers A. - - PowerPoint PPT Presentation

screening rules for lasso with non convex sparse
SMART_READER_LITE
LIVE PREVIEW

Screening Rules for Lasso with Non-Convex Sparse Regularizers A. - - PowerPoint PPT Presentation

Screening Rules for Lasso with Non-Convex Sparse Regularizers A. Rakotomamonjy Joint work with G. Gasso and J. Salmon ICML 2019 This work benefited from the support of the project OATMIL ANR-17-CE23-0012 of the French National Research Agency


slide-1
SLIDE 1

Screening Rules for Lasso with Non-Convex Sparse Regularizers

  • A. Rakotomamonjy

Joint work with G. Gasso and J. Salmon ICML 2019

This work benefited from the support of the project OATMIL ANR-17-CE23-0012 of the French National Research Agency (ANR), the Normandie Projet GRR-DAISI, European funding FEDER DAISI 1 / 6

slide-2
SLIDE 2

Objective of the paper

Lasso and screening learning sparsity-induced linear models from high-dimensional data X ∈ Rn×d, y ∈ Rn min

w∈Rd

1 2y − Xw2

2 + d

  • j=1

λ|wj| Screening rule : identify vanishing variables in w⋆. Example with ˆ w,ˆ s intermediate primal-dual solutions : |x⊤

j ˆ

s| + r(ˆ w,ˆ s)xj < 1 = ⇒ w ⋆

j = 0

by exploiting sparsity, convexity and duality. Extension to non-convex regularizers non-convex regularizers lead to statistically better models but how to do screening when the regularizer is non-convex?

−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

l1 logsum mcp

2 / 6

slide-3
SLIDE 3

Non-convex Lasso

The problem min

w∈Rd

1 2y − Xw2

2 + d

  • j=1

rλ(|wj|) with the regularizer rλ(·) being smooth and concave on [0, ∞[. The proposed screening strategy Solve by majorization-minimization wk+1 = arg min

w∈Rd 1 2y − Xw2 2 + 1 2αw − wk2 2

+

d

  • j=1

λj|wj| , with λj = r ′

λ(|wj|)

Screen at two levels

within each weighted Lasso propagate screened variables information between 2 successive Lasso.

3 / 6

slide-4
SLIDE 4

Screening weighted Lasso

Optimization problem and screening condition min

w∈Rd

1 2y−Xw2

2+ 1

2αw−wk2

2+ d

  • j=1

λj|wj| |x⊤

j s⋆−v ⋆ j |−λj < 0 =

⇒ w ⋆

j = 0

with s and v being dual variables and s⋆ = y − Xw⋆ and w⋆ − w′⋆ = αv⋆. Our screening test |x⊤

j ˆ

s − ˆ vj| +

  • 2GΛ
  • xj + 1

α

  • T

(λj ) j

(ˆ w,ˆ s,ˆ v)

< λj given a primal-dual intermediate solution (ˆ w,ˆ s, ˆ v), with duality gap GΛ.

4 / 6

slide-5
SLIDE 5

Screened variables propagation

Setting After iteration k, we have a weigthed Lasso with weights {λj} and approximate solutions ˆ w, ˆ s and ˆ

  • v. Screened variables are those

T

(λj ) j

(ˆ w,ˆ s, ˆ v) < λj Before iteration k + 1

change of weights {λν

j }j=1,...,d

new primal-dual triplet (ˆ wν,ˆ sν, ˆ vν),

Screening propagation test T

(λj ) j

(ˆ w,ˆ s, ˆ v) + xj(a + √ 2b) + c + 1 α √ 2b < λν

j

with that ˆ sν − ˆ s2 ≤ a, |GΛ(ˆ w,ˆ s, ˆ v) − GΛν (ˆ wν,ˆ sν, ˆ vν)| ≤ b and |ˆ v ν

j − ˆ

vj| ≤ c.

5 / 6

slide-6
SLIDE 6

Summary

First approach for screening with non-convex regularizers Convexification and propagation At poster #190 Pacific Ballroom More technical details Experimental results on computational gain and on propagation strategy

1.00e-03 1.00e-04 1.00e-05 Tolerance 20 40 60 80 100 Percentage of time of ncxCD

Regularization Path - n=50 d=100 p=5 σ=2.00

ncxCD GIST MM genuine MM screening

5 10 15 20 25 30

Iterations

0.0 0.2 0.4 0.6 0.8

Ratio of screened variables Pre-PWL Post-PWL

6 / 6