Faster and still safe: Combining screening techniques and - - PowerPoint PPT Presentation

faster and still safe combining screening techniques and
SMART_READER_LITE
LIVE PREVIEW

Faster and still safe: Combining screening techniques and - - PowerPoint PPT Presentation

Faster and still safe: Combining screening techniques and structured dictionaries to accelerate the Lasso Cssio F. DANTAS, Rmi GRIBONVAL cassio.fraga-dantas@inria.fr, remi.gribonval@inria.fr 1 - 03/04/18 Accelerate the Lasso


slide-1
SLIDE 1

Faster and still safe: Combining screening techniques and structured dictionaries to accelerate the Lasso

03/04/18

  • 1

Cássio F. DANTAS, Rémi GRIBONVAL cassio.fraga-dantas@inria.fr, remi.gribonval@inria.fr

slide-2
SLIDE 2

Accelerate the Lasso optimization by combining two strategies : 1) Safe Screening Rules 2) Fast Structured Dictionaries

slide-3
SLIDE 3

Contents

03/04/18

  • 3

01. Context (Lasso problem) 02. Fast Structured Dictionaries 03. Screening Rules 04. Screening Rules w/ Approx. Dictionaries 05. Results 06. Conclusion

slide-4
SLIDE 4

01

4

Context

slide-5
SLIDE 5

01 Context

5

Lasso problem

The l1-regularized least squares. Denoting :

  • the observation vector;
  • the design matrix (or dictionary);
  • the sparse representation vector;
  • parameter controlling the sparsity of the solution.
slide-6
SLIDE 6

6

Dual Lasso

Dual formulation of the Lasso problem : 01 Context Denoting :

  • the dual variable;
  • the feasible set;
slide-7
SLIDE 7

02

7

Fast Structured Dictionaries

slide-8
SLIDE 8

Fast Structured Dictionaries

8

Motivation

  • Iterative algorithms are often used to solve the Lasso problem.
  • Exemple : ISTA (Iterative Shrinkage-Thresholding Algorithm)
  • T

wo matrix-vector multiplications at each iteration. Quadratic complexity! Can it be reduced ?

slide-9
SLIDE 9

Fast Approximate Dictionaries

9

Structure ⇒ Acceleration

Accelerate matrix-vector multiplications

Constrain the dictionary matrix to have a certain type of structure. Examples :

Kronecker product

Sparse factors

Circulant factors

(...)

slide-10
SLIDE 10

Fast Approximate Dictionaries

10

Structured Approximation

If the dictionary matrix is not structured, an structured approximation can be found. where is the approximation error matrix and is its -th column.

slide-11
SLIDE 11

01 Context

11

Algorithm (high level view)

1) Start Lasso optimization by using the structured , to take advantage

  • f

its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the

  • riginal dictionary .
slide-12
SLIDE 12

01 Context

11

Algorithm (high level view)

1) Start Lasso optimization by using the structured , to take advantage

  • f

its reduced multiplication cost. 2) As the algorithm approaches the solution, switch back to the

  • riginal dictionary .
slide-13
SLIDE 13

02

12

Safe Screening Rules

slide-14
SLIDE 14

Safe Screening Rules

13

Safe Screening

  • Rules for identifying inactive dictionary atoms, before completely solving the problem.
slide-15
SLIDE 15

Safe Screening Rules

13

Safe Screening

  • Rules for identifying inactive dictionary atoms, before completely solving the problem.

Dictionary columns that will receive zero weight

  • n the reconstruction of the input signal
slide-16
SLIDE 16

Safe Screening Rules

13

Safe Screening

  • Rules for identifying inactive dictionary atoms, before completely solving the problem.

Dictionary columns that will receive zero weight

  • n the reconstruction of the input signal
slide-17
SLIDE 17

Safe Screening Rules

13

Safe Screening

  • Rules for identifying inactive dictionary atoms, before completely solving the problem.

Dictionary columns that will receive zero weight

  • n the reconstruction of the input signal

Inactive atoms Solution support

slide-18
SLIDE 18

Safe Screening Rules

13

Safe Screening

  • Rules for identifying inactive dictionary atoms, before completely solving the problem.
  • We can eliminate such atoms.
  • Zero risk of false eliminations !

Dictionary columns that will receive zero weight

  • n the reconstruction of the input signal

Inactive atoms

slide-19
SLIDE 19

Screening T est

  • Function
  • f the atom

is surely inactive.

14

slide-20
SLIDE 20

Screening T est

  • Function
  • f the atom

is surely inactive. Rejection set: Preserved set:

14

slide-21
SLIDE 21

Safe Screening Rules

15

Screening T est - Details

Given a region (safe region) which contains .

Sphere test

Safe region is a closed l2-ball with center c and radius r :

slide-22
SLIDE 22

04

16

Screening Rules with Approximate Dictionaries

slide-23
SLIDE 23

Extending Screening Rules

17

Rejection set Guarantee:

slide-24
SLIDE 24

Extending Screening Rules

17

Rejection set Guarantee:

slide-25
SLIDE 25

Extending Screening Rules

Rejection set Guarantee: Rejection set

17

slide-26
SLIDE 26

Extending Screening Rules

Rejection set Guarantee: Rejection set

17

slide-27
SLIDE 27

Extending Screening Rules

Rejection set Guarantee: Rejection set

17

slide-28
SLIDE 28

Extending Screening Rules

18

Extending sphere tests

Sphere test :

A certain « security margin » must be added to account for the atom approximation error. Suppose a safe sphere given.

slide-29
SLIDE 29

Extending Screening Rules

18

Sphere test with approximate dictionary :

A certain « security margin » must be added to account for the atom approximation error.

Extending sphere tests

Suppose a safe sphere given.

Sphere test :

slide-30
SLIDE 30

Extending Screening Rules

19

Obtaining a safe sphere

Given a primal-dual estimation at iteration .

GAP safe sphere :

with the duality gap at iteration .

slide-31
SLIDE 31

Extending Screening Rules

19

Obtaining a safe sphere

with the duality gap at iteration . Given a primal-dual estimation at iteration .

GAP safe sphere :

slide-32
SLIDE 32

Extending Screening Rules

19

Obtaining a safe sphere

with the duality gap at iteration . Given a primal-dual estimation at iteration .

GAP safe sphere :

slide-33
SLIDE 33

Extending Screening Rules

19

Obtaining a safe sphere

with the duality gap at iteration . Given a primal-dual estimation at iteration .

GAP safe sphere :

slide-34
SLIDE 34

Extending Screening Rules

Obtaining a safe sphere

Given a primal-dual estimation at iteration .

GAP safe sphere: GAP safe sphere with approximate dictionary:

cannot be calculated, since depends on . Instead, we use a modifjed primal with

20

slide-35
SLIDE 35

Dynamic screening

Rejection set Safe region GAP sphere Guarantee 1: Guarantee 2:

21

slide-36
SLIDE 36

Extended dynamic screening

Safe region Extended GAP sphere Guarantee 1: Rejection set Guarantee 2:

21

slide-37
SLIDE 37

We now have safe screening rules that manipule an approximate dictionary. But, what’s the impact of the numerous security margins? Is it still worth it?

slide-38
SLIDE 38

05

22

Results

slide-39
SLIDE 39

Results

23

Running times per iteration

  • Less inactive atoms are identifjed by the extended screening.
  • BUT, structured dictionary makes the initial iterations much faster.
slide-40
SLIDE 40

Results

23

Running times per iteration

  • Less inactive atoms are identifjed by the extended screening.
  • BUT, structured dictionary makes the initial iterations much faster.
slide-41
SLIDE 41

Results

23

Running times per iteration

slide-42
SLIDE 42

Results

24

slide-43
SLIDE 43

06

25

Conclusion

slide-44
SLIDE 44

26

Conclusion

  • The proposed approach combines screening rules and fast approximate dictionaries.
  • It reduces even further the execution time with respect screening rules alone.

Potential extensions

  • Other region types (e.g. domes)
  • Other problems than Lasso (e.g. Group-Lasso, Elastic-Net, Regularized Logistic Regression)
slide-45
SLIDE 45

Thank you!

Questions?

Contact me: cassio.fraga-dantas@inria.fr

slide-46
SLIDE 46
slide-47
SLIDE 47

Safe Screening Rules

11

Screening test – Details

Dual formulation of the Lasso problem : Projection problem ! At the dual solution : ➢ Constraints on and are active, i.e. ➢ Constraints on is inactive, i.e.

Feasible region

slide-48
SLIDE 48

Safe Screening Rules

12

Screening test – Details

  • Every dictionary atom for which

is inactive.

  • Then, simply calculate for all and discard all atoms for which the result is

smaller than 1. Dual solution is not known. Identify a region (safe region) which contains . Suffjcient condition : is inactive

slide-49
SLIDE 49

Extending Screening Rules

Swithing criterion

Reasons to switch back from to :

  • Convergence: to avoid converging to the solution of the approximate problem.
  • The higher the approximation error, the sooner we need to switch.
  • Screening ratio: the number of active atoms may become so small that the use of

does not pay ofg anymore.

slide-50
SLIDE 50

Extending Screening Rules

23

Comparison

Less inactive atoms are identifjed by the extended screening.

slide-51
SLIDE 51

Extending Screening Rules

24

Swithing criterion

slide-52
SLIDE 52

Extending Screening Rules

25

Complexity reduction

slide-53
SLIDE 53

Extending Screening Rules

25

Complexity reduction

slide-54
SLIDE 54

Simulation Results

27

Impact of the Approximation Error