bilevel learning of the group lasso structure
play

Bilevel Learning of the Group Lasso Structure Jordan Frecon 1 , - PowerPoint PPT Presentation

Bilevel Learning of the Group Lasso Structure Jordan Frecon 1 , Saverio Salzo 1 , Massimiliano Pontil 1 , 2 1 CSML - Istituto Italiano di Tecnologia 2 Dept of Computer Science - University College London Thirty-second Conference on Neural


  1. Bilevel Learning of the Group Lasso Structure Jordan Frecon 1 , Saverio Salzo 1 , Massimiliano Pontil 1 , 2 1 CSML - Istituto Italiano di Tecnologia 2 Dept of Computer Science - University College London Thirty-second Conference on Neural Information Processing Systems, Montreal, Canada Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 1 / 9

  2. Linear Regression and Group Sparsity Problem: Predict y ∈ R N from X ∈ R N × P Linear Regression: Find w ∈ R P such that In many applications, few groups are relevant to predict y ⇒ Group Sparse w Predict psychiatric disorder from activities in regions of the brain Predict protein functions from their molecular composition Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 2 / 9

  3. Group Lasso Given λ > 0 and a group-structure {G 1 , . . . , G L } , find L 1 2 � y − Xw � 2 + λ � w ∈ argmin ˆ � w G l � 2 , w ∈ R P l =1 5 G 1 10 20 G 2 G 3 G 4 0 Group-sparse solution ˆ w 30 40 G 5 50 -5 Limitation: The group-structure {G 1 , . . . , G L } may be unknown Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 3 / 9

  4. Setting Setting: T Group Lasso problems with shared group-structure L 1 2 � y t − X t w t � 2 + λ � ( ∀ t ∈ { 1 , . . . , T } ) w t ( θ ) ∈ argmin ˆ � w t ⊙ θ l � 2 , w t ∈ R P l =1 encodes groups 5 10 10 20 20 0 30 30 40 40 50 50 -5 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 Goal: Estimation of the optimal group-structure θ ∗ Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 4 / 9

  5. A Bilevel Programming Approach Upper-level Problem: T � [ θ 1 ··· θ L ] ∈ Θ U ( θ ) := E t ( ˆ w t ( θ )) ( e.g., validation error ) minimize t =1 � � where ˆ w ( θ ) = w 1 ( θ ) · · · ˆ ˆ w T ( θ ) solves Lower-level Problem: ( T Group Lasso problems) � � T L 1 2 � y t − X t w t � 2 + λ � � L ( w , θ ) := � θ l ⊙ w t � 2 minimize w ∈ R P × T t =1 l =1 Difficulties: w ( θ ) not available in closed form ˆ θ �→ ˆ w ( θ ) is nonsmooth [ ⇒ U is nonsmooth] Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 5 / 9

  6. Approximate Bilevel Problem Upper-level Problem: T E t ( w ( K ) � [ θ 1 ··· θ L ] ∈ Θ U K ( θ ) := ( θ )) minimize t t =1 where w ( K ) ( θ ) → ˆ w t ( θ ) t Dual Algorithm: u (0) ( θ ) chosen arbitrarily for k = 0 , 1 , . . . , K − 1 � u ( k +1) ( θ ) = A ( u ( k ) ( θ ) , θ ) dual update w ( K ) ( θ ) · · · w ( K ) � � = B ( u ( K ) ( θ ) , θ ) ( θ ) primal dual relationship 1 T Goals: Find A and B smooth [ ⇒ w ( K ) is smooth ⇒ U K is smooth] Prove that the approximate bilevel scheme converges . Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 6 / 9

  7. Contributions Bilevel Framework for Estimating the Group Lasso Structure Design of a Dual Forward-Backward Algorithm with Bregman Distances such that A and B are smooth ⇒ U K is smooth 1 � min U K → min U 2 argmin U K → argmin U Implementation of proxSAGA algorithm: nonconvex stochastic variant of θ ( q +1) = P Θ θ ( q ) − γ ∇U K ( θ ( q ) ) � � Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 7 / 9

  8. Numerical Experiment Setting: T = 500 tasks, N = 25 noisy observations, P = 50 features. Estimate and group the features into, at most, L = 10 groups. 10 10 20 8 30 40 6 50 50 500 5000 1 2 3 4 5 6 7 8 9 10 Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 8 / 9

  9. Conclusion Thank You Our poster AB #92 will be presented in Room 210 & 230 at 5pm Jordan Frecon, Saverio Salzo, Massimiliano Pontil NIPS 2018 9 / 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend