exact 0 norm optimization via branch and bound methods
play

Exact 0 -norm optimization via branch-and-bound methods S ebastien - PowerPoint PPT Presentation

Exact 0 -norm optimization via branch-and-bound methods S ebastien Bourguignon Laboratoire des Sciences du Num erique de Nantes Ecole Centrale de Nantes GdR MIA, Thematic day on Non-Convex Sparse Optimization, Toulouse, October


  1. Exact ℓ 0 -norm optimization via branch-and-bound methods S´ ebastien Bourguignon Laboratoire des Sciences du Num´ erique de Nantes ´ Ecole Centrale de Nantes GdR MIA, Thematic day on Non-Convex Sparse Optimization, Toulouse, October 9th 2020 Joint work with Ramzi Ben Mhenni (LS2N-ECN, now LITIS, Universit´ e de Rouen) Jordan Ninin (Lab-STICC / ENSTA Bretagne) Marcel Mongeau (Universit´ e de Toulouse, ENAC) Herv´ e Carfantan (Universit´ e de Toulouse, IRAP)

  2. Outline Why? 1 Who? 2 How? 3 Where? 4 S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 2 / 16

  3. Outline Why? 1 � exact solutions to ℓ 0 -norm problems may achieve better estimates Who? 2 � small to moderate size sparse problems can be solved exactly How? 3 � dedicated Branch-and-Bound strategy Where? 4 � directions for further works S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 2 / 16

  4. Exactness: exact criterion, exact optimization � True, unrelaxed, ℓ 0 -“norm” criterion 1 � x � q p | x p | q � x � 1 = � p | x p | q = � � x � 0 := Card { x p | x p � = 0 } Some sparsity-enhancing functions � p ϕ ( | x p | ) and their unit balls. � Global optimization: optimality guaranteed by the algorithm 1 On (re)lˆ ache rien! S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 3 / 16

  5. Exactness may be worth. . . � Natural formulation for many problems 2 � y − A x � 2 2 � y − A x � 2 1 1 P 2 / 0 : min 2 s.t. � x � 0 ≤ K P 0 / 2 : min x ∈ R P � x � 0 s.t. 2 ≤ ǫ x ∈ R P � � 2 � y − A x � 2 1 P 2+0 : min 2 + λ � x � 0 x ∈ R P � Global optimum � better solution [Bertsimas et al., 2016, Bourguignon et al. , 2016] 4 4 4 4 4 2 2 2 2 2 0 0 0 0 0 −2 −2 −2 −2 −2 −4 −4 −4 −4 −4 −6 −6 −6 −6 −6 0 50 100 0 50 100 0 50 100 0 50 100 0 50 100 Data and truth OMP ℓ 1 relaxation SBR Global optimum x � 2 x � 2 x � 2 x � 2 x � 2 � y − H ˚ 2 = 1 . 62 � y − H � 2 = 6 . 07 � y − H � 2 = 2 . 36 � y − H � 2 = 2 . 22 � y − H � 2 = 1 . 43 Results taken from [Bourguignon et al., 2016] S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 4 / 16

  6. . . . but exactness has a price 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16

  7. . . . but exactness has a price 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16

  8. . . . but exactness has a price NP-hard 2 : � x � 0 ≤ K � � P � possible combinations. . . in worst case scenario! K Branch-and-Bound: eliminate (hopefully huge) sets of possible combinations without resorting to their evaluation Moderate-size problems ( P ∼ a few hundreds, K ∼ a few tens) ◮ one-dimensional problems ◮ deconvolution, time series spectral analysis, spectral unmixing, . . . ◮ variable/subset selection in Statistics 2 On n’est jamais fort pour ce calcul. S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 5 / 16

  9. Mixed Integer Programming (MIP) reformulation (see [Bienstock 1996, Bertsimas et al. 2016, Bourguignon et al. 2016] ) Big-M assumption : ∀ p , | x p | ≤ M . � � x � 0 ≤ K x ∈ R P � y − A x � 2 Then: min 2 s.t. ∀ p , | x p | ≤ M  � b p ≤ K   � y − A x � 2 ⇔ min 2 s.t. p b ∈{ 0;1 } P  ∀ p , | x p | ≤ Mb p  x ∈ R P Can be addressed by MIP solvers (CPLEX, GUROBI, . . . ) but computation time ↑ / limited to small size Here: No need for MIP reformulation nor binary variables Specific Branch-and-Bound construction for problems P 2 / 0 , P 0 / 2 , and P 2+0 S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 6 / 16

  10. Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? P (2) P (3) P (5) P (6) S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16

  11. Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16

  12. Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) ⊲ Computation of relaxed solutions? S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16

  13. Branch-and-Bound resolution [Land & Doig, 1960] Decision tree for binary variables At each node, a lower bound on all subproblems contained by this node � � � remaining binary variables are relaxed into 0 , 1 If this bound exceeds the best known solution, the branch is pruned . ⊲ Which variable b p branch on? P (0) highest relaxed variable b p 0 = 1 b p 0 = 0 ⊲ Which side explore first? P (1) P (4) b p = 1 b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 ⊲ Which node explore first? depth-first search P (2) P (3) P (5) P (6) ⊲ Computation of relaxed solutions? related to ℓ 1 -norm optimization . . . S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 7 / 16

  14. MIP continuous relaxation and ℓ 1 norm �� p b p ≤ K 2 � y − A x � 2 1 P 2 / 0 : min s.c. M ∀ p , − Mb p ≤ x p ≤ Mb p x ∈ R P b ∈{ 0 , 1 } P 0 �� p b p ≤ K 2 � y − A x � 2 1 � R 2 / 0 : min s.c. ∀ p , − Mb p ≤ x p ≤ Mb p x ∈ R P -M b ∈ [0 , 1] P 0 1 We have 3 �� p | x p | ≤ MK 2 � y − A x � 2 1 min R 2 / 0 = min s.t. . ∀ p , | x p | ≤ M x ∈ R P 3 Proof: for a solution ( x ⋆ , b ⋆ ) of P R 2 / 0 , we have | x ⋆ | = M b ⋆ . . . S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 8 / 16

  15. Continuous relaxation within the branch-and-bound procedure At a given node: P (0) ◮ b S 0 = 0 and x S 0 = 0 b p 0 = 1 b p 0 = 0 ◮ b S 1 = 1 and | x S 1 | ≤ M P (1) P (4) ◮ b S free and | x S | ≤ M b S b p 1 = 1 b p 1 = 0 b p 4 = 1 b p 4 = 0 � p b p = Card S 1 + � p ∈ S b p � P (2) P (3) P (5) P (6) The relaxed problem at node i reads equivalently:  � x S � 1 ≤ M ( K − Card S 1 )   R ( i ) 2 � y − A S x S − A S 1 x S 1 � 2 1 2 / 0 : min s.t. � x S � ∞ ≤ M 2 x S , x S 1  � x S 1 � ∞ ≤ M  ⊲ Least squares, ℓ 1 norm (partially) and box constraints. ⊲ No binary variables ! S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 9 / 16

  16. Optimization with (partial) ℓ 1 -norm and box constraints � Homotopy continuation principle Standard case [Osborne et al. 2000] With free variable and box constraints 2 � y − A x � 2 1 2 � y − A S x S − A S 1 x S 1 � 2 1 min 2 + λ � x � 1 min 2 + λ � x S � 1 x x S , x S 1 � � x S � ∞ ≤ M s.c. � x S 1 � ∞ ≤ M x ∗ x ∗ x ∗ M x ∗ 4 1 x ∗ 1 x ∗ 4 x ∗ x ∗ 2 2 λ (6) λ (5) λ (4) λ (3) λ (2) λ (1) λ (0) λ (4) λ (3) λ (2) λ (1) λ (0) λ λ x ∗ 5 x ∗ 3 x ∗ − M λ ∗ 3 λ ∗ S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 10 / 16

  17. Homotopy continuation Similarly solves relaxations for the sparsity-constrained problem: � � x S � 1 ≤ τ ⋆ R ( i ) 2 � y − A S x S − A S 1 x S 1 � 2 1 2 / 0 : min s.t. 2 � x S � ∞ ≤ M , � x S 1 � ∞ ≤ M x S , x S 1 and for the error-constrained problem: � 2 � y − A S x S − A S 1 x S 1 � 2 1 2 ≤ ǫ ⋆ R ( i ) 0 / 2 : min � x S � 1 s.t. � x S � ∞ ≤ M , � x S 1 � ∞ ≤ M x S , x S 1 � x ∗ S � 1 λ ∗ τ ⋆ λ (4) λ (3) λ (2) λ (1) ǫ ⋆ λ (0) 1 S � 2 2 � y − A S 1 x ∗ S 1 − A S x ∗ Pareto curve S. Bourguignon ℓ 0 -norm optimization & branch-and-bound 11 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend