Heuristic Optimality Check and Computational Solver Comparison for - PowerPoint PPT Presentation

Heuristic Optimality Check and Computational Solver Comparison for Basis Pursuit Andreas M. Tillmann Research Group Optimization, TU Darmstadt, Germany joint work with Dirk A. Lorenz (TU Braunschweig) and Marc E. Pfetsch (TU Darmstadt) ISMP 2012 Berlin, Germany 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 1

Outline Motivation Infeasible-Point Subgradient Algorithm ISAL1 Comparison of ℓ 1 -Solvers Testset Construction and Computational Results Improvements with Heuristic Optimality Check Possible Future Research 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 2

Sparse Recovery via ℓ 1 -Minimization ◮ Seek sparsest solution to underdetermined linear system: ( A ∈ R m × n , m < n ) min � x � 0 s. t. Ax = b ◮ Finding minimum-support solution is NP -hard. Convex “relaxation”: ℓ 1 -minimization / Basis Pursuit: min � x � 1 s. t. Ax = b (L1) ◮ Several conditions (RIP , Nullspace Property, etc) ensure “ ℓ 0 - ℓ 1 -equivalence” 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 4

Solving the Basis Pursuit Problem ◮ (L1) can be recast as a linear program ◮ Broad variety of specialized algorithms for (L1) ◮ direct or primal-dual approaches ◮ regularization, penalty methods ◮ further relaxations (e. g. � Ax − b � ≤ δ instead of Ax = b ) ◮ ... � � ◮ Which algorithm is “the best”? 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 5

Solving the Basis Pursuit Problem ◮ (L1) can be recast as a linear program ◮ Broad variety of specialized algorithms for (L1) ◮ direct or primal-dual approaches ◮ regularization, penalty methods ◮ further relaxations (e. g. � Ax − b � ≤ δ instead of Ax = b ) ◮ ... � � ◮ Which algorithm is “the best”? ◮ A classic algorithm from nonsmooth optimization: (projected) subgradient method – competitive? 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 5

Projected Subgradient Methods min f ( x ) s.t. x ∈ F ( f , F convex) Problem: standard projected subgradient iteration x k − α k h k � α k > 0, h k ∈ ∂ f ( x k ) x k +1 = P F � , applicability: only reasonable if projection is “easy” 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 7

Projected Subgradient Methods min f ( x ) s.t. x ∈ F ( f , F convex) Problem: standard projected subgradient iteration x k − α k h k � α k > 0, h k ∈ ∂ f ( x k ) x k +1 = P F � , applicability: only reasonable if projection is “easy” idea: replace exact projection by approximation � “infeasible” subgradient iteration x k +1 = P ε k x k − α k h k � �P ε k � F − P F � 2 ≤ ε k , F 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 7

ISA = Infeasible-Point Subgradient Algorithm ◮ ... works for arbitrary convex objectives and constraint sets ◮ ... incorporates adaptive approximate projections P ε F such that �P ε F ( y ) − P F ( y ) � 2 ≤ ε for every ε ≥ 0 ◮ ... converges to optimality (under reasonable assumptions) whenever projection inaccuracies ( ε k ) sufficiently small, ◮ for stepsizes α k > 0 with � ∞ k =0 α k = ∞ , � ∞ k =0 α 2 k < ∞ ◮ for dynamic stepsizes α k = λ k f ( x k ) − ϕ / � h k � 2 2 with ϕ ≤ ϕ ∗ � � ◮ ... converges to ϕ with dynamic stepsizes using ϕ ≥ ϕ ∗ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 8

ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ exact projected subgradient step for (L1): x k − α k h k � x k +1 = P F � = ( x k − α k h k ) − A T ( AA T ) − 1 � A ( x k − α k h k ) − b � 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ exact projected subgradient step for (L1): y k ← x k − α k h k z k ← Solution of AA T z = Ay k − b x k +1 ← y k − A T z k = P F ( y k ) AA T is s.p.d. ⇒ may employ CG to solve equation system 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

ISAL1 = Spezialization of ISA to ℓ 1 -Minimization ◮ f ( x ) = � x � 1 , F = { x | Ax = b } , sign( x ) ∈ ∂ � x � 1 ◮ inexact projected subgradient step for (L1): y k ← x k − α k h k z k ← Solution of AA T z ≈ Ay k − b x k +1 ← y k − A T z k = P ε k F ( y k ) AA T is s.p.d. ⇒ may employ CG to solve equation system ◮ Approximation: Stop after a few CG iterations (CG residual norm ≤ σ min ( A ) · ε k ⇒ P ε k F fits ISA framework) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 9

Why simple subgradient scheme? Drawbacks of standard subgradient algorithms can often be alleviated by bundle methods, especially concerning “excessive” parameter tuning Experiments for (L1) with two bundle method implementations (E. Hübner’s and ConicBundle): ◮ approach 1: choose B s.t. A B regular, then with d := A − 1 B b , D := A − 1 B A [ n ] \ B , ⇔ min � z � 1 + � d − Dz � 1 (L1) ◮ approach 2: handle constraint implicitly by using conditional subgradients ◮ tried various parameter settings (bundle size, periodic restarts) Surprise: very often, these bundle solvers did not reach a solution (but ISA did) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 10

Our Testset ◮ 100 matrices A (74 dense, 26 sparse) dense: 512 × { 1024, 1536, 2048, 4096 } ◮ × { 2048, 3072, 4096, 8192 } 1024 × { 4096, 6144, 8192, 12288 } sparse: 2048 × { 16384, 24576, 32768, 49152 } 8192 ◮ random (e.g., partial Hadamard, random signs, ...) ◮ concatenations of dictionaries (e.g., [Haar, ID, RST], ...) ◮ columns normalized ◮ 4 or 6 vectors x ∗ per matrix such that each resulting (L1) instance (with b := Ax ∗ ) has unique optimum x ∗ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 12

Constructing Unique Solutions 548 instances with known, unique solution vectors x ∗ : ◮ For each matrix A , choose support S which obeys ∈ S � A † ERC ( A , S ) := max S a j � 1 < 1. j / 1. pick S at random, and 2. try increasing some S by repeatedly adding the resp. arg max 3. For dense A ’s, use L1TestPack to construct another unique solution support (via optimality condition for (L1)) ◮ Entries of x ∗ S random with i) high dynamic range ( − 10 5 , 10 5 ) ii) low dynamic range ( − 1, 1) 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 13

Comparison Setup ◮ Only exact solvers for (L1): min � x � 1 s. t. Ax = b ◮ Tested algorithms: ISAL1, SPGL1, YALL1, ℓ 1 -Magic, SolveBP (SparseLab), ℓ 1 -Homotopy, CPLEX (Dual Simplex) ◮ Use default settings (black box usage) ◮ Solution ¯ x − x ∗ � 2 ≤ 10 − 6 x “optimal”, if � ¯ ◮ Solution ¯ x − x ∗ � 2 ≤ 10 − 1 x “acceptable”, if � ¯ 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 14

Running Time vs. Distance from Unique Optimum (whole testset) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 15

Running Time vs. Distance from Unique Optimum (high dynamic range) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 16

Running Time vs. Distance from Unique Optimum (low dynamic range) ISAL1 6 10 SPGL1 YALL1 CPLEX l1−MAGIC 0 10 SparseLab / x ∗ � 2 PDCO Homotopy x − −6 10 � ¯ −12 10 −2 −1 0 1 2 10 10 10 10 10 Running Times [sec] 08/21/2012 | TU Darmstadt, TU Braunschweig | A. M. Tillmann et al. | 17

Heuristic Optimality Check and Computational Solver Comparison for - PowerPoint PPT Presentation

Heuristic Optimality Check and Computational Solver Comparison for Basis Pursuit Andreas M. Tillmann Research Group Optimization, TU Darmstadt, Germany joint work with Dirk A. Lorenz (TU Braunschweig) and Marc E. Pfetsch (TU Darmstadt) ISMP

Heuristic Search Lucia Moura Winter 2018 Heuristic Search Lucia Moura Heuristic Search Intro

Heuristic Search Heuristic Search Best-First A * Heuristic Functions Some material

Optimality Conditions Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Optimality

Heuristic parameter choice in Tikhonov method from minimizers of the quasi-optimality function

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

Stability, Networks: Stability, Networks: Control, and Optimality Control, and Optimality

SAT Solver as coNP Solver Beyond Resolution Norbert Manthey International Center for

Year 1 Phonics Screening Check Phonics Screening Check All schools have to administer a

Adjoint Solver Workshop Why is an Adjoint Solver useful? Design and manufacture for better

Automatic Solver Configuration and Solver Portfolios Meinolf Sellmann IBM Research Watson AI

precise solver for chemical ODEs Fan Feng, Zifa Wang GTC 2016 San Jose, USA, 04-07 Apr. 2016

Implementation of multiple deltaTs for the multi-region solver based on chtMultiRegionFoam Yuzhu

KY 1 Engineering 10 San Jose State University Solver The Solver is intended primarily for

Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline

An Iterative Solver for the Diffusion The Methods Progress So Far... Equation Alan Davidson

DS-optimal designs for RCR models with heteroscedastic errors Mateusz Wilk 1 , 2 , Aleksander

Algorithmic Aspects of Temporal Betweenness Sebastian Bu Hendrik Molter Rolf Niedermeier

An Approach to Robustness in Matching Problems under Ordinal Preferences Post-viva presentation

Optimum Sequential Procedures for Detecting Changes in Processes George V. Moustakides

ACCCA is committed to developing and supporting community college leaders through unparalleled

MEET THE ADMINISTRATION Hilca Thomas Amy Abate Joel Diaz Principal Assistant Principal

3D orientation Rotation matrix Fixed angle and Euler angle Axis angle

Juror Orientation Overview United States District Court for the District of Columbia Overview