Optimal approximation for unconstrained non-submodular minimization - PowerPoint PPT Presentation

Optimal approximation for unconstrained non-submodular minimization Marwa El Halabi Stefanie Jegelka CSAIL, MIT ICML 2020

Set function minimization Goal: Select collection S of items in V that minimize cost H ( S ) Unconstrained non-submodular minimization Slide 2/ 17

Set function minimization in Machine learning x ♮ y A ε Structured sparse learning Batch Bayesian optimization Figures from [Mairal et al., 2010, Krause et al., 2008] Unconstrained non-submodular minimization Slide 3/ 17

Set function minimization Ground set V = { 1 , · · · , d } , set function H : 2 V → R min S ⊆ V H ( S ) ◮ Assume: H ( ∅ ) = 0 , black box oracle to evaluate H Unconstrained non-submodular minimization Slide 4/ 17

Set function minimization Ground set V = { 1 , · · · , d } , set function H : 2 V → R min S ⊆ V H ( S ) ◮ Assume: H ( ∅ ) = 0 , black box oracle to evaluate H ◮ NP-hard to approximate in general Unconstrained non-submodular minimization Slide 4/ 17

Set function minimization Ground set V = { 1 , · · · , d } , set function H : 2 V → R min S ⊆ V H ( S ) ◮ Assume: H ( ∅ ) = 0 , black box oracle to evaluate H ◮ NP-hard to approximate in general ◮ Submodularity helps: diminishing returns (DR) property H ( A ∪ { i } ) − H ( A ) ≥ H ( B ∪ { i } ) − H ( B ) for all A ⊆ B Unconstrained non-submodular minimization Slide 4/ 17

Set function minimization Ground set V = { 1 , · · · , d } , set function H : 2 V → R min S ⊆ V H ( S ) ◮ Assume: H ( ∅ ) = 0 , black box oracle to evaluate H ◮ NP-hard to approximate in general ◮ Submodularity helps: diminishing returns (DR) property H ( A ∪ { i } ) − H ( A ) ≥ H ( B ∪ { i } ) − H ( B ) for all A ⊆ B ◮ Efficient minimization Unconstrained non-submodular minimization Slide 4/ 17

Set function minimization in Machine learning x ♮ y A ε Structured sparse learning Bayesian optimization H is not submodular Figures from [Mairal et al., 2010, Krause et al., 2008] Unconstrained non-submodular minimization Slide 5/ 17

Set function minimization in Machine learning x ♮ y A ε Structured sparse learning Bayesian optimization H is not submodular but it is “close” . . . Figures from [Mairal et al., 2010, Krause et al., 2008] Unconstrained non-submodular minimization Slide 5/ 17

Approximately submodular functions What if the objective is not submodular, but “close”? Unconstrained non-submodular minimization Slide 6/ 17

Approximately submodular functions What if the objective is not submodular, but “close”? ◮ Several works on non-submodular maximization [Das and Kempe, 2011, Bian et al., 2017, Kuhnle et al., 2018, Horel and Singer, 2016, Hassidim and Singer, 2018] ◮ Only constrained non-submodular minimization is studied [Wang et al., 2019, Bai et al., 2016, Qian et al., 2017, Sviridenko et al., 2017] Unconstrained non-submodular minimization Slide 6/ 17

Approximately submodular functions Can submodular minimization algorithms extend to such non-submodular functions? Unconstrained non-submodular minimization Slide 6/ 17

Overview of main results Can submodular minimization algorithms extend to such non-submodular functions? Yes! ◮ First approximation guarantee ◮ Efficient simple algorithm: Projected subgradient method ◮ Extension to noisy setting ◮ Matching lower-bound showing optimality Unconstrained non-submodular minimization Slide 7/ 17

Weakly DR-submodular functions H is α -weakly DR-submodular [Lehmann et al., 2006] , with α > 0 if � � H ( A ∪ { i } ) − H ( A ) ≥ α H ( B ∪ { i } ) − H ( B ) for all A ⊆ B Unconstrained non-submodular minimization Slide 8/ 17

Weakly DR-submodular functions H is α -weakly DR-submodular [Lehmann et al., 2006] , with α > 0 if � � H ( A ∪ { i } ) − H ( A ) ≥ α H ( B ∪ { i } ) − H ( B ) for all A ⊆ B ◮ H is submodular ⇒ α = 1 Unconstrained non-submodular minimization Slide 8/ 17

Weakly DR-submodular functions H is α -weakly DR-submodular [Lehmann et al., 2006] , with α > 0 if � � H ( A ∪ { i } ) − H ( A ) ≥ α H ( B ∪ { i } ) − H ( B ) for all A ⊆ B ◮ H is submodular ⇒ α = 1 ◮ Caveat: H should be monotone H ( A ) ≤ H ( B ) ⇒ α ≤ 1 H ( A ) ≥ H ( B ) ⇒ α ≥ 1 Unconstrained non-submodular minimization Slide 8/ 17

Problem set-up S ⊆ V H ( S ) := F ( S ) − G ( S ) min ◮ F and G are both non-decreasing ◮ F is α -weakly DR-submodular ◮ G is β -weakly DR-supermodular ◮ F ( ∅ ) = G ( ∅ ) = 0 Unconstrained non-submodular minimization Slide 9/ 17

What set functions have this form? min S ⊆ V H ( S ) := F ( S ) − G ( S ) Objectives in several applications: Structured sparse learning, variance reduction in Bayesian optimization, Bayesian A-optimality in experimental design [Bian et al., 2017] , column subset selection [Sviridenko et al., 2017] . Unconstrained non-submodular minimization Slide 10/ 17

What set functions have this form? min S ⊆ V H ( S ) := F ( S ) − G ( S ) Decomposition result Given any set function H , and α, β ∈ (0 , 1] , αβ < 1 , we can write H ( S ) = F ( S ) − G ( S ) ◮ F and G are non-decreasing α -weakly DR-submodular ◮ G is β -weakly DR-supermodular Unconstrained non-submodular minimization Slide 10/ 17

Submodular function minimization min S ⊆ V H ( S ) = s ∈ [0 , 1] d h L ( s ) min ( | V | = d ) h L is the Lovász extension of H Unconstrained non-submodular minimization Slide 11/ 17

Submodular function minimization min S ⊆ V H ( S ) = s ∈ [0 , 1] d h L ( s ) min ( | V | = d ) h L is the Lovász extension of H ◮ H is submodular ⇔ Lovász extension is convex [Lovász, 1983] ◮ Easy to compute subgradients [Edmonds, 2003] : Sorting + d function evaluations of H Unconstrained non-submodular minimization Slide 11/ 17

Non-submodular function minimization Can we use the same strategy? min S ⊆ V H ( S ) = s ∈ [0 , 1] d h L ( s ) min ( | V | = d ) Unconstrained non-submodular minimization Slide 12/ 17

Non-submodular function minimization Can we use the same strategy? No min S ⊆ V H ( S ) = s ∈ [0 , 1] d h L ( s ) min ( | V | = d ) ◮ The Lovász extension h L is not convex anymore Unconstrained non-submodular minimization Slide 12/ 17

Non-submodular function minimization Can we use the same strategy? Almost min S ⊆ V H ( S ) := F ( S ) − G ( S ) = s ∈ [0 , 1] d h L ( s ) := f L ( S ) − g L ( S ) min ◮ The Lovász extension h L is not convex anymore Main result ◮ Easy to compute approximate subgradient (= subgradients in the submodular case): α f L ( s ′ ) − βg L ( s ′ ) ≥ h L ( s ) + � κ , s ′ − s � , ∀ s ′ ∈ [0 , 1] d 1 ◮ H approximately submodular ⇒ h L is approximately convex Unconstrained non-submodular minimization Slide 12/ 17

Projected subgradient method (PGM) s t +1 = Π [0 , 1] d ( s t − η κ t ) (PGM) κ t is an approximate subgradient of h L at s t Unconstrained non-submodular minimization Slide 13/ 17

Projected subgradient method (PGM) s t +1 = Π [0 , 1] d ( s t − η κ t ) (PGM) κ t is an approximate subgradient of h L at s t S ⊆ V H ( S ):= F ( S ) − G ( S ) min � PGM does not need to know α, β, F, G , just H Unconstrained non-submodular minimization Slide 13/ 17

Projected subgradient method (PGM) s t +1 = Π [0 , 1] d ( s t − η κ t ) (PGM) κ t is an approximate subgradient of h L at s t S ⊆ V H ( S ):= F ( S ) − G ( S ) min � PGM does not need to know α, β, F, G , just H Approximation guarantee After T iterations of PGM + rounding, we obtain: S ) ≤ 1 αF ( S ∗ ) − βG ( S ∗ ) + O ( 1 H ( ˆ √ ) T Unconstrained non-submodular minimization Slide 13/ 17

Projected subgradient method (PGM) s t +1 = Π [0 , 1] d ( s t − η κ t ) (PGM) κ t is an approximate subgradient of h L at s t S ⊆ V H ( S ):= F ( S ) − G ( S ) min � PGM does not need to know α, β, F, G , just H Approximation guarantee After T iterations of PGM + rounding, we obtain: S ) ≤ 1 αF ( S ∗ ) − βG ( S ∗ ) + O ( 1 H ( ˆ √ ) T � Result extends to noisy oracle setting: � | ˆ H ( S ) − H ( S ) | ≤ ǫ � ≥ 1 − δ P Unconstrained non-submodular minimization Slide 13/ 17

Can we do better? General set function minimization (in value oracle model): min S ⊆ V H ( S ) := F ( S ) − G ( S ) Inapproximability result For any δ > 0 , no (deterministic or randomized) algorithm achieves E [ H ( ˆ S )] ≤ 1 α F ( S ∗ ) − βG ( S ∗ ) − δ with less than exponentially many queries. Unconstrained non-submodular minimization Slide 14/ 17

Experiment: Structured sparse learning Problem: Learn x ♮ ∈ R d , whose support is an interval, from noisy linear Gaussian measurements x ♮ y A ε S ⊆ V H ( S ) := λF ( S ) − G ( S ) min n × d ◮ Regularizer: F ( S ) = d + max( S ) − min( S ) , F ( ∅ ) = 0 ; α = 1 ◮ Loss: G ( S ) = ℓ (0) − min supp( x ) ⊆ S ℓ ( x ) , where ℓ is least squares loss. G is β -weakly DR-supermodular; β > 0 Unconstrained non-submodular minimization Slide 15/ 17

Optimal approximation for unconstrained non-submodular minimization - PowerPoint PPT Presentation

Optimal approximation for unconstrained non-submodular minimization Marwa El Halabi Stefanie Jegelka CSAIL, MIT ICML 2020 Set function minimization Goal: Select collection S of items in V that minimize cost H ( S ) Unconstrained

( ) Outline Submodular

Fast Semi-differential based Submodular Function Optimization Rishabh Iyer 1 Stefanie Jegelka 2

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Streaming -submodular Maximization under Noise subject to Size Constraint Lan N. Nguyen, My

Minimizing Submodular Functions Satoru Iwata (RIMS, Kyoto University) Outline Submodular

CS675: Convex and Combinatorial Optimization Fall 2019 Submodular Function Optimization

Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity Matthew

MELODI M achin E L earning, O ptimization, & D ata I nterpretation @ UW Iyer & Bilmes,

Optimization of Submodular Functions Tutorial - lecture II Jan Vondrk 1 1 IBM Almaden Research

Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

Approximating Submodular Functions Everywhere Nick Harvey February 16, 2008 Joint work with M.

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 25: Unconstrained Submodular

Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Dynamic Content Allocation for Cloud- assisted Service of Periodic Workloads Gyrgy Dn Niklas

Incremental Methods for Additive Convex Cost Minimization: Deterministic vs Randomized Variants

Meta-modelling Markov Model Simulations for cost effectiveness analyses ICTR-PHE 2012 Daniel

Evaluation 101 Energy Efficiency Program Evaluation By Nick Hall TecMarket Works February 8,

Reparameterization: a Universal Tool for Optimization and Counting George Katsirelos 10/05/2017

Minimization of Energy Loss using Integrated Evolutionary Approaches

How do we deal with such pointers? What about write-barrier cost? Inter-generational ptrs

From heuristic to optimal models in naturalistic visual search Angela Radulescu 1,2 *, Bas van

Optimal approximation for unconstrained non-submodular minimization - PowerPoint PPT Presentation

Optimal approximation for unconstrained non-submodular minimization Marwa El Halabi Stefanie Jegelka CSAIL, MIT ICML 2020 Set function minimization Goal: Select collection S of items in V that minimize cost H ( S ) Unconstrained

( ) Outline Submodular

Fast Semi-differential based Submodular Function Optimization Rishabh Iyer 1 Stefanie Jegelka 2

Submodular Maximization Seffi Naor Lecture 2 4th Cargese Workshop on Combinatorial Optimization

Streaming -submodular Maximization under Noise subject to Size Constraint Lan N. Nguyen, My

Minimizing Submodular Functions Satoru Iwata (RIMS, Kyoto University) Outline Submodular

CS675: Convex and Combinatorial Optimization Fall 2019 Submodular Function Optimization

Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity Matthew

MELODI M achin E L earning, O ptimization, &amp; D ata I nterpretation @ UW Iyer &amp; Bilmes,

Optimization of Submodular Functions Tutorial - lecture II Jan Vondrk 1 1 IBM Almaden Research

Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints

Maximization of Submodular Functions Seffi Naor Lecture 1 4th Cargese Workshop on Combinatorial

Approximating Submodular Functions Everywhere Nick Harvey February 16, 2008 Joint work with M.

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 25: Unconstrained Submodular

Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Dynamic Content Allocation for Cloud- assisted Service of Periodic Workloads Gyrgy Dn Niklas

Incremental Methods for Additive Convex Cost Minimization: Deterministic vs Randomized Variants

Meta-modelling Markov Model Simulations for cost effectiveness analyses ICTR-PHE 2012 Daniel

Evaluation 101 Energy Efficiency Program Evaluation By Nick Hall TecMarket Works February 8,

Reparameterization: a Universal Tool for Optimization and Counting George Katsirelos 10/05/2017

Minimization of Energy Loss using Integrated Evolutionary Approaches

How do we deal with such pointers? What about write-barrier cost? Inter-generational ptrs

From heuristic to optimal models in naturalistic visual search Angela Radulescu 1,2 *, Bas van

MELODI M achin E L earning, O ptimization, & D ata I nterpretation @ UW Iyer & Bilmes,