Nearly Linear-Time Algorithms for Structured Sparsity - PowerPoint PPT Presentation

Nearly ¡Linear-‑Time ¡Algorithms ¡for ¡ Structured ¡Sparsity ¡ ¡ Chinmay ¡Hegde ¡ Piotr ¡Indyk ¡ Ludwig ¡Schmidt ¡ MIT ¡

Compressive ¡sensing ¡ ¡ Setup: ¡ • – Data/signal ¡in ¡n-‑dimensional ¡space ¡: ¡x ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡E.g., ¡x ¡is ¡an ¡256x256 ¡image ¡ ⇒ ¡n=65536 ¡ – Goal: ¡compress ¡x ¡into ¡Ax ¡, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡where ¡A ¡is ¡a ¡m ¡x ¡n ¡“measurement” ¡or ¡“sketch” ¡matrix, ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡m ¡<< ¡n ¡ ¡ Goal: ¡ ¡want ¡to ¡recover ¡an ¡“approximaPon” ¡x* ¡of ¡k-‑ • sparse ¡x ¡from ¡Ax+e, ¡i.e., ¡ ¡ ||x*-‑x|| ≤ ¡C ¡ ¡||e|| ¡ Want: ¡ • Lichtenstein ¡img ¡processing ¡test ¡ – Good ¡compression ¡(small ¡m=m(k,n)) ¡ – Efficient ¡algorithms ¡for ¡encoding ¡and ¡recovery ¡ ¡ ¡ ¡Ax ¡ A ¡ = ¡ x ¡

Bounds ¡ • Want: ¡ – Good ¡compression ¡(small ¡m=m(k,n)) ¡ m=O(k ¡log ¡(n/k)) ¡ ¡ ¡ ¡ ¡[Candes-‑Romberg-‑Tao’04,….] ¡ ¡ ¡ – Efficient ¡algorithms ¡for ¡encoding ¡and ¡recovery ¡ L1 ¡minimizaPon, ¡CoSaMP, ¡IHT, ¡SMP, ¡…. ¡ ¡ • Issue ¡? ¡ – log ¡(n/k) ¡penalty ¡compared ¡to ¡non-‑linear ¡compression ¡ – Unavoidable ¡in ¡general ¡ [Donoho’04, ¡Do ¡Ba-‑Indyk-‑Price-‑ Woodruff’10] ¡ ¡

Structured ¡sparsity ¡ ¡ • Some ¡signals ¡contain ¡more ¡ structure ¡than ¡mere ¡sparsity ¡ • Less ¡sparsity ¡paherns ¡to ¡worry ¡ about ¡

� Model-‑based ¡compressive ¡sensing ¡ [Baraniuk-‑Cevher-‑Duarte-‑Hegde’10] ¡ • Idea: ¡structure ¡ ⇔ ¡ restricted ¡support ¡ • DefiniPon: ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡A ¡ structured ¡sparsity ¡model ¡ M ¡ ¡ is ¡defined ¡by ¡a ¡set ¡of ¡ ¡ ¡ ¡ ¡ ¡ ¡allowed ¡supports ¡M ¡ ¡ = ¡ { Ω 1 , ¡. ¡. ¡. ¡, ¡ Ω ap ¡ } ¡ where ¡Ω i ¡ ¡ ⊆ ¡ [ n ]: ¡ ¡ M ¡ ¡ = ¡ { x ¡ ∈ ¡ R n ¡ ¡ | ¡∃ ¡ Ω i ¡ ¡ ∈ ¡ Ω ¡: ¡supp( x ¡ ) ¡ ⊆ ¡ Ω i ¡ }� • Only ¡ O ( k ¡ + ¡log ¡ | M | ) ¡measurements ¡suffice ¡if|Ω i | ¡<=k ¡ ¡ • For ¡all ¡models ¡considered ¡in ¡this ¡talk ¡|M|=2 O(k) ¡ , ¡so ¡m=O(k) ¡ ¡

Model ¡specs ¡ • Recovery ¡algorithm ¡depends ¡on ¡the ¡model ¡ M ¡ • Need ¡ model-‑projec;on ¡oracle ¡ M(x) ¡= ¡argmin x’ ∈ M ¡ ¡ ||x’-‑x|| 2 ¡ • Requirements: ¡ – The ¡oracle ¡should ¡run ¡ fast ¡ (e.g. ¡in ¡polynomial ¡or ¡linear ¡Pme) ¡ – The ¡oracle ¡must ¡find ¡ the ¡best ¡ approximaPon ¡in ¡the ¡model ¡ • Model-‑IHT: ¡iterate ¡ Necessary ¡??? ¡ x ¡ i ¡ ¡ ← ¡M ¡ ( ¡x ¡ i−1 ¡ + ¡ A T ¡( ¡ y ¡− ¡Ax ¡ i−1 )) ¡ ¡ • Summary: ¡If ¡|M|=2 O(k) ¡and ¡the ¡oracle ¡M() ¡available, ¡then ¡Model-‑ IHT ¡recovers ¡any ¡x ¡s.t. ¡supp(x) ∈ M ¡from ¡O(k) ¡measurements ¡Ax ¡ – Stable ¡generalizaPons ¡exists ¡as ¡well ¡ ¡ – TheorePcal ¡and ¡empirical ¡improvement ¡

Why ¡not ¡approximaPons? ¡ • E.g., ¡why ¡not ¡an ¡approximate ¡model-‑projecPon: ¡ given ¡x ¡find ¡x’ ∈M ¡s.t.: ¡ ||x-‑x’|| 2 ¡ ¡≤ ¡c ¡min x’’ ∈ M ¡ ¡ ||x’’-‑x|| 2 ¡ • Turns ¡out ¡MIHT ¡might ¡not ¡work ¡if ¡c>1 ¡! ¡ – While ¡iteraPng ¡x i ¡ ¡ ← ¡M ¡ ( ¡x i−1 ¡ + ¡ A T ¡( ¡ y ¡− ¡Ax i−1 )) ¡we ¡can ¡ keep ¡x i =0 ¡even ¡though ¡the ¡opPmal ¡soluPon ¡is ¡1-‑sparse ¡ ¡ | ¡ x ¡ i−1 ) | � i−1 ¡ + ¡ A T ¡( ¡ y ¡− ¡Ax ¡

Our ¡framework ¡[HIS14] ¡ • ApproximaPon-‑Tolerant ¡Model-‑Based ¡ Compressive ¡Sensing ¡ • IntuiPon: ¡ – Consider ¡oracle ¡M(x) ¡= ¡argmin Ω ∈ M ¡ ¡ || ¡x-‑x Ω || 2 ¡ • Minimizes ¡the ¡norm ¡of ¡the ¡ “tail” ¡ – Equivalently: ¡M(x) ¡= ¡argmax Ω ∈ M ¡ ¡ || ¡x Ω || 2 ¡ • Maximizes ¡ the ¡norm ¡of ¡the ¡ “head” ¡ – However, ¡these ¡two ¡problems ¡are ¡ not ¡ equivalent ¡if ¡ approximaPon ¡is ¡allowed ¡ – Our ¡approach: ¡required ¡ two ¡separate ¡approximate ¡ head ¡and ¡tail ¡oracles… ¡ – …and ¡then ¡things ¡work ¡ J ¡

Our ¡framework ¡ctd. ¡ • Tail-‑approximaPon ¡oracle ¡ T ¡ ( x ¡, ¡p ) ¡ – T ¡ ( x ¡, ¡p ) ¡= ¡ x Ω ¡ ¡ for ¡some ¡Ω ¡ ∈ ¡ M ¡ ¡ – Tail ¡approximaPon: ¡ ¡ || x ¡ ∓ ¡ T ¡ ( x ¡, ¡p ) || 2 ¡ ¡ ≤ ¡ c T ¡ ¡ ¡ min Ω ¡ ∈ ¡ M ¡ ¡ || x ¡ ∓ ¡ x Ω || 2 ¡ ¡ ¡ • Head-‑approximaPon ¡oracle ¡ H ( x ¡, ¡p ) ¡ – H ( x ¡, ¡p ) ¡= ¡ x Ω ¡ ¡ for ¡some ¡Ω ¡ ∈ ¡ M ¡ ¡ – Head ¡approximaPon: ¡ || H ( x ¡, ¡p ) || 2 ¡ ¡ ≥ ¡ c H ¡ ¡ ¡ max Ω ¡ ∈ ¡ M || x Ω || 2 ¡ ¡ • ApproximaPon-‑Tolerant ¡MIHT: ¡ x i ¡ ¡ ← ¡T ¡ ( ¡x i−1 ¡ + ¡ H ¡(A T ¡( ¡ y ¡− ¡Ax i−1 )) ¡ ¡ ¡

ApproximaPon-‑Tolerant ¡M-‑IHT ¡ Theorem ¡[HIS14] : ¡The ¡iterates ¡of ¡AM-‑IHT ¡saPsfy ¡ ¡ ⇣ ⌘ p k x � x i +1 k 2  (1 + c T ) δ + 1 � ( c H (1 � δ ) � δ ) 2 k x � x i k 2 ¡ if ¡ A ¡ saPsfies ¡the ¡model-‑RIP ¡with ¡constant ¡δ ¡and ¡ c T ¡and ¡c H ¡are ¡as ¡on ¡the ¡previous ¡slide. ¡

Prior ¡work ¡ Stable ¡ Oracle ¡ AssumpPons ¡ Recovery ¡ Blumensath ¡ ¡ AddiPve ¡ No ¡ -‑ ¡ Kyrillidis-‑ Head ¡ No ¡ -‑ ¡ Cevher ¡ Bound ¡on ¡ ¡singular ¡ Giryes-‑Elad ¡ Tail ¡ Yes ¡ values ¡of ¡ A ¡ Davenport-‑ PorPon ¡of ¡opPmal ¡ Head, ¡Tail ¡ Yes ¡ Needell-‑Wakin ¡ support ¡idenPfied ¡ This ¡work ¡ Head, ¡Tail ¡ Yes ¡ None ¡ [Giryes, ¡Needell]: ¡similar ¡approach, ¡somewhat ¡different ¡results ¡

OK, ¡but ¡what ¡does ¡it ¡lead ¡to ¡? ¡ • Constrained ¡Earth-‑Mover ¡Distance ¡ [SAMPTA’13, ¡SODA’14] ¡ – Columns ¡with ¡similar ¡sparsity ¡paherns ¡ – PolyPme ¡recovery ¡from ¡O(k) ¡ measurements ¡ • Tree-‑sparsity ¡ [ISIT’14, ¡ICALP’14] ¡ – Supports ¡form ¡connected ¡tree ¡ components ¡ – Near-‑linear ¡Pme ¡recovery ¡from ¡O(k) ¡ measurements ¡ • Clustered ¡sparsity ¡ [???’15] ¡ – Supports ¡form ¡connected ¡components ¡ in ¡graphs ¡ – Near-‑linear ¡Pme ¡recovery ¡from ¡O(k) ¡ measurements ¡

Tree-‑Sparsity ¡

Tree-‑sparsity ¡ • Sparse ¡signals ¡whose ¡large ¡coefficients ¡can ¡be ¡ arranged ¡in ¡the ¡form ¡of ¡a ¡ rooted, ¡connected ¡tree ¡ ¡ ¡ ¡ ¡ApplicaPons: ¡data ¡streams, ¡imaging, ¡genomics, ¡… ¡ ¡ ¡ ¡ ¡ Natural ¡images ¡ Piecewise ¡const. ¡signals ¡

An ¡opPmizaPon ¡problem ¡ • Given ¡a ¡signal ¡x, ¡compute ¡ the ¡opPmal ¡(exact) ¡ tree-‑ sparse ¡projec;on ¡ ¡ ¡ ¡ of ¡x, ¡i.e., ¡solve ¡ min |Ω|≤k, ¡Ω ¡tree ¡ ||x-‑x Ω || 2 2 ¡

Summary ¡of ¡results ¡ RunPme ¡ Guarantee ¡ Baraniuk-‑Jones ¡‘94 ¡ O(n ¡log ¡n) ¡ ? ¡ Donoho ¡‘97 ¡ O(n) ¡ ? ¡ Bohanec-‑Bratko ¡‘94 ¡ O(n 2 ) ¡ Exact ¡ CarPs-‑Thompson ¡‘13 ¡ O(nk) ¡ Exact ¡ This ¡work ¡ O(n ¡log ¡n) ¡ Approximate ¡Head ¡ This ¡work ¡ O(n ¡log ¡n ¡+ ¡k ¡log 2 ¡n) ¡ Approximate ¡Tail ¡ ImplicaPon: ¡stable ¡recovery ¡of ¡tree-‑sparse ¡signals ¡from ¡O(k) ¡ ¡measurements ¡in ¡Pme ¡ ¡#iteraPons ¡* ¡(n ¡log ¡n ¡+ ¡k ¡log 2 n ¡+ ¡matrix-‑vector-‑mult-‑Pme) ¡

Why ¡nlogn ¡is ¡beher ¡than ¡nk ¡ • Consider ¡a ¡‘moderate’ ¡problem ¡size ¡ – e.g. ¡ n ¡= ¡10 6 ¡, ¡ k ¡= ¡5% ¡of ¡ n ¡ ¡ ¡ ¡ • Then, ¡ nk ¡~ ¡ ¡ 50 ¡x ¡10 9 ¡while ¡nlog ¡n ¡~ ¡20 ¡x ¡10 6 ¡ • Really ¡need ¡near-‑linear ¡Pme ¡

Nearly Linear-Time Algorithms for Structured Sparsity - PowerPoint PPT Presentation

Nearly Linear-Time Algorithms for Structured Sparsity Chinmay Hegde Piotr Indyk Ludwig Schmidt MIT Compressive sensing Setup: Data/signal in

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Structured sparsity and convex optimization Francis Bach INRIA - Ecole Normale Sup erieure,

RegML2017@SIMULA Oslo Class 7 Structured sparsity Lorenzo Rosasco UNIGE-MIT-IIT May 4, 2017

RegML 2016 Class 6 Structured sparsity Lorenzo Rosasco UNIGE-MIT-IIT June 30, 2016 Exploiting

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Network Flow Algorithms for Structured Sparsity Julien Mairal 1 Rodolphe Jenatton 2 Guillaume

Compressive Sensing with Biorthogonal Wavelets via Structured Sparsity Marco F. Duarte Richard

Structured Sparsity in Gabor Analysis Dominik Fuchs University of Vienna Faculty of Mathematics

Algorithms and Architecture I Sorting in Linear Time 1 Linear Sort? But... Best algorithms

Structured Electronic Design Structured Electronic Design ET 8016 5 ECTS credits 1

Integrable states, exact overlaps, and the Boundary Yang-Baxter relation Balzs Pozsgay

Locally finite-dimensional operator algebras and some transfinite combinatorial structures Piotr

Universal Hamiltonians for Exponentially Long Simulation: Exploring Susskinds Conjecture Thom

Public key cryptography: a practical Public key cryptography: a practical approach approach

Hamiltonian simulation and solving linear systems Robin Kothari Center for Theoretical Physics

Translating proofs from HOL to Coq Theoretical and practical aspects Chantal Keller and Benjamin

srs stts r r

Open quantum dynamics Dariusz Chru sci nski Institute of Physics, Faculty of Physics,