the algorithmic frontiers of atomic norm minimization
play

The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, - PowerPoint PPT Presentation

The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, Discretization, and Randomization Benjamin Recht University of California, Berkeley Linear Inverse Problems Find me a solution of y = x n x p, n<p


  1. The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, Discretization, and Randomization Benjamin Recht University of California, Berkeley

  2. Linear Inverse Problems Find me a solution of • y = Φ x � � Φ n x p, n<p • Of the infinite collection of solutions, which one • should we pick? Leverage structure: • � Sparsity Rank Smoothness Symmetry � How do we design algorithms to solve • underdetermined systems problems with priors?

  3. Atomic Decompositions rank � � � model atoms weights � • Search for best linear combination of fewest atoms • “rank” = fewest atoms needed to describe the model

  4. Atomic Norms Given a basic set of atoms, , define the function A • k x k A = inf { t > 0 : x 2 t conv( A ) } � � Under mild conditions, we get a norm • X X � k x k A = inf { | c a | : x = c a a } � a ∈ A a ∈ A � k z k A minimize IDEA: � subject to Φ z = y � When does this work? • How do we solve the optimization problem? •

  5. Atomic Norm Minimization k z k A minimize IDEA: subject to Φ z = y Generalizes existing, powerful methods • Rigorous formula for developing new analysis • algorithms Precise, tight bounds on number of measurements • needed for model recovery One algorithm prototype for a myriad of data- • analysis applications Chandrasekaran, R, Parrilo, and Willsky

  6. Union of Subspaces X has structured sparsity: linear combination of elements • from a set of subspaces {U g }. Atomic set: unit norm vectors living in one of the U g • Permutations and Rankings X a sum of a few permutation matrices • Examples: Multiobject Tracking, Ranked elections, BCS • Convex hull of permutation matrices: doubly stochastic matrices. •

  7. Moments: convex hull of of [1,t,t 2 ,t 3 ,t 4 ,...], • t ∈T , some basic set. System Identification, Image Processing, • Numerical Integration, Statistical Inference Solve with semidefinite programming • � Cut-matrices: sums of rank-one sign matrices. • Collaborative Filtering, Clustering in Genetic • Networks, Combinatorial Approximation Algorithms Approximate with semidefinite • programming � Low-rank Tensors: sums of rank-one tensors • Computer Vision, Image Processing, • Hyperspectral Imaging, Neuroscience Approximate with alternating least- • squares

  8. Algorithms k Φ z � y k 2 2 + µ k z k A minimize z � Naturally amenable to projected gradient algorithm: • � z k +1 = Π η µ ( z k − η Φ ∗ r k ) � residual � r k = Φ z k − y 2 k z � u k 2 + τ k u k A � “shrinkage” 1 Π τ ( z ) = arg min � u Similar algorithm for atomic norm constraint • � Same basic ingredients for ALM, ADM, Bregman, • Mirror Prox, etc... how to compute the shrinkage?

  9. Shrinkage 2 k z � u k 2 + τ k u k A 1 Π τ ( z ) = arg min � u 2 k z � v k 2 1 Λ τ ( z ) = arg min � k v k ∗ A  τ � � z = Π τ ( z ) + Λ τ ( z ) � � k v k ∗ Dual norm a ∈ A h v, a i • A = max

  10. Relaxations k v k ∗ a ∈ A h v, a i A = max � Dual norm is efficiently computable if the set of • atoms is polyhedral or semidefinite representable A 1 ⇢ A 2 = ) k x k ∗ A 1  k x k ∗ A 2 and k x k A 2  k x k A 1 � Convex relaxations of atoms yield approximations to • the norm � NB! sample � complexity increases � � Hierarchy of relaxations based on θ -Bodies yield • progressively tighter bounds on the atomic norm

  11. Theta Bodies Suppose is an algebraic variety A • A = { x : f ( x ) = 0 ∀ f ∈ I } � � k v k ∗ A = max a ∈ A h v, a i  τ ( ) h v, a i = τ � q ( a ) � q = h + g � h ( x ) ≥ 0 ∀ x g ∈ I � positive vanishes on � everywhere atoms � Relaxation: restrict h to be sum of squares. • Gives a lower bound on atomic norm • Solvable by semidefinite programming ( Gouveia, • Parrilo, and Thomas, 2010 )

  12. Approximation through discretization Relaxations: A 1 ⇢ A 2 = ) k x k A 2  k x k A 1 • � Let be a finite net. • � � Let be a matrix whose columns are the set A ✏ • Ψ (X ) � k x k A ✏ = inf | c k | : x = Ψ c � k + extra equality constraints ` 1 � � Often times, we can compute explicit bounds such • that λ ✏ k x k A ✏  k x k A  k x k A ✏

  13. Discretization Theory Discretize the parameter space to get a • finite number of grid points � Enforce finite number of constraints: • � | h Φ ∗ z, a ( ω j ) i |  1 , ω j 2 Ω m � Equivalently, in the primal replace the • atomic norm with a discrete one � � � � What happens to the solutions when •

  14. Convergence in Dual 0 log 2 ( | f m − f ∗ | ) m = 32 Assumption: there exist parameters • 1.5 such that are � 10 linearly independent 1 � 0.5 � 20 Enforce finite constraints in the dual: • 0 0 0.5 1 | h Φ ∗ z, a ( ω j ) i |  1 , ω j 2 Ω m � 30 m = 128 4 6 8 10 12 1.5 1 Theorem: 0 z ∥ ) • The discretized optimal objectives 0.5 z m − ˆ converge to the original objective 0 � 5 • Any solution sequence of the { ˆ z m } 0 0.5 1 m = 512 discretized problems has a log 2 ( ∥ ˆ 1.5 subsequence that converges to the � 10 solution set of the original problem 1 • For the LASSO dual, the convergence 0.5 � 15 speed is 4 6 8 10 12 O ( ρ m ) 0 log 2 ( m ) 0 0.5 1

  15. Single Molecule Imaging Courtesy of Zhuang Research Lab

  16. Single Molecule Imaging Bundles of 8 tubes of 30 nm diameter • Sparse density: 81049 molecules on • 12000 frames Resolution: 64x64 pixels • Pixel size: 100nmx100nm • Field of view: 6400nmx6400nm • Target resolution: 10nmx10nm • Discretize the FOV into 640x640 • pixels X I ( x, y ) = c j PSF( x − x j , y − y j ) j ( x j , y j ) ∈ [0 , 6400] 2 ( x, y ) ∈ { 50 , 150 , . . . , 6350 }

  17. Single Molecule Imaging

  18. Single Molecule Imaging 1 1 0.8 0.8 0.6 0.6 Precision Recall 0.4 0.4 0.2 0.2 Sparse Sparse CoG CoG quickPALM quickPALM 0 0 0 10 20 30 40 50 0 10 20 30 40 50 Radius Radius 100 1 80 0.8 60 0.6 F � score Jaccard 40 0.4 0.2 20 Sparse Sparse CoG CoG quickPALM quickPALM 0 0 0 10 20 30 40 50 0 10 20 30 40 50 Radius Radius

  19. Atomic norms in sparse approximation Greedy approximations • � k f � f n k L 2  c 0 k f k A p n � � Best n term approximation to a function f in the • convex hull of A . � Maurey, Jones, and Barron (1980s-90s) • Devore and Temlyakov (1996) •

  20. If greedy is hard… Training these networks is hard • � � � But for fixed θ k , the following can be feasible • � � � Can we just not optimize the θ k ? • What if we randomly sample the parameters? •

  21. Fix parameterized basis functions • Fix a probability distribution • � Our target space will be: • Example: Fourier basis functions: • � Gaussian parameters • � If , then means • that the frequency distribution of f has subgaussian tails.

  22. Theorem 1 : Let f be in with • Let ω 1 ,…, ω n be sampled iid from p . Then � � with probability at least 1 - δ . • Generalization Error Estimation Error Approximation Error It’s a finite sized basis set! • Choosing gives overall convergence of •

  23. % Approximates Gaussian Process regression % with Gaussian kernel of variance gamma % lambda: regularization parameter % dataset: X is dxN, y is 1xN % test: xtest is dx1 % D: dimensionality of random feature % training w = randn(D, size(X,1)); b = 2*pi*rand(D,1); Z = cos(sqrt(gamma)*w*X + repmat(b,1,size(X,2))); % Equivalent to % alpha = (lambda*eye(size(X,2))+Z*Z’)\(Z*y); alpha = symmlq(@(v)(lambda*v(:) + Z*(Z'*v(:))),… Z*y(:),1e-6,2000); � % testing ztest = alpha(:) ’ *cos( sqrt(gamma)*w*xtest(:) + … + repmat(b,1,size(X,2)) );

  24. Relaxation - hierarchies of approximating complex • priors via semidefinite programming Discretization - fast convergence in distribution for • models that admit tight discretizations Randomization - efficient algorithms for greedy • methods with practical algorithms Challenge - integrate these ideas into fast, greedy • algorithms.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend