The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, Discretization, and Randomization
Benjamin Recht University of California, Berkeley
The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, - - PowerPoint PPT Presentation
The Algorithmic Frontiers of Atomic Norm Minimization: Relaxation, Discretization, and Randomization Benjamin Recht University of California, Berkeley Linear Inverse Problems Find me a solution of y = x n x p, n<p
Benjamin Recht University of California, Berkeley
should we pick?
underdetermined systems problems with priors?
Sparsity Rank Smoothness Symmetry
atoms model weights rank
kxkA = inf{ X
a∈A
|ca| : x = X
a∈A
caa} kxkA = inf{t > 0 : x 2 tconv(A)} minimize kzkA subject to Φz = y
IDEA:
A
algorithms
needed for model recovery
analysis applications
minimize kzkA subject to Φz = y
IDEA:
Chandrasekaran, R, Parrilo, and Willsky
from a set of subspaces {Ug}.
t∈T, some basic set.
Numerical Integration, Statistical Inference
Networks, Combinatorial Approximation Algorithms
programming
Hyperspectral Imaging, Neuroscience
squares
Mirror Prox, etc... how to compute the shrinkage?
zk+1 = Πηµ(zk − ηΦ∗rk) minimizez kΦz yk2
2 + µkzkA
rk = Φzk − y
“shrinkage” residual
Πτ(z) = arg min
u 1 2kz uk2 + τkukA
Λτ(z) = arg min
kvk∗
Aτ
1 2kz vk2
z = Πτ(z) + Λτ(z) Πτ(z) = arg min
u 1 2kz uk2 + τkukA
kvk∗
A = max a∈Ahv, ai
atoms is polyhedral or semidefinite representable
the norm
progressively tighter bounds on the atomic norm
A1 ⇢ A2 = ) kxk∗
A1 kxk∗ A2 and kxkA2 kxkA1
kvk∗
A = max a∈Ahv, ai
NB! sample complexity increases
kvk∗
A = max a∈Ahv, ai τ (
) hv, ai = τ q(a)
Parrilo, and Thomas, 2010)
g ∈ I q = h + g A = {x : f(x) = 0 ∀ f ∈ I} h(x) ≥ 0 ∀x A
positive everywhere vanishes on atoms
that
A1 ⇢ A2 = ) kxkA2 kxkA1 Ψ A✏ kxkA✏ = inf (X
k
|ck| : x = Ψc ) λ✏kxkA✏ kxkA kxkA✏
+ extra equality constraints
`1
finite number of grid points
atomic norm with a discrete one
| hΦ∗z, a(ωj)i | 1, ωj 2 Ωm
such that are linearly independent
Theorem:
converge to the original objective
discretized problems has a subsequence that converges to the solution set of the original problem
speed is
{ˆ zm} O(ρm)
0.5 1 0.5 1 1.5 m = 32 0.5 1 0.5 1 1.5 m = 128 0.5 1 0.5 1 1.5 m = 512 4 6 8 10 12 30 20 10 log2(|fm − f ∗|) 4 6 8 10 12 15 10 5 log2(m) log2(∥ˆ zm − ˆ z∥)
| hΦ∗z, a(ωj)i | 1, ωj 2 Ωm
Courtesy of Zhuang Research Lab
12000 frames
pixels
I(x, y) = X
j
cjPSF(x − xj, y − yj) (xj, yj) ∈ [0, 6400]2 (x, y) ∈ {50, 150, . . . , 6350}
10 20 30 40 50 0.2 0.4 0.6 0.8 1 Radius Precision Sparse CoG quickPALM 10 20 30 40 50 0.2 0.4 0.6 0.8 1 Radius Recall Sparse CoG quickPALM 10 20 30 40 50 20 40 60 80 100 Radius Jaccard Sparse CoG quickPALM 10 20 30 40 50 0.2 0.4 0.6 0.8 1 Radius Fscore Sparse CoG quickPALM
convex hull of A.
kf fnkL2 c0kfkA pn
, then means that the frequency distribution of f has subgaussian tails.
Let ω1,…, ωn be sampled iid from p. Then
Estimation Error Approximation Error
% Approximates Gaussian Process regression % with Gaussian kernel of variance gamma % lambda: regularization parameter % dataset: X is dxN, y is 1xN % test: xtest is dx1 % D: dimensionality of random feature % training w = randn(D, size(X,1)); b = 2*pi*rand(D,1); Z = cos(sqrt(gamma)*w*X + repmat(b,1,size(X,2))); % Equivalent to % alpha = (lambda*eye(size(X,2))+Z*Z’)\(Z*y); alpha = symmlq(@(v)(lambda*v(:) + Z*(Z'*v(:))),… Z*y(:),1e-6,2000);
ztest = alpha(:)’*cos( sqrt(gamma)*w*xtest(:) + … + repmat(b,1,size(X,2)) );
priors via semidefinite programming
models that admit tight discretizations
methods with practical algorithms
algorithms.