1/54
MADMM: a generic algorithm for non-smooth
- ptimization on manifolds
MADMM: a generic algorithm for non-smooth optimization on manifolds - - PowerPoint PPT Presentation
MADMM: a generic algorithm for non-smooth optimization on manifolds Michael Bronstein Faculty of Informatics Perceptual Computing Group University of Lugano Intel Corporation Switzerland Israel Louvain-la-Neuve, 25 September 2015 1/54
1/54
2/54
3/54
X∈Rn×m f(X) s.t. X ∈ M
4/54
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008
4/54
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.
4/54
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.
4/54
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.
12Cayton, Dasgupta 2006
4/54
1Zhang, Fletcher 2013; 2Boufounos, Baraniuk 2008; 3Ten Berghe 1977; 4Sun et al.
12Cayton, Dasgupta 2006; 13Absil, Gallivan 2006; 14Kleinsteuber, Shen 2012.
5/54
x∈Rn x⊺Ax
5/54
x∈Rn x⊺Ax
6/54
X∈M f(X)
6/54
X∈M f(X)
6/54
X∈M f(X)
6/54
X∈M f(X)
6/54
X∈M f(X)
6/54
X∈M f(X)
7/54
7/54
7/54
7/54
8/54
8/54
8/54
8/54
9/54
X∈M f(X) + g(AX)
9/54
X∈M f(X) + g(AX)
9/54
X∈M f(X) + g(AX)
9/54
X∈M f(X) + g(AX)
10/54
11/54
X∈M f(X) + g(AX)
11/54
X∈M,Z f(X) + g(Z) s.t. Z = AX
11/54
X∈M,Z f(X) + g(Z) s.t. Z = AX
X∈M,Z f(X) + g(Z) + ρ 2∥AX − Z + U∥2 F
11/54
X∈M,Z f(X) + g(Z) s.t. Z = AX
X∈M,Z f(X) + g(Z) + ρ 2∥AX − Z + U∥2 F
12/54
X∈M
2∥AX − Z(k) + U (k)∥2 F
Z
2∥AX(k+1) − Z + U (k)∥2 F
12/54
X∈M
2∥AX − Z(k) + U (k)∥2 F
Z
2∥AX(k+1) − Z + U (k)∥2 F
12/54
X∈M
2∥AX − Z(k) + U (k)∥2 F
Z
2∥AX(k+1) − Z + U (k)∥2 F
12/54
X∈M
2∥AX − Z(k) + U (k)∥2 F
Z
2∥AX(k+1) − Z + U (k)∥2 F
13/54
14/54
15/54
Φ∈Rn×k tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
15/54
Φ∈Rn×k tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
F a.k.a. Dirichlet energy in physics
15/54
Φ∈Rn×k tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
F a.k.a. Dirichlet energy in physics
15/54
Φ∈Rn×k tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
F a.k.a. Dirichlet energy in physics
15/54
Φ∈Rn×k tr(Φ⊺∆Φ) s.t. Φ⊺Φ = I
F a.k.a. Dirichlet energy in physics
16/54
10 20 30 40 50 60 70 80 90 100 −0.2 0.2 10 20 30 40 50 60 70 80 90 100 −0.2 0.2 10 20 30 40 50 60 70 80 90 100 −0.2 0.2 10 20 30 40 50 60 70 80 90 100 −0.2 0.2 10 20 30 40 50 60 70 80 90 100 −0.2 0.2 10 20 30 40 50 60 70 80 90 100 −0.2 0.2
17/54
18/54
Φ∈Rn×k tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
18/54
Φ∈Rn×k tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
18/54
Φ∈Rn×k tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
18/54
Φ∈Rn×k tr(Φ⊺∆Φ) + µ∥Φ∥1 s.t. Φ⊺Φ = I
19/54
10 20 30 40 50 60 70 80 90 100 −2 2 4 6 10 20 30 40 50 60 70 80 90 100 −5 5 10 20 30 40 50 60 70 80 90 100 −5 5 10 20 30 40 50 60 70 80 90 100 −5 5 10 20 30 40 50 60 70 80 90 100 −5 5 10 20 30 40 50 60 70 80 90 100 −5 5
20/54
20/54
21/54
22/54
Φ∈Rn×k tr(Φ⊺∆Φ) + µ∥Φ∥1
22/54
Φ,P,Q∈Rn×k tr(Φ⊺∆Φ) + µ∥Q∥1
22/54
Φ,P,Q∈Rn×k tr(Φ⊺∆Φ) + µ∥Q∥1
Φ
2∥Φ−Q(k)+U (k)∥2 F+ ρ′ 2 ∥Φ−P (k)+V (k)∥2 F
Q
2∥Φ(k+1) − Q + U (k)∥2 F
P ρ′ 2 ∥Φ(k+1) − P + V (k)∥2 F
23/54
Φ∈S(n,k) tr(Φ⊺∆Φ) + µ∥Φ∥1
23/54
Φ∈S(n,k),Z tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ 2∥Φ − Z + U∥2 F
23/54
Φ∈S(n,k),Z tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ 2∥Φ − Z + U∥2 F
Φ∈S(n,k) tr(Φ⊺∆Φ) + ρ 2∥Φ − Z + U∥2 F
23/54
Φ∈S(n,k),Z tr(Φ⊺∆Φ) + µ∥Z∥1 + ρ 2∥Φ − Z + U∥2 F
Φ∈S(n,k) tr(Φ⊺∆Φ) + ρ 2∥Φ − Z + U∥2 F
Z
2∥Φ + U − Z∥2 F
24/54
Φ∈S(n,k)
2∥Φ − Z(k) + U (k)∥2 F
ρ (Φ(k+1) + U (k))
25/54
10−1 100 101 102 101 102 103
26/54
10−1 100 101 102 101 102 103
27/54
1,000 2,000 3,000 4,000 5,000 100 101 102 103
28/54
1,000 2,000 3,000 4,000 5,000 10−1 100 101 102
29/54
30/54
31/54
31/54
31/54
32/54
32/54
32/54
32/54
32/54
32/54
32/54
33/54
33/54
34/54
(X1,⋯,XL)∈SL(k,k) ∑ i≠j
L
i=1
i ΛiXi)
34/54
(X1,⋯,XL)∈SL(k,k) ∑ i≠j
L
i=1
i ΛiXi)
34/54
(X1,⋯,XL)∈SL(k,k) ∑ i≠j
L
i=1
i ΛiXi)
34/54
(X1,⋯,XL)∈SL(k,k) ∑ i≠j
L
i=1
i ΛiXi)
34/54
(X1,⋯,XL)∈SL(k,k) ∑ i≠j
L
i=1
i ΛiXi)
35/54
36/54
37/54
5 ⋅ 10−2 0.1 0.15 0.2 0.25 0.2 0.4 0.6 0.8 1
38/54
2 4 6 8 10 100.2 100.4
39/54
39/54
39/54
39/54
40/54
41/54
2 ≈ dij
42/54
2HDH
n11⊺ is the double-centering matrix
42/54
2HDH
n11⊺ is the double-centering matrix
43/54
2HDH
k
43/54
2HDH
k
X∈Rn×k ∥HDH − XX⊺∥2 F
44/54
2HDH
45/54
Seattle SF LA Denver NY WDC Atlanta Miami Houston Chicago
45/54
Seattle SF LA Denver NY WDC Atlanta Miami Houston Chicago
46/54
D∗∈EDM∥D − D∗∥1
46/54
D∗∈EDM∥D − D∗∥1
46/54
D∗∈EDM∥D − D∗∥1
46/54
D∗∈EDM∥D − D∗∥1
47/54
B∈S+(n,k)∥D − dist(B)∥1
48/54
B∈S+(n,k)
F
ρ
49/54
50/54
200 400 600 800 1,000 10−2 100 102
51/54
20 40 60 80 100 103.5 104
52/54
53/54
54/54