Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization
- L. T. K. Hien 1
- N. Gillis 1
- P. Patrinos 2
1University of Mons 2KU Leuven
The 37th International Conference on Machine Learning ICML 2020
1 / 44
Inertial Block Proximal Methods for Non-Convex Non-Smooth - - PowerPoint PPT Presentation
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization L. T. K. Hien 1 N. Gillis 1 P. Patrinos 2 1 University of Mons 2 KU Leuven The 37th International Conference on Machine Learning ICML 2020 1 / 44 Overview Problem set up 1
1University of Mons 2KU Leuven
1 / 44
2 / 44
3 / 44
4 / 44
+
+ (V ).
+(U:i), i = 1, . . . , r, and
+(Vi:), i = 1, . . . , r.
5 / 44
+ (X (i)). NCPD is rewritten as
6 / 44
7 / 44
1 Classical BCD methods update each block of variables as follows
8 / 44
2 Proximal BCD methods update each block of variables as follows
[1] H. Attouch, J. Bolte, P. Redont, and A. Soubeyran. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka - Lojasiewicz inequality. Mathematics of Operations Research, 35(2) : 438–457, 2010. 9 / 44
3 Proximal gradient BCD methods update each block of variables as
[2] J. Bolte, S. Sabach, and M. Teboulle. Proximal alternating linearized minimization for nonconvex and nonsmooth
10 / 44
11 / 44
[3] B. Polyak. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5) : 1–17, 1964. [4] Y. Nesterov. A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Mathematics Doklady, 27(2), 1983. 12 / 44
1 Classical BCD
2 Proximal BCD
3 Proximal gradient BCD
13 / 44
Initialize: Choose ˜ x(0) = ˜ x(−1). for k = 1, . . . do x(k,0) = ˜ x(k−1). for j = 1, . . . , Tk do Choose i ∈ {1, . . . , s} . Let yi be the value of the ith block before it was updated to x(k,j−1)
i
. Extrapolate ˆ xi = x(k,j−1)
i
+ α(k,j)
i
i
− yi
(3) and compute x(k,j)
i
= argmin
xi
F (k,j)
i
(xi ) + 1 2β(k,j)
i
xi − ˆ xi 2 . (4) Let x(k,j)
i′
= x(k,j−1)
i′
for i′ = i. end for Update ˜ x(k) = x(k,Tk ). end for
Initialize: Choose ˜ x(0) = ˜ x(−1). for k = 1, . . . do x(k,0) = ˜ x(k−1). for j = 1, . . . , Tk do Choose i ∈ {1, . . . , s}. Let yi be the value of the ith block before it was updated to x(k,j−1)
i
. Extrapolate ˆ xi = x(k,j−1)
i
+ α(k,j)
i
i
− yi
` xi = x(k,j−1)
i
+ γ(k,j)
i
i
− yi
(5) and compute x(k,j)
i
= argmin
xi
∇f (k,j)
i
(` xi ), xi − x(k,j−1)
i
1 2β(k,j)
i
xi − ˆ xi 2. (6) Let x(k,j)
i′
= x(k,j−1)
i′
for i′ = i. end for Update ˜ x(k) = x(k,Tk ). end for
14 / 44
15 / 44
i
1
i−1
i+1
s
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
(k,dk
i )
i
16 / 44
17 / 44
i
β(k,j)
i
,F (k,j)
i
i
β(k,j)
i
,gi ,f (k,j)
i
i′
i′
19 / 44
20 / 44
i
i
i +1)
i
21 / 44
22 / 44
23 / 44
i
i
i
i +1)
i
24 / 44
i
25 / 44
26 / 44
27 / 44
k∈N by
k ≤ Φ
n∈N of
k=1 ζk < ∞, and
28 / 44
29 / 44
30 / 44
U,V
F + IRm×r
+
+
1
1
2
2
i
i
i
τk
L(k−1)
i
˜ L(k)
i
i
i
2 (1 +
k−1) , ˘
[5] N. Gillis and F. Glineur. Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix
31 / 44
U:i ,Vi:
r
F + r
+ (U:i) +
2r
+(Vi:).
U:i ≥0
i−1
r
i: − (UV )V T i: + U:iVi:V T i: + 1/βi ˆ
i: + 1/βi
i
i
32 / 44
[6] T. Pock and S. Sabach. Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth
[7] N. Gillis and F. Glineur. Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix
[8] A. M. S. Ang and N. Gillis. Accelerating nonnegative matrix factorization algorithms using extrapolation. Neural Computation, 31(2):417–439, 2019. [9] Y. Xu and W. Yin. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on Imaging Sciences, 6(3):1758–1789, 2013. 33 / 44
34 / 44
35 / 44
5 10 15 20 Time (s.) 10-7 10-6 10-5 10-4 10-3
A-HALS E-A-HALS IBPG-A IBP APGC IBPG iPALM 5 10 15 20
Time (s.) 1 1.5 2 2.5 3 3.5 4 4.5
10-3 A-HALS E-A-HALS IBPG-A IBP APGC IBPG iPALM
36 / 44
37 / 44
38 / 44
5 10 15 20 Time (s.) 2 2.5 3 3.5 4 4.5 5
10-4 A-HALS E-A-HALS IBPG-A IBP APGC IBPG iPALM 5 10 15 20 Time (s.) 1 2 3 4 5 6 7 8
10-4 A-HALS E-A-HALS IBPG-A IBP APGC IBPG iPALM
39 / 44
40 / 44
Time (s.)
||X-UV||F/||X||F - emin
Time (s.)
||X-UV||F/||X||F - emin
41 / 44
42 / 44
43 / 44
44 / 44