Setup Restarting FISTA Restarting APPROX Adaptive restart
Restarting accelerated gradient methods with a rough strong convexity estimate
Olivier Fercoq
Joint work with Zheng Qu 20 March 2017
1/28
Restarting accelerated gradient methods with a rough strong - - PowerPoint PPT Presentation
Setup Restarting FISTA Restarting APPROX Adaptive restart Restarting accelerated gradient methods with a rough strong convexity estimate Olivier Fercoq Joint work with Zheng Qu 20 March 2017 1/28 Setup Restarting FISTA Restarting APPROX
Setup Restarting FISTA Restarting APPROX Adaptive restart
1/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
x∈RN{F(x) = f (x) + ψ(x)}
y∈RN ψ(y) + 1
L
2/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2Ax − b2 + λx1
3/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2x−yk2 L+ψ(x)
θk (xk+1 − yk)
θ4
k+4θ2 k−θ2 k
2
4/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2 z−zk2 L+ψ(z)
θ4
k+4θ2 k−θ2 k
2
5/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
n and z0 = x0.
k+1 = arg min z∈Rni
k+θknvi
k|2+ψi(z)
τ θk(zk+1 − zk)
θ4
k+4θ2 k−θ2 k
2
6/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
7/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
8/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2x−yk2 L+ψ(x)
θk (xk+1 − yk)
θ4
k+4θ2 k−θ2 k
2
αµF − 1
9/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
10/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
11/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
12/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
13/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
14/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
k−1
k−1
14/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
k−1
k−1
k−1
k−1
k−1
14/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
k−1
k−1
k−1
k−1
k−1
k−1
k−1
14/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2Ax − b2 2 + λx1, N = 4 (iris dataset)
15/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
S])] ≤ F(xk) + τ n
2h2 v
n and z0 = x0.
k+1 = arg min z∈Rni
k+θknvi
k|2+ψi(z)
τ θk(zk+1 − zk)
θ4
k+4θ2 k−θ2 k
2
16/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
k−1
k
k
i−1
k ≥ 0, i γi k = 1 and xk = i γi kzi 17/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
Z
i=0 γi
k
θ2
i−1xi +
θ0θk−1 − 1−θ0 θ2
µθ2 1+µ(1−θ0)
i=0 γi
k
θ2
i−1 +
1 θ0θk−1 − 1−θ0 θ2
θ2
0 (F(x) − F∗) +
1 2θ2
0 distv(x, X∗)2
18/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
n.
19/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
1 1+mK (µest).
0 K 2)
√ 3 θ0
1 µest − 2 θ0 + 1
τ
√µest, √µest µF
0∆(x0)
ǫ
4 √ 3 √µest
v ≤ ǫ. 20/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
10 -10 10 -8 10 -6 10 -4 10 -2 10 0 µ 10 -8 10 -7 10 -6 10 -5 10 -4 1-ρ
rate restarted approx bound on rate rate coordinate descent
21/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
10 -10 10 -8 10 -6 10 -4 10 -2 10 0 µF 10 -10 10 -8 10 -6 10 -4 10 -2 10 0 1-ρ
rate restarted approx bound on rate rate coordinate descent
22/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
λ1 2A⊤b∞
j=1 log(1 + exp(bja⊤ j x)) + x1 + µΨ 2 x2 1000 2000 3000 10
−10
10
−5
10 10
5
time log(Primal Dual Gap) rcv1; n = N = 47236; m = 20242; λ1=10000 µψ=1/(10n)
APCG ( µF) Acc+Restart (µF) CD
23/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
λ1 2A⊤b∞
j=1 log(1 + exp(bja⊤ j x)) + x1 + µΨ 2 x2 1000 2000 3000 10
−10
10
−5
10 10
5
time log(Primal Dual Gap) rcv1; λ1=10000 µψ=1/(10n)
APCG (1-10 µF) Acc+Restart (1-10 µF) CD
23/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
λ1 2A⊤b∞
j=1 log(1 + exp(bja⊤ j x)) + x1 + µΨ 2 x2 1000 2000 3000 10
−10
10
−5
10 10
5
time log(Primal Dual Gap) rcv1; λ1=10000 µψ=1/(10n)
APCG Acc+Restart CD
23/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
λ1 2A⊤b∞
j=1 log(1 + exp(bja⊤ j x)) + x1 + µΨ 2 x2 1000 2000 3000 10
−10
10
−5
10 10
5
time log(Primal Dual Gap) rcv1; λ1=10000 µψ=1/(10n)
APCG Acc+Restart CD
23/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
L Ψ
L∇f (x)
FTL(x) − x∗2 24/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
2x1 − x∗2, and x1 = TL(x0),
F
µ2
estx0 − TL(x0)2
25/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
r
Ki
4 √µr and σr = 1 1+µr/θ2
Kr −1
r+1
26/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
27/28
Setup Restarting FISTA Restarting APPROX Adaptive restart
28/28