New primal-dual subgradient methods for Convex Problems with - - PowerPoint PPT Presentation

new primal dual subgradient methods for convex problems
SMART_READER_LITE
LIVE PREVIEW

New primal-dual subgradient methods for Convex Problems with - - PowerPoint PPT Presentation

New primal-dual subgradient methods for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) January 12, 2015 (Les Houches) Yu. Nesterov New primal-dual methods for functional constraints 1/19 Outline 1 Constrained


slide-1
SLIDE 1

New primal-dual subgradient methods for Convex Problems with Functional Constraints

Yurii Nesterov, CORE/INMA (UCL) January 12, 2015 (Les Houches)

  • Yu. Nesterov

New primal-dual methods for functional constraints 1/19

slide-2
SLIDE 2

Outline

1 Constrained optimization problem 2 Lagrange multipliers 3 Dual function and dual problem 4 Augmented Lagrangian 5 Switching subgradient methods 6 Finding the dual multipliers 7 Complexity analysis

  • Yu. Nesterov

New primal-dual methods for functional constraints 2/19

slide-3
SLIDE 3

Optimization problem: simple constraints

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-4
SLIDE 4

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-5
SLIDE 5

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-6
SLIDE 6

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set:

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-7
SLIDE 7

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q,

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-8
SLIDE 8

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function:

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-9
SLIDE 9

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-10
SLIDE 10

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x). Optimality condition:

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-11
SLIDE 11

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x). Optimality condition: point x∗ ∈ Q is optimal iff

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-12
SLIDE 12

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x). Optimality condition: point x∗ ∈ Q is optimal iff ∇f (x∗), x − x∗ ≥ 0, ∀x ∈ Q.

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-13
SLIDE 13

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x). Optimality condition: point x∗ ∈ Q is optimal iff ∇f (x∗), x − x∗ ≥ 0, ∀x ∈ Q. Interpretation:

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-14
SLIDE 14

Optimization problem: simple constraints

Consider the problem: min

x∈Q f (x),

where Q is a closed convex set: x, y ∈ Q ⇒ [x, y] ⊆ Q, f is a subdifferentiable on Q convex function: f (y) ≥ f (x) + ∇f (x), y − x, x, y ∈ Q, ∇f (x) ∈ ∂f (x). Optimality condition: point x∗ ∈ Q is optimal iff ∇f (x∗), x − x∗ ≥ 0, ∀x ∈ Q. Interpretation: Function increases along any feasible direction.

  • Yu. Nesterov

New primal-dual methods for functional constraints 3/19

slide-15
SLIDE 15

Optimization problem: functional constraints

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-16
SLIDE 16

Optimization problem: functional constraints

Problem:

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-17
SLIDE 17

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-18
SLIDE 18

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-19
SLIDE 19

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set,

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-20
SLIDE 20

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m:

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-21
SLIDE 21

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-22
SLIDE 22

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951):

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-23
SLIDE 23

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-24
SLIDE 24

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff there exist Lagrange multipliers λ(i)

∗ ≥ 0, i = 1, . . . , m, such that

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-25
SLIDE 25

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff there exist Lagrange multipliers λ(i)

∗ ≥ 0, i = 1, . . . , m, such that

(1) : ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q,

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-26
SLIDE 26

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff there exist Lagrange multipliers λ(i)

∗ ≥ 0, i = 1, . . . , m, such that

(1) : ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, (2) : fi(x∗) ≤ 0, i = 1, . . . , m, (feasibility)

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-27
SLIDE 27

Optimization problem: functional constraints

Problem: min

x∈Q{f0(x), fi(x) ≤ 0, i = 1, . . . , m},

where Q is a closed convex set, all fi are convex and subdifferentiable on Q, i = 0, . . . , m: fi(y) ≥ fi(x) + ∇fi(x), y − x, x, y ∈ Q, ∇fi(x) ∈ ∂fi(x). Optimality condition (KKT, 1951): point x∗ ∈ Q is optimal iff there exist Lagrange multipliers λ(i)

∗ ≥ 0, i = 1, . . . , m, such that

(1) : ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, (2) : fi(x∗) ≤ 0, i = 1, . . . , m, (feasibility) (3) : λ(i)

∗ fi(x∗) = 0,

i = 1, . . . , m. (complementary slackness)

  • Yu. Nesterov

New primal-dual methods for functional constraints 4/19

slide-28
SLIDE 28

Lagrange multipliers: interpretation

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-29
SLIDE 29

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-30
SLIDE 30

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-31
SLIDE 31

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-32
SLIDE 32

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-33
SLIDE 33

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-34
SLIDE 34

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-35
SLIDE 35

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources. (Kantorovich, 1939)

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-36
SLIDE 36

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources. (Kantorovich, 1939) Application examples:

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-37
SLIDE 37

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources. (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-38
SLIDE 38

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources. (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues. Electrical networks: currents in the wires ⇔ voltage potentials, etc.

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-39
SLIDE 39

Lagrange multipliers: interpretation

Let I ⊆ {1, . . . , m} be an arbitrary set of indexes. Denote fI(x) = f0(x) +

i∈I

λ(i)

∗ fi(x). Consider the problem

PI : min

x∈Q{fI(x) : fi(x) ≤ 0, i ∈ I}.

Observation: in any case, x∗ is the optimal solution of problem PI. Interpretation: λ(i)

are the shadow prices for resources. (Kantorovich, 1939) Application examples: Traffic congestion: car flows on roads ⇔ size of queues. Electrical networks: currents in the wires ⇔ voltage potentials, etc. Main question: How to compute (x∗, λ∗)?

  • Yu. Nesterov

New primal-dual methods for functional constraints 5/19

slide-40
SLIDE 40

Algebraic interpretation

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-41
SLIDE 41

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-42
SLIDE 42

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q,

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-43
SLIDE 43

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-44
SLIDE 44

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

Define the dual function φ(λ) = min

x∈Q L(x, λ), λ ≥ 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-45
SLIDE 45

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

Define the dual function φ(λ) = min

x∈Q L(x, λ), λ ≥ 0. It is concave!

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-46
SLIDE 46

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

Define the dual function φ(λ) = min

x∈Q L(x, λ), λ ≥ 0. It is concave!

By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), with x(λ) ∈ Arg min

x∈Q L(x, λ).

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-47
SLIDE 47

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

Define the dual function φ(λ) = min

x∈Q L(x, λ), λ ≥ 0. It is concave!

By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), with x(λ) ∈ Arg min

x∈Q L(x, λ).

Conditions KKT(2,3): fi(x∗) ≤ 0, λ(i)

∗ fi(x∗) = 0, i = 1, . . . , m,

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-48
SLIDE 48

Algebraic interpretation

Consider the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x). Condition KKT(1): ∇f0(x∗) +

m

  • i=1

λ(i)

∗ ∇fi(x∗), x − x∗ ≥ 0,

∀x ∈ Q, implies x∗ ∈ Arg min

x∈Q L(x, λ∗).

Define the dual function φ(λ) = min

x∈Q L(x, λ), λ ≥ 0. It is concave!

By Danskin’s Theorem, ∇φ(λ) = (f1(x(λ)), . . . , fm(x(λ)), with x(λ) ∈ Arg min

x∈Q L(x, λ).

Conditions KKT(2,3): fi(x∗) ≤ 0, λ(i)

∗ fi(x∗) = 0, i = 1, . . . , m,

imply (x∗ = x(λ∗)) λ∗ ∈ Arg max

λ≥0 φ(λ).

  • Yu. Nesterov

New primal-dual methods for functional constraints 6/19

slide-49
SLIDE 49

Algorithmic aspects

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-50
SLIDE 50

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-51
SLIDE 51

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method:

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-52
SLIDE 52

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))).

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-53
SLIDE 53

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-54
SLIDE 54

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way.

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-55
SLIDE 55

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way. Main difficulties:

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-56
SLIDE 56

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming.

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-57
SLIDE 57

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion.

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-58
SLIDE 58

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion. Low rate of convergence

  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-59
SLIDE 59

Algorithmic aspects

Main idea: solve the dual problem max

λ≥0 φ(λ)

by the subgradient method: 1. Compute x(λk) and define ∇φ(λk) = (f1(x(λk)), . . . , fm(x(λk))). 2. Update λk+1 = ProjectRn

+ (λk + hk∇φ(λk)).

Stepsizes hk > 0 are defined in the usual way. Main difficulties: Each iteration is time consuming. Unclear termination criterion. Low rate of convergence (O 1

ǫ2

  • upper-level iterations).
  • Yu. Nesterov

New primal-dual methods for functional constraints 7/19

slide-60
SLIDE 60

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-61
SLIDE 61

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter.

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-62
SLIDE 62

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).
  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-63
SLIDE 63

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).

Main properties. Function ˆ φ is concave.

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-64
SLIDE 64

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).

Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1

K .

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-65
SLIDE 65

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).

Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1

K .

Its unconstrained maximum is attained at the optimal dual solution.

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-66
SLIDE 66

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).

Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1

K .

Its unconstrained maximum is attained at the optimal dual solution. The corresponding point ˆ x(λ∗) is the optimal primal solution.

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-67
SLIDE 67

Augmented Lagrangian (1970’s) [Hestenes, Powell, Rockafellar, Polyak, Bertsekas, . . .]

Define the Augmented Lagrangian

  • LK(x, λ) = f0(x) +

1 2K m

  • i=1
  • λ(i) + Kfi(x)

2

+ − 1 2K λ2 2,

λ ∈ Rm, where K > 0 is a penalty parameter. Consider the dual function ˆ φ(λ) = min

x∈Q

  • L(x, λ).

Main properties. Function ˆ φ is concave. Its gradient is Lipschitz continuous with constant 1

K .

Its unconstrained maximum is attained at the optimal dual solution. The corresponding point ˆ x(λ∗) is the optimal primal solution. Hint: Check that the equation

  • λ(i) + Kfi(x)
  • + = λ(i)

is equivalent to KKT(2,3).

  • Yu. Nesterov

New primal-dual methods for functional constraints 8/19

slide-68
SLIDE 68

Method of Augmented Lagrangians

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-69
SLIDE 69

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-70
SLIDE 70

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows:

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-71
SLIDE 71

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-72
SLIDE 72

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-73
SLIDE 73

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process. Disadvantages:

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-74
SLIDE 74

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-75
SLIDE 75

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-76
SLIDE 76

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination. No global complexity analysis.

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-77
SLIDE 77

Method of Augmented Lagrangians

Note that ∇ˆ φ(λ) = 1

K

  • λ(i) + Kfi(x)
  • + − 1

K λ.

Therefore, the usual gradient method λk+1 = λk + K∇ˆ φ(λk) is exactly as follows: Method: λk+1 = (λk + Kf (ˆ x(λk)))+. Advantage: Fast convergence of the dual process. Disadvantages: Difficult iteration. Unclear termination. No global complexity analysis. Do we have an alternative?

  • Yu. Nesterov

New primal-dual methods for functional constraints 9/19

slide-78
SLIDE 78

Problem formulation

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-79
SLIDE 79

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m},

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-80
SLIDE 80

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles,

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-81
SLIDE 81

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set.

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-82
SLIDE 82

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.)

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-83
SLIDE 83

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.) Defining the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x), x ∈ Q, λ ∈ Rm

+,

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-84
SLIDE 84

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.) Defining the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x), x ∈ Q, λ ∈ Rm

+,

we can introduce the Lagrangian dual problem f∗

def

= sup

λ∈Rm

+

φ(λ),

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-85
SLIDE 85

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.) Defining the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x), x ∈ Q, λ ∈ Rm

+,

we can introduce the Lagrangian dual problem f∗

def

= sup

λ∈Rm

+

φ(λ), where φ(λ) def = inf

x∈Q L(x, λ).

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-86
SLIDE 86

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.) Defining the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x), x ∈ Q, λ ∈ Rm

+,

we can introduce the Lagrangian dual problem f∗

def

= sup

λ∈Rm

+

φ(λ), where φ(λ) def = inf

x∈Q L(x, λ).

Clearly, f ∗ ≥ f∗.

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-87
SLIDE 87

Problem formulation

Problem: f ∗ = inf

x∈Q{f0(x) : fi(x) ≤ 0, i = 1, . . . , m}, where

fi(x), i = 0, . . . , m, are closed convex functions on Q endowed with a first-order black-box oracles, Q ⊂ E is a bounded simple closed convex set. (We can solve some auxiliary optimization problems over Q.) Defining the Lagrangian L(x, λ) = f0(x) +

m

  • i=1

λ(i)fi(x), x ∈ Q, λ ∈ Rm

+,

we can introduce the Lagrangian dual problem f∗

def

= sup

λ∈Rm

+

φ(λ), where φ(λ) def = inf

x∈Q L(x, λ).

Clearly, f ∗ ≥ f∗. Later, we will show f ∗ = f∗ algorithmically.

  • Yu. Nesterov

New primal-dual methods for functional constraints 10/19

slide-88
SLIDE 88

Bregman distances

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-89
SLIDE 89

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-90
SLIDE 90

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-91
SLIDE 91

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-92
SLIDE 92

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0. Bregman distance: β(x, y) = d(y) − d(x) − ∇d(x), y − x, x, y ∈ Q.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-93
SLIDE 93

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0. Bregman distance: β(x, y) = d(y) − d(x) − ∇d(x), y − x, x, y ∈ Q. Clearly, β(x, y) ≥ 1

2x − y2 for all x, y ∈ Q.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-94
SLIDE 94

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0. Bregman distance: β(x, y) = d(y) − d(x) − ∇d(x), y − x, x, y ∈ Q. Clearly, β(x, y) ≥ 1

2x − y2 for all x, y ∈ Q.

Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-95
SLIDE 95

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0. Bregman distance: β(x, y) = d(y) − d(x) − ∇d(x), y − x, x, y ∈ Q. Clearly, β(x, y) ≥ 1

2x − y2 for all x, y ∈ Q.

Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define Bh(x, g) = arg min

y∈Q{hg, y − x + β(x, y)}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-96
SLIDE 96

Bregman distances

Prox-function: d(·) is strongly convex on Q with parameter one: d(y) ≥ d(x) + ∇d(x), y − x + 1

2y − x2,

x, y ∈ Q. Denote by x0 the prox-center of the set Q: x0 = arg min

x∈Q d(x).

Assume d(x0) = 0. Bregman distance: β(x, y) = d(y) − d(x) − ∇d(x), y − x, x, y ∈ Q. Clearly, β(x, y) ≥ 1

2x − y2 for all x, y ∈ Q.

Bregman mapping: for x ∈ Q, g ∈ E ∗ and h > 0 define Bh(x, g) = arg min

y∈Q{hg, y − x + β(x, y)}.

Examples: Euclidean distance, Entropy distance, etc.

  • Yu. Nesterov

New primal-dual methods for functional constraints 11/19

slide-97
SLIDE 97

Switching subgradient methods: Primal Method

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-98
SLIDE 98

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-99
SLIDE 99

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-100
SLIDE 100

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 :

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-101
SLIDE 101

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-102
SLIDE 102

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .
  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-103
SLIDE 103

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .

c) If Ik = ∅, then choose arbitrary ik ∈ Ik and define hk =

fik (xk) ∇fik (xk)2

∗ .

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-104
SLIDE 104

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .

c) If Ik = ∅, then choose arbitrary ik ∈ Ik and define hk =

fik (xk) ∇fik (xk)2

∗ . Compute xk+1 = Bhk(xk, ∇fik(xk)).

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-105
SLIDE 105

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .

c) If Ik = ∅, then choose arbitrary ik ∈ Ik and define hk =

fik (xk) ∇fik (xk)2

∗ . Compute xk+1 = Bhk(xk, ∇fik(xk)).

After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-106
SLIDE 106

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .

c) If Ik = ∅, then choose arbitrary ik ∈ Ik and define hk =

fik (xk) ∇fik (xk)2

∗ . Compute xk+1 = Bhk(xk, ∇fik(xk)).

After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}. Denote N(t) = |F(t)|.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-107
SLIDE 107

Switching subgradient methods: Primal Method

Input parameter: the step size h > 0. Initialization : Compute the prox-center x0. Iteration k ≥ 0 : a) Define Ik = {i ∈ {1, . . . , m} : fi(xk) > h∇fi(xk)∗}. b) If Ik = ∅, then compute xk+1 = Bh

  • xk,

∇f0(xk) ∇f0(xk)∗

  • .

c) If Ik = ∅, then choose arbitrary ik ∈ Ik and define hk =

fik (xk) ∇fik (xk)2

∗ . Compute xk+1 = Bhk(xk, ∇fik(xk)).

After t ≥ 0 iterations, define Ft = {k ∈ {0, . . . , t} : Ik = ∅}. Denote N(t) = |F(t)|. It is possible that N(t) = 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 12/19

slide-108
SLIDE 108

Finding the dual multipliers

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-109
SLIDE 109

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows:

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-110
SLIDE 110

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-111
SLIDE 111

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m,

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-112
SLIDE 112

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m.

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-113
SLIDE 113

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m. Denote St =

k∈Ft 1 ∇f0(xk)∗ .

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-114
SLIDE 114

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m. Denote St =

k∈Ft 1 ∇f0(xk)∗ . If Ft = ∅, then we define St = 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-115
SLIDE 115

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m. Denote St =

k∈Ft 1 ∇f0(xk)∗ . If Ft = ∅, then we define St = 0.

For proving convergence of the switching strategy,

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-116
SLIDE 116

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m. Denote St =

k∈Ft 1 ∇f0(xk)∗ . If Ft = ∅, then we define St = 0.

For proving convergence of the switching strategy, we find an upper bound for the gap δt = 1

St

  • k∈F(t)

f0(xk) ∇f0(xk)∗ − φ(λt),

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-117
SLIDE 117

Finding the dual multipliers

if N(t) > 0, define the dual multipliers as follows: λ(0)

t

= h

k∈Ft 1 ∇f0(xk)∗ ,

λ(i)

t

=

1 λ(0)

t

  • k∈Ai(t)

hk, i = 1, . . . , m, where Ai(t) = {k ∈ {0, . . . , t} : ik = i}, 0 ≤ i ≤ m. Denote St =

k∈Ft 1 ∇f0(xk)∗ . If Ft = ∅, then we define St = 0.

For proving convergence of the switching strategy, we find an upper bound for the gap δt = 1

St

  • k∈F(t)

f0(xk) ∇f0(xk)∗ − φ(λt),

assuming that N(t) > 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 13/19

slide-118
SLIDE 118

Convergence result

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-119
SLIDE 119

Convergence result

Main inequality: λ(0)

t δt

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-120
SLIDE 120

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-121
SLIDE 121

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-122
SLIDE 122

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-123
SLIDE 123

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D,

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-124
SLIDE 124

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-125
SLIDE 125

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-126
SLIDE 126

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-127
SLIDE 127

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof:

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-128
SLIDE 128

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-129
SLIDE 129

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-130
SLIDE 130

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0. This is impossible for t big enough.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-131
SLIDE 131

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0. This is impossible for t big enough. Finally, λ(0)

t

≥ h

M N(t).

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-132
SLIDE 132

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0. This is impossible for t big enough. Finally, λ(0)

t

≥ h

M N(t). Therefore, if t is big enough,

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-133
SLIDE 133

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0. This is impossible for t big enough. Finally, λ(0)

t

≥ h

M N(t). Therefore, if t is big enough, then

δt ≤ N(t)h2

λ(0)

t

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-134
SLIDE 134

Convergence result

Main inequality: λ(0)

t δt ≤ r0(x) + 1 2N(t)h2 − 1 2(t − N(t))h2 = r0(x) − 1 2th2 + N(t)h2.

Denote D = max

x∈Q r0(x).

  • Theorem. If the number t ≥ 2

h2 D, then F(t) = ∅.

In this case δt ≤ Mh and max

1≤i≤m fi(xk) ≤ Mh, k ∈ F(t)

where M = max

0≤k≤t max 0≤i≤m ∇fi(xk)∗.

Proof: If F(t) = ∅, then N(t) = 0. Consequently, λ(0)

t

= 0. This is impossible for t big enough. Finally, λ(0)

t

≥ h

M N(t). Therefore, if t is big enough, then

δt ≤ N(t)h2

λ(0)

t

≤ Mh.

  • Yu. Nesterov

New primal-dual methods for functional constraints 14/19

slide-135
SLIDE 135

Dual subgradient method

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-136
SLIDE 136

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak.

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-137
SLIDE 137

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak. Initialization : Define ℓ0(x) ≡ 0, x ∈ Q.

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-138
SLIDE 138

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak. Initialization : Define ℓ0(x) ≡ 0, x ∈ Q. Iteration k ≥ 0 : a) Compute xk = arg min

x∈Q {ℓk(x) + βkd(x)}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-139
SLIDE 139

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak. Initialization : Define ℓ0(x) ≡ 0, x ∈ Q. Iteration k ≥ 0 : a) Compute xk = arg min

x∈Q {ℓk(x) + βkd(x)}.

b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ǫ}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-140
SLIDE 140

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak. Initialization : Define ℓ0(x) ≡ 0, x ∈ Q. Iteration k ≥ 0 : a) Compute xk = arg min

x∈Q {ℓk(x) + βkd(x)}.

b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ǫ}. c) If Ik = ∅, then ℓk+1(x) = ℓk(x) + ak[f (0)(xk) + ∇f (0)(xk), x − xk].

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-141
SLIDE 141

Dual subgradient method

Define the averaging coefficients {ak}k≥0, and nondecreasing scaling coefficients {βk}k≥0. Denote At =

t

  • k=0

ak. Initialization : Define ℓ0(x) ≡ 0, x ∈ Q. Iteration k ≥ 0 : a) Compute xk = arg min

x∈Q {ℓk(x) + βkd(x)}.

b) Define Ik = {i ∈ [1 : m] : f (i)(xk) ≥ ǫ}. c) If Ik = ∅, then ℓk+1(x) = ℓk(x) + ak[f (0)(xk) + ∇f (0)(xk), x − xk]. d) If Ik = ∅, then choose arbitrary ik ∈ Ik and define ℓk+1(x) = ℓk(x) + ak[f (ik)(xk) + ∇f (ik)(xk), x − xk].

  • Yu. Nesterov

New primal-dual methods for functional constraints 15/19

slide-142
SLIDE 142

Convergence result

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-143
SLIDE 143

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m.

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-144
SLIDE 144

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m. If N(t) > 0, then define the gap δt = 1

σt

  • k∈F(t)

akf (0)(xk) − φ(λt).

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-145
SLIDE 145

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m. If N(t) > 0, then define the gap δt = 1

σt

  • k∈F(t)

akf (0)(xk) − φ(λt). Denote D = max

x∈Q d(x).

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-146
SLIDE 146

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m. If N(t) > 0, then define the gap δt = 1

σt

  • k∈F(t)

akf (0)(xk) − φ(λt). Denote D = max

x∈Q d(x).

  • T. Let all subgradients be bounded by M. Then for any t ≥ 0

σt · (δt − ǫ) + Atǫ ≤ βt+1D + 1

2M2 t

  • k=0

a2

k

βk .

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-147
SLIDE 147

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m. If N(t) > 0, then define the gap δt = 1

σt

  • k∈F(t)

akf (0)(xk) − φ(λt). Denote D = max

x∈Q d(x).

  • T. Let all subgradients be bounded by M. Then for any t ≥ 0

σt · (δt − ǫ) + Atǫ ≤ βt+1D + 1

2M2 t

  • k=0

a2

k

βk .

If Atǫ > βt+1D + 1

2M2 t

  • k=0

a2

k

βk , then λ(0) t

> 0 and δt ≤ ǫ.

  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-148
SLIDE 148

Convergence result

Define A0(t) = {k ∈ [0 : t] : Ik = ∅}, N(t) = |A0(t)|, and σt =

  • k∈A0(t)

ak, λ(i)

t

=

1 σt

  • k∈Ai(t)

ak, i = 1, . . . , m, where Ai(t) = {k ∈ [0 : t] : ik = i}, 1 ≤ i ≤ m. If N(t) > 0, then define the gap δt = 1

σt

  • k∈F(t)

akf (0)(xk) − φ(λt). Denote D = max

x∈Q d(x).

  • T. Let all subgradients be bounded by M. Then for any t ≥ 0

σt · (δt − ǫ) + Atǫ ≤ βt+1D + 1

2M2 t

  • k=0

a2

k

βk .

If Atǫ > βt+1D + 1

2M2 t

  • k=0

a2

k

βk , then λ(0) t

> 0 and δt ≤ ǫ. Example: at ≡ 1, βt ≈ √t ⇒ t ≈ O 1

ǫ2

  • .
  • Yu. Nesterov

New primal-dual methods for functional constraints 16/19

slide-149
SLIDE 149

Quasi-monotone method

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-150
SLIDE 150

Quasi-monotone method

Initialization : Define ℓ0(x) ≡ 0, x0 = ⊲ ⊳, and σ0 = 0. Iteration t ≥ 0 :

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-151
SLIDE 151

Quasi-monotone method

Initialization : Define ℓ0(x) ≡ 0, x0 = ⊲ ⊳, and σ0 = 0. Iteration t ≥ 0 : a) Set vt = arg min

x∈Q {ℓt(x) + βtd(x)}, It def

= {i ∈ [1 : m] : f (i)(vt) ≥ ǫ}.

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-152
SLIDE 152

Quasi-monotone method

Initialization : Define ℓ0(x) ≡ 0, x0 = ⊲ ⊳, and σ0 = 0. Iteration t ≥ 0 : a) Set vt = arg min

x∈Q {ℓt(x) + βtd(x)}, It def

= {i ∈ [1 : m] : f (i)(vt) ≥ ǫ}. b) If It = ∅, then set xt+1 = xt, σt+1 = σt, and choose arbitrary it ∈ It. Update ℓt+1(x) = ℓt(x) + at+1[f (it)(vt) + ∇f (it)(vt), x − vt].

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-153
SLIDE 153

Quasi-monotone method

Initialization : Define ℓ0(x) ≡ 0, x0 = ⊲ ⊳, and σ0 = 0. Iteration t ≥ 0 : a) Set vt = arg min

x∈Q {ℓt(x) + βtd(x)}, It def

= {i ∈ [1 : m] : f (i)(vt) ≥ ǫ}. b) If It = ∅, then set xt+1 = xt, σt+1 = σt, and choose arbitrary it ∈ It. Update ℓt+1(x) = ℓt(x) + at+1[f (it)(vt) + ∇f (it)(vt), x − vt]. c) Otherwise, σt+1 = σt + at+1, τt = at+1

σt+1 , xt+1 = (1 − τt)xt + τtvt.

Update ℓt+1(x) = ℓt(x) + at+1[f (0)(xt+1) + ∇f (0)(xt+1), x − xt+1].

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-154
SLIDE 154

Quasi-monotone method

Initialization : Define ℓ0(x) ≡ 0, x0 = ⊲ ⊳, and σ0 = 0. Iteration t ≥ 0 : a) Set vt = arg min

x∈Q {ℓt(x) + βtd(x)}, It def

= {i ∈ [1 : m] : f (i)(vt) ≥ ǫ}. b) If It = ∅, then set xt+1 = xt, σt+1 = σt, and choose arbitrary it ∈ It. Update ℓt+1(x) = ℓt(x) + at+1[f (it)(vt) + ∇f (it)(vt), x − vt]. c) Otherwise, σt+1 = σt + at+1, τt = at+1

σt+1 , xt+1 = (1 − τt)xt + τtvt.

Update ℓt+1(x) = ℓt(x) + at+1[f (0)(xt+1) + ∇f (0)(xt+1), x − xt+1]. Operation x0 = ⊲ ⊳ ∈ E indicates that x0 is not chosen yet.

  • Yu. Nesterov

New primal-dual methods for functional constraints 17/19

slide-155
SLIDE 155

Convergence result

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-156
SLIDE 156

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-157
SLIDE 157

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • 2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,

σt(f (0)(xt) − ǫ) + Atǫ ≤ ℓt(x) + βtd(x) + 1

2M2 t

  • k=1

a2

k

βk−1 .

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-158
SLIDE 158

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • 2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,

σt(f (0)(xt) − ǫ) + Atǫ ≤ ℓt(x) + βtd(x) + 1

2M2 t

  • k=1

a2

k

βk−1 .

  • 3. No later than Atǫ > βtD + 1

2M2 t

  • k=1

a2

k

βk−1 , we get σt > 0 and

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-159
SLIDE 159

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • 2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,

σt(f (0)(xt) − ǫ) + Atǫ ≤ ℓt(x) + βtd(x) + 1

2M2 t

  • k=1

a2

k

βk−1 .

  • 3. No later than Atǫ > βtD + 1

2M2 t

  • k=1

a2

k

βk−1 , we get σt > 0 and

f (xt) − φ(λt) ≤ f (xt) − 1

σt min x∈Q ℓt(x) ≤ ǫ.

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-160
SLIDE 160

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • 2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,

σt(f (0)(xt) − ǫ) + Atǫ ≤ ℓt(x) + βtd(x) + 1

2M2 t

  • k=1

a2

k

βk−1 .

  • 3. No later than Atǫ > βtD + 1

2M2 t

  • k=1

a2

k

βk−1 , we get σt > 0 and

f (xt) − φ(λt) ≤ f (xt) − 1

σt min x∈Q ℓt(x) ≤ ǫ.

Example: at ≡ 1, βt ≈ √t ⇒ t ≈ O 1

ǫ2

  • .
  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-161
SLIDE 161

Convergence result

  • Theorem. 1. All points xt = x⊲

⊳ are ǫ-feasible.

  • 2. If all subgradients are bounded by M, then ∀x ∈ Q, t ≥ 0,

σt(f (0)(xt) − ǫ) + Atǫ ≤ ℓt(x) + βtd(x) + 1

2M2 t

  • k=1

a2

k

βk−1 .

  • 3. No later than Atǫ > βtD + 1

2M2 t

  • k=1

a2

k

βk−1 , we get σt > 0 and

f (xt) − φ(λt) ≤ f (xt) − 1

σt min x∈Q ℓt(x) ≤ ǫ.

Example: at ≡ 1, βt ≈ √t ⇒ t ≈ O 1

ǫ2

  • .

NB: this is true for the whole sequence!

  • Yu. Nesterov

New primal-dual methods for functional constraints 18/19

slide-162
SLIDE 162

Conclusion

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-163
SLIDE 163

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-164
SLIDE 164

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation
  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-165
SLIDE 165

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation:

relative importance of corresponding constraints during the adjustments process.

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-166
SLIDE 166

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation:

relative importance of corresponding constraints during the adjustments process.

  • 3. However, it has optimal worst-case efficiency estimate
  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-167
SLIDE 167

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation:

relative importance of corresponding constraints during the adjustments process.

  • 3. However, it has optimal worst-case efficiency estimate even if

the dual optimal solution does not exist.

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-168
SLIDE 168

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation:

relative importance of corresponding constraints during the adjustments process.

  • 3. However, it has optimal worst-case efficiency estimate even if

the dual optimal solution does not exist.

  • 4. Many interesting questions (influence of smoothness, strong

convexity, etc.)

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19

slide-169
SLIDE 169

Conclusion

  • 1. Optimal primal-dual solution can be approximated by simple

switching subgradient schemes.

  • 2. Approximations of dual multipliers have natural interpretation:

relative importance of corresponding constraints during the adjustments process.

  • 3. However, it has optimal worst-case efficiency estimate even if

the dual optimal solution does not exist.

  • 4. Many interesting questions (influence of smoothness, strong

convexity, etc.) Thank you for your attention!

  • Yu. Nesterov

New primal-dual methods for functional constraints 19/19