Optimization with Online and Massive Data Yinyu Ye K.T. Li Chair - - PowerPoint PPT Presentation

optimization with online and massive data
SMART_READER_LITE
LIVE PREVIEW

Optimization with Online and Massive Data Yinyu Ye K.T. Li Chair - - PowerPoint PPT Presentation

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM) Optimization with Online and Massive Data Yinyu Ye K.T. Li Chair Professor of Engineering Department of


slide-1
SLIDE 1

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Optimization with Online and Massive Data

Yinyu Ye

K.T. Li Chair Professor of Engineering Department of Management Science and Engineering Stanford University (Guanghua School of Management, Peking University) 2014 Workshop on Optimization for Modern Computation

September 4, 2014

Yinyu Ye September 2-4 2014

slide-2
SLIDE 2

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Outline

We present optimization models and/or computational algorithms dealing with online/streamline, structured, and/or massively distributed data:

◮ Online Linear Programming ◮ Least Squares with Nonconvex Regularization ◮ The ADMM Method with Multiple Blocks

Yinyu Ye September 2-4 2014

slide-3
SLIDE 3

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Background

Consider a store that sells a number of goods/products

◮ There is a fixed selling period

Yinyu Ye September 2-4 2014

slide-4
SLIDE 4

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Background

Consider a store that sells a number of goods/products

◮ There is a fixed selling period ◮ There is a fixed inventory of goods

Yinyu Ye September 2-4 2014

slide-5
SLIDE 5

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Background

Consider a store that sells a number of goods/products

◮ There is a fixed selling period ◮ There is a fixed inventory of goods ◮ Customers come and require a bundle of goods and bid for

certain prices

Yinyu Ye September 2-4 2014

slide-6
SLIDE 6

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Background

Consider a store that sells a number of goods/products

◮ There is a fixed selling period ◮ There is a fixed inventory of goods ◮ Customers come and require a bundle of goods and bid for

certain prices

◮ Objective: Maximize the revenue

Yinyu Ye September 2-4 2014

slide-7
SLIDE 7

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Background

Consider a store that sells a number of goods/products

◮ There is a fixed selling period ◮ There is a fixed inventory of goods ◮ Customers come and require a bundle of goods and bid for

certain prices

◮ Objective: Maximize the revenue ◮ Decision: Accept or not?

Yinyu Ye September 2-4 2014

slide-8
SLIDE 8

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

An Example

  • rder 1(t = 1)
  • rder 2(t = 2)

..... Inventory(b) Price(πt) $100 $30 ... Decision x1 x2 ... Pants 1 ... 100 Shoes 1 ... 50 T-shirts 1 ... 500 Jackets ... 200 Hats 1 1 ... 1000

Yinyu Ye September 2-4 2014

slide-9
SLIDE 9

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Online Linear Programming Model

The classical offline version of the above program can be formulated as a linear (integer) program as all data would have arrived: maximizex n

t=1 πtxt

subject to n

t=1 aitxt ≤ bi,

∀i = 1, ..., m 0 ≤ xt ≤ 1, ∀t = 1, ..., n

Yinyu Ye September 2-4 2014

slide-10
SLIDE 10

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Online Linear Programming Model

The classical offline version of the above program can be formulated as a linear (integer) program as all data would have arrived: maximizex n

t=1 πtxt

subject to n

t=1 aitxt ≤ bi,

∀i = 1, ..., m 0 ≤ xt ≤ 1, ∀t = 1, ..., n Now we consider the online or streamline and data-driven version

  • f this problem:

◮ We only know b and n at the start

Yinyu Ye September 2-4 2014

slide-11
SLIDE 11

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Online Linear Programming Model

The classical offline version of the above program can be formulated as a linear (integer) program as all data would have arrived: maximizex n

t=1 πtxt

subject to n

t=1 aitxt ≤ bi,

∀i = 1, ..., m 0 ≤ xt ≤ 1, ∀t = 1, ..., n Now we consider the online or streamline and data-driven version

  • f this problem:

◮ We only know b and n at the start ◮ the constraint matrix is revealed column by column

sequentially along with the corresponding objective coefficient.

Yinyu Ye September 2-4 2014

slide-12
SLIDE 12

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Online Linear Programming Model

The classical offline version of the above program can be formulated as a linear (integer) program as all data would have arrived: maximizex n

t=1 πtxt

subject to n

t=1 aitxt ≤ bi,

∀i = 1, ..., m 0 ≤ xt ≤ 1, ∀t = 1, ..., n Now we consider the online or streamline and data-driven version

  • f this problem:

◮ We only know b and n at the start ◮ the constraint matrix is revealed column by column

sequentially along with the corresponding objective coefficient.

◮ an irrevocable decision must be made as soon as an order

arrives without observing or knowing the future data.

Yinyu Ye September 2-4 2014

slide-13
SLIDE 13

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Application Overview

◮ Revenue management problems: Airline tickets booking, hotel

booking;

◮ Online network routing on an edge-capacitated network; ◮ Combinatorial auction; ◮ Online adwords allocation

Yinyu Ye September 2-4 2014

slide-14
SLIDE 14

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Model Assumptions

Main Assumptions

◮ The columns at arrive in a random order. ◮ 0 ≤ ait ≤ 1, for all (i, t); ◮ πt ≥ 0 for all t

Yinyu Ye September 2-4 2014

slide-15
SLIDE 15

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Model Assumptions

Main Assumptions

◮ The columns at arrive in a random order. ◮ 0 ≤ ait ≤ 1, for all (i, t); ◮ πt ≥ 0 for all t

Denote the offline maximal value by OPT(A, π). We call an online algorithm A to be c-competitive if and only if Eσ n

  • t=1

πtxt(σ, A)

  • ≥ c · OPT(A, π),

where σ is the permutation of arriving order.

Yinyu Ye September 2-4 2014

slide-16
SLIDE 16

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Learning Algorithm is Needed

◮ There is no distribution known so that any type of stochastic

  • ptimization models is not applicable.

Yinyu Ye September 2-4 2014

slide-17
SLIDE 17

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Learning Algorithm is Needed

◮ There is no distribution known so that any type of stochastic

  • ptimization models is not applicable.

◮ Unlike dynamic programming, the decision maker does not

have full information/data so that a backward recursion can not be carried out to find an optimal sequential decision policy.

Yinyu Ye September 2-4 2014

slide-18
SLIDE 18

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Learning Algorithm is Needed

◮ There is no distribution known so that any type of stochastic

  • ptimization models is not applicable.

◮ Unlike dynamic programming, the decision maker does not

have full information/data so that a backward recursion can not be carried out to find an optimal sequential decision policy.

◮ Thus, the online algorithm needs to be learning-based, in

particular, learning-while-doing.

Yinyu Ye September 2-4 2014

slide-19
SLIDE 19

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Sufficient and Necessary Results

Theorem

For any fixed ǫ > 0, there is a 1 − ǫ competitive online algorithm for the problem on all inputs when B = mini bi ≥ Ω

  • m log (n/ǫ)

ǫ2

  • Yinyu Ye

September 2-4 2014

slide-20
SLIDE 20

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Sufficient and Necessary Results

Theorem

For any fixed ǫ > 0, there is a 1 − ǫ competitive online algorithm for the problem on all inputs when B = mini bi ≥ Ω

  • m log (n/ǫ)

ǫ2

  • Theorem

For any online algorithm for the online linear program in random

  • rder model, there exists an instance such that the competitive

ratio is less than 1 − ǫ if B = min

i

bi ≤ log(m) ǫ2 .

Yinyu Ye September 2-4 2014

slide-21
SLIDE 21

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Sufficient and Necessary Results

Theorem

For any fixed ǫ > 0, there is a 1 − ǫ competitive online algorithm for the problem on all inputs when B = mini bi ≥ Ω

  • m log (n/ǫ)

ǫ2

  • Theorem

For any online algorithm for the online linear program in random

  • rder model, there exists an instance such that the competitive

ratio is less than 1 − ǫ if B = min

i

bi ≤ log(m) ǫ2 . Agrawal, Wang and Y [Operations Research, to appear 2014]

Yinyu Ye September 2-4 2014

slide-22
SLIDE 22

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Key Observation and Idea of the Online Algorithm I

The problem would be easy if there is a ”fair and optimal price” vector:

  • rder 1(t = 1)
  • rder 2(t = 2)

..... Inventory(b) p∗ Bid(πt) $100 $30 ... Decision x1 x2 ... Pants 1 ... 100 $45 Shoes 1 ... 50 $45 T-shirts 1 ... 500 $10 Jackets ... 200 $55 Hats 1 1 ... 1000 $15

Yinyu Ye September 2-4 2014

slide-23
SLIDE 23

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Key Observation and Idea of the Online Algorithm II

◮ Pricing the bid: The optimal dual price vector p∗ of the offline

problem can play such a role, that is x∗

t = 1 if πt > aT t p∗ and

x∗

t = 0 otherwise, yields a near-optimal solution as long as

(m/n) is sufficiently small.

Yinyu Ye September 2-4 2014

slide-24
SLIDE 24

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Key Observation and Idea of the Online Algorithm II

◮ Pricing the bid: The optimal dual price vector p∗ of the offline

problem can play such a role, that is x∗

t = 1 if πt > aT t p∗ and

x∗

t = 0 otherwise, yields a near-optimal solution as long as

(m/n) is sufficiently small.

◮ Based on this observation, our online algorithm works by

learning a threshold price vector ˆ p and use ˆ p to price the bids.

Yinyu Ye September 2-4 2014

slide-25
SLIDE 25

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Key Observation and Idea of the Online Algorithm II

◮ Pricing the bid: The optimal dual price vector p∗ of the offline

problem can play such a role, that is x∗

t = 1 if πt > aT t p∗ and

x∗

t = 0 otherwise, yields a near-optimal solution as long as

(m/n) is sufficiently small.

◮ Based on this observation, our online algorithm works by

learning a threshold price vector ˆ p and use ˆ p to price the bids.

◮ One-time learning algorithm: learns the price vector once

using the initial ǫn input (1/ǫ3): maxx

ǫn

  • t=1

πtxt s.t.

ǫn

  • t=1

aitxt ≤ (1 − ǫ)ǫbi, 0 ≤ xt ≤ 1, ∀i, t.

Yinyu Ye September 2-4 2014

slide-26
SLIDE 26

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Key Observation and Idea of the Online Algorithm II

◮ Pricing the bid: The optimal dual price vector p∗ of the offline

problem can play such a role, that is x∗

t = 1 if πt > aT t p∗ and

x∗

t = 0 otherwise, yields a near-optimal solution as long as

(m/n) is sufficiently small.

◮ Based on this observation, our online algorithm works by

learning a threshold price vector ˆ p and use ˆ p to price the bids.

◮ One-time learning algorithm: learns the price vector once

using the initial ǫn input (1/ǫ3): maxx

ǫn

  • t=1

πtxt s.t.

ǫn

  • t=1

aitxt ≤ (1 − ǫ)ǫbi, 0 ≤ xt ≤ 1, ∀i, t.

◮ Dynamic learning algorithm: dynamically updates the price

vector at a carefully chosen pace (1/ǫ2).

Yinyu Ye September 2-4 2014

slide-27
SLIDE 27

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Geometric Pace of Price Updating

Yinyu Ye September 2-4 2014

slide-28
SLIDE 28

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Related Work on Random-Permutation

Sufficient Condition Learning Kleinberg [2005] B ≥ 1

ǫ2 , for m = 1

Dynamic Devanur et al [2009] OPT ≥ m2 log(n)

ǫ3

One-time Feldman et al [2010] B ≥ m log n

ǫ3

and OPT ≥ m log n

ǫ

One-time Agrawal et al [2010] B ≥ m log n

ǫ2

  • r OPT ≥ m2 log n

ǫ2

Dynamic Molinaro and Ravi [2013] B ≥ m2 log m

ǫ2

Dynamic Kesselheim et al [2014] B ≥ log m

ǫ2

Dynamic* Gupta and Molinaro [2014] B ≥ log m

ǫ2

Dynamic* Table: Comparison of several existing results

Yinyu Ye September 2-4 2014

slide-29
SLIDE 29

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on OLP

◮ We have designed a dynamic near-optimal online algorithm for

a very general class of online linear programming problems.

Yinyu Ye September 2-4 2014

slide-30
SLIDE 30

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on OLP

◮ We have designed a dynamic near-optimal online algorithm for

a very general class of online linear programming problems.

◮ The algorithm is distribution-free, thus is robust to

distribution/data uncertainty.

Yinyu Ye September 2-4 2014

slide-31
SLIDE 31

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on OLP

◮ We have designed a dynamic near-optimal online algorithm for

a very general class of online linear programming problems.

◮ The algorithm is distribution-free, thus is robust to

distribution/data uncertainty.

◮ The dynamic learning algorithm has the feature of

“learning-while-doing”, and the pace the price is updated is neither too fast nor too slow...

Yinyu Ye September 2-4 2014

slide-32
SLIDE 32

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on OLP

◮ We have designed a dynamic near-optimal online algorithm for

a very general class of online linear programming problems.

◮ The algorithm is distribution-free, thus is robust to

distribution/data uncertainty.

◮ The dynamic learning algorithm has the feature of

“learning-while-doing”, and the pace the price is updated is neither too fast nor too slow...

◮ Buy-and-sell model?

Yinyu Ye September 2-4 2014

slide-33
SLIDE 33

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on OLP

◮ We have designed a dynamic near-optimal online algorithm for

a very general class of online linear programming problems.

◮ The algorithm is distribution-free, thus is robust to

distribution/data uncertainty.

◮ The dynamic learning algorithm has the feature of

“learning-while-doing”, and the pace the price is updated is neither too fast nor too slow...

◮ Buy-and-sell model? ◮ Multi-product price-posting market?

Yinyu Ye September 2-4 2014

slide-34
SLIDE 34

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Outline

◮ Online Linear Programming ◮ Least Squares with Nonconvex Regularization ◮ The ADMM Method with Multiple Blocks

Yinyu Ye September 2-4 2014

slide-35
SLIDE 35

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Unconstrained L2+Lp Minimization

Consider the convex quadratic optimization problem with Lp quasi-norm regularization: Minimizex fp(x) := Ax − b2

2 + λxp p, x ∈ X

(1) where X is a convex set, data A ∈ Rm×n, b ∈ Rm, parameter 0 ≤ p < 1, and xp

p =

  • j

xjp.

Yinyu Ye September 2-4 2014

slide-36
SLIDE 36

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Unconstrained L2+Lp Minimization

Consider the convex quadratic optimization problem with Lp quasi-norm regularization: Minimizex fp(x) := Ax − b2

2 + λxp p, x ∈ X

(1) where X is a convex set, data A ∈ Rm×n, b ∈ Rm, parameter 0 ≤ p < 1, and xp

p =

  • j

xjp. When p = 0: x0

0 := x0 := |{j : xj = 0}| that is, the number

  • f nonzero entries in x.

Yinyu Ye September 2-4 2014

slide-37
SLIDE 37

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Application and Motivation

The original goal is to control x0

0 = |{j : xj = 0}|, the size of

the support set of x, for

◮ Cardinality constrained portfolio management ◮ Sparse image reconstruction ◮ Sparse signal recovering ◮ Compressed sensing – reweighed L1 seems more effective

Yinyu Ye September 2-4 2014

slide-38
SLIDE 38

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Application and Motivation

The original goal is to control x0

0 = |{j : xj = 0}|, the size of

the support set of x, for

◮ Cardinality constrained portfolio management ◮ Sparse image reconstruction ◮ Sparse signal recovering ◮ Compressed sensing – reweighed L1 seems more effective

But L2 + L0 is known to be an NP-Hard problem, and hope L2 + Lp could be easier...

Yinyu Ye September 2-4 2014

slide-39
SLIDE 39

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Modern Portfolio Theory

A case p = 1 does not help: Minimizex Ax − b2

2, eTx = 1, x ≥ 0;

  • r ”short” is allowed:

Minimizex Ax − b2

2, eTx = 1.

Yinyu Ye September 2-4 2014

slide-40
SLIDE 40

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Modern Portfolio Theory

A case p = 1 does not help: Minimizex Ax − b2

2, eTx = 1, x ≥ 0;

  • r ”short” is allowed:

Minimizex Ax − b2

2, eTx = 1.

Let x = x+ − x−, (x+, x−) ≥ 0. Then, eTx+ − eTx− = 1, so that x1 = eTx+ + eTx− = 1 + 2eTx−.

Yinyu Ye September 2-4 2014

slide-41
SLIDE 41

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Modern Portfolio Theory

A case p = 1 does not help: Minimizex Ax − b2

2, eTx = 1, x ≥ 0;

  • r ”short” is allowed:

Minimizex Ax − b2

2, eTx = 1.

Let x = x+ − x−, (x+, x−) ≥ 0. Then, eTx+ − eTx− = 1, so that x1 = eTx+ + eTx− = 1 + 2eTx−. Minimizing x1 is about to control the debt exposure, not about the cardinality.

Yinyu Ye September 2-4 2014

slide-42
SLIDE 42

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Hardness Result

Question: Is L2 + Lp minimization easier than L2 + L0 minimization?

Yinyu Ye September 2-4 2014

slide-43
SLIDE 43

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Hardness Result

Question: Is L2 + Lp minimization easier than L2 + L0 minimization?

Theorem

Deciding the global minimal objective value of L2 + Lp minimization is strongly NP-hard for any given 0 ≤ p < 1 and λ > 0.

Yinyu Ye September 2-4 2014

slide-44
SLIDE 44

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Hardness Result

Question: Is L2 + Lp minimization easier than L2 + L0 minimization?

Theorem

Deciding the global minimal objective value of L2 + Lp minimization is strongly NP-hard for any given 0 ≤ p < 1 and λ > 0. Chen, Ge, Jian, Wang and Y [Math Programming 2011 and 2014]

Yinyu Ye September 2-4 2014

slide-45
SLIDE 45

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Easiness Result

However,

Theorem

There are FPTAS algorithms that provably compute a (second-order) ǫ-KKT point of L2 + Lp minimization.

Yinyu Ye September 2-4 2014

slide-46
SLIDE 46

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Easiness Result

However,

Theorem

There are FPTAS algorithms that provably compute a (second-order) ǫ-KKT point of L2 + Lp minimization. Bian, Chen, Ge, Jian, and Y [Math Programming 2011 and 2014]

Yinyu Ye September 2-4 2014

slide-47
SLIDE 47

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Easiness Result

However,

Theorem

There are FPTAS algorithms that provably compute a (second-order) ǫ-KKT point of L2 + Lp minimization. Bian, Chen, Ge, Jian, and Y [Math Programming 2011 and 2014] Question: Does any (second-order) KKT point or solution possess predictable sparse properties?

Yinyu Ye September 2-4 2014

slide-48
SLIDE 48

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Theory of Constrained L2+Lp: First-Order Bound

Theorem

Let x∗ be any first-order KKT point and let Li =

  • λp

2ai

  • f (x∗)
  • 1

1−p

. Then, for any i, either x∗

i = 0 or |x∗ i | ≥ Li.

Yinyu Ye September 2-4 2014

slide-49
SLIDE 49

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Theory of Constrained L2+Lp: Second-Order Bound

Theorem

Let x∗ be any KKT point that satisfies the second-order necessary conditions and let Li = λp(1 − p) 2ai2

  • 1

2−p

. Then, for any i, either x∗

i = 0 or |x∗ i | ≥ Li. Moreover, the support

columns of x∗ are linearly independent.

Yinyu Ye September 2-4 2014

slide-50
SLIDE 50

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Theory of Constrained L2+Lp: Second-Order Bound

Theorem

Let x∗ be any KKT point that satisfies the second-order necessary conditions and let Li = λp(1 − p) 2ai2

  • 1

2−p

. Then, for any i, either x∗

i = 0 or |x∗ i | ≥ Li. Moreover, the support

columns of x∗ are linearly independent. Chen, Xu and Y [SIAM Journal on Scientific Computing 2010]

Yinyu Ye September 2-4 2014

slide-51
SLIDE 51

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Extension to other Regularizations

Consider the Least Squares problem with any non-convex regularization: Minimizex fp(x) := Ax − b2

2 + λ i φ(|xi|)

where φ(·) is a concave increasing function.

Yinyu Ye September 2-4 2014

slide-52
SLIDE 52

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Extension to other Regularizations

Consider the Least Squares problem with any non-convex regularization: Minimizex fp(x) := Ax − b2

2 + λ i φ(|xi|)

where φ(·) is a concave increasing function. First-order bound: either x∗

i = 0 or 2ai

  • f (x∗) ≥ λ|φ′(x∗

i )|.

Yinyu Ye September 2-4 2014

slide-53
SLIDE 53

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Extension to other Regularizations

Consider the Least Squares problem with any non-convex regularization: Minimizex fp(x) := Ax − b2

2 + λ i φ(|xi|)

where φ(·) is a concave increasing function. First-order bound: either x∗

i = 0 or 2ai

  • f (x∗) ≥ λ|φ′(x∗

i )|.

Second-order bound: either x∗

i = 0 or 2ai2 ≥ λ|φ′′(x∗ i )|.

Yinyu Ye September 2-4 2014

slide-54
SLIDE 54

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

Yinyu Ye September 2-4 2014

slide-55
SLIDE 55

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy!

Yinyu Ye September 2-4 2014

slide-56
SLIDE 56

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy! ◮ There are desired structure properties of any KKT point of

LSNR problems.

Yinyu Ye September 2-4 2014

slide-57
SLIDE 57

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy! ◮ There are desired structure properties of any KKT point of

LSNR problems.

◮ Could one apply statistical analyses to a local minimizers or

KKT points of LSNR?

Yinyu Ye September 2-4 2014

slide-58
SLIDE 58

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy! ◮ There are desired structure properties of any KKT point of

LSNR problems.

◮ Could one apply statistical analyses to a local minimizers or

KKT points of LSNR?

◮ When is a local minimizer of LSNR also global?

Yinyu Ye September 2-4 2014

slide-59
SLIDE 59

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy! ◮ There are desired structure properties of any KKT point of

LSNR problems.

◮ Could one apply statistical analyses to a local minimizers or

KKT points of LSNR?

◮ When is a local minimizer of LSNR also global? ◮ Faster algorithms for solving LSNR, such as ADMM

convergence for two blocks:

Yinyu Ye September 2-4 2014

slide-60
SLIDE 60

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on LSNR

◮ Unfortunately, finding the global minimizer of LSNR problems

is (strongly) NP-hard;

◮ but fortunately finding an KKT point is easy! ◮ There are desired structure properties of any KKT point of

LSNR problems.

◮ Could one apply statistical analyses to a local minimizers or

KKT points of LSNR?

◮ When is a local minimizer of LSNR also global? ◮ Faster algorithms for solving LSNR, such as ADMM

convergence for two blocks: min f (x) + r(y), s.t. x − y = 0, x ∈ X?

Yinyu Ye September 2-4 2014

slide-61
SLIDE 61

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Outline

◮ Distributionally Robust Optimization ◮ Online Linear Programming ◮ Least Squares with Nonconvex Regularization ◮ The ADMM Method with Multiple Blocks

Yinyu Ye September 2-4 2014

slide-62
SLIDE 62

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Alternating Direction Method of Multipliers I

min {θ1(x1) + θ2(x2) | A1x1 + A2x2 = b, x1 ∈ X1, x2 ∈ X2}

  • θ1(x1) and θ2(x2) are convex closed proper functions;
  • X1 and X2 are convex sets.

Yinyu Ye September 2-4 2014

slide-63
SLIDE 63

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Alternating Direction Method of Multipliers I

min {θ1(x1) + θ2(x2) | A1x1 + A2x2 = b, x1 ∈ X1, x2 ∈ X2}

  • θ1(x1) and θ2(x2) are convex closed proper functions;
  • X1 and X2 are convex sets.

Original ADMM (Glowinski & Marrocco ’75, Gabay & Mercier ’76):      xk+1

1

= arg min{LA(x1, xk

2, λk) | x1 ∈ X1},

xk+1

2

= arg min{LA(xk+1

1

, x2, λk) | x2 ∈ X2}, λk+1 = λk − β(A1xk+1

1

+ A2xk+1

2

− b), where the augmented Lagrangian function LA is defined as

LA(x1, x2, λ) =

2

  • i=1

θi(xi) − λT 2

  • i=1

Aixi − b

  • + β

2

  • 2
  • i=1

Aixi − b

  • 2.

Yinyu Ye September 2-4 2014

slide-64
SLIDE 64

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

ADMM for Multi-block Convex Minimization Problems

Convex minimization problems with three blocks:

min θ1(x1) + θ2(x2) + θ3(x3) s.t. A1x1 + A2x2 + A3x3 = b x1 ∈ X1, x2 ∈ X2, x3 ∈ X3

Yinyu Ye September 2-4 2014

slide-65
SLIDE 65

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

ADMM for Multi-block Convex Minimization Problems

Convex minimization problems with three blocks:

min θ1(x1) + θ2(x2) + θ3(x3) s.t. A1x1 + A2x2 + A3x3 = b x1 ∈ X1, x2 ∈ X2, x3 ∈ X3 The direct and natural extension of ADMM:            xk+1

1

= arg min{LA(x1, xk

2, xk 3, λk) | x1 ∈ X1}

xk+1

2

= arg min{LA(xk+1

1

, x2, xk

3, λk) | x2 ∈ X2}

xk+1

3

= arg min{LA(xk+1

1

, xk+1

2

, x3, λk) | x3 ∈ X3} λk+1 = λk − β(A1xk+1

1

+ A2xk+1

2

+ A3xk+1

3

− b) LA(x1, x2, x3, λ) =

3

  • i=1

θi(xi) − λT 3

  • i=1

Aixi − b

  • + β

2

  • 3
  • i=1

Aixi − b

  • 2

Yinyu Ye September 2-4 2014

slide-66
SLIDE 66

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Existing Theoretical Results of the Extended ADMM

Not easy to analyze the convergence: the operator theory for the ADMM cannot be directly extended to the ADMM with three

  • blocks. Big difference between the ADMM with two blocks and

with three blocks.

Yinyu Ye September 2-4 2014

slide-67
SLIDE 67

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Existing Theoretical Results of the Extended ADMM

Not easy to analyze the convergence: the operator theory for the ADMM cannot be directly extended to the ADMM with three

  • blocks. Big difference between the ADMM with two blocks and

with three blocks. Existing results for global convergence:

  • Strong convexity; plus β in a specific range (Han & Yuan ’12).
  • Certain conditions on the problem; then take a sufficiently

small stepsize γ (Hong & Luo ’12) λk+1 = λk − γβ(A1xk+1

1

+ A2xk+1

2

+ A3xk+1

3

− b).

  • A correction term (He et al. ’12, He et al. -IMA, Deng at al.

’14, Ma et al. ’14...)

Yinyu Ye September 2-4 2014

slide-68
SLIDE 68

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Existing Theoretical Results of the Extended ADMM

Not easy to analyze the convergence: the operator theory for the ADMM cannot be directly extended to the ADMM with three

  • blocks. Big difference between the ADMM with two blocks and

with three blocks. Existing results for global convergence:

  • Strong convexity; plus β in a specific range (Han & Yuan ’12).
  • Certain conditions on the problem; then take a sufficiently

small stepsize γ (Hong & Luo ’12) λk+1 = λk − γβ(A1xk+1

1

+ A2xk+1

2

+ A3xk+1

3

− b).

  • A correction term (He et al. ’12, He et al. -IMA, Deng at al.

’14, Ma et al. ’14...) But, these did not answer the open question whether or not the direct extension of ADMM converges under the simple convexity assumption.

Yinyu Ye September 2-4 2014

slide-69
SLIDE 69

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM I

We simply consider the system of homogeneous linear equations with three variables: A1x1 +A2x2 +A3x3 = 0, where

Yinyu Ye September 2-4 2014

slide-70
SLIDE 70

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM I

We simply consider the system of homogeneous linear equations with three variables: A1x1 +A2x2 +A3x3 = 0, where A = (A1, A2, A3) =   1 1 1 1 1 2 1 2 2   .

Yinyu Ye September 2-4 2014

slide-71
SLIDE 71

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM I

We simply consider the system of homogeneous linear equations with three variables: A1x1 +A2x2 +A3x3 = 0, where A = (A1, A2, A3) =   1 1 1 1 1 2 1 2 2   . Then the extended ADMM with β = 1 can be specified as a linear map

          3 4 6 5 7 9 1 1 1 1 1 1 2 1 1 2 2 1                xk+1

1

xk+1

2

xk+1

3

λk+1      =           −4 −5 1 1 1 −7 1 1 2 1 2 2 1 1 1                xk

1

xk

2

xk

3

λk      .

Yinyu Ye September 2-4 2014

slide-72
SLIDE 72

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM II

Or equivalently,

   xk+1

2

xk+1

3

λk+1    = M    xk

2

xk

3

λk    , where M = 1 162         144 −9 −9 −9 18 8 157 −5 13 −8 64 122 122 −58 −64 56 −35 −35 91 −56 −88 −26 −26 −62 88         .

Yinyu Ye September 2-4 2014

slide-73
SLIDE 73

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM III

The matrix M = V Diag(d)V−1, where

d =       0.9836 + 0.2984i 0.9836 − 0.2984i 0.8744 + 0.2310i 0.8744 − 0.2310i       . Note that ρ(M) = |d1| = |d2| > 1.

Theorem

There exist an example where the direct extension of ADMM of three blocks with any real number initial point in a subspace is not convergent for any choice of β. Chen, He, Y, and Yuan [Manuscript 2013]

Yinyu Ye September 2-4 2014

slide-74
SLIDE 74

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Divergent Example of the Extended ADMM III

The matrix M = V Diag(d)V−1, where

d =       0.9836 + 0.2984i 0.9836 − 0.2984i 0.8744 + 0.2310i 0.8744 − 0.2310i       . Note that ρ(M) = |d1| = |d2| > 1.

Theorem

There exist an example where the direct extension of ADMM of three blocks with any real number initial point in a subspace is not convergent for any choice of β. Chen, He, Y, and Yuan [Manuscript 2013]

Corollary

When starting from a random point, there exist an example the direct extension of ADMM of three blocks is not convergent with probability

  • ne for any choice of β.

Yinyu Ye September 2-4 2014

slide-75
SLIDE 75

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Strong Convexity Helps?

Consider the following example min 0.05x2

1 + 0.05x2 2 + 0.05x2 3

s.t.   1 1 1 1 1 2 1 2 2     x1 x2 x3   = 0.

Yinyu Ye September 2-4 2014

slide-76
SLIDE 76

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Strong Convexity Helps?

Consider the following example min 0.05x2

1 + 0.05x2 2 + 0.05x2 3

s.t.   1 1 1 1 1 2 1 2 2     x1 x2 x3   = 0.

◮ Then, the linear mapping matrix M in the extended ADMM

(β = 1) has ρ(M) = 1.0087 > 1

Yinyu Ye September 2-4 2014

slide-77
SLIDE 77

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Strong Convexity Helps?

Consider the following example min 0.05x2

1 + 0.05x2 2 + 0.05x2 3

s.t.   1 1 1 1 1 2 1 2 2     x1 x2 x3   = 0.

◮ Then, the linear mapping matrix M in the extended ADMM

(β = 1) has ρ(M) = 1.0087 > 1

◮ Able to find a proper initial point such that the extended

ADMM diverges

Yinyu Ye September 2-4 2014

slide-78
SLIDE 78

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Strong Convexity Helps?

Consider the following example min 0.05x2

1 + 0.05x2 2 + 0.05x2 3

s.t.   1 1 1 1 1 2 1 2 2     x1 x2 x3   = 0.

◮ Then, the linear mapping matrix M in the extended ADMM

(β = 1) has ρ(M) = 1.0087 > 1

◮ Able to find a proper initial point such that the extended

ADMM diverges

◮ even for strongly convex programming, the extended ADMM

is not necessarily convergent for a certain β > 0.

Yinyu Ye September 2-4 2014

slide-79
SLIDE 79

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Yinyu Ye September 2-4 2014

slide-80
SLIDE 80

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Convergence is proved:

Yinyu Ye September 2-4 2014

slide-81
SLIDE 81

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Convergence is proved:

◮ One block (Augmented Lagrangian Method): γ ∈ (0, 2),

(Hestenes ’69, Powell ’69).

Yinyu Ye September 2-4 2014

slide-82
SLIDE 82

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Convergence is proved:

◮ One block (Augmented Lagrangian Method): γ ∈ (0, 2),

(Hestenes ’69, Powell ’69).

◮ Two blocks (Alternating Direction Method of Multipliers:

γ ∈ (0, 1+

√ 5 2

), (Glowinski, ’84).

Yinyu Ye September 2-4 2014

slide-83
SLIDE 83

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Convergence is proved:

◮ One block (Augmented Lagrangian Method): γ ∈ (0, 2),

(Hestenes ’69, Powell ’69).

◮ Two blocks (Alternating Direction Method of Multipliers:

γ ∈ (0, 1+

√ 5 2

), (Glowinski, ’84).

◮ Three blocks: for γ sufficiently small provided additional conditions

  • n the problem, (Hong & Luo ’12).

Yinyu Ye September 2-4 2014

slide-84
SLIDE 84

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

The Small-Stepsized ADMM

Recall that, In the small stepsized ADMM, the Lagrangian multiplier is updated by λk+1 := λk − γβ(A1xk+1

1

+ A2xk+1

2

+ . . . + A3xk+1

3

).

Convergence is proved:

◮ One block (Augmented Lagrangian Method): γ ∈ (0, 2),

(Hestenes ’69, Powell ’69).

◮ Two blocks (Alternating Direction Method of Multipliers:

γ ∈ (0, 1+

√ 5 2

), (Glowinski, ’84).

◮ Three blocks: for γ sufficiently small provided additional conditions

  • n the problem, (Hong & Luo ’12).

Question: Is there a problem-data-independent γ such that the method converges?

Yinyu Ye September 2-4 2014

slide-85
SLIDE 85

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Numerical Study

For any given γ > 0, consider the linear system   1 1 1 1 1 1 + γ 1 1 + γ 1 + γ     x1 x2 x3   = 0.

Yinyu Ye September 2-4 2014

slide-86
SLIDE 86

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Numerical Study

For any given γ > 0, consider the linear system   1 1 1 1 1 1 + γ 1 1 + γ 1 + γ     x1 x2 x3   = 0.

Table: The radius of M γ 1 0.1 1e-2 1e-3 1e-4 1e-5 1e-6 1e-7 ρ(M) 1.0278 1.0026 1.0001 > 1 > 1 > 1 > 1 > 1

Yinyu Ye September 2-4 2014

slide-87
SLIDE 87

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

A Numerical Study

For any given γ > 0, consider the linear system   1 1 1 1 1 1 + γ 1 1 + γ 1 + γ     x1 x2 x3   = 0.

Table: The radius of M γ 1 0.1 1e-2 1e-3 1e-4 1e-5 1e-6 1e-7 ρ(M) 1.0278 1.0026 1.0001 > 1 > 1 > 1 > 1 > 1 Thus, there seems no practical problem-data-independent γ such that the small-stepsized ADMM variant works.

Yinyu Ye September 2-4 2014

slide-88
SLIDE 88

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

Yinyu Ye September 2-4 2014

slide-89
SLIDE 89

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

◮ Even in the case where the objective function is strongly convex, the

direct extension of ADMM loses its convergence for certain βs.

Yinyu Ye September 2-4 2014

slide-90
SLIDE 90

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

◮ Even in the case where the objective function is strongly convex, the

direct extension of ADMM loses its convergence for certain βs.

◮ There doesn’t exist a problem-data-independent stepsize γ such

that the small-stepsized variant of ADMM would work.

Yinyu Ye September 2-4 2014

slide-91
SLIDE 91

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

◮ Even in the case where the objective function is strongly convex, the

direct extension of ADMM loses its convergence for certain βs.

◮ There doesn’t exist a problem-data-independent stepsize γ such

that the small-stepsized variant of ADMM would work.

◮ Is there a cyclic non-converging example?

Yinyu Ye September 2-4 2014

slide-92
SLIDE 92

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

◮ Even in the case where the objective function is strongly convex, the

direct extension of ADMM loses its convergence for certain βs.

◮ There doesn’t exist a problem-data-independent stepsize γ such

that the small-stepsized variant of ADMM would work.

◮ Is there a cyclic non-converging example? ◮ Our results support the need of a correction step in the ADMM-type

method (He&Tao&Yuan 12’, He&Tao&Yuan-IMA,...).

Yinyu Ye September 2-4 2014

slide-93
SLIDE 93

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

Summary and Future Questions on ADMM

◮ We construct examples to show that the direct extension of ADMM

for multi-block convex minimization problems is not necessarily convergent for any given algorithm parameter β.

◮ Even in the case where the objective function is strongly convex, the

direct extension of ADMM loses its convergence for certain βs.

◮ There doesn’t exist a problem-data-independent stepsize γ such

that the small-stepsized variant of ADMM would work.

◮ Is there a cyclic non-converging example? ◮ Our results support the need of a correction step in the ADMM-type

method (He&Tao&Yuan 12’, He&Tao&Yuan-IMA,...).

◮ Question: Is there a ”simple correction” of the ADMM for the

multi-block convex minimization problems? Or how to treat the multi blocks ”equally”?

Yinyu Ye September 2-4 2014

slide-94
SLIDE 94

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

How to Treat All Blocks Equally?

Answer: Independent random permutation in each iteration!

Yinyu Ye September 2-4 2014

slide-95
SLIDE 95

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

How to Treat All Blocks Equally?

Answer: Independent random permutation in each iteration!

◮ Select the block-update order in the uniformly random fashion

– this equivalently reduces the ADMM algorithm to one block.

Yinyu Ye September 2-4 2014

slide-96
SLIDE 96

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

How to Treat All Blocks Equally?

Answer: Independent random permutation in each iteration!

◮ Select the block-update order in the uniformly random fashion

– this equivalently reduces the ADMM algorithm to one block.

◮ Or fix the first block, and then select the rest block order in

the uniformly random fashion – this equivalently reduces the ADMM algorithm to two blocks.

Yinyu Ye September 2-4 2014

slide-97
SLIDE 97

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

How to Treat All Blocks Equally?

Answer: Independent random permutation in each iteration!

◮ Select the block-update order in the uniformly random fashion

– this equivalently reduces the ADMM algorithm to one block.

◮ Or fix the first block, and then select the rest block order in

the uniformly random fashion – this equivalently reduces the ADMM algorithm to two blocks.

◮ It works for the example – the expected ρ(M) equals 0.9723!

Yinyu Ye September 2-4 2014

slide-98
SLIDE 98

Online Linear Programming (OLP) Least Squares with Nonconvex Regularization (LSNR) Alternating Direction Method of Multipliers (ADMM)

How to Treat All Blocks Equally?

Answer: Independent random permutation in each iteration!

◮ Select the block-update order in the uniformly random fashion

– this equivalently reduces the ADMM algorithm to one block.

◮ Or fix the first block, and then select the rest block order in

the uniformly random fashion – this equivalently reduces the ADMM algorithm to two blocks.

◮ It works for the example – the expected ρ(M) equals 0.9723! ◮ It works in general?

Yinyu Ye September 2-4 2014