SLIDE 1
SDP Rank Reduction Yinyu Ye, EURO XXII 1
A Unified Theorem on SDP Rank Reduction
Yinyu Ye Department of Management Science and Engineering and Institute of Computational and Mathematical Engineering Stanford University Stanford, CA 94305, U.S.A.
http://www.stanford.edu/˜yyye
Joint work with Anthony So and Jiawei Zhang
SLIDE 2 SDP Rank Reduction Yinyu Ye, EURO XXII 2
Outline
- Problem Statement
- Application
- New SDP Rank Reduction Theorem and Algorithm
- Sketch of Proof
- Extension and Question
SLIDE 3 SDP Rank Reduction Yinyu Ye, EURO XXII 3
Problem Statement
- Consider the system of Semidefinite Programming constraints:
Ai • X = bi i = 1, . . . , m, X 0
where given A1, . . . , Am are n × n symmetric positive semidefinite matrices, and b1, . . . , bm ≥ 0, and A • X =
i,j aijxij = TrAT X.
- Clearly, the feasibility of the above system can be “decided” by using SDP
interior-point algorithms (Alizadeh 91, Nesterov and Nemirovskii 93, etc).
- More precisely, find an ǫ-approximate solution where solution time is linear in
log(1/ǫ).
SLIDE 4 SDP Rank Reduction Yinyu Ye, EURO XXII 4
Problem Statement (Cont’d)
- However, we are interested in finding a low–rank solution to the above system.
- The low–rank problem arises in many applications, e.g.:
– localizing sensor network (e.g., Biswas and Y 03, So and Y 04) – metric embedding/dimension reduction (e.g., Johnson and Lindenstrauss 84, Matousek 90) – approximating non-convex (complex, quaternion) quadratic optimization (e.g., Nemirovskii, Roos and Terlaky 99, Luo, Sidiropoulos, Tseng and Zhang 06, Faybusovich 07) – graph rigidity/distance matix (e.g., Alfakih, Khandani and Wolkowicz 99, etc.)
SLIDE 5 SDP Rank Reduction Yinyu Ye, EURO XXII 5
Graph Realization
Given a graph G = (V, E) and sets of non–negative weights, say
{dij : (i, j) ∈ E} and {θilj : (i, l, j) ∈ Θ}, the goal is to compute a
realization of G in the Euclidean space Rd for a given low dimension d, i.e.
- to place the vertices of G in Rd such that
- the Euclidean distance between every pair of adjacent vertices (i, j) equals
(or bounded) by the prescribed weight dij ∈ E, and
- the angle between edges (i, l) and (j, l) equals (or bounded) by the
prescribed weight θilj ∈ Θ.
SLIDE 6 SDP Rank Reduction Yinyu Ye, EURO XXII 6
−0.5 −0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5 −0.5 −0.4 −0.3 −0.2 −0.1 0.1 0.2 0.3 0.4 0.5
Figure 1: 50-node 2-D Sensor Localization
SLIDE 7
SDP Rank Reduction Yinyu Ye, EURO XXII 7
Figure 2: A 3-D Tensegrity Graph Realization; provided by Anstreicher
SLIDE 8
SDP Rank Reduction Yinyu Ye, EURO XXII 8
Figure 3: Tensegrity Graph: A Needle Tower; provided by Anstreicher
SLIDE 9
SDP Rank Reduction Yinyu Ye, EURO XXII 9
Figure 4: Molecular Conformation: 1F39(1534 atoms) with 85% of distances below
6 ˚
A and 10% noise on upper and lower bounds
SLIDE 10
SDP Rank Reduction Yinyu Ye, EURO XXII 10
Math Programming: Rank-Constrained SDP
Given ak ∈ Rd, dij ∈ Nx, ˆ
dkj ∈ Na, and vilj ∈ Θ, find xi ∈ Rd such that xi − xj2 (≤) = (≥) d2
ij, ∀ (i, j) ∈ Nx, i < j,
ak − xj2 (≤) = (≥) ˆ d2
kj, ∀ (k, j) ∈ Na,
(xi − xl)T (xj − xl) (≤) = (≥) vilj, ∀ (i, l, j) ∈ Θ,
which lead to
Ai • X = bi i = 1, . . . , m, X 0, rank(X) ≤ d;
and relaxed to
Ai • X = bi i = 1, . . . , m, X 0.
SLIDE 11 SDP Rank Reduction Yinyu Ye, EURO XXII 11
Some Background
- Barvinok 95 showed that if the system is feasible, then there exists a solution
X whose rank is at most √ 2m (also see Carath´
eodorys theorem). Moreover, Pataki 98 showed how to construct such an X efficiently.
- Unfortunately, for the applications mentioned above, this is not enough.
– We want a fixed-low-rank (say d) solution!
- However, there are some issues:
– Such a solution may not exist! – Even if it does, one may not be able to find it efficiently.
- So we consider an approximation of the problem.
SLIDE 12
SDP Rank Reduction Yinyu Ye, EURO XXII 12
Approximating the Problem
We consider the problem of finding an ˆ
X 0 of rank at most d that satisfies the
system approximately:
β(m, n, d) · bi ≤ Ai • ˆ X ≤ α(m, n, d) · bi ∀ i = 1, . . . , m
Here, distortion factors α ≥ 1 and β ∈ (0, 1]. Clearly, the closer are both to 1, the better.
SLIDE 13 SDP Rank Reduction Yinyu Ye, EURO XXII 13
Our Result
Theorem 1. Suppose that the original system is feasible. Let
r = maxi{Rank(Ai)}. Then, for any d ≥ 1, there exists an ˆ X 0 of rank at
most d such that:
α(m, n, d) = 1 + 12 log(4mr) d
for 1 ≤ d ≤ 12 log(4mr)
1 +
d
for d > 12 log(4mr)
SLIDE 14 SDP Rank Reduction Yinyu Ye, EURO XXII 14
Our Result
Theorem 1. Suppose that the original system is feasible. Let
r = maxi{Rank(Ai)}. Then, for any d ≥ 1, there exists an ˆ X 0 of rank at
most d such that:
α(m, n, d) = 1 + 12 log(4mr) d
for 1 ≤ d ≤ 12 log(4mr)
1 +
d
for d > 12 log(4mr)
β(m, n, d) = 1 5e · 1 m2/d
for 1 ≤ d ≤
2 log m log log(2m) 1 4e · 1 logf(m)/d(2m)
for
2 log m log log(2m) < d ≤ 4 log(4mr) 1 −
d
for d > 4 log(4mr) where f(m) = 3 log m/ log log(2m).
SLIDE 15 SDP Rank Reduction Yinyu Ye, EURO XXII 15
Our Result
Theorem 1. Suppose that the original system is feasible. Let
r = maxi{Rank(Ai)}. Then, for any d ≥ 1, there exists an ˆ X 0 of rank at
most d such that:
α(m, n, d) = 1 + 12 log(4mr) d
for 1 ≤ d ≤ 12 log(4mr)
1 +
d
for d > 12 log(4mr)
β(m, n, d) = 1 5e · 1 m2/d
for 1 ≤ d ≤
2 log m log log(2m) 1 4e · 1 logf(m)/d(2m)
for
2 log m log log(2m) < d ≤ 4 log(4mr) 1 −
d
for d > 4 log(4mr) where f(m) = 3 log m/ log log(2m). Moreover, such an ˆ
X can be found in
randomized polynomial time.
SLIDE 16 SDP Rank Reduction Yinyu Ye, EURO XXII 16
Some Remarks
In general, the data parameter r can be bounded by
√ 2m, so that α(m, n, d) = 1 + O log m d
β(m, n, d) = Ω
for d = O
log log m
- Ω
- (log m)−3 log m/(d log log m)
- therwise
SLIDE 17 SDP Rank Reduction Yinyu Ye, EURO XXII 17
Some Remarks (Cont’d)
- In the region 1 ≤ d ≤ 2 log m/ log log(2m), the lower bound β is
independent of the ranks of A1, . . . , Am.
- f(m)/d ≤ 3/2 in the region d >
2 log m log log(2m).
d
is a constant in the region d > 4 log(4mr)
- Our result contains as special cases several well-known results in the
literature.
SLIDE 18 SDP Rank Reduction Yinyu Ye, EURO XXII 18
Early Result: Metric Embedding
- Given an n–point set V = {v1, . . . , vn} in Rl, we would like to embed it
into a low–dimensional Euclidean space as faithfully as possible.
- Specifically, a map f : V → Rd is an α–embedding (where α ≥ 1) if
u − v2 ≤ f(u) − f(v)2 ≤ α · u − v2
The goal is to find an f such that α is as small as possible.
– for any ǫ > 0, an (1 + ǫ)–embedding into RO(ǫ−2 log n) exists (Johnson–Lindenstrauss); – for any fixed d ≥ 1, an O(n2/dd−1/2 log1/2 n)–embedding into Rd exists (Matousek).
SLIDE 19 SDP Rank Reduction Yinyu Ye, EURO XXII 19
Early Result: Metric Embedding (Cont’d)
We can get these results from our Theorem. We focus on the fixed d case.
i=1 be the standard basis vectors, and set
Eij = (ei − ej)(ei − ej)T .
- Let U be the m × n matrix whose i–th column is vi. Then, X = U T U
satisfies the system Eij • X = vi − vj2
2 for 1 ≤ i < j ≤ n.
- By our Theorem, we can find an ˆ
X 0 of rank at most d such that: Ω(n−4/d) · vi − vj2
2 ≤ Eij • ˆ
X ≤ O(log n/d) · vi − vj2
2
X = ˆ U T ˆ U, where ˆ U is d × n, we recover points ˆ u1, . . . , ˆ un ∈ Rd such that: Ω(n−2/d) · vi − vj2 ≤ ˆ ui − ˆ uj2 ≤ O(
The embedding results imply only a weaker version (r = 1) of our theorem.
SLIDE 20 SDP Rank Reduction Yinyu Ye, EURO XXII 20
Early Result: Approximating QPs
- Let A1, . . . , Am be positive semidefinite. Consider the following QP:
v∗ = maximize xT Ax
subject to xT Aix ≤ 1
i = 1, . . . , m
and its natural SDP relaxation:
v∗
sdp = maximize A • X
subject to Ai • X ≤ 1
i = 1, . . . , m; X 0
- Let X∗ be an optimal solution to the SDP
.
- Nemirovskii et al. showed that one can randomly extract a rank–1 matrix ˆ
X
from X∗ such that it is feasible for the SDP and that
E[A • ˆ X] ≥ Ω(log−1 m)v∗.
SLIDE 21 SDP Rank Reduction Yinyu Ye, EURO XXII 21
Early Result: Approximating QPs (Cont’d)
We can obtain a similar result from our Theorem.
- The matrix X∗ satisfies the system:
Ai • X∗ = bi ≤ 1 i = 1, . . . , m
- Our proof of the Theorem shows that one can find a rank–1 matrix ˆ
X 0
such that:
E[A • ˆ X] = v∗
sdp,
Ai • ˆ X ≤ O(log m) · bi i = 1, . . . , m
X by a factor of O(log m), we obtain a feasible rank–1
matrix ˆ
X′ that satisfies E[A • ˆ X′] ≥ Ω(log−1 m)v∗.
SLIDE 22 SDP Rank Reduction Yinyu Ye, EURO XXII 22
Early Result: Approximating QPs (Cont’d)
- Luo et al. considered the following real (complex) QP:
minimize xT Ax subject to xT Aix ≥ 1
i = 1, . . . , m
and its natural SDP relaxation: minimize A • X subject to Ai • X ≥ 1
i = 1, . . . , m; X 0
- They showed how to extract a solution ˆ
x from an optimal solution matrix to
the SDP so that it is feasible for the SDP and that it is within a factor
O(m−2) (O(m−1)) of the optimal.
- Again, we can obtain the same results from our Theorem on both real
(d = 1) and complex (d = 2) spaces.
SLIDE 23 SDP Rank Reduction Yinyu Ye, EURO XXII 23
How Sharp are the Bounds?
- For metric embedding, it is known that:
– for any d ≥ 1, there exists an n–point set V ⊂ Rd+1 such that any embedding of V into Rd requires D = Ω(n1/⌊(d+1)/2⌋) (Matousek); – there exists an n–point set V ⊂ Rl for some l such that for any
ǫ ∈ (n−1/2, 1/2), say, an (1 + ǫ)–embedding of V into Rd will require d = Ω((ǫ2 log(1/ǫ))−1 log n) (Alon 03).
Thus, from the metric embedding perspective, the ratio of our upper and lower bounds is almost tight for d ≥ 3.
SLIDE 24 SDP Rank Reduction Yinyu Ye, EURO XXII 24
How Sharp are the Bounds? (Cont’d)
v∗ = maximize xT Ax
subject to xT Aix ≤ 1
i = 1, . . . , m
and its natural SDP relaxation:
v∗
sdp = maximize A • X
subject to Ai • X ≤ 1
i = 1, . . . , m; X 0
Nemirovskii et al. showed that the ratio between v∗ and v∗
sdp can be as large
as Ω(log m).
- For the minimization version, Luo et al. showed that the ratio can be as small
as Ω(m−2). Thus, from the QP perspective, the ratio of our upper and lower bounds is almost tight for d = 1.
SLIDE 25 SDP Rank Reduction Yinyu Ye, EURO XXII 25
Sketch of Proof of the Theorem
- Without loss of generality, we may assume that X = I is feasible for the
- riginal system, that is, our system becomes
Ai • X = Tr(Ai) i = 1, . . . , m; X 0.
- Thus, the Theorem becomes:
Theorem 2. Let A1, . . . , Am be n × n positive semidefinite matrices. Then, for any d ≥ 1, there exists an ˆ
X 0 with rank at most d such that: β(m, n, d) · Tr(Ai) ≤ Ai • ˆ X ≤ α(m, n, d) · Tr(Ai) ∀ i = 1, . . . , m
SLIDE 26 SDP Rank Reduction Yinyu Ye, EURO XXII 26
Sketch of Proof of the Theorem (Cont’d)
The algorithm for generating ˆ
X is simple:
- generate i.i.d. Gaussian RVs ξj
i with mean 0 and variance 1/d and define
column vector ξj = (ξj
1; . . . ; ξj n), for 1 ≤ i ≤ n and 1 ≤ j ≤ d;
X = d
j=1 ξj(ξj)T .
SLIDE 27 SDP Rank Reduction Yinyu Ye, EURO XXII 27
Sketch of Proof of the Theorem (Cont’d)
The analysis makes use of the following Markov inequality: Lemma 1. Let ξ1, . . . , ξn be i.i.d. standard Gaussian RVs. Let α ∈ (1, ∞) and
β ∈ (0, 1) be constants, and Chi-square Un = n
i=1 ξ2 i . Then, the following
hold:
Pr (Un ≥ αn) ≤ exp n 2 (1 − α + log α)
n 2 (1 − β + log β)
SLIDE 28 SDP Rank Reduction Yinyu Ye, EURO XXII 28
Sketch of Proof of the Theorem (Cont’d)
Lemma 2. Let H be an n × n positive semidefinite matrix and r = rank(H). Then, for any β ∈ (0, 1), we have:
Pr
X ≤ βTr(H)
d 2 (1 − β + log β)
- On the other hand, if β satisfies eβ log r ≤ 1/5, then the above can be sharpened to:
Pr
X ≤ βTr(H)
Note that (2) is independent of r!
SLIDE 29 SDP Rank Reduction Yinyu Ye, EURO XXII 29
Sketch of Proof of the Theorem (Cont’d)
Lemma 2. Let H be an n × n positive semidefinite matrix and r = rank(H). Then, for any β ∈ (0, 1), we have:
Pr
X ≤ βTr(H)
d 2 (1 − β + log β)
On the other hand, if β satisfies eβ log r ≤ 1/5, then the above can be sharpened to:
Pr
X ≤ βTr(H)
(2) Note that (2) is independent of r! For any α > 1, we have:
Pr
X ≥ αTr(H)
d 2 (1 − α + log α)
SLIDE 30 SDP Rank Reduction Yinyu Ye, EURO XXII 30
Sketch of Proof of the Theorem (Cont’d)
- It is easy to establish (1) and (3) using Lemma 1.
- By applying the bounds (1) and (3) of Lemma 2 to each A1, . . . , Am and
taking the union bound, we can get the upper bound in the Theorem. However, the lower bound obtained this way is weaker.
- To obtain a better lower bound (for the region
1 ≤ d ≤ 2 log m/ log log(2m)), we use the bound (2) in Lemma 2.
- To prove it, consider the spectral decomposition H = r
k=1 λkvkvT k and
λ1 ≥ λ2 ≥ · · · ≥ λr > 0.
SLIDE 31 SDP Rank Reduction Yinyu Ye, EURO XXII 31
Sketch of Proof of the Theorem (Cont’d)
- Recall that it says if β ∈ (0, 1) satisfies eβ log r ≤ 1/5, then
Pr
X ≤ βTr(H)
- ≤ (5eβ/2)d/2
- First, by the spectral decomposition, we have
H • ˆ X = r
λkvkvT
k
d
ξj(ξj)T =
r
d
λk(vT
k ξj)2.
k ξj)k,j ∼ N(0, d−1) and mutually independent.
X has the same distribution as the weighted Chi-square r
k=1 λk
d
j=1 ˜
ξ2
kj, where ˜
ξkj are i.i.d. Gaussian of N(0, d−1).
SLIDE 32 SDP Rank Reduction Yinyu Ye, EURO XXII 32
Sketch of Proof of the Theorem (Cont’d)
λk = λk r
k=1 λk. It then follows that:
Pr
X ≤ βTr(H)
r
λk
d
˜ ξ2
kj ≤ β
r
λk = Pr
r
¯ λk
d
˜ ξ2
kj ≤ β
≡ p(r, ¯ λ, β)
λ, β). On the one hand, by replacing all ¯ λk by the
smallest one ¯
λr and using the tail estimates of Lemma 1, we have: p
λ, β
¯ λr
r
d
˜ ξ2
kj ≤ β
≤ eβ r¯ λr rd/2
SLIDE 33 SDP Rank Reduction Yinyu Ye, EURO XXII 33
- On the other hand, by dropping smallest term ¯
λr in the summation p
λ, β
r−1
d
¯ λk ˜ ξ2
kj ≤ β
= Pr
r−1
d
¯ λk 1 − ¯ λr ˜ ξ2
kj ≤
β 1 − ¯ λr ≡ p
¯ λ1:r−1 1 − ¯ λr , β 1 − ¯ λr
SLIDE 34 SDP Rank Reduction Yinyu Ye, EURO XXII 34
Sketch of Proof of the Theorem (Cont’d)
- By unrolling the recursive formula, we have:
p
λ, β
1≤k≤r
eβ k¯ λk kd/2
λ, β 2/d. Note that γ ∈ (0, 1). From the above, we have ¯ λk ≤
- kγ1/k−1 eβ for k = 1, . . . , r.
- Upon summing over k and using the fact that r
k=1 ¯
λk = 1, we obtain: eβ
r
1 kγ1/k ≥ 1
SLIDE 35 SDP Rank Reduction Yinyu Ye, EURO XXII 35
Sketch of Proof of the Theorem (Cont’d)
1 kγ1/k ≤ 1 γ + r
1
1 tγ1/t dt = 1 γ + log(1/γ)
log(1/γ) r
et t dt.
- Then, one can show that the above implies that:
1 eβ ≤ 2 γ + log r
- Together with the assumption that eβ log r ≤ 1/5, we conclude that:
5eβ/2 ≥ γ = Pr
X ≤ βTr(H) 2/d
as desired.
SLIDE 36
SDP Rank Reduction Yinyu Ye, EURO XXII 36
SDP with an Objective Function
Our result can be used for solving SDP with an objective function:
min C • X, subject to Ai • X = bi
for i = 1, . . . , m; X 0. When X is optimal, there must be a ( ¯
S, ¯ y) feasible for the dual such that ¯ S • X = 0 (under a mild condition).
One can treat ¯
S • X = 0 as an equality constraint. Thus, the rounding method
will preserve ¯
S • ˆ X = 0, that is, low–rank ˆ X is optimal for a “nearby” problem to
the original SDP with the identical objective.
SLIDE 37 SDP Rank Reduction Yinyu Ye, EURO XXII 37
Question
- Is there deterministic algorithm? Choose the largest d eigenvalue component
- f X?
- In practical applications, we see much smaller distortion, why?
- Add a regularization objective to find a low rank SDP solution?