Sparsity and decomposition in semidefinite optimization Lieven - PowerPoint PPT Presentation

Sparsity and decomposition in semidefinite optimization Lieven Vandenberghe ECE Department, UCLA Joint work with Joachim Dahl, Martin S. Andersen, Yifan Sun, Xin Jiang DYSCO PAI/IUAP Network Study Day Leuven, November 28, 2017

Semidefinite program (SDP) tr ( CX ) minimize tr ( A i X ) = b i , i = 1 , . . ., m subject to X � 0 variable X is n × n symmetric matrix; X � 0 means X is positive semidefinite • matrix inequalities arise naturally in many areas (for example, control, statistics) • used in convex modeling systems (CVX, YALMIP, CVXPY, ...) • relaxations of nonconvex quadratic and polynomial optimization Algorithms • primal-dual interior-point algorithms (used in SeDuMi, SDPT3, MOSEK) • nonlinear programming methods based on parameterization X = YY T • first order methods This talk: structure in solution X that results from sparsity in coefficients A i , C 1

Band structure cost of solving SDP with banded matrices (bandwidth 11 , 100 constraints) 10 3 SDPT3 SeDuMi Time per iteration (seconds) 10 2 10 1 10 0 O ( n 2 ) 10 − 1 10 − 2 10 2 10 3 n • for bandwidth 1 (linear program), cost/iteration is linear in n • for bandwidth > 1 , cost grows as n 2 or faster [Andersen, Dahl, Vandenberghe 2010] 2

Power flow optimization an optimization problem with non-convex quadratic constraints Variables • complex voltage v i at each node (bus) of the network • complex power flow s ij entering the link (line) from node i to node j Non-convex constraints • (lower) bounds on voltage magnitudes v min ≤ | v i | ≤ v max • flow balance equations: s ij s ji g ij | v i − v j | 2 s ij + s ji = ¯ g ij bus i bus j g ij is admittance of line from node i to j 3

Semidefinite relaxation of optimal power flow problem • introduce matrix variable X = Re ( vv H ) , i.e. , with elements X ij = Re ( v i ¯ v j ) • voltage bounds and flow balance equations are convex in X : v 2 min ≤ X ii ≤ v 2 v min ≤ | v i | ≤ v max −→ max g ij | v i − v j | 2 −→ g ij ( X ii + X j j − 2 X ij ) s ij + s ji = ¯ s ij + s ji = ¯ • replace constraint X = Re ( vv H ) with weaker constraint X � 0 • relaxation is exact if optimal X has rank two Sparsity in SDP relaxation: off-diagonal X ij appears in constraints only if there is a line between buses i and j [Jabr 2006] [Bai et al. 2008] [Lavaei and Low 2012], [Molzahn et al. 2013], ... 4

Sparsity graph   1 2 A 11 A 21 A 31 0 A 51     A 21 A 22 0 A 42 0     A 31 0 A 33 0 A 53 A = 5     0 A 42 0 A 44 A 54     A 51 0 A 53 A 54 A 55   3 4 • sparsity pattern of symmetric n × n matrix is set of ‘nonzero’ positions E ⊆ {{ i , j } | i , j ∈ { 1 , 2 , . . ., n }} • A has sparsity pattern E if A ij = 0 if i � j and { i , j } � E • notation: A ∈ S n E • represented by undirected graph ( V , E ) with edges E , vertices V = { 1 , . . ., n } • clique (maximal complete subgraph) forms maximal ‘dense’ principal submatrix 5

Sparsity graph   1 2 A 11 A 31 A 51 A 21 0     A 21 A 22 0 A 42 0     A 31 0 A 33 0 A 53 A = 5     0 A 42 0 A 44 A 54     A 51 0 A 53 A 54 A 55   3 4 • sparsity pattern of symmetric n × n matrix is set of ‘nonzero’ positions E ⊆ {{ i , j } | i , j ∈ { 1 , 2 , . . ., n }} • A has sparsity pattern E if A ij = 0 if i � j and { i , j } � E • notation: A ∈ S n E • represented by undirected graph ( V , E ) with edges E , vertices V = { 1 , . . ., n } • clique (maximal complete subgraph) forms maximal ‘dense’ principal submatrix 5

Sparse matrix cones we define two convex cones in S n E (symmetric n × n matrices with pattern E ) • positive semidefinite matrices S n + ∩ S n E = { X ∈ S n E | X � 0 } • matrices with a positive semidefinite completion Π E ( S n + ) = { Π E ( X ) | X � 0 } Π E is projection on S n E Properties • two cones are convex • closed, pointed, with nonempty interior (relative to S n E ) • form a pair of dual cones (for the trace inner product) 6

Sparse semidefinite program Standard form SDP and dual (variables X , S ∈ S n , y ∈ R m ) b T y minimize tr ( CX ) maximize � m tr ( A i X ) = b i , i = 1 , . . ., m subject to subject to i = 1 y i A i + S = C X � 0 S � 0 Equivalent pair of conic linear programs (variables X , S ∈ S n E , y ∈ R m ) b T y tr ( CX ) minimize maximize � m tr ( A i X ) = b i , i = 1 , . . ., m subject to subject to i = 1 y i A i + S = C S ∈ K ∗ X ∈ K • E is union of sparsity patterns of C , A 1 , ..., A m • K = Π E ( S n + ) is cone of p.s.d. completable matrices with sparsity pattern E • K ∗ = S n + ∩ S n E is cone of positive semidefinite matrices with sparsity pattern E 7

Outline 1. Sparse semidefinite programs 2. Chordal graphs 3. Decomposition of sparse matrix cones 4. Multifrontal algorithms for logarithmic barrier functions 5. Minimum rank positive semidefinite completion

Chordal graph • undirected graph with vertex set V , edge set E ⊆ {{ v , w } | v , w ∈ V } G = ( V , E ) • a chord of a cycle is an edge between non-consecutive vertices • G is chordal if every cycle of length greater than three has a chord a a f f b b e c e c d d not chordal chordal also known as triangulated, decomposable, rigid circuit graph, ... 8

History chordal graphs have been studied in many disciplines since the 1960s • combinatorial optimization (a class of perfect graphs) • linear algebra (sparse factorization, completion problems) • database theory • machine learning (graphical models, probabilistic networks) • nonlinear optimization (partial separability) first used in semidefinite optimization by Fujisawa, Kojima, Nakata (1997) 9

Chordal sparsity and Cholesky factorization Cholesky factorization of positive definite A ∈ S n E : PAP T = LDL T P a permutation, L unit lower triangular, D positive diagonal • if E is chordal, then there exists a permutation for which P T ( L + L T ) P ∈ S n E A has a ‘zero fill’ Cholesky factorization • if E is not chordal, then for every P there exist positive definite A ∈ S n E for which P T ( L + L T ) P � S n E [Rose 1970] 10

Examples Simple patterns Sparsity pattern of a Cholesky factor : edges of non-chordal sparsity pattern : fill entries in Cholesky factorization a chordal extension of non-chordal pattern 11

Supernodal elimination tree (clique tree) 1 2 13 , 14 , 15 , 16 , 17 3 4 14 , 15 , 17 16 , 17 5 6 11 , 12 10 7 8 10 , 16 9 8 , 9 10 11 8 , 10 9 , 10 9 , 16 12 13 5 , 6 , 7 1 , 2 3 14 15 6 , 7 16 4 17 • vertices of tree are cliques of chordal sparsity graph • top row of each block is intersection of clique with parent clique • bottom rows are (maximal) supernodes ; form a partition of { 1 , 2 , . . ., n } • for each v , cliques that contain v form a subtree of elimination tree 12

Supernodal elimination tree (clique tree) 1 2 13 , 14 , 15 , 16 , 17 3 4 14 , 15 , 17 16 , 17 5 6 11 , 12 10 7 8 10 , 16 9 8 , 9 10 11 8 , 10 9 , 10 9 , 16 12 13 5 , 6 , 7 1 , 2 3 14 15 6 , 7 16 4 17 • vertices of tree are cliques of chordal sparsity graph • top row of each block is intersection of clique with parent clique • bottom rows are supernodes ; form a partition of { 1 , 2 , . . ., n } • for each v , cliques that contain v form a subtree of elimination tree 12

Outline 1. Sparse semidefinite programs 2. Chordal graphs 3. Decomposition of sparse matrix cones 4. Multifrontal algorithms for logarithmic barrier functions 5. Minimum rank positive semidefinite completion

Positive semidefinite matrices with chordal sparsity pattern S ∈ S n E is positive semidefinite if and only if it can be expressed as � P T with H i � 0 S = γ i H i P γ i cliques γ i (for an index set β , P β is 0 - 1 matrix of size | β | × n with P β x = x β for all x ) = + + P T P T P T γ 1 H 1 P γ 1 � 0 γ 2 H 2 P γ 2 � 0 γ 3 H 3 P γ 3 � 0 S � 0 [Griewank and Toint 1984] [Agler, Helton, McCullough, Rodman 1988] 13

Decomposition from Cholesky factorization • example with two cliques: H 1 = + H 2 H 1 and H 2 follow by combining columns in Cholesky factorization = + • readily computed from update matrices in multifrontal Cholesky factorization 14

PSD completable matrices with chordal sparsity X ∈ S n E has a positive semidefinite completion if and only if X γ i γ i � 0 for all cliques γ i follows from duality and clique decomposition of positive semidefinite cone Example (three cliques γ 1 , γ 2 , γ 3 ) X γ 1 γ 1 � 0 X γ 2 γ 2 � 0 PSD completable X X γ 3 γ 3 � 0 [Grone, Johnson, Sá, Wolkowicz, 1984] 15

Sparse semidefinite optimization tr ( CX ) minimize tr ( A i X ) = b i , i = 1 , . . ., m subject to X ∈ K • E is union of sparsity patterns of C , A 1 , ..., A m • K = Π E ( S n + ) is cone of p.s.d. completable matrices • without loss of generality, can assume E is chordal Decomposition algorithms • cone K is intersection of simple cones ( X γ i γ i � 0 for all cliques γ i ) • first used in interior-point methods [Fukuda et al. 2000] [Nakata et al. 2003] • first order, splitting, and dual decomposition methods [Lu, Nemirovski, Monteiro 2007] [Lam, Zhang, Tse 2011] [Sun et al. 2014, 2015] [Pakazad et al. 2017] [Zheng, Fantuzzi, Papachristodoulou, Goulart, Wynn 2017], ... 16

Sparsity and decomposition in semidefinite optimization Lieven - PowerPoint PPT Presentation

Sparsity and decomposition in semidefinite optimization Lieven Vandenberghe ECE Department, UCLA Joint work with Joachim Dahl, Martin S. Andersen, Yifan Sun, Xin Jiang DYSCO PAI/IUAP Network Study Day Leuven, November 28, 2017 Semidefinite

Sparsity and decomposition in semidefinite optimization Lieven Vandenberghe Electrical and

Sparsity, Randomness and Compressed Sensing Petros Boufounos Mitsubishi Electric Research Labs

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Lecture 3: Semidefinite Programming Lecture Outline Part I: Semidefinite programming,

Semidefinite Optimization using MOSEK Joachim Dahl ISMP Berlin, August 23, 2012

Semidefinite Programming Pekka Orponen T-79.7001 Postgraduate Course on Theoretical Computer

Introduction to Sparsity in Modeling and Learning Introduction to Sparsity in Modeling and

Sparsity and image processing Aurlie Boisbunon INRIA-SAM, AYIN March 26, 2014 Why sparsity?

Structured sparsity and convex optimization Francis Bach INRIA - Ecole Normale Sup erieure,

Flyspeck Inequalities and Semidefinite Programming Victor Magron , RA Imperial College Memory

13. Cones and semidefinite constraints Geometry of cones Second order cone programs

Semidefinite programming bounds for codes and anticodes in Cayley graphs Frank Vallentin

Moments, Sums of Squares and Semidefinite Programming Jean B. LASSERRE LAAS-CNRS, and Institute

Completely positive semidefinite matrices: conic approximations and matrix factorization ranks

CS672: Approximation Algorithms Spring 2020 Intro to Semidefinite Programming Instructor:

Support Vector Machines (II): Non-linear SVMs LING 572 Advanced Statistical Methods for NLP

MSc in Computer Engineering, Cybersecurity and Artificial Intelligence Course FDE , a.a.

Determinacy for the complex moment problem via positive definite extensions Dariusz Cicho n

Symmetric indefinite systems, positive definite preconditioning, and interior eigenvalues Eugene

Evaluation Robert W. Lindeman Worcester Polytechnic Institute Department of Computer Science

1 Clock skew optimization Another approach for sequential timing optimization

Cylindric Skew Schur Functions University of Minnesota Combinatorics Seminar 5 November 2004

CENG 4480 Lecture 10: Clock Bei Yu Reference : Chapter 11 Clock Distribution High speed