cs672 approximation algorithms spring 2020 intro to
play

CS672: Approximation Algorithms Spring 2020 Intro to Semidefinite - PowerPoint PPT Presentation

CS672: Approximation Algorithms Spring 2020 Intro to Semidefinite Programming Instructor: Shaddin Dughmi Outline Basics of PSD Matrices 1 Semidefinite Programming 2 Max Cut 3 Symmetric Matrices A matrix A R n n is symmetric if and


  1. CS672: Approximation Algorithms Spring 2020 Intro to Semidefinite Programming Instructor: Shaddin Dughmi

  2. Outline Basics of PSD Matrices 1 Semidefinite Programming 2 Max Cut 3

  3. Symmetric Matrices A matrix A ∈ R n × n is symmetric if and only if it is square and A ij = A ji for all i, j . We denote the cone of n × n symmetric matrices by S n . Basics of PSD Matrices 1/13

  4. Symmetric Matrices A matrix A ∈ R n × n is symmetric if and only if it is square and A ij = A ji for all i, j . We denote the cone of n × n symmetric matrices by S n . Fact A matrix A ∈ R n × n is symmetric if and only if it is orthogonally diagonalizable. Basics of PSD Matrices 1/13

  5. Symmetric Matrices A matrix A ∈ R n × n is symmetric if and only if it is square and A ij = A ji for all i, j . We denote the cone of n × n symmetric matrices by S n . Fact A matrix A ∈ R n × n is symmetric if and only if it is orthogonally diagonalizable. i.e. A = QDQ ⊺ where Q is an orthogonal matrix and D = diag ( λ 1 , . . . , λ n ) . The columns of Q are the (normalized) eigenvectors of A , with corresponding eigenvalues λ 1 , . . . , λ n Equivalently: As a linear operator, A scales the space along an orthonormal basis Q The scaling factor λ i along direction q i may be negative, positive, or 0 . Basics of PSD Matrices 1/13

  6. Positive Semi-Definite Matrices A matrix A ∈ R n × n is positive semi-definite if it is symmetric and moreover all its eigenvalues are nonnegative. We denote the cone of n × n positive semi-definite matrices by S n + We use A � 0 as shorthand for A ∈ S n + Basics of PSD Matrices 2/13

  7. Positive Semi-Definite Matrices A matrix A ∈ R n × n is positive semi-definite if it is symmetric and moreover all its eigenvalues are nonnegative. We denote the cone of n × n positive semi-definite matrices by S n + We use A � 0 as shorthand for A ∈ S n + A = QDQ ⊺ where Q is an orthogonal matrix and D = diag ( λ 1 , . . . , λ n ) , where λ i ≥ 0 . As a linear operator, A performs nonnegative scaling along an orthonormal basis Q Basics of PSD Matrices 2/13

  8. Positive Semi-Definite Matrices A matrix A ∈ R n × n is positive semi-definite if it is symmetric and moreover all its eigenvalues are nonnegative. We denote the cone of n × n positive semi-definite matrices by S n + We use A � 0 as shorthand for A ∈ S n + A = QDQ ⊺ where Q is an orthogonal matrix and D = diag ( λ 1 , . . . , λ n ) , where λ i ≥ 0 . As a linear operator, A performs nonnegative scaling along an orthonormal basis Q Note Positive definite, negative semi-definite, and negative definite defined similarly. Basics of PSD Matrices 2/13

  9. Geometric Intuition for PSD Matrices For A � 0 , let q 1 , . . . , q n be the orthonormal eigenbasis for A , and let λ 1 , . . . , λ n ≥ 0 be the corresponding eigenvalues. The linear operator x → Ax scales the q i component of x by λ i When applied to every x in the unit ball, the image of A is an ellipsoid centered at the origin with principal directions q 1 , . . . , q n and corresponding diameters 2 λ 1 , . . . , 2 λ n When A is positive definite ( i.e.λ i > 0 ), and therefore invertible, the � � y : y T ( AA T ) − 1 y ≤ 1 ellipsoid is the set Basics of PSD Matrices 3/13

  10. Useful Properties of PSD Matrices If A � 0 , then x T Ax ≥ 0 for all x 1 A has a positive semi-definite square root A 2 2 = Q diag ( √ λ 1 , . . . , √ λ n ) Q ⊺ 1 A A = B T B for some matrix B . Interpretation: PSD matrices encode the “pairwise similarity” relationships of a family of vectors. A ij is dot product of the i th and j th columns of B . Interpretation: The quadratic form x T Ax is the length of a linear transformation of x , namely || Bx || 2 2 The quadratic function x T Ax is convex A can be expressed as a sum of vector outer-products e.g., A = � n v i = √ λ i � i =1 v i v T i for � q i Basics of PSD Matrices 4/13

  11. Useful Properties of PSD Matrices If A � 0 , then x T Ax ≥ 0 for all x 1 A has a positive semi-definite square root A 2 2 = Q diag ( √ λ 1 , . . . , √ λ n ) Q ⊺ 1 A A = B T B for some matrix B . Interpretation: PSD matrices encode the “pairwise similarity” relationships of a family of vectors. A ij is dot product of the i th and j th columns of B . Interpretation: The quadratic form x T Ax is the length of a linear transformation of x , namely || Bx || 2 2 The quadratic function x T Ax is convex A can be expressed as a sum of vector outer-products e.g., A = � n v i = √ λ i � i =1 v i v T i for � q i As it turns out, each of the above is also sufficient for A � 0 (assuming A is symmetric). Basics of PSD Matrices 4/13

  12. Properties of PSD Matrices Relevant for Computation The set of PSD matrices is convex Follows from the characterization: x T Ax ≥ 0 for all x The set of PSD matrices admits an efficient separation oracle Given A , find eigenvector v with negative eigenvalue: v T Av < 0 . A PSD matrix A ∈ R n × n implicitly encodes the “pairwise similarities” of a family of vectors b 1 , . . . , b n ∈ R n . Follows from the characterization A = B T B for some B A ij = � b i , b j � Can convert between A and B efficiently. B to A : Matrix multiplication A to B : B can be expressed in terms of eigenvectors/eigenvalues of A , which can be easily computed to arbitrary precision via powering methods. Alternatively: Cholesky decomposition, SVD, . . . . Basics of PSD Matrices 5/13

  13. Outline Basics of PSD Matrices 1 Semidefinite Programming 2 Max Cut 3

  14. Convex Optimization Convex Set min (or max) f ( x ) subject to x ∈ X Convex Optimization Problem Generalization of LP where Feasible set X convex: αx + (1 − α ) y ∈ X , for all x, y ∈ X and α ∈ [0 , 1] Objective function f is convex in case of minimization f ( αx + (1 − α ) y ) ≤ αf ( x ) + (1 − α ) f ( y ) for all x, y ∈ X and α ∈ [0 , 1] Objective function f is concave in case of maximization Semidefinite Programming 6/13

  15. Convex Optimization Convex Set min (or max) f ( x ) subject to x ∈ X Convex Optimization Problems Solvable efficiently (i.e. in polynomial time) to arbitrary precision under mild conditions Separation oracle for X First-order oracle for evaluating f ( x ) and ▽ f ( x ) . For more detail Take CSCI 675! Semidefinite Programming 6/13

  16. Semidefinite Programs These are Optimization problems where the feasible set is the cone of PSD cone, possibly intersected with linear constraints. Generalization of LP . Special case of Convex Optimization. maximize c ⊺ x subject to Ax � b x 1 F 1 + x 2 F 2 . . . x n F n + G is PSD F 1 , . . . , F n , G , and A are given matrices, and c, b are given vectors. Semidefinite Programming 7/13

  17. Semidefinite Programs These are Optimization problems where the feasible set is the cone of PSD cone, possibly intersected with linear constraints. Generalization of LP . Special case of Convex Optimization. maximize c ⊺ x subject to Ax � b x 1 F 1 + x 2 F 2 . . . x n F n + G is PSD F 1 , . . . , F n , G , and A are given matrices, and c, b are given vectors. Examples Fitting a distribution, say a Gaussian, to observed data. Variable is a positive semi-definite covariance matrix. As a relaxation to combinatorial problems that encode pairwise relationships: e.g. finding the maximum cut of a graph. Semidefinite Programming 7/13

  18. Semidefinite Programs These are Optimization problems where the feasible set is the cone of PSD cone, possibly intersected with linear constraints. Generalization of LP . Special case of Convex Optimization. maximize c ⊺ x subject to Ax � b x 1 F 1 + x 2 F 2 . . . x n F n + G is PSD F 1 , . . . , F n , G , and A are given matrices, and c, b are given vectors. Fact SDP can be solved in polytime to arbitrary precision, since PSD constraints admit a polytime separation oracle. Semidefinite Programming 7/13

  19. Outline Basics of PSD Matrices 1 Semidefinite Programming 2 Max Cut 3

  20. The Max Cut Problem Given an undirected graph G = ( V, E ) , find a partition of V into ( S, V \ S ) maximizing number of edges with exactly one end in S . � 1 − x i x j maximize ( i,j ) ∈ E 2 subject to x i ∈ {− 1 , 1 } , for i ∈ V. Max Cut 8/13

  21. The Max Cut Problem Given an undirected graph G = ( V, E ) , find a partition of V into ( S, V \ S ) maximizing number of edges with exactly one end in S . � 1 − x i x j maximize ( i,j ) ∈ E 2 subject to x i ∈ {− 1 , 1 } , for i ∈ V. Instead of requiring x i to be on the 1 dimensional sphere, we relax and permit it to be in the n -dimensional sphere, where n = | V | . Vector Program relaxation � 1 − � v i · � v j maximize ( i,j ) ∈ E 2 subject to || � v i || 2 = 1 , for i ∈ V. v i ∈ R n , � for i ∈ V. Max Cut 8/13

  22. SDP Relaxation Recall: A symmetric n × n matrix Y is PSD iff Y = V T V for n × n matrix V Equivalently: PSD matrices encode pairwise dot products of columns of V When diagonal entries of Y are 1 , V has unit length columns Recall: Y and V can be recovered from each other efficiently Max Cut 9/13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend