si231 matrix computations lecture 6 positive semidefinite
play

SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices - PowerPoint PPT Presentation

SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices Ziping Zhao Fall Term 20202021 School of Information Science and Technology ShanghaiTech University, Shanghai, China Lecture 6: Positive Semidefinite Matrices positive


  1. SI231 Matrix Computations Lecture 6: Positive Semidefinite Matrices Ziping Zhao Fall Term 2020–2021 School of Information Science and Technology ShanghaiTech University, Shanghai, China

  2. Lecture 6: Positive Semidefinite Matrices • positive semidefinite matrices • application: subspace method for super-resolution spectral analysis • application: Euclidean distance matrices • matrix inequalities Ziping Zhao 1

  3. Hightlights • a matrix A ∈ S n is said to be positive semidefinite (PSD) if x T Ax ≥ 0 , for all x ∈ R n ; and positive definite (PD) if for all x ∈ R n with x � = 0 x T Ax > 0 , • a matrix A ∈ S n is PSD (resp. PD) – if and only if its eigenvalues are all non-negative (resp. positive); – if and only if it can be factored as A = B T B for some B ∈ R m × n • in this lecture, we will deal with the real-symmetric matrices–the Hermitian case follows along the same lines Ziping Zhao 2

  4. Quadratic Form Let A ∈ S n . For x ∈ R n , the matrix product x T Ax is called a quadratic form. • some basic facts (try to verify): – x T Ax = � n � n j =1 x i x j a ij = � n i + � n − 1 � n i =1 a ii x 2 j = i +1 2 a ij x i x j i =1 i =1 – x T Ax = � n i + � n − 1 � n i =1 a ii x 2 j = i +1 ( a ij + a ji ) x i x j for general A ∈ R n × n , i =1 there may exist A 1 and A 2 s.t. x T A 1 x = x T A 2 x ∗ it suffices to consider unique symmetric A for general A ∈ R n × n since x T Ax = x T � 1 � 2 ( A + A T ) x – complex case: ∗ the quadratic form is defined as x H Ax , where x ∈ C n ∗ for A ∈ H n , x H Ax is real for any x ∈ C n Ziping Zhao 3

  5. Positive Semidefinite Matrices A matrix A ∈ S n is said to be • positive semidefinite (PSD) if x T Ax ≥ 0 for all x ∈ R n • positive definite (PD) if x T Ax > 0 for all x ∈ R n with x � = 0 • indefinite if both A and − A are not PSD Notation: • A � 0 means that A is PSD • A ≻ 0 means that A is PD • A � 0 means that A is indefinite • if A is PD, then it is also PSD • The concepts negative semidefinite and negative definite may be defined by reversing the inequalities or, equivalently, by saying − A is PSD or PD, respectively. Ziping Zhao 4

  6. Example: Covariance Matrices • let y 0 , y 2 , . . . y T − 1 ∈ R n be a sequence of multi-dimensional data samples – examples: patches in image processing, multi-channel signals in signal pro- cessing, history of returns of assets in finance [Brodie-Daubechies-et al.’09] , ... � T − 1 µ y = 1 • sample mean: ˆ t =0 y t T � T − 1 ˆ C y = 1 µ y ) T • sample covariance: t =0 ( y t − ˆ µ y )( y t − ˆ T µ y ) T x | 2 ≥ 0 • a sample covariance is PSD: x T ˆ � T − 1 C y x = 1 t =0 | ( y t − ˆ T • the (statistical) covariance of y t is also PSD – to put into context, assume that y t is a wide-sense stationary random process – the covariance, defined as C y = E[( y t − µ y )( y t − µ y ) T ] where µ y = E[ y t ] , can be shown to be PSD Ziping Zhao 5

  7. Example: Hessian • let f : R n → R be a twice differentiable function • the Hessian of f , denoted by ∇ 2 f ( x ) ∈ S n , is a matrix whose ( i, j ) th entry is given by ∂ 2 f � � ∇ 2 f ( x ) i,j = ∂x i ∂x j • Fact: f is convex if and only if ∇ 2 f ( x ) � 0 for all x in the problem domain • example: consider the quadratic function f ( x ) = 1 2 x T Rx + q T x + c It can be verified that ∇ 2 f ( x ) = R . Thus, f is convex if and only if R � 0 Ziping Zhao 6

  8. Illustration of Quadratic Functions 20 10 15 5 f(x) 10 f(x) 0 5 −5 0 1 1 −10 0.5 0.5 1 1 0.5 0 0.5 0 0 0 −0.5 −0.5 −0.5 −0.5 −1 −1 −1 −1 x1 x2 x1 x2 (a) PSD A . (b) indefinite A . Ziping Zhao 7

  9. PSD Matrix Inequalities • the notion of PSD matrices can be used to define inequalities for matrices • PSD matrix inequalities are frequently used in topics like semidefinite programming • definition: – A � B means that A − B is PSD – A ≻ B means that A − B is PD – A � B means that A − B is indefinite • results that immediately follow from the definition: let A , B , C ∈ S n . – A � 0 , α ≥ 0 (resp. A ≻ 0 , α > 0 ) = ⇒ α A � 0 (resp. α A ≻ 0 ) – A , B � 0 (resp. A � 0 , B ≻ 0 ) = ⇒ A + B � 0 (resp. A + B ≻ 0 ) – A � B , B � C (resp. A � B , B ≻ C ) = ⇒ A � C (resp. A ≻ C ) – A � B does not imply B � A Ziping Zhao 8

  10. PSD Matrix Inequalities • more results: let A , B ∈ S n . – A � B = ⇒ λ k ( A ) ≥ λ k ( B ) for all k ; the converse is not always true – A � I (resp. A ≻ I ) ⇐ ⇒ λ k ( A ) ≥ 1 for all k (resp. λ k ( A ) > 1 for all k ) – I � A (resp. I ≻ A ) ⇐ ⇒ λ k ( A ) ≤ 1 for all k (resp. λ k ( A ) < 1 for all k ) ⇒ B − 1 � A − 1 – if A , B ≻ 0 then A � B ⇐ • some results as consequences of the above results: – for A � B � 0 , det( A ) ≥ det( B ) – for A � B , tr( A ) ≥ tr( B ) – for A � B ≻ 0 , tr( A − 1 ) ≤ tr( B − 1 ) Ziping Zhao 9

  11. PSD Matrix Inequalities • the Schur complement: let � A � B X = , B T C where A ∈ S m , B ∈ R m × n , C ∈ S n with C ≻ 0 . Let S = A − BC − 1 B T , which is called the Schur complement of C . • We have X � 0 (resp. X ≻ 0 ) ⇐ ⇒ S � 0 (resp. S ≻ 0 ) – example: let C be PD. By the Schur complement, C − bb T � 0 1 − b T C − 1 b ≥ 0 ⇐ ⇒ Ziping Zhao 10

  12. PSD Matrices and Eigenvalues Theorem 5.1. Let A ∈ S n , and let λ 1 , . . . , λ n be the eigenvalues of A . We have 1. A � 0 ⇐ ⇒ λ i ≥ 0 for i = 1 , . . . , n 2. A ≻ 0 ⇐ ⇒ λ i > 0 for i = 1 , . . . , n • proof: let A = VΛV T be the eigendecomposition of A . x T VΛV T x ≥ 0 , for all x ∈ R n A � 0 ⇐ ⇒ z T Λz ≥ 0 , for all z ∈ R ( V T ) = R n ⇐ ⇒ i =1 λ i | z i | 2 ≥ 0 , � n for all z ∈ R n ⇐ ⇒ ⇐ ⇒ λ i ≥ 0 for all i The PD case is proven by the same manner. Ziping Zhao 11

  13. Example: Ellipsoid • an ellipsoid of R n centered at 0 is defined as E = { x ∈ R n | x T P − 1 x ≤ 1 } , for some PD P ∈ S n l1 l2 0 • let P = VΛV T be the eigendecomposition – V determines the directions of the semi-axes – λ 1 , . . . , λ n determine the lengths of the semi-axes 1 2 – ℓ i = λ i v i Ziping Zhao 12

  14. Example: Ellipsoid • an ellipsoid of R n centered at 0 is defined as E = { x ∈ R n | x T P − 1 x ≤ 1 } , for some PD P ∈ S n l1 l2 0 • note: – in direction v 1 , x T P − 1 x is large, hence ellipsoid is fat in direction v 1 – in direction v n , x T P − 1 x is small, hence ellipsoid is thin in direction v n � – λ max /λ min gives maximum eccentricity E = { x ∈ R n | x T Q − 1 x ≤ 1 } , for some PD Q ∈ S n , the E ⊇ ˜ • ˜ E ⇐ ⇒ A � B Ziping Zhao 13

  15. Example: Multivariate Gaussian Distribution • probability density function for a Gaussian-distributed vector x ∈ R n : � � 1 − 1 2( x − µ ) T Σ − 1 ( x − µ ) p ( x ) = 2 exp n 1 (2 π ) 2 (det( Σ )) where µ and Σ are the mean and covariance of x , resp. – Σ is PD – Σ determines how x is spread, by the same way as in ellipsoid Ziping Zhao 14

  16. Example: Multivariate Gaussian Distribution 0.15 0.25 0.2 0.1 0.15 f(x) f(x) 0.1 0.05 0.05 0 0 2 2 0 0 3 3 −2 2 −2 2 1 1 0 0 −1 −1 −2 −2 x2 x2 −3 −3 x1 x1 � 1 � 1 � � 0 0 . 8 (a) µ = 0 , Σ = . (b) µ = 0 , Σ = . 0 1 0 . 8 1 Ziping Zhao 15

  17. Some Properties of PSD Matrices • it can be directly seen from the definition that – A � 0 = ⇒ a ii ≥ 0 for all i – A ≻ 0 = ⇒ a ii > 0 for all i • A is PSD, x T Ax = 0 ⇐ ⇒ Ax = 0 for a x . ( A is PD ⇐ ⇒ A is nonsingular.) • extension (also direct): partition A as � � A 11 A 12 A = . A 21 A 22 Then, A � 0 = ⇒ A 11 � 0 , A 22 � 0 . Also, A ≻ 0 = ⇒ A 11 ≻ 0 , A 22 ≻ 0 • further extension: – a principal submatrix of A , denoted by A I , where I = { i 1 , . . . , i m } ⊆ { 1 , . . . , n } , m < n , is a submatrix obtained by keeping only the rows and columns indicated by I ; i.e., [ A I ] jk = a i j ,i k for all j, k ∈ { 1 , . . . , m } – if A is PSD (resp. PD), then any principal submatrix of A is PSD (resp. PD) Ziping Zhao 16

  18. Some Properties of PSD Matrices Property 5.1. Let A ∈ S n , B ∈ R n × m , and C = B T AB . We have the following properties: 1. A � 0 = ⇒ C � 0 (specially, A ≻ 0 = ⇒ C � 0 ) 2. suppose A ≻ 0 . It holds that C ≻ 0 ⇐ ⇒ B has full column rank 3. suppose B is nonsingular. It holds that A ≻ 0 ⇐ ⇒ C ≻ 0 , and that A � 0 ⇐ ⇒ C � 0 . • proof sketch: the 1st property is trivial. For the 2nd property, observe ⇒ z T Az > 0 , ∀ z ∈ R ( B ) \ { 0 } . C ≻ 0 ⇐ ( ∗ ) If A ≻ 0 , ( ∗ ) reduces to C ≻ 0 ⇐ ⇒ Bx � = 0 , ∀ x � = 0 (or B has full column rank). The 3rd property is proven by the similar manner. Ziping Zhao 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend