numerical analysis of the matrix logarithm
play

Numerical Analysis of the Matrix Logarithm Nick Higham School of - PowerPoint PPT Presentation

Numerical Analysis of the Matrix Logarithm Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Computational Methods with Applications Harrachov 2007 Defining f(A)


  1. Numerical Analysis of the Matrix Logarithm Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Computational Methods with Applications Harrachov 2007

  2. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 2 / 42

  3. Defining f(A) Applications Theory Methods Matrix Logarithm A logarithm of A ∈ C n × n is any matrix X such that e X = A . Existence. Representation, classification. Computation. Conditioning. First, approach via theory of matrix functions. . . MIMS Nick Higham Matrix Logarithm 3 / 42

  4. Defining f(A) Applications Theory Methods Multiplicity of Definitions There have been proposed in the literature since 1880 eight distinct definitions of a matric function, by Weyr, Sylvester and Buchheim, Giorgi, Cartan, Fantappiè, Cipolla, Schwerdtfeger and Richter. — R. F. Rinehart, The Equivalence of Definitions of a Matric Function, Amer. Math. Monthly (1955) MIMS Nick Higham Matrix Logarithm 4 / 42

  5. Defining f(A) Applications Theory Methods Jordan Canonical Form   λ k 1 ...   λ k   Z − 1 AZ = J = diag ( J 1 , . . . , J p ) , J k =   ... ����   1 m k × m k λ k Definition f ( A ) = Zf ( J ) Z − 1 = Z diag ( f ( J k )) Z − 1 ,   f ( m k − 1 ) )( λ k ) f ′ ( λ k ) f ( λ k ) . . .   ( m k − 1 )!   .  ...  . f ( J k ) = .  f ( λ k ) .    ...   f ′ ( λ k ) f ( λ k ) MIMS Nick Higham Matrix Logarithm 5 / 42

  6. Defining f(A) Applications Theory Methods Interpolation Definition (Sylvester, 1883; Buchheim, 1886) Distinct e’vals λ 1 , . . . , λ s , n i = max size of Jordan blocks for λ i . Then f ( A ) = p ( A ) , where p is unique Hermite interpolating poly of degree < � s i = 1 n i satisfying p ( j ) ( λ i ) = f ( j ) ( λ i ) , j = 0 : n i − 1 , i = 1 : s . MIMS Nick Higham Matrix Logarithm 7 / 42

  7. Defining f(A) Applications Theory Methods Cauchy Integral Theorem Definition � 1 f ( z )( zI − A ) − 1 dz , f ( A ) = 2 π i Γ where f is analytic on and inside a closed contour Γ that encloses λ ( A ) . MIMS Nick Higham Matrix Logarithm 8 / 42

  8. Defining f(A) Applications Theory Methods Equivalence of Definitions Theorem The three definitions are equivalent , modulo analyticity assumption for Cauchy. MIMS Nick Higham Matrix Logarithm 9 / 42

  9. Defining f(A) Applications Theory Methods Composite Functions Theorem f ( t ) = g ( h ( t )) ⇒ f ( A ) = g ( h ( A )) , provided latter matrix defined. Corollary exp ( log ( A )) = A when log ( A ) is defined. MIMS Nick Higham Matrix Logarithm 10 / 42

  10. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 11 / 42

  11. Defining f(A) Applications Theory Methods Application: Markov Models Time-homogeneous continuous-time Markov process with transition probability matrix P ( t ) ∈ R n × n . Transition intensity matrix Q related to P by P ( t ) = e Qt . Elements of Q satisfy n � q ij ≥ 0 , i � = j , q ij = 0 . j = 1 Embeddability problem When does a given stochastic P have a real logarithm Q that is an intensity matrix ? MIMS Nick Higham Matrix Logarithm 13 / 42

  12. Defining f(A) Applications Theory Methods The Average Eye First order character of optical system characterized by � S � ∈ R 5 × 5 , where S ∈ R 4 × 4 is δ transference matrix T = 0 1 � � 0 I 2 symplectic: S T JS = J , where J = . − I 2 0 Average m − 1 � m i = 1 T i is not a transference matrix. Harris (2005) proposes the average exp ( m − 1 � m i = 1 log ( T i )) . MIMS Nick Higham Matrix Logarithm 14 / 42

  13. Defining f(A) Applications Theory Methods The Average Eye First order character of optical system characterized by � S � ∈ R 5 × 5 , where S ∈ R 4 × 4 is δ transference matrix T = 0 1 � � 0 I 2 symplectic: S T JS = J , where J = . − I 2 0 Average m − 1 � m i = 1 T i is not a transference matrix. Harris (2005) proposes the average exp ( m − 1 � m i = 1 log ( T i )) . For Hermitian pos def A and B , Arsigny et al. (2007) define the log-Euclidean mean E ( A , B ) = exp ( 1 2 ( log ( A ) + log ( B ))) . MIMS Nick Higham Matrix Logarithm 14 / 42

  14. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 15 / 42

  15. Defining f(A) Applications Theory Methods Logs of A = I 3   0 0 0  ,  B = 0 0 0 0 0 0     0 2 π − 1 1 0 2 π 1   ,   , C = − 2 π 0 0 D = − 2 π 0 0 − 2 π 0 0 0 0 0 e B = e C = e D = I 3 . Λ ( C ) = Λ ( D ) = { 0 , 2 π i , − 2 π i } . MIMS Nick Higham Matrix Logarithm 18 / 42

  16. Defining f(A) Applications Theory Methods Principal Log and p th Root Let A ∈ C n × n have no eigenvalues on R − . Principal log X = log ( A ) denotes unique X such that e X = A . � � − π < Im λ ( X ) < π . � � For next 2 slides only , allow Im λ ( X ) = π . Principal p th root For integer p > 0, X = A 1 / p is unique X such that X p = A . − π/ p < arg ( λ ( X )) < π/ p . MIMS Nick Higham Matrix Logarithm 19 / 42

  17. Defining f(A) Applications Theory Methods All Solutions of e X = A Theorem (Gantmacher) A ∈ C n × n nonsing with Jordan canonical form Z − 1 AZ = J = diag ( J 1 , J 2 , . . . , J p ) . All solutions to e X = A are given by − 1 Z − 1 , 2 , . . . , L ( j p ) X = Z U diag ( L ( j 1 ) 1 , L ( j 2 ) p ) U where L ( j k ) = log ( J k ( λ k )) + 2 j k π i I m k , k j k ∈ Z arbitrary, and U an arbitrary nonsing matrix that commutes with J. MIMS Nick Higham Matrix Logarithm 20 / 42

  18. Defining f(A) Applications Theory Methods All Solutions of e X = A : Classified Theorem A ∈ C n × n nonsing: p Jordan blocks, s distinct ei’vals. e X = A has a countable infinity of solutions that are primary functions of A : 2 , . . . , L ( j p ) X j = Z diag ( L ( j 1 ) 1 , L ( j 2 ) p ) Z − 1 , where λ i = λ k implies j i = j k . If s < p then e X = A has non-primary solutions − 1 Z − 1 , X j ( U ) = Z U diag ( L ( j 1 ) 1 , L ( j 2 ) 2 , . . . , L ( j p ) p ) U where j k ∈ Z arbitrary, U arbitrary nonsing with UJ = JU , and for each j ∃ i and k s.t. λ i = λ k while j i � = j k . MIMS Nick Higham Matrix Logarithm 21 / 42

  19. Defining f(A) Applications Theory Methods Logs of A = I 3     0 2 π − 1 1 0 2 π 1  ,  ,   C = − 2 π 0 0 D = − 2 π 0 0 − 2 π 0 0 0 0 0 e 0 = e C = e D = I 3 . Λ ( C ) = Λ ( D ) = { 0 , 2 π i , − 2 π i } .   1 α 0   , U = 0 1 α α ∈ C , 0 0 1   2 α 2 1 − 2 α X = U diag ( 2 π i , − 2 π i , 0 ) U − 1 = 2 π i   . 0 1 − α 0 0 1 MIMS Nick Higham Matrix Logarithm 22 / 42

  20. Defining f(A) Applications Theory Methods Two Facts on Commuting Matrices Theorem If A , B ∈ C n × n commute then ∃ a unitary U ∈ C n × n such that U ∗ AU and U ∗ BU are both upper triangular. MIMS Nick Higham Matrix Logarithm 23 / 42

  21. Defining f(A) Applications Theory Methods Two Facts on Commuting Matrices Theorem If A , B ∈ C n × n commute then ∃ a unitary U ∈ C n × n such that U ∗ AU and U ∗ BU are both upper triangular. Theorem For A , B ∈ C n × n , e ( A + B ) t = e At e Bt for all t if and only if AB = BA. MIMS Nick Higham Matrix Logarithm 23 / 42

  22. Defining f(A) Applications Theory Methods When Does log ( BC ) = log ( B ) + log ( C ) ? Theorem Let B , C ∈ C n × n commute and have no ei’vals on R − . If for every ei’val λ j of B and the corr. ei’val µ j of C , | arg λ j + arg µ j | < π , then log ( BC ) = log ( B ) + log ( C ) . MIMS Nick Higham Matrix Logarithm 24 / 42

  23. Defining f(A) Applications Theory Methods When Does log ( BC ) = log ( B ) + log ( C ) ? Theorem Let B , C ∈ C n × n commute and have no ei’vals on R − . If for every ei’val λ j of B and the corr. ei’val µ j of C , | arg λ j + arg µ j | < π , then log ( BC ) = log ( B ) + log ( C ) . Proof . log ( B ) and log ( C ) commute, since B and C do. Therefore e log ( B )+ log ( C ) = e log ( B ) e log ( C ) = BC . Thus log ( B ) + log ( C ) is some logarithm of BC . Then Im ( log λ j + log µ j ) = arg λ j + arg µ j ∈ ( − π, π ) , so log ( B ) + log ( C ) is the principal logarithm of BC . MIMS Nick Higham Matrix Logarithm 24 / 42

  24. Defining f(A) Applications Theory Methods Outline 1 Definition of log ( A ) 2 Applications 3 Theory 4 Numerical methods MIMS Nick Higham Matrix Logarithm 25 / 42

  25. Defining f(A) Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places . MIMS Nick Higham Matrix Logarithm 26 / 42

  26. Defining f(A) Applications Theory Methods Henry Briggs (1561–1630) Arithmetica Logarithmica (1624) Logarithms to base 10 of 1–20,000 and 90,000–100,000 to 14 decimal places . Briggs must be viewed as one of the great figures in numerical analysis. —Herman H. Goldstine, A History of Numerical Analysis (1977) MIMS Nick Higham Matrix Logarithm 26 / 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend