approximate factor analysis models
play

Approximate Factor Analysis Models Lorenzo Finesso, Peter Spreij - PowerPoint PPT Presentation

Approximate Factor Analysis Models Lorenzo Finesso, Peter Spreij Brixen July 19, 2007 1 0 0 1 1 0 P 1 = 2 2 1 1 0 2 2 1 1 2 0 2 1 1 P 2 = 2 0 2


  1. Approximate Factor Analysis Models Lorenzo Finesso, Peter Spreij Brixen – July 19, 2007

  2.   1 0 0 1 1   0 P 1 =   2 2   1 1   0 2 2 1 1   2 0 2   1 1 P 2 = 2 0   2     0 0 1 1 1 1   2 4 4 1 1 1   P 2 P 1 =   2 4 4   1 1   0 2 2 1

  3. Factor Analysis models Y = HX + ε where X ∈ R k and ε ∈ R n , independent zero mean normals, ( k < n ) C ov( X ) = I , and C ov( ε ) = D > 0, diagonal therefore C ov( Y ) := Σ 0 = HH ⊤ + D C ov( Y | X ) = D diagonal 2

  4. Exact (weak) realization of FA models Problem Given the positive covariance matrix Σ 0 ∈ R n × n and the integer k < n find ( H, D ) such that H ∈ R n × k D > 0 diagonal n × n Σ 0 = HH ⊤ + D 3

  5. Informational divergence between normal measures Given probability measures P 1 ≪ P 2 , on the same space D ( P 1 || P 2 ) = E P 1 log d P 1 d P 2 normal case on R n P 1 = N (0 , Σ 1 ) , P 2 = N (0 , Σ 2 ) D ( P 1 || P 2 ) := D (Σ 1 || Σ 2 ) = 1 2 log | Σ 2 | | Σ 1 | + 1 2 Σ 1 ) − n 2 tr(Σ − 1 2 4

  6. Approximate FA models Problem Given Σ 0 ∈ R n × n positive and the integer k < n minimize 2 log | HH ⊤ + D | D (Σ 0 || HH ⊤ + D ) = 1 + 1 2 tr(( HH ⊤ + D ) − 1 Σ 0 ) − n | Σ 0 | 2 over ( H, D ), where H ∈ R n × k and D > 0 is diagonal of size n Proposition The approximate FA problem admits a (nonunique) solution 5

  7. Lifted version of the problem Definitions � � � � Σ 11 Σ 12 Σ ∈ R ( n + k ) × ( n + k ) : Σ = Σ = > 0 Σ 21 Σ 22 Two subsets of Σ will play a special role Σ 0 = { Σ ∈ Σ : Σ 11 = Σ 0 } HH ⊤ + D � � �� HQ Σ 1 = Σ ∈ Σ : Σ = ( HQ ) ⊤ Q ⊤ Q Elements of Σ 1 will often be denoted by Σ( H, D, Q ) Remark Y ∼ N (0 , Σ 0 ) admits an exact FA model of size k iff Σ 0 ∩ Σ 1 � = ∅ 6

  8. Lifted problem Problem D (Σ ′ || Σ 1 ) min Σ ′ ∈ Σ 0 , Σ 1 ∈ Σ 1 Proposition Let Σ 0 be given. It holds that H,D D (Σ 0 || HH ⊤ + D ) = D (Σ ′ || Σ 1 ) min min Σ ′ ∈ Σ 0 , Σ 1 ∈ Σ 1 7

  9. First partial minimization Problem D (Σ ′ || Σ) min Σ ′ ∈ Σ 0 This problem has a unique solution 8

  10. First partial minimization - general solution Proposition Let ( Y, X ) ∼ Q = Q Y,X and let P = P Y,X : P Y = P 0 � � P = for a given P 0 ≪ Q Y , then D ( P || Q ) = D ( P ∗ || Q ) min P ∈ P where P ∗ is given by P ∗ Y = P 0 , P ∗ X | Y = Q X | Y Moreover, for any P ∈ P , one has the Pythagorean law D ( P || Q ) = D ( P || P ∗ ) + D ( P ∗ || Q ) 9

  11. First partial minimization – normal case Proposition Let Q ∼ N (0 , Σ) and P 0 ∼ N (0 , Σ 0 ) where Σ ∈ Σ and Σ 0 ∈ R n × n , then D (Σ ′ || Σ) min Σ ′ ∈ Σ 0 is attained by P ∗ ∼ N (0 , Σ ∗ ) with Σ 0 Σ − 1   Σ 0 11 Σ 12 Σ ∗ = Σ 21 Σ − 1 Σ 22 − Σ 21 Σ − 1 11 (Σ 11 − Σ 0 )Σ − 1   11 Σ 0 11 Σ 12 10

  12. Second partial minimization Problem min D (Σ || Σ 1 ) Σ 1 ∈ Σ 1 This problem has a unique solution Σ ∗ 1 = Σ ∗ ( H ∗ , D ∗ , Q ∗ ) 11

  13. Second partial minimization – normal case Notation For M square let ∆( M ) be the diagonal ∆( M ) ii = M ii Proposition An optimal point is ( H ∗ , D ∗ , Q ∗ ) with H ∗ = Σ 12 Σ − 1 / 2 22 D ∗ = ∆(Σ 11 − Σ 12 Σ − 1 22 Σ 21 ) Q ∗ = Σ 1 / 2 22 thus: Σ 12 Σ − 1 22 Σ 21 + ∆(Σ 11 − Σ 12 Σ − 1 � � 22 Σ 21 ) Σ 12 Σ ∗ 1 = Σ 21 Σ 22 moreover D (Σ || Σ( H, D, Q )) = D (Σ || Σ ∗ 1 ) + D (Σ ∗ 1 || Σ( H, D, Q )) for any Σ( H, D, Q ) ∈ Σ 1 12

  14. Alternating minimization algorithm Given Σ 0 > 0, pick ( H 0 , D 0 , Q 0 ) and let Σ (0) = Σ( H 0 , D 0 , Q 0 ) 1 construct the sequence Σ (0) → Σ (1) → Σ (2) → Σ ′ (1) − → Σ ′ (2) − − − − → . . . 1 1 1 where D (Σ ′ ( t +1) || Σ ( t ) D (Σ ′ || Σ ( t ) . 1 ) = min 1 ) Σ ′ ∈ Σ 0 and D (Σ ′ ( t +1) || Σ ( t +1) D (Σ ′ ( t +1) || Σ 1 ) . ) = min 1 Σ 1 ∈ Σ 1 13

  15. Algorithm At the t -th iteration the matrices H t , D t and Q t are available. � Q ⊤ t Q t − Q ⊤ t H ⊤ t ( H t H ⊤ t + D t ) − 1 H t Q t Q t +1 = � 1 / 2 + Q ⊤ t H ⊤ t ( H t H ⊤ t + D t ) − 1 Σ 0 ( H t H ⊤ t + D t ) − 1 H t Q t H t +1 = Σ 0 ( H t H ⊤ t + D t ) − 1 H t Q t Q − 1 t +1 D t +1 = ∆(Σ 0 − H t +1 H ⊤ t +1 ) 14

  16. Algorithm Notice The update rules can be written in terms of ( H t , D t ) only R t = I − H ⊤ t ( H t H ⊤ t + D t ) − 1 ( H t H ⊤ t + D t − Σ 0 )( H t H ⊤ t + D t ) − 1 H t t + D t ) − 1 H t R − 1 / 2 H t +1 = Σ 0 ( H t H ⊤ t D t +1 = ∆(Σ 0 − H t +1 H ⊤ t +1 ) 15

  17. Some properties of the algorithm Proposition (a) D t > 0 (b) R t is invertible (c) If H 0 is of full column rank, so is H t If Σ 0 = H t H ⊤ (e) t + D t the algorithm stops (f) The objective function decreases at each iteration (g) The limit points ( H, D ) of the algorithm satisfy the relations H = (Σ 0 − HH ⊤ ) D − 1 H, D = ∆(Σ 0 − HH ⊤ ) 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend