online principal component analysis
play

Online Principal Component Analysis Edo Liberty . . . . . . . - PowerPoint PPT Presentation

Online Principal Component Analysis Edo Liberty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PCA Motivation . . . . . . . . . . . . . . . . . . . . .


  1. Online Principal Component Analysis Edo Liberty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  2. PCA Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  3. F min t min PCA Objective Given X ∈ R d × n and k < d minimize over Y ∈ R k × n Φ ∥ X − Φ Y ∥ 2 ∑ Φ ∥ x t − Φ y t ∥ 2 or Think of X = [ x 1 , x 2 , . . . ] and Y = [ y 1 , y 2 , . . . ] as collections of column vectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  4. Optimal Offline Solution Optimal Offline Solution Let U k span the top k left singular vectors of X . ■ Set Y = U T k X ■ Set Φ = U k ■ Computing U k is possible offline using the Singular Value Decomposition. ■ The optimal reconstruction Φ turns out to be an isometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  5. x t x T t Pass efficient PCA We can compute U k from XX T and XX T = ∑ t . This requires Θ( nd 2 ) time (potentially) and Θ( d 2 ) space. Approximating U k in one pass more efficiently is possible. [FKV04, DK03, Sar06, DMM08, DRVW06, RV07, WLRT08, CW09, Oli10, CW12, Lib13, GP14, GLPW15] Nevertheless, a second pass is required to map x t �→ y t = U T k x t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  6. Online PCA Consider online clustering (e.g. [CCFM97, LSS14] ) or online facility location (e.g. [Mey01] ) The PCA algorithm must output y t before receiving x t +1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  7. Online regression Note that this is non trivial even when d = 2 and k = 1 . For x 1 there aren’t many options... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  8. Online regression Note that this is non trivial even when d = 2 and k = 1 . For x 2 this is already a non standard optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  9. Online regression Note that this is non trivial even when d = 2 and k = 1 . In general, the mapping x i �→ y i is not necessarily linear. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  10. Online PCA, Possible Problem Definitions ■ Stochastic model: Bounds ∥ X − Φ Y ∥ 2 F assumes x t are i.i.d. from an unknown distribution. [OK85, ACS13, MCJ13, BDF13] ■ Regret minimization: Minimizes ∑ t ∥ x t − P t − 1 x t ∥ 2 . Commits to P t − 1 before observing x t . [WK06, NKW13] ■ Random projection: can guarantee online that ∥ ( X − ( XY + ) Y ∥ 2 F is small. [Sar06, CW09] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  11. Online PCA Problem Definitions Definition of a ( c , ε ) -approximation algorithm for Online PCA Given X ∈ R d × n as vectors [ x 1 , x 2 . . . ] and k < d produce Y = [ y 1 , y 2 , . . . ] such that ■ y t is produced before observing x t +1 . ■ y t ∈ R ℓ and ℓ ≤ c · k . ■ ∥ X − Φ Y ∥ 2 F ≤ ∥ X − X k ∥ 2 F + ε ∥ X ∥ 2 F for some isometry Φ . Main Contribution [BGKL15] There exists a (˜ O ( ε − 2 ) , ε ) -approximation algorithm for online PCA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  12. Noisy Data Spectra Setting Y = 0 gives an (0 , ε ) approximation... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  13. Noisy Data Spectra Sometimes, ”poor reconstruction error” is algorithmically required. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  14. Online PCA Problem Definitions Setting Y = U T k X and Φ = U k minimizes ∥ X − Φ Y ∥ 2 2 Definition of a ( c , ε ) -approximation algorithm for Spectral Online PCA Given X ∈ R d × n as vectors [ x 1 , x 2 . . . ] and k < d produce Y = [ y 1 , y 2 , . . . ] such that ■ y t is produced before observing x t +1 . ■ y t ∈ R ℓ and ℓ ≤ c · k . ■ ∥ X − Φ Y ∥ 2 2 ≤ ∥ X − X k ∥ 2 2 + ε ∥ X ∥ 2 2 for some isometry Φ . Main Contribution [KL15] There exists a (˜ O ( ε − 2 ) , ε ) -approximation algorithm for Spectral Online PCA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  15. Some Intuition The covariance matrix X T X visualized as an ellipse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  16. Some Intuition The optimal residual is R = X − X k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  17. Some Intuition Any residual R = X − Φ Y such that ∥ R T R ∥ ≤ σ 2 k +1 + εσ 2 1 would work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  18. Bad Algorithm, Big Step Forward ∆ = σ 2 k +1 + εσ 2 1 U ← all zeros matrix for x t ∈ X do if ∥ ( I − UU T ) X 1: t ∥ 2 ≥ ∆ Add the top left singular vector of ( I − UU T ) X 1: t to U yield y t = U T x t Obvious problems with this algorithm (will be fixed later) ■ it must “guess” σ 2 k +1 + εσ 2 1 . ■ it stores the entire history X 1: t ■ it computes the top singular value of ( I − UU T ) X 1: t at every round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  19. Algorithm Intuition Assume we know ∆ = σ 2 k +1 + εσ 2 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  20. Algorithm Intuition We start with mapping x t �→ 0 and R [1: t ] = X [1: t ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  21. Algorithm Intuition This is continued as long as ∥ R T R ∥ ≤ ∆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  22. Algorithm Intuition When ∥ R T R ∥ > ∆ we commit to a new online PCA direction u i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  23. Algorithm Intuition This prevents R T R from growing more in the direction u i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  24. Algorithm Properties Theorems 2,5 and 6 in [KL15] ∥ X − UY ∥ 2 2 ≤ ∥ R ∥ 2 2 ≤ σ 2 k + εσ 2 1 + o ( σ 2 1 ) . “Proof by drawing” above is deceivingly simple. This is the main difficulty! Theorem 1 in [KL15] Number of directions added by the algorithm is ℓ ≤ k / ε . i X ∥ 2 all added directions u 1 , . . . , u ℓ . We sum the inequality ∆ ≤ ∥ u ⊤ ℓ ℓ i X ∥ 2 = ∥ U ⊤ ∑ ∥ u ⊤ n X ∥ 2 ∑ σ 2 i ≤ k σ 2 1 + ( ℓ − k ) σ 2 ℓ ∆ ≤ F ≤ k +1 i =1 i =1 By rearranging we get: ℓ ≤ ( k σ 2 1 − k σ 2 k +1 )/(∆ − σ 2 k +1 ) Substituting ∆ = σ 2 k +1 + εσ 2 1 gives ℓ ≤ k / ε . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  25. Fixing the Algorithm ■ Exponentially search for the right ∆ . If we added more than k / ε direction to U we can conclude that ∆ < σ 2 k +1 + εσ 2 1 . ■ Instead of keeping X 1: t use covariance sketching. Keep B such that XX T ∼ BB T and B required o ( d 2 ) to store. ■ Only compute the top singular value of ( I − UU T ) X 1: t “once in a while”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  26. Visual Ilustration and Open Problem ■ Can we reduce target dimension while keeping the approximation guaranty? ■ Would allowing scaled isometric registration help reduce the target dimension? ■ Can we avoid the exponential search for ∆ ? ■ Is there a simple way to update U that is more accurate than only adding columns? ■ Can we reduce the running time of online PCA? Currently the bottleneck is covariance sketching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  27. Thank you . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend