Fast algorithms for sparse principal component analysis based on - - PowerPoint PPT Presentation

fast algorithms for sparse principal component analysis
SMART_READER_LITE
LIVE PREVIEW

Fast algorithms for sparse principal component analysis based on - - PowerPoint PPT Presentation

Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration Volodymyr Kuleshov Department of Computer Science Stanford University June 18, 2013 Volodymyr Kuleshov Algorithms for sparse PCA Sparse principal


slide-1
SLIDE 1

Fast algorithms for sparse principal component analysis based on Rayleigh quotient iteration

Volodymyr Kuleshov

Department of Computer Science Stanford University

June 18, 2013

Volodymyr Kuleshov Algorithms for sparse PCA

slide-2
SLIDE 2

Sparse principal component analysis

Seeks principal components that maximize variance subject to a sparsity constraint: Sparse PCA PCA max 1 2xTΣx s.t. ||x||2 ≤ 1 ||x||0 ≤ k max 1 2xTΣx s.t. ||x||2 ≤ 1 where Σ ∈ Rn×n, Σ = ΣT and k > 0.

Volodymyr Kuleshov Algorithms for sparse PCA

slide-3
SLIDE 3

Current state-of-the-art

The most successful methods are variations of the generalized power method. Algorithm 1 GPM(Σ, x0, γ, ǫ) j ← 0 repeat y ← Σx(j) x(j+1) ← SparsifyAndScaleγ(y) // New relative to pow. met. j ← j + 1 until ||x(j) − x(j−1)|| < ǫ return x(j) The sparsification step typically consists of soft thresholding and scaling to a norm of one.

Volodymyr Kuleshov Algorithms for sparse PCA

slide-4
SLIDE 4

Rayleigh quotient iteration

A more sophisticated algorithm for computing eigenvalues than the power method. Algorithm 2 RayleighQuotientIteration(Σ, x0, ǫ) j ← 0 repeat µ ← (x(j))TΣx(j) (x(j))Tx(j) // Rayleigh quotient x(j+1) ← (Σ − µI)−1x(j) ||(Σ − µI)−1x(j)|| j ← j + 1 until ||x(j) − x(j−1)|| < ǫ return x(j)

Volodymyr Kuleshov Algorithms for sparse PCA

slide-5
SLIDE 5

Generalized Rayleigh quotient iteration

Algorithm 3 GRQI(Σ, x0, k, J, ǫ) j ← 0 repeat W ← {i|x(j)

i

= 0} x(j)

W ← RQIStep(x(j) W , ΣW)

// Rayleigh quotient update if j < J then x(j) ← Σx(j)/||Σx(j)||2 // Power met. update end if x(j+1) ← Projectk (xnew) // Project on l0 ∩ l2 ball. j ← j + 1 until ||x(j) − x(j−1)|| < ǫ return x(j)

Volodymyr Kuleshov Algorithms for sparse PCA

slide-6
SLIDE 6

Comparison

  • Gen. power method
  • Gen. Rayleigh quotient iter.

Extends power method Extends Rayleigh quotient iter. Form of gradient descent A second-order method Linear convergence Cubic convergence O(nk + n2) flops per iter. O(nk + k3) flops per iter. Converges in about 100 iter. Converges in about 10 iter.

Volodymyr Kuleshov Algorithms for sparse PCA

slide-7
SLIDE 7

Comparison

50 100 150 200 250 2 4 6 8 10 12 x 10

7

Number of non−zero components Flops GPower0 GPower1 GRQI

(a) Flops to compute eigenvector as a function of sparsity (R1000×1000)

100 200 300 400 500 600 700 1500 2000 2500 3000 3500 4000 Number of non−zero components Variance GPower0 GPower1 GRQI

(b) Variance/sparsity tradeoff (random matrices in R1000×1000)

Volodymyr Kuleshov Algorithms for sparse PCA

slide-8
SLIDE 8

Summary

New algorithms for sparse PCA that Use 10-100x fewer flops than the best current methods; Find sparse components which are as good or better than

  • nes from existing algorithms;

Generalize Rayleigh quotient iteration. This motivates further research into second-order methods for doing matrix factorizations.

Volodymyr Kuleshov Algorithms for sparse PCA