EMPIRICAL COMPARISON OF COLUMN SUBSET SELECTION ALGORITHMS
Yining Wang, Aarti Singh Machine Learning Department, Carnegie Mellon University
1
EMPIRICAL COMPARISON OF COLUMN SUBSET SELECTION ALGORITHMS Yining - - PowerPoint PPT Presentation
EMPIRICAL COMPARISON OF COLUMN SUBSET SELECTION ALGORITHMS Yining Wang , Aarti Singh Machine Learning Department, Carnegie Mellon University 1 COLUMN SUBSET SELECTION M R n 1 n 2 C R n 1 s | C | s k M CC M k F min 2
Yining Wang, Aarti Singh Machine Learning Department, Carnegie Mellon University
1
|C|≤s kM CC†MkF
2
Interpretable low-rank approximation (compared to PCA) Applications: Unsupervised feature selection Image compression Genetic analysis: target SNP selection, etc. Challenges: Exact column subset selection is NP-hard
3
Deterministic Algorithms Rank-revealing QR (RRQR) [Chan, 87] Most accurate, but expensive: Sampling based algorithms,slightly inaccurate, but cheap: Norm sampling [Frieze et. al., 04] Leverage score sampling [Drineas et. al., 08] Iterative norm sampling (approximate volume sampling) [Deshpand &
Vempala, 06]
O(n3) O(n2k)
4
kM(i)k2
pi / kM(i)k2
2
F kM Mkk2 F + O(k/s) · kMk2 F
5
k + UkΣkV > k
2
6
2
O(n2s)
F
F
7
kM CC†Mk2
F kM Mkk2 F + ✏kMk2 F
kM CC†Mk2
F (1 + ✏)kM Mkk2 F
kM CC†Mk2
F (k + 1)!kM Mkk2 F
8
9
10
11
12
13
14
T.F. Chan, “Rank Revealing QR Factorizations,” Linear Algebra and Its Applications,
Vempala, “Fast Monte-Carlo Algorithms for Finding Low-rank Approximations,” Journal of the ACM, vol. 51, no. 6, pp. 1025-1041, 2004. P . Drineas, M.W. Mahoney and S. Muthukrishnan, “Relative-error CUR Matrix Decompositions,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 2,
Vempala, “Adaptive Sampling and Fast Low-rank Matrix Approximation,” in Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 2006, pp. 292-303.
15