large scale greedy feature selection for multi target
play

Large scale greedy feature-selection for multi-target learning - PowerPoint PPT Presentation

Large scale greedy feature-selection for multi-target learning Antti Airola, Tapio Pahikkala et al. ECML 2015 BigTargets Workshop Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning Overview


  1. Large scale greedy feature-selection for multi-target learning Antti Airola, Tapio Pahikkala et al. ECML 2015 BigTargets Workshop Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  2. Overview Joint work with many authors University of Turku: Antti Airola, Pekka Naula, Tapio Pahikkala, Tapio Salakoski (Multi-target greedy RLS) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  3. Overview Large scale feature selection for multi-target learning Task: select minimal set of common features allowing accurate predictions over target tasks Greedy RLS: greedy regularized least-squares Linear time (#inputs, #features, #outputs, #selected) Highlights from experiments Broad-DREAM Gene Essentiality Prediction Challenge Outperforms multi-task Lasso for small feature budgets Also scales to full Genome Wide Association Studies; thousands of samples, hundreds of thousands of features (recent PhD thesis: Sebastian Okser) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  4. Motivation Why feature selection? 1 Accuracy: regularizing effect, avoiding overfitting leads to better generalization 2 Interpretability: obtain a small set of features understandable by human expert 3 Budget constraints: obtaining features costs time and money Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  5. Model sparsity     1 0 0 0 0 0 0 0     3 0 0 0 2 3 − 1 2         0 2 0 0 0 0 0 0         0 − 1 0 0 3 1 4 1     W 1 = , W 2 =     0 0 0 3 0 0 0 0         0 0 0 1 0 0 0 0         0 0 2 0 0 0 0 0 0 0 2 0 0 0 0 0 features x targets coefficient matrices W 1 8 features needed for prediction W 2 2 features needed for prediction Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  6. Learning task Least-squares formulation arg min W ∈ R d × t � XW − Y � 2 F subject to C ( W ) Notation X data matrix Y output matrix W model coefficients � · � F Frobenius norm C ( · ) Constraint (regularizer) Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  7. Multi-task Lasso (baseline) Multi-task Lasso (Zhang, 2006) arg min W ∈ R d × t � XW − Y � 2 F subject to � d i =1 max j | W i , j | ≤ r L 1 , ∞ norm enforces sparsity in the number of features r > 0 regularization parameter Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  8. Greedy RLS Greedy RLS (proposed) arg min W ∈ R d × t � XW − Y � 2 F subject to � W � 2 F < r and |{ i | ∃ j , W i , j � = 0 }| ≤ k r > 0 regularization parameter k > 0 constraint on the number of features heuristics needed to search over the power set of features Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  9. Greedy RLS Greedy regularized least-squares (Greedy RLS) Starting from empty feature set, at each point add the feature reducing leave-one-out cross-validation error most Stop once k features have been selected Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  10. Greedy RLS Algorithm 1 Multi-target greedy RLS 1: S ← ∅ ⊲ selected features common for all tasks 2: while |S| < k do ⊲ select k features e ← ∞ 3: b ← 0 4: for i ∈ { 1 , . . . , d } \ S do ⊲ test all features 5: e avg ← 0 6: for j ∈ { 1 , . . . , t } do 7: e i , j ← L ( X : , S∪{ i } , Y : , j ) ⊲ LOO for task j 8: e avg ← e avg + e i , j / t 9: if e avg < e then 10: e ← e avg 11: b ← i 12: S ← S ∪ { b } ⊲ feature with lowest LOO-error 13: 14: W ← A ( X : , S , Y ) ⊲ train final models 15: return W , S Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  11. Greedy RLS could be implemented as a general wrapper code calling a black-box solver #selected × #features × #targets × #CV-rounds calls for naive implementation! Matrix algebraic optimization for feature addition, leave-one-out... (for all targets simultaneously) Linear time algorithm (#inputs, #features, #outputs, #selected) P. Naula, A. Airola, T. Salakoski and T. Pahikkala. Multi-label learning under feature extraction budgets. Pattern Recognition Letters , 2014. Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  12. Greedy RLS Algorithm 2 Multi-target greedy RLS A ← λ − 1 Y g ← λ − 1 1 C ← λ − 1 X S ← ∅ while |S| < k do e ← ∞ b ← 0 for i ∈ { 1 , . . . , d } \ S do u ← C : , i (1 + ( X : , i ) T C : , i ) − 1 e i ← 0 � A ← A − u (( X : , i ) T A ) for h ∈ { 1 , . . . , t } do for j ∈ { 1 , . . . , n } do ˜ g j ← g j − u j C j , i g j ) − 2 ( � A j , h ) 2 e i ← e i + (˜ if e i < e then e ← e i b ← i u ← C : , b (1 + ( X : , b ) T C : , b ) − 1 A ← A − u (( X : , b ) T A ) for j ∈ { 1 , . . . , n } do g j ← g j − u j C j , b C ← C − u (( X : , b ) T C ) S ← S ∪ { b } W ← ( X : , S ) T A Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  13. Benchmarking greedy RLS and multi-task Lasso Table: Mulan datasets (Tsoumakas et al. 2011). Data sets domain labels features instances Scene image 6 294 2407 Yeast biology 14 103 2417 Emotions music 6 72 593 Mediamill* text 9 120 41583 Delicious text 983 500 16105 Tmc2007 text 22 49060 28596 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  14. Greedy RLS vs. Lasso 1.0 M.avg.AUC 0.9 0.8 Scene data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 50 100 150 200 250 1.0 Yeast data M.avg.AUC 0.9 MT-Lasso 0.8 ML-gRLS 0.7 0.6 0.5 0 20 40 60 80 100 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  15. Greedy RLS vs. Lasso 1.0 M.avg.AUC 0.9 0.8 Emotions data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 10 20 30 40 50 60 70 1.0 M.avg.AUC 0.9 0.8 Mediamill data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 20 40 60 80 100 Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  16. Greedy RLS vs. Lasso 1.0 Delicious data M.avg.AUC 0.9 MT-Lasso 0.8 ML-gRLS 0.7 0.6 0.5 0 20 40 60 80 1.0 M.avg.AUC 0.9 0.8 Tmc2007 data 0.7 MT-Lasso 0.6 ML-gRLS 0.5 0 10 20 30 40 Number of features Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

  17. Conclusion Greedy RLS: linear time algorithm for (multi-target) feature selection Selects joint features for the target tasks Competitive, when number of features to be selected small Applications on Genome-Wide Association Studies RLScore open source implementation at https://github.com/aatapa/RLScore Antti Airola, Tapio Pahikkala et al. Large scale greedy feature-selection for multi-target learning

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend