Data Reduction Jieping Ye Arizona State University Joint work with - PowerPoint PPT Presentation

Center for Evolutionary Medicine and Informatics Sparse Screening for Exact Data Reduction Jieping Ye Arizona State University Joint work with Jie Wang and Jun Liu 1

Center for Evolutionary Medicine and Informatics wide data feature reduction sample reduction tall data 2

Center for Evolutionary Medicine and Informatics Sparse Screening: A New Framework for Exact Data Reduction The model learnt from the reduced data is identical to the model learnt from the full data. We focus on two models in this talk:  Lasso for wide data (feature reduction)  SVM for tall data (sample reduction) 3

Center for Evolutionary Medicine and Informatics 4

Center for Evolutionary Medicine and Informatics Lasso/Basis Pursuit (Tibshirani, 1996, Chen, Donoho, and Saunders, 1999) x y A z × + = … n × p n × 1 n × 1 p × 1 Simultaneous feature selection and regression 5

Center for Evolutionary Medicine and Informatics Imaging Genetics (Thompson et al. 2013) 6

Center for Evolutionary Medicine and Informatics Sparse Reduced-Rank Regression Vounou et al. (2010, 2012) 7

Center for Evolutionary Medicine and Informatics Structured Sparse Models Group Lasso Fused Lasso Graph Lasso Tree Lasso 8

Center for Evolutionary Medicine and Informatics Sparsity has become an important modeling tool in genomics , genetics , signal and audio processing , image processing , neuroscience (theory of sparse coding), machine learning , statistics … 9

Center for Evolutionary Medicine and Informatics Optimization Algorithms min loss( x ) + λ × penalty( x ) • Coordinate descent • Subgradient descent • Augmented Lagrangian Method • Gradient descent • Accelerated gradient descent • … 10

Center for Evolutionary Medicine and Informatics Lasso Fused Lasso Group Lasso Sparse Group Lasso Tree Structured Group Lasso Overlapping Group Lasso Sparse Inverse Covariance Estimation Trace Norm Minimization http://www.public.asu.edu/~jye02/Software/SLEP/ 11

Center for Evolutionary Medicine and Informatics More Efficiency? Very high dimensional data Non-smooth sparsity-induced norms Multiple runs in model selection A large number of runs in permutation test 12

Center for Evolutionary Medicine and Informatics How to make any existing Lasso solver much more efficient? 13

Center for Evolutionary Medicine and Informatics Data Reduction/Compression 1M 1K original data reduced data 14

Center for Evolutionary Medicine and Informatics Data Reduction • Heuristic-based data reduction – Sure screening, random projection/selection – Resulting model is an approximation of the true model • Propose data reduction methods – Exact data reduction via sparse screening • The model based on reduced data is identical to the one constructed from complete data 15

Center for Evolutionary Medicine and Informatics Sparse Screening 1M without screening same solution 1M 1K with screening 16

Center for Evolutionary Medicine and Informatics Large-Scale Sparse Screening

Center for Evolutionary Medicine and Informatics Screening Rule: Motivation Ghaoui, Viallon, and Rabbani.

Center for Evolutionary Medicine and Informatics Large- Scale Sparse Screening (Cont’d)

Center for Evolutionary Medicine and Informatics More on the Dual Formulation • Solving the dual formulation is difficult • Providing a good (not exact) estimate of the optimal dual solution is easier • A good estimate of the optimal dual solution is sufficient for effective feature screening 20

Center for Evolutionary Medicine and Informatics Screening Rule 21

Center for Evolutionary Medicine and Informatics Sketch of Sparse Screening 22

Center for Evolutionary Medicine and Informatics How to Estimate the Region Θ ? Non-expansiveness: J. Wang et al. NIPS’13 ; J. Liu et al. ICML’14

Center for Evolutionary Medicine and Informatics Enhanced DPP Define: Use projections of rays: Enhanced DPP: 24

Center for Evolutionary Medicine and Informatics Firmly Non-expansive Projection Non-expansiveness: Firmly non-expansiveness: 25

Center for Evolutionary Medicine and Informatics Results on MNIST along a sequence of 100 parameter values along the λ / λ max scale from 0.05 to 1. The data matrix is of size 784x50,000 26

Center for Evolutionary Medicine and Informatics Evaluation on MNIST solver SAFE DPP EDPP SDPP time (s) 2245.26 685.12 233.85 45.56 9.34 Speedup SDPP EDPP DPP SAFE 0 100 200 300 27

Center for Evolutionary Medicine and Informatics Evaluation on ADNI • Problem: GWAS to MRI ROI prediction (ADNI) – The size of the data matrix is 747 by 504095 Method ROI3 ROI8 ROI30 ROI69 ROI76 ROI83 Lasso Solver 37975.31 37097.25 38258.72 36926.81 38116.29 37251.03 SR 84.06 84.44 84.70 83.09 82.76 85.39 SR+Lasso 217.08 215.90 223.39 214.36 212.04 211.57 EDDP 43.56 45.75 45.70 45.01 44.31 44.16 EDDP+Lasso 183.64 190.43 182.87 170.71 177.41 178.98 Running time (in seconds) of the Lasso solver, strong rule (Tibshriani et al, 2012), and EDPP. The parameter sequence contains 100 values along the log λ / λ max scale from 100 log 0.95 to log 0.95.

Center for Evolutionary Medicine and Informatics Sparse Screening Extensions • Group Lasso – J Wang, J Liu, J Ye. Efficient Mixed-Norm Regularization: Algorithms and Safe Screening Methods. arXiv preprint arXiv:1307.4156. • Sparse Logistic Regression – J Wang, J Zhou, P Wonka, J Ye. A Safe Screening Rule for Sparse Logistic Regression. arXiv preprint arXiv:1307.4145. • Sparse Inverse Covariance Estimation – S Huang, J Li, L Sun, J Liu, T Wu, K Chen, A Fleisher, E Reiman, J Ye. Learning brain connectivity of Alzheimer’s disease by exploratory graphical models. NeuroImage 50, 935-949. – Witten, Friedman and Simon (2011), Mazumder and Hastie (2012) • Multiple Graphical Lasso – S Yang, Z Pan, X Shen, P Wonka, J Ye. Fused Multiple Graphical Lasso. arXiv preprint arXiv:1209.2139. 29

Center for Evolutionary Medicine and Informatics Wide versus Tall Data wide data tall data 30

Center for Evolutionary Medicine and Informatics Support Vector Machines • SVM is a maximum margin classifier. denotes +1 denotes -1 Margin 31

Center for Evolutionary Medicine and Informatics Support Vectors • SVM is determined by the so-called support vectors . denotes +1 denotes -1 The non-support vectors are Support Vectors irrelevant to the classifier. are those data points that the margin pushes up against Can we make use of this observation? 32

Center for Evolutionary Medicine and Informatics The Idea of Sample Screening Smaller Problem Original Problem Screening to Solve 33

Center for Evolutionary Medicine and Informatics Guidelines for Sample Screening J. Wang, P. Wonka, and J. Ye. ICML’14 . 34

Center for Evolutionary Medicine and Informatics Relaxed Guidelines 35

Center for Evolutionary Medicine and Informatics Sketch of SVM Screening 36

Center for Evolutionary Medicine and Informatics Synthetic Studies • We use the rejection rates to measure the performance of the screening rules, the ratio of the number of data instances whose membership can be identified by the rule to the total number of data instances. 37

Center for Evolutionary Medicine and Informatics Performance of DVI for SVM on Real Data Sets Comparison of SSNSV ( Ogawa et al., ICML’13 ) , ESSNSV and DVIs for SVM on three real data sets. IJCNN, , Speedup Wine, , Speedup Covertype, , Speedup Solver Total 4669.14 Solver Total 76.52 Solver Total 1675.46 SSNSV 2.08 SSNSV 0.02 SSNSV 2.73 Solver + Solver + Solver + 2.31 3.50 7.60 Init. 92.45 Init. 1.56 Init. 35.52 SSNSV SSNSV SSNSV Total 2018.55 Total 21.85 Total 220.58 ESSNSV 2.09 ESSNSV 0.03 ESSNSV 2.89 Solver + Solver + Solver + ESSNS ESSNS ESSNS Init. 91.33 3.01 Init. 1.60 4.47 Init. 36.13 10.72 V V V Total 1552.72 Total 17.17 Total 156.23 DVI 0.99 DVI 0.01 DVI 1.27 Solver + Solver + Solver + 5.64 6.59 79.18 Init. 42.67 Init. 0.67 Init. 12.57 DVI DVI DVI Total 828.02 Total 11.62 Total 21.26 38

Center for Evolutionary Medicine and Informatics Experiments on Real Data Sets Comparison of SSNSV ( Ogawa et al., ICML’13 ) , ESSNSV and DVIs for LAD on three real data sets. Telescope, , Speedup Computer, , Speedup Telescope, , Speedup Solver Total 122.34 Solver Total 5.85 Solver Total 21.43 DVI 0.28 DVI 0.08 DVI 0.06 Solver + Solver + Solver + 9.86 19.21 114.91 Init. 0.12 Init. 0.05 Init. 0.1 DVI DVI DVI Total 12.14 Total 0.28 Total 0.19 39

Center for Evolutionary Medicine and Informatics Resource Tutorial webpages of our screening rules, which include sample codes, • implementation instructions, illustration materials, etc. http://www.public.asu.edu/~jwang237/screening.html Seven lines implementation of EDPP rule The list is growing quickly 40

Data Reduction Jieping Ye Arizona State University Joint work with - PowerPoint PPT Presentation

Center for Evolutionary Medicine and Informatics Sparse Screening for Exact Data Reduction Jieping Ye Arizona State University Joint work with Jie Wang and Jun Liu 1 Center for Evolutionary Medicine and Informatics wide data feature

SAS data reduction Haydyn Mertens (EMBL-Hamburg) Data reduction steps Acquisition Reduction

Introduction to Harm Reduction Definition of Harm Reduction Harm reduction refers to policies,

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

ICU Restraint Reduction: ICU Restraint Reduction: ICU Restraint Reduction: Development of

Lattice Basis Reduction Part II: Algorithms Sanzheng Qiao Department of Computing and Software

Treewidth reduction and algorithmic applications Treewidth reduction and algorithmic applications

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

What is harm reduction? The International Harm Reduction Association (IHRA) defines harm

Temporary Refuge Patents DMR Impianti S.r.l. PRESSURE REDUCTION PANELS The pressure reduction

Ruijsenaars-Schneider system from reduction Quasi- quasi-Hamiltonian reduction Hamiltonian

Stress pattern and reduction correlations in Spanish Karolina Bro University of Warsaw

IUCN Ecosystem based approaches to adaptation and risk reduction and risk reduction 1. What is

HILLSIDE KELLY SLATER REDUCTION MONOTYPE ON PAPER WHITE PINE, BERKSHIRES KELLY SLATER,

Reduction Policy Alberta Energy November 26, 2019 Albertas Cost -Effective Methane Emissions

GENERAL PROCEDURES FOR REDUCTION OF HYDROXYLAMINE Stoichiometric reduction OH H Zn or TiCl

Oxidation/Reduction Metal Displacement Reaction oxidation reduction 1.10 Volts for each cell

Partitioning Spatially Located Load with Rectangles: Algorithms and Simulations Erik Saule ,

OpportuniHes to Harness NanoinformaHcs for Impact in Environmental

How important is theory in health informatics? A survey of UK academics Oral presentation at MIE

Context-based security State of the art, open research topics and a case study Stephan Sigg The

Writing Academic Texts General Remarks Using the Media Informatics templates for theses &

Machine learning for energy landscapes Tristan Bereau Van t Hoff Institute for Molecular

Biodiversity and Ecosystem Informatics Panel Yannis Ioannidis Univ. of Athens, Hellas Personal

Carol V. Alexandru , Sebastiano Panichella, Sebastian Proksch, Harald C. Gall Software Evolution

Sambuz

Useful Links

Newsletter

Mail Us

Data Reduction Jieping Ye Arizona State University Joint work with - PowerPoint PPT Presentation

Center for Evolutionary Medicine and Informatics Sparse Screening for Exact Data Reduction Jieping Ye Arizona State University Joint work with Jie Wang and Jun Liu 1 Center for Evolutionary Medicine and Informatics wide data feature

SAS data reduction Haydyn Mertens (EMBL-Hamburg) Data reduction steps Acquisition Reduction

Introduction to Harm Reduction Definition of Harm Reduction Harm reduction refers to policies,

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

ICU Restraint Reduction: ICU Restraint Reduction: ICU Restraint Reduction: Development of

Lattice Basis Reduction Part II: Algorithms Sanzheng Qiao Department of Computing and Software

Treewidth reduction and algorithmic applications Treewidth reduction and algorithmic applications

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

What is harm reduction? The International Harm Reduction Association (IHRA) defines harm

Temporary Refuge Patents DMR Impianti S.r.l. PRESSURE REDUCTION PANELS The pressure reduction

Ruijsenaars-Schneider system from reduction Quasi- quasi-Hamiltonian reduction Hamiltonian

Stress pattern and reduction correlations in Spanish Karolina Bro University of Warsaw

IUCN Ecosystem based approaches to adaptation and risk reduction and risk reduction 1. What is

HILLSIDE KELLY SLATER REDUCTION MONOTYPE ON PAPER WHITE PINE, BERKSHIRES KELLY SLATER,

Reduction Policy Alberta Energy November 26, 2019 Albertas Cost -Effective Methane Emissions

GENERAL PROCEDURES FOR REDUCTION OF HYDROXYLAMINE Stoichiometric reduction OH H Zn or TiCl

Oxidation/Reduction Metal Displacement Reaction oxidation reduction 1.10 Volts for each cell

Partitioning Spatially Located Load with Rectangles: Algorithms and Simulations Erik Saule ,

OpportuniHes to Harness NanoinformaHcs for Impact in Environmental

How important is theory in health informatics? A survey of UK academics Oral presentation at MIE

Context-based security State of the art, open research topics and a case study Stephan Sigg The

Writing Academic Texts General Remarks Using the Media Informatics templates for theses &amp;

Machine learning for energy landscapes Tristan Bereau Van t Hoff Institute for Molecular

Biodiversity and Ecosystem Informatics Panel Yannis Ioannidis Univ. of Athens, Hellas Personal

Carol V. Alexandru , Sebastiano Panichella, Sebastian Proksch, Harald C. Gall Software Evolution

Sambuz

Useful Links

Newsletter

Mail Us

Writing Academic Texts General Remarks Using the Media Informatics templates for theses &