A review of dimensionality reduction in high-dimensional data using - PowerPoint PPT Presentation

Second Workshop on Software Challenges to Exascale Computing (13 th , 14 th December 2018, New Delhi) A presentation on A review of dimensionality reduction in high-dimensional data using multi-core and many-core architecture by Mr. Siddheshwar Vilas Patil Ph. D. Research Scholar (QIP , AICTE Scheme) Under the Guidance of Prof. Dr. D. B. Kulkarni Registrar & Professor in Information Technology, Walchand College of Engineering, Sangli, MH, India (A Government Aided Autonomous Institute)

Outline • Introduction • Dimensionality Reduction • Literature Review • Challenges • Parallel Computing Approaches • Conclusion • References 1/24/2019 SCEC 2018 2

Introduction • Massive amounts of high dimensional data • Big Data - Exponential growth and availability of data, 3Vs • Afterwards, this list was extended with “ Big Dimensionality” in Big Data . • “Curse of Big Dimensionality”, is boosted by the explosion of features ( thousand or even millions of features) • Early, Data scientists - huge number of instances , while paying less attention to the features aspect. 1/24/2019 SCEC 2018 3

Big Dimensionality Millions of Dimensions 1/24/2019 SCEC 2018 4

Example- libSVM Database • In 1990s, the maximum dimensionality - 62,000 • In 2000s - 16 million • In 2010s - 29 million • In this new scenario, it is common now to deal with millions of features, so the existing learning methods need to be adapted. 1/24/2019 SCEC 2018 5

Summary of high-dimensional datasets 1/24/2019 SCEC 2018 6

Scalability • Scalability is defined as the effect that an increase in the size of the training set has on the computational performance of an algorithm: accuracy, training time and allocated memory. 1/24/2019 SCEC 2018 7

Methods to perform DR • Missing Values • Low Variance- Let’s think of a scenario where we have a constant variable (all observations have the same value) in data set • Not improve the power of model because it has zero variance • High Correlation- It is not good to have multiple variables of similar information. • Pearson correlation matrix to identify the variables with high correlation. 1/24/2019 SCEC 2018 8

Dimensionality Reduction • Feature Extraction: Transforms original features to a set of new features • More compact and of stronger discriminating power. • Applications - Image analysis, Signal processing, and Information retrieval 1/24/2019 SCEC 2018 9

Dimensionality Reduction • Feature Selection: remove the irrelevant and redundant features • Two features are redundant to each other if their values are completely correlated • Irrelevant: contain no information that is useful for the data mining task at hand • Feature is relevant if it contains some information about the target (removal of this feature will decrease accuracy of classifier) 1/24/2019 SCEC 2018 10

Dimensionality reduction • Linear Methods: – Principal Component Analysis (PCA) – Linear Discriminate Analysis (LDA) – Multidimensional Scaling (MDS) – Non-negative Matrix Factorization(NMF) – Lasso • Non-Linear Methods: – Locally Linear Embedding (LLE) – Isometric Feature Mapping (Isomap) – Hilbert Schmidt Independence Criterion(HSIC) – Minimum Redundancy Maximum Relevancy ( mRMR) • Autoencoders (Linear as well Non Linear) 1/24/2019 SCEC 2018 11

Feature selection methods • Individual evaluation is also known as feature ranking and assesses individual features by assigning them weights according to their degrees of relevance. • Subset evaluation produces candidate feature subsets based on a certain search strategy. • Compared with the previous best one with respect to this measure. • While the individual evaluation is incapable of removing redundant features because redundant features are likely to have similar rankings, the subset evaluation approach can handle feature redundancy with feature relevance. 1/24/2019 SCEC 2018 12

Feature Selection Steps • Feature selection is an optimization problem. • Step 1: Search the space of possible feature subsets. • Step 2: Pick the subset that is optimal or near-optimal with respect to some criterion 1/24/2019 SCEC 2018 13

Feature Selection Steps (Cont’d) • Search strategies – Exhaustive – Heuristic • Evaluation Criterion - Filter methods - Wrapper methods 1/24/2019 SCEC 2018 14

Search Strategies • Assuming d features, an exhaustive search would require: • Examining all possible subsets of size m. • Selecting the subset that performs the best according to the criterion. • Exhaustive search is usually impractical. • In practice, heuristics are used to speed-up search 1/24/2019 SCEC 2018 15

Evaluation Strategies • Filter Methods – Evaluation is independent of the classification method – The criterion evaluates feature subsets based on their class discrimination ability (feature relevance): • Mutual information or correlation between the feature values and the class labels 1/24/2019 SCEC 2018 16

Evaluation Strategies • Wrapper Methods – Evaluation uses criteria related to the classification algorithm. – To compute the objective function, a classifier is built for each tested feature subset and its generalization accuracy is estimated (e.g. cross- validation) 1/24/2019 SCEC 2018 17

Evaluation Strategies Evaluation Strategies • Filter based – Chi-Squared – Information Gain – Correlation-Based Feature Selection, CFS • Wrapper methods – recursive feature elimination – sequential feature selection algorithms – genetic algorithms 1/24/2019 SCEC 2018 18

Feature Ranking • Evaluate all d features individually using the criterion • Select the top m features from this list. Sequential forward selection (SFS) (heuristic search) • First, the best single feature is selected • Then, pairs of features are formed using one of the remaining features and this best feature, and the best pair is selected. • Next, triplets of features are formed using one of the remaining features and these two best features, and the best triplet is selected. • This procedure continues until a predefined number of features are selected. • Wrapper methods (e.g. decision trees, linear classifiers) or Filter methods (e.g. mRMR) could be used • Sequential backward selection (SBS) 1/24/2019 SCEC 2018 19

Advantages of Dimensionality Reduction • Helps in data compression, and hence reduced storage space. • It reduces computation time. • It remove redundant irrelevant features, if any • Improves accuracy of Classification 1/24/2019 SCEC 2018 20

Literature Review • Implementation of the Principal Component Analysis onto High- Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons • An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark • Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data 1/24/2019 SCEC 2018 21

Author Dimensionality Parallel H/W configuration Datasets reduction programming algorithm model M. Hilbert-schmidt MapReduce Intel xeon 2.4 GHz, 24 GB P53, Yamada independence framework RAM (16 cores) Enzyme et al. [7] criterion lasso (Hadoop and with least apache spark) angle regression Z. Wu Principal MapReduce Cloud computing (Intel AVIRIS et component framework Xeon E5630 CPUs(8 cores) cuprite al.[12] analysis (Hadoop and 2.53 GHz, 5GB RAM, hypersp apache spark), 292 GB SAS HDD), 8 ectral MPI Cluster slave(Intel Xeon E7-4807 datasets CPUs (12 cores) 1.86 GHz) S. Minimum MapReduce Cluster (18 computing Epsilon, Ramirez redundancy on apache nodes, 1 master node) URL, - maximum spark, CUDA computing nodes: Intel Xeon Kddb Gallego relevance on GPGPU E5-2620, 6 cores/processor, et al.[2] (mRMR) 64 GB RAM 1/24/2019 SCEC 2018

Author Dimensionality Parallel H/W configuration Datasets reduction programming algorithm model E. Martel Principal CUDA on Intel core i7-4790, NVIDIA Hyperspectr et al. [4] component GPGPU 32 GB Memory, GeForce al data analysis GTX 680 GPU J. Zubova Random MPI Cluster - URL, Kddb et al. [13] projection L. Zhao Distributed Cluster platforms - Economic et subtractive data (China) al. [5] clustering S. Singular value CUDA on Intel core i7, 8GB RAM, - Cuomo et Decomposition GPGPU 2.8 GHz, GPU NVIDIA al.[8] Quadro K5000, 1536 CUDA cores W. Li et Isometric CUDA on Intel core i7-4790, 3.6 GHz, HIS datasets al. [9] mapping GPGPU 8 cores, 32GB RAM, GPU -Indian (ISOMAP) Nvidia GTX 1080, 2560 pines,Salinas CUDA cores, 8GB RAM , Pavia 1/24/2019 SCEC 2018

Challenges • Exponential growth in the dimensionality and sample size. • So, the existing algorithms not always respond in an adequate same way when deal with this new extremely high dimensions. 1/24/2019 SCEC 2018 24

Challenges • Reducing data complexity is therefore crucial for data analysis tasks, knowledge inference using machine learning (ML) algorithms, and data visualization • Ex. Use of feature selection in analyzing DNA microarrays, where there are many thousands of features, and a few tens to hundreds of samples 1/24/2019 SCEC 2018 25

A review of dimensionality reduction in high-dimensional data using - PowerPoint PPT Presentation

Second Workshop on Software Challenges to Exascale Computing (13 th , 14 th December 2018, New Delhi) A presentation on A review of dimensionality reduction in high-dimensional data using multi-core and many-core architecture by Mr.

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Dimensionality Reduction Alexandros Tantos Assistant Professor Aristotle University of

Investigating Dimensionality Dimensionality Dimensionality with with Investigating

WIKIPEDIA ARTICLE GROUP 9 Contents Article Overview 1. Dimensionality Reduction 2.

Nonlinear Dimensionality Reduction Donovan Parks Overview Direct visualization vs.

Dimensionality Reduction Algorithms (and how to interpret their output) Dalya Baron (Tel Aviv

Exploring Multivariate Data with Clustering and Dimensionality Reduction Marco Baroni Practical

Applied Machine Learning Dimensionality reduction using PCA Siamak Ravanbakhsh COMP 551 (Fall

Preprocessing and Dimensionality Reduction J er emy Fix CentraleSup elec

DIMENSIONALITY REDUCTION DIMENSIONALITY REDUCTION MATTHIEU BLOCH April 21, 2020 1 / 26

Probabilistic Dimensionality Reduction Neil D. Lawrence University of Sheffield Facebook, London

Kernel-Based Dimensionality Reduction Methods on Synthesized and Facial Image Data Jonathan L.

Spatial Data: Dimensionality Reduction CS444 Techniques, Lecture 3 In this subfield, we think

Spatial Data: Dimensionality Reduction CSC444 Techniques In this subfield, we think of a data

Dimensionality Reduction INFO-4604, Applied Machine Learning University of Colorado Boulder

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

CUDA Application Examples John E. Stone Theoretical and Computational Biophysics Group Beckman

Assessing Word Difficulty for Quiz-like Game 12.09.2017r. Jakub Jagoda Tomasz Maria Boiski

PROTEIN SYNTHESIS RNA (ribonucleic acid) 3 types RNA DIFFERENCES 1. messenger RNA (mRNA)

Lecture 5: Morphology Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage:

INCREASING INDEPENDENCE A Collaborative Experience with Hampton Inn and ESU 13 A TRUE COOPERATIVE

Complex manifolds of dimension 1 lecture 12: Tilings, hyperelliptic curves, Ananin theorem Misha

Perturbative algebraic quantum field theory Klaus Fredenhagen Katarzyna Rejzner

Optimizing the AUC with Rule Learning Prof. Johannes Frnkranz Julius Stecher Knowledge