Learning Aides Additional tools that can be applied to all - PowerPoint PPT Presentation

Learning Aides Additional tools that can be applied to all techniques Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and Feature Selection Preprocess data to account for arbitrary choices during data collection (input normalization) Principal Components Analysis (PCA) Hints, Data Cleaning, Validation, . . . Remove irrelevant dimensions that can mislead learning (PCA) Incorporate known properties of the target function (hints and invariances) M. Magdon-Ismail CSCI 4100/6100 Remove detrimental data (deterministic and stochastic noise) Better ways to validate (estimate E out ) for model selection � A M Learning Aides : 2 /16 c L Creator: Malik Magdon-Ismail Nearest neighbor − → Nearest Neighbor Nearest Neighbor Uses Euclidean Distance Mr. Good and Mr. Bad were both given credit cards by the Bank of Learning (BoL). Mr. Good and Mr. Bad were both given credit cards by the Bank of Learning (BoL). Mr. Good Mr. Bad Mr. Good Mr. Bad (Age in years, Income in $) (47,35000) (22,40000) (Age in years, Income in $ × 1 , 000) (47,35) (22,40) Mr. Unknown who has “coordinates” (21yrs,$36000) applies for credit. Should the Mr. Unknown who has “coordinates” (21yrs,$36K) applies for credit. Should the BoL give him credit, according to the nearest neighbor algorithm? BoL give him credit, according to the nearest neighbor algorithm? What if, income is measured in dollars instead of “K” (thousands of dollars)? What if, income is measured in dollars instead of “K” (thousands of dollars)? 50 40000 Income (K) Income ($) 35000 25 20 45 -3500 3500 Age (yrs) Age (yrs) � A c M Learning Aides : 3 /16 � A c M Learning Aides : 4 /16 L Creator: Malik Magdon-Ismail Nearest neighbor uses Euclidean distance − → L Creator: Malik Magdon-Ismail Algorithms treat dimensions uniformly − →

Uniform Treatment of Dimensions Input Preprocessing is a Data Transform Most learning algorithms treat each dimension equally   x t 1   x t   2 | x − x ′ | X = x n �→ z n Nearear neighbor: d ( x , x ′ ) = | . |  .  . x t Weight Decay: Ω( w ) = λ w t w n SVM: margin defined using Euclidean distance RBF: bump function decays with Euclidean distance g ( x ) = ˜ g (Φ( x )) . Input Preprocessing Unless you want to emphasize certain dimensions, the data should be Raw { x n } have (for example) arbitrary scalings in each dimension, and { z n } will not. preprocessed to present each dimension on an equal footing � A M Learning Aides : 5 /16 � A M Learning Aides : 6 /16 c L Creator: Malik Magdon-Ismail Input preprocessing is a data transform − → c L Creator: Malik Magdon-Ismail Centering − → Centering Normalizing z 2 z 2 x 2 x 2 z 2 centered centered raw data raw data normalized z 1 z 1 x 1 x 1 z 1 z n = Σ − 1 z n = Σ − 1 z n = x n − ¯ x z n = D x n 2 x n z n = x n − ¯ x z n = D x n 2 x n � N � N D ii = 1 Σ = 1 n = 1 D ii = 1 Σ = 1 n = 1 x n x t N X t X x n x t N X t X σ i N σ i N n =1 n =1 Σ = 1 � Σ = 1 � ¯ z = 0 σ i = 1 ˜ N Z t Z = I ¯ z = 0 σ i = 1 ˜ N Z t Z = I � A c M Learning Aides : 7 /16 � A c M Learning Aides : 8 /16 L Creator: Malik Magdon-Ismail Normalizing − → L Creator: Malik Magdon-Ismail Whitening − →

Whitening Only Use Training Data For Preprocessing WARNING! x 2 z 2 z 2 z 2 Transforming data into a more convenient format has a hidden trap which leads to data snooping. centered raw data normalized whitened When using a test set, determine the input transformation from training data only . z 1 x 1 z 1 z 1 Rule: lock away the test data until you have your final hypothesis. D train z n = Σ − 1 z n = x n − ¯ x z n = D x n 2 x n D − → input preprocessing − → g ( x ) = ˜ g (Φ( x )) 30 z = Φ( x ) snooping − Cumulative Profit % − − − 20 N � − − D ii = 1 Σ = 1 n = 1 → − x n x t N X t X − σ i N 10 − n =1 − − 0 → − − − − − − − − − − − − − − − − − − − − − − − → -10 no snooping D test � Σ = 1 ¯ z = 0 σ i = 1 ˜ N Z t Z = I 0 100 200 300 400 500 Day E test � A M Learning Aides : 9 /16 � A M Learning Aides : 10 /16 c L Creator: Malik Magdon-Ismail Only use training data − → c L Creator: Malik Magdon-Ismail PCA − → Principal Components Analysis Projecting the Data to Maximize Variance Original Data Rotated Data (Always center the data first) − − − − − − → v Original Data z = x t n v Rotate the data so that it is easy to Identify the dominant directions (information) Throw away the smaller dimensions (noise) � x 1 � � z 1 � Find v to maximize the variance of z � z 1 � → → x 2 z 2 � A c M Learning Aides : 11 /16 � A c M Learning Aides : 12 /16 L Creator: Malik Magdon-Ismail Projecting the data − → L Creator: Malik Magdon-Ismail Maximizing the variance − →

Maximizing the Variance The Principal Components z 1 = x t v 1 N � var [ z ] = 1 z 2 z 2 = x t v 2 n N n =1 v Original Data z 3 = x t v 3 . N � . = 1 . v t x n x t n v N n =1 � � v 1 , v 2 , · · · , v d are the eigenvectors of Σ with eigenvalues λ 1 ≥ λ 2 ≥ · · · ≥ λ d N � 1 = v t x n x t v n N n =1 v Original Data = v t Σ v . Theorem [Eckart-Young]. These directions give best reconstruction of data; also capture maximum variance. Choose v as v 1 , the top eigenvector of Σ — the top principal component (PCA) � A M Learning Aides : 13 /16 � A M Learning Aides : 14 /16 c L Creator: Malik Magdon-Ismail − → c L Creator: Malik Magdon-Ismail PCA features for digits data − → PCA Features for Digits Data Other Learning Aides 1. Nonlinear dimension reduction: 100 x 2 x 2 x 2 80 % Reconstruction Error 60 x 1 x 1 x 1 z 2 40 2. Hints (invariances and prior information): rotational invariance, monotonicity, symmetry, . . . . 20 3. Removing noisy data: 1 0 0 50 100 150 200 not 1 k symmetry symmetry symmetry z 1 Principal components are automated Captures dominant directions of the data. intensity intensity intensity May not capture dominant dimensions for f . 4. Advanced validation techniques: Rademacher and Permutation penalties More efficient than CV, more convenient and accurate than VC. � A c M Learning Aides : 15 /16 � A c M Learning Aides : 16 /16 L Creator: Malik Magdon-Ismail Other Learning Aides − → L Creator: Malik Magdon-Ismail

Learning Aides Additional tools that can be applied to all - PowerPoint PPT Presentation

Learning Aides Additional tools that can be applied to all techniques Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and Feature Selection Preprocess data to account for arbitrary choices during data

Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and

Advanced Home Health Aides Update on Implementation Update on Implementation DRAFT - November

2014 Parade Aide Orientation What does a parade aide do? Parade aides follow a series of

Meet your instructors and course aides Blake Everett Johnson, Ph.D. T eaching Assistant

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

1 Plaque begins to harden onto the teeth after 24 hours therefore it can be easily removed if the

O R I E N T A T I O N PARK RK WITH ITH US Types of Permit Permit Required Campus

2019 -2020 Budget Presentation General Fund Salaries and Benefits April 8 th , 2019 - 6:00 P.M

Training Partnership Charissa Raynor, Executive Director Sept. 15, 2014 1 Who We Are

http://www.utexas.edu/ugs/ugr/pos ter/samples#stem Poster 1 (Parasite Poster 3 (Microbial

A Plan 4 Excellence Thank You Member of Strategic Plan Committee Local PTAs Special

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

What is mobile learning, mobile learning policies and technologies Dr. Mohamed Ally Learning

Year 7 Learning Evening 2017 W elcome! Year 7 Learning Evening 2017 Year 7 Learning Evening

Learning is a never-ending process Tasks come and go, but learning is forever Learn more e ff

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Sowmyan Rajagopalan, Founder & CTO Thalia Design Automation Analog IP Reuse & Process

Reconciling Coherence-Driven and Centering-Driven Theories of Pronoun Interpretation Andrew

An Overview of AHCA's Emergency Status System (ESS) June 18, 2019 Welcome! John Wilgis Vice

Study framework Condi2on analysis, demographics and mapping of

Repe$$on 1 Crystallography basics 2 Crystal systems 3 Centering What happens when other

CS 6453: Geode and Clarinet Soumya Basu April 13, 2017 Motivation Motivation Status Quo Tens

WANalytics: Analytics for a geo- distributed data-intensive world Ashish Vulimiri * , Carlo Curino

Multiple Access garbled garbled what if the moderator what if the moderator s connection

Sambuz

Useful Links

Newsletter

Mail Us

Learning Aides Additional tools that can be applied to all - PowerPoint PPT Presentation

Learning Aides Additional tools that can be applied to all techniques Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and Feature Selection Preprocess data to account for arbitrary choices during data

Learning From Data Lecture 27 Learning Aides Input Preprocessing Dimensionality Reduction and

Advanced Home Health Aides Update on Implementation Update on Implementation DRAFT - November

2014 Parade Aide Orientation What does a parade aide do? Parade aides follow a series of

Meet your instructors and course aides Blake Everett Johnson, Ph.D. T eaching Assistant

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

1 Plaque begins to harden onto the teeth after 24 hours therefore it can be easily removed if the

O R I E N T A T I O N PARK RK WITH ITH US Types of Permit Permit Required Campus

2019 -2020 Budget Presentation General Fund Salaries and Benefits April 8 th , 2019 - 6:00 P.M

Training Partnership Charissa Raynor, Executive Director Sept. 15, 2014 1 Who We Are

http://www.utexas.edu/ugs/ugr/pos ter/samples#stem Poster 1 (Parasite Poster 3 (Microbial

A Plan 4 Excellence Thank You Member of Strategic Plan Committee Local PTAs Special

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

What is mobile learning, mobile learning policies and technologies Dr. Mohamed Ally Learning

Year 7 Learning Evening 2017 W elcome! Year 7 Learning Evening 2017 Year 7 Learning Evening

Learning is a never-ending process Tasks come and go, but learning is forever Learn more e ff

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Sowmyan Rajagopalan, Founder &amp; CTO Thalia Design Automation Analog IP Reuse &amp; Process

Reconciling Coherence-Driven and Centering-Driven Theories of Pronoun Interpretation Andrew

An Overview of AHCA's Emergency Status System (ESS) June 18, 2019 Welcome! John Wilgis Vice

Study framework Condi2on analysis, demographics and mapping of

Repe$$on 1 Crystallography basics 2 Crystal systems 3 Centering What happens when other

CS 6453: Geode and Clarinet Soumya Basu April 13, 2017 Motivation Motivation Status Quo Tens

WANalytics: Analytics for a geo- distributed data-intensive world Ashish Vulimiri * , Carlo Curino

Multiple Access garbled garbled what if the moderator what if the moderator s connection

Sambuz

Useful Links

Newsletter

Mail Us

Sowmyan Rajagopalan, Founder & CTO Thalia Design Automation Analog IP Reuse & Process