RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent - PowerPoint PPT Presentation

RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent Systems Lab Analysis and Control Technology Tempe, AZ, USA Chandler, AZ, USA Kari.Torkkola@motorola.com eugene.tuv@intel.com NIPS 2003 Feature Selection Workshop

RF + RLSC • Random Forests (RF) for feature selection • Regularized Least Squares Classifiers (RLSC) • Stochastic ensembles of RLSCs NIPS 2003 Feature Selection Workshop

Why Random Forests for Feature Selection? • Basic idea: Train a classifier, then extract features that are important to the classifier • Features are not chosen in isolation! • RF is extremely fast to train • Allows for mixed data types, missing values NIPS 2003 Feature Selection Workshop

Random Forests for Feature Selection - How? • RF – Trains a large forest of decision trees – Samples the training data for each tree – Samples the features to make each split – Error estimation from out-of-bag cases – Proximity measures, importance measures, … • An Importance Measure – A split in a tree by using a particular variable results in a decrease of the gini index – Sum of these decreases over the forest ranks features by importance NIPS 2003 Feature Selection Workshop

Challenge Examples Madelon • 500 variables, training set has 2000 cases • Constructed 500 trees • Variable importance has a clear cut-off point at 19 variables • Validation set: 600 cases • The top 19 variables are the same, but the cut-off point is not that clear Dexter • 20000 variables, 300 cases in both the training and the validation sets • Top 50 variables from both sets are 70% shared (stability) NIPS 2003 Feature Selection Workshop

Why Ensembles of RLSCs as Classifiers? • Why not just use RF? – The base learner is not good enough! • RLSC solves a simple linear problem Given data ( x i , y i ) m i =1 , find f : X → Y that generalizes: 1. Choose a kernel, such as K ( x, x 0 ) = e − || x − x 0|| 2 2 σ 2 , 2. f ( x ) = P m i =1 c i K x i ( x ), where c i is a solution to ( m γ I + K ) c = y • Square loss function works well in binary classification (Poggio, Smale, et al.) • Use minimum regularization (just to guarantee solution) to reduce bias, sample cases to produce diversity in base learners NIPS 2003 Feature Selection Workshop

Things to worry about with RLSC Ensembles • Kernel and its parameters? • How many classifiers in the ensemble? • What fraction of data to use to train each? • How much to regularize (if at all)? • Determine all of the above by cross-validation NIPS 2003 Feature Selection Workshop

Future Directions • RF as one type of supervised kernel generator using the pairwise similarities • Similarity between 2 cases could be defined (for a single tree) as total number of common parent nodes, normalized by level of the deepest case, and summed up for the ensemble • Minimum number of common parents to define nonzero similarity is another parameter acting like width in Gaussian kernels. • Works for any type of data (numeric, categorical, mixed, missing values)! • Feature selection bypassed altogether! Arcene: Gaussian kernel Arcene: Supervised kernel 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 100 100 20 40 60 80 100 10 20 30 40 50 60 70 80 90 100 NIPS 2003 Feature Selection Workshop

Conclusion • RF: Fast and robust feature selection • RLSC: linear problem-solving • Supervised kernels • What we don’t know… NIPS 2003 Feature Selection Workshop

RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent - PowerPoint PPT Presentation

RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent Systems Lab Analysis and Control Technology Tempe, AZ, USA Chandler, AZ, USA Kari.Torkkola@motorola.com eugene.tuv@intel.com NIPS 2003 Feature Selection Workshop RF + RLSC

Random Forests A Statistical Tool for the Sciences Adele Cutler Utah State University Based

14.54 International Trade Lecture 11: Specific Factors Model 14.54 Week 6 Fall 2016

Intro & Updates Ben Hilburn What is Software Radio? Defined by the IEEE P1900.1

Measuring variable importance in random forests Variable Variable importance in RF importance

Selecting feat u res for model performance D IME N SION AL ITY R E D U C TION IN P YTH ON

The Design Complexity of Program Undo Support in a General

IQC5000B Series RF Signal Record and Playback Value of RF Streaming Evidence of what happened

Asset Pricing Chapter VI. Risk Aversion and Investment Decisions, Part II: Modern Portfolio

Jim Galvins Slides Exports: Today & Tomorrow Jim Galvin Chief Executive Officer

for Communities We will begin at approximately Please use your computers speakers for

W HAT ARE E XPRESS S ERVICES ? Core elements : Triage-based STD testing without a full physical

UCF Yogesh S Rawat, Aayush Rana, Praveen Tirupattur, and Mubarak Shah Center for Research in

Gra Grant nt Coordi rdina nato tor M r Meeti ting ng Febru ruary ry 2019 2019 Office

I PB X-ray and IR spectrometry Quantitative Rntgenfluoreszenzanalyse Woran wir g lauben und was

Q1 Abt Associates SNAP and food assistance policy Klerman, JA and C Danielson, 2011.

3.12: Closure Properties of Regular Languages In this section, we show how to convert regular

Interconnection Network Models for Large-Scale Performance Prediction Kishwar Ahmed, Mohammad

Administrative Class webpage updated to include reading assignments Lab after class today

(CLNA) Process Overview Training Module Prepared by the Division of Career and Adult Education 1

Cloud security CS642: Computer Security Professor Ristenpart

2. EBGL Methodologies UPDATE ON NRA DECISION MAKING ON EB PROPOSALS 1 EBTF update on

Streaming Data Explanation with MacroBase Kai Sheng Tai in collaboration with Peter Bailis,

SIM PTO TRAINING JUNE 27, 2018 9:00 AM Call Instructions: Please Mute your phone,

SW/HW Codesign of the Post-Quantum Cryptography Algorithm NTRUEncrypt Using HLS and RTL Design

RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent - PowerPoint PPT Presentation

RF + RLSC Kari Torkkola Eugene Tuv Motorola Intel Intelligent Systems Lab Analysis and Control Technology Tempe, AZ, USA Chandler, AZ, USA Kari.Torkkola@motorola.com eugene.tuv@intel.com NIPS 2003 Feature Selection Workshop RF + RLSC

Random Forests A Statistical Tool for the Sciences Adele Cutler Utah State University Based

14.54 International Trade Lecture 11: Specific Factors Model 14.54 Week 6 Fall 2016

Intro &amp; Updates Ben Hilburn What is Software Radio? Defined by the IEEE P1900.1

Measuring variable importance in random forests Variable Variable importance in RF importance

Selecting feat u res for model performance D IME N SION AL ITY R E D U C TION IN P YTH ON

The Design Complexity of Program Undo Support in a General

IQC5000B Series RF Signal Record and Playback Value of RF Streaming Evidence of what happened

Asset Pricing Chapter VI. Risk Aversion and Investment Decisions, Part II: Modern Portfolio

Jim Galvins Slides Exports: Today &amp; Tomorrow Jim Galvin Chief Executive Officer

for Communities We will begin at approximately Please use your computers speakers for

W HAT ARE E XPRESS S ERVICES ? Core elements : Triage-based STD testing without a full physical

UCF Yogesh S Rawat, Aayush Rana, Praveen Tirupattur, and Mubarak Shah Center for Research in

Gra Grant nt Coordi rdina nato tor M r Meeti ting ng Febru ruary ry 2019 2019 Office

I PB X-ray and IR spectrometry Quantitative Rntgenfluoreszenzanalyse Woran wir g lauben und was

Q1 Abt Associates SNAP and food assistance policy Klerman, JA and C Danielson, 2011.

3.12: Closure Properties of Regular Languages In this section, we show how to convert regular

Interconnection Network Models for Large-Scale Performance Prediction Kishwar Ahmed, Mohammad

Administrative Class webpage updated to include reading assignments Lab after class today

(CLNA) Process Overview Training Module Prepared by the Division of Career and Adult Education 1

Cloud security CS642: Computer Security Professor Ristenpart

2. EBGL Methodologies UPDATE ON NRA DECISION MAKING ON EB PROPOSALS 1 EBTF update on

Streaming Data Explanation with MacroBase Kai Sheng Tai in collaboration with Peter Bailis,

SIM PTO TRAINING JUNE 27, 2018 9:00 AM Call Instructions: Please Mute your phone,

SW/HW Codesign of the Post-Quantum Cryptography Algorithm NTRUEncrypt Using HLS and RTL Design

Intro & Updates Ben Hilburn What is Software Radio? Defined by the IEEE P1900.1

Jim Galvins Slides Exports: Today & Tomorrow Jim Galvin Chief Executive Officer