Using Meta-learning for Model Type Selection in Predictive Big Data Analytics
Mustafa Nural, Hao Peng, John A. Miller Department of Computer Science University of Georgia
Model Type Selection in Predictive Big Data Analytics Mustafa - - PowerPoint PPT Presentation
Using Meta-learning for Model Type Selection in Predictive Big Data Analytics Mustafa Nural, Hao Peng, John A. Miller Department of Computer Science University of Georgia What is is Predictive Analytics? The process of building a
Mustafa Nural, Hao Peng, John A. Miller Department of Computer Science University of Georgia
capture the relationships between variables in order to
π§ = π π + Ο΅
Train Meta-learner
Candidate Dataset
Candidate s Meta-features Most Predictive Technique(s)
Training Datasets Modeling Techniques Performance Statistics
Report most predictive technique for each dataset
Feature Extraction Meta-learning Suggestion Engine Training Set
Meta-features
response, distinct ratio of response, % numeric, % categorical, % binary variables
numeric variables
problems
Regression (ScalaTion)
Regression (ScalaTion)
(Quadratic Expansion) (ScalaTion)
Expansion) (ScalaTion)
(ScalaTion)
(ScalaTion)
(R)
(R)
etc.
dataset/technique to get more reliable estimates
techniques
0.53 0.77 0.56 0.70 0.84 0.45 0.74 0.55 0.65 0.83
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
LA@1 LA@3 MAP@3 NDGC@1 NDGC@3
Random Forest kNN
including regression family of techniques
meta-learner for prediction
variable are the most important meta-features.
response variable.
are important indicators for using a regularization technique such as Lasso or Ridge.
SPSS Auto Modeler, Data Robot, β¦
A More Modern Approach
default model criteria and diagnostics
through decisions itβs making
statistical insight
Screenshot taken from Watson Analytics platform
Analytics, Google Prediction API)
API)
Library
the GCRMA method as predictors of proteins with top 9 most variance obtained from Reverse-phase protein lysate arrays (RPLA).
https://github.com/scalation/data