Model evaluation
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
Model evaluation P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES - - PowerPoint PPT Presentation
Model evaluation P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R Rafael Falcon Data Scientist at Shopify Q: What aspects need to be considered when you evaluate a Machine Learning model? 1. Type of Machine Learning task
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Classication Regression Clustering
Split data into training/validation/test sets use cross-validation
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Accuracy proportion of correctly classied examples useful when errors in predicting all classes are equally important beware of class imbalance scenarios! always predicting most frequent class → high accuracy cost-sensitive accuracy
TP +TN+F P +F N TP +TN TP +TN+c F P +c F N
1 2
TP +TN
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Receiver Operating Characteristic (ROC) For models that return class probabilities Is the model able to distinguish between the classes? For each possible classication threshold: True Positive Rate TPR = False Positive Rate FPR = Area under ROC Curve (AUC) higher is better
TP +F N TP F P +TN F P
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Average distance between model predictions and ground truth (actual values) Easy to compute In the same units of the response variable example: y = house price RMSE = 8,000 Model is $8,000 off from the true house price on average
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
No label information Two criteria to consider: compact clusters well-separated clusters Several validity indices Dunn's index Davies-Bouldin index Silhouette index etc. Use multiple indices!
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Large disparity in the frequencies of the decision classes. Accuracy metric is especially sensitive to these scenarios. always predicting majority class --> high accuracy! Two popular avenues: cost-sensitive classication subsampling imbalanced data
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Misclassication cost for minority classes is higher than for the majority class.
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Subsample the training data in a way that mitigates the class imbalance. Three common approaches: downsampling upsampling SMOTE
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Reduce frequency of the overrepresented classes Match the frequency of the underrepresented classes Example: Before: majority class (80 samples), minority class (20 samples) After: majority class (20 samples), minority class (20 samples)
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Increase frequency of the underrepresented classes random sampling with replacement Match the frequency of the overrepresented classes Example: Before: majority class (80 samples), minority class (20 samples) After: majority class (80 samples), minority class (80 samples)
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Synthetic Minority Oversampling TEchnique Generates new instances from the minority class
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Parameters vs. hyperparameters Parameters: learned by the model during training
Hyperparameters: not learned but specied prior to training inuence different aspects of the training process do not change as training unfolds tuned as part of a meta-learning process
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Parameters weight matrix bias vector Hyperparameters learning rate number of hidden layers number of hidden neurons per layer
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
K-means clustering Parameters cluster prototypes Hyperparameters number of clusters K centroid initialization method
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Finding adequate hyperparameter values Iterative process generate hyperparameter vector train the model with this vector evaluate model performance Computationally expensive!
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Three main strategies: grid search random search informed search
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Exhaustive search over a manually specied subset of the hyperparameter space All possible combinations are considered expensive but highly parallelizable Example: α ∈ [0,1], β ∈ [2,5] Sample each hyperparameter space
α ∈ {0.2,0.5,0.8} β ∈ {2,3,4,5}
12 hyperparameter vectors are tested (Cartesian product)
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Randomly selects hyperparameter vectors discrete sample continuous hyperparameter distribution Highly parallelizable Can outperform grid search a few hyperparameters affect model performance
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Bayesian optimization Probabilistically maps a hyperparameter vector to a model performance indicator Samples more frequently around promising hyperparameter vectors Better results in fewer evaluations compared to grid and random search
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Several R packages available:
caret mlr h2o
Check out Hyperparameter tuning in R DataCamp course! We will show you how to tune hyperparameters using caret in the exercises.
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
They are both top-performing ensemble models. Suitable for both classication and regression tasks. Decision trees as base learners. Can handle missing values. Provide model-specic variable importance metric.
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Random Forests Bagging ensemble Deeper decision trees Aimed at reducing variance Trees grown in parallel Easier to tune All trees used Gradient Boosting Trees Boosting ensemble Shallower decision trees Aimed at reducing bias Trees grown sequentially Harder to tune Trees added as needed
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
There are multiple R packages that implement RFs and GBTs.
library(randomForest) library(ranger) library(gbm) library(xgboost) library(caret)
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
library(randomForest) # Tunes mtry tunedModel <- tuneRF(x = predictors, y = response, nTreeTry=500) library(caret) # Tunes m_try by default, others if configured tunedModel <- train(x = predictors, y = response, method = 'rf')
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
library(gbm) # Tunes n.trees based on CV/OOB error
library(caret) # Tunes several hyperparameters model <- train(x = predictors, y = response, method='xgbLinear')
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R
Rafael Falcon
Data Scientist at Shopify
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Data normalization Max-min scaling vs. standardization Handling missing data Exploration and visualization Imputation methods Anomaly detection IQR rule KNN distance score Local Outlier Factor (LOF) Package list (in alphabetical order):
dbscan dplyr FNN ggplot2 naniar simputation tidyr
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Model interpretability Linear regression, decision trees Regularization Ridge, Lasso and Elastic net regression Bias and variance bias-variance analysis Model ensembles Bagging, boosting, stacking Package list (in alphabetical order):
caret caretEnsemble dplyr e1071 elasticnet Metrics nnet rattle rpart rpart.plot
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
K-means clustering Checking assumptions Determining optimal number of clusters Clustering algorithms Hierarchical, K-means, PAM Cluster validity indices Feature selection Filter, wrapper, embedded methods Feature extraction PCA, LDA Package list (in alphabetical order):
caret clValid dplyr MASS Metrics plot3D stats
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Model evaluation metrics classication, regression, clustering Handling imbalanced data downsampling, upsampling, SMOTE Hyperparameter tuning grid search, random search Random Forests vs. Gradient Boosted Trees commonalities and differences Package list (in alphabetical order):
caret clValid dplyr gbm Metrics randomForest
PRACTICING MACHINE LEARNING INTERVIEW QUESTIONS IN R
Keep learning! DataCamp courses on: Deep Learning Model validation Machine Learning with Apache Spark Linear classiers etc. Your constructive feedback about this course is important!
P RACTICIN G MACH IN E LEARN IN G IN TERVIEW QUES TION S IN R