Empirical Confidence Models for Supervised Machine Learning
Margarita Castro1, Meinolf Sellmann2, Zhaoyuan Yang2, Nurali Virani2
1 University of Toronto, Mechanical and Industrial Engineering 2 General Electric, Global Research Center
May, 2020
Empirical Confidence Models for Supervised Machine Learning - - PowerPoint PPT Presentation
Empirical Confidence Models for Supervised Machine Learning Margarita Castro 1 , Meinolf Sellmann 2 , Zhaoyuan Yang 2 , Nurali Virani 2 1 University of Toronto, Mechanical and Industrial Engineering 2 General Electric, Global Research Center May,
1 University of Toronto, Mechanical and Industrial Engineering 2 General Electric, Global Research Center
May, 2020
u We can’t expect the models to be
u Summarize statistics (e.g.,
2 Self-Driving Cars Healthcare diagnosis Cyber security
“We develop techniques that learn when models generated by certain learning techniques on a particular data set can be expected to perform well, and when not.” 3
𝑌 Run time instance
𝑍′ Prediction 𝐷 Competence Level Trusted, Cautioned
u Overall framework. u Meta-features. u Meta Training Data.
u Experimental Setting. u Results. u Conclusions.
4
PART 1 5
6 Regressor Competence Assessor Meta Feature Builder
Input for Competence Assessor (𝑌, 𝑍) Training Set 𝑦 Run-time input y′ Prediction 𝐷 Competence Level (𝑌, 𝑍)
Technique
(e.g., Random Forest)
Regressor
Training Set
Primary Model
1 2
u Different distances measures depending
𝑒: 𝐺×𝐺 → ℝ!
u Neighborhood 𝑂(𝑦) based on the distance
measure 𝑒 ⋅,⋅ .
u We consider 𝑙 nearest neighbors
with 𝑙 = 5. 7 Meta Feature Builder
(𝑌, 𝑍)
Training Set
𝑦 Run-time input 𝑔 𝑦 = 𝑧′ Prediction 𝑦 y′
6 meta features
1. Average Distance to the Neighborhood 𝑁" 𝑦 ≔ 3
#!,%! ∈' #
𝑒 𝑦, 𝑦( 𝑙
u Measure how far the run-time input
from the training data set. 2. Average Prediction Distance 3. Deviation from regressor’s prediction 𝑁) 𝑦 ≔ 3
#!,%! ∈' #
𝑔 𝑦 − 𝑔(𝑦() 𝑙 𝑁* 𝑦 ≔ 𝑔 𝑦 − 3
#!,%! ∈' #
𝑧 𝑡(𝑦) 𝑒(𝑦, 𝑦′) u Relationship between predictions at the vicinity of current input. 8
4. Average training error on 𝑂(𝑦) 5. Variance training error on 𝑂(𝑦) 𝑁+ 𝑦 ≔ 3
#!,%! ∈' #
𝑔 𝑦( − 𝑧′ 𝑡 𝑦 𝑒 𝑦(, 𝑦
𝑁! 𝑦 ≔ $
"!,$! ∈& "
𝑔 𝑦' − 𝑧' − 𝑁( 𝑦
)
𝑙 − 1
u Accuracy of regressor in the immediate vicinity. 6. Target value variability on 𝑂(𝑦) 𝑁, 𝑦 ≔ 3
#!,%! ∈' #
𝑧( − 9 𝑧 ) 𝑙 − 1 u Variance of true value in 𝑂(𝑦). 9
10 Regressor Technique
Training Set
Splitter
Base Validation
Meta Feature Builder
𝑍′ 𝑍
Training Data for Competence Assessor
𝐷
u
Random splitting into ℎ ∈ {3,5,10} buckets.
u
One validation bucket and the rest as base.
u
Assess i.i.d. assumption of the technique.
u
Create interpolation and extrapolation scenarios.
u
Project over 1st and 2nd PC dimension and sort the training data before splitting.
11
Training Set Base Validation Training Set Base Validation Projected and sorted data
u Based on the true error of the learned
model.
u Sort the absolute residual values in
ascending order and set the labels as:
u 80% smaller
à Trusted
u 80-95%
à Cautioned
u Last 5%
à Not trusted
Note: the labeling can be modified for specific applications
u Off-the-shelf SVM and Random Forest
Classifier.
u Our goal is to test the framework in several
datasets.
Note: More sophisticated techniques can be used for specific applications. 12
PART 2 13
Objective:
u Six UCI benchmark data-sets. u Regressors: Linear, Random Forest, and
u Task: standard, interpolation, and
extrapolation.
u Standard cross-validation. u Interpolation and Extrapolation: u Cluster data and take complete
clusters as test set.
u PC projections (1st and 3rd).
14
u 1-dimension data following a linear
regression with random noise.
u Interpolation task. u Regressors:
u Linear regression. u Random forest.
15
Linear Regression Model Random Forest Model ECM Predictions
16
Trusted Cautious Not Trusted
Baseline: Competence assessor trained over original data (only standard splitting and no meta features) 17
Trusted Warned
u We present an Empirical Confidence Model (ECM) that assess the
u We show the effectives of ECM for i.i.d. and non-i.i.d. train/test splits. u Future works: u Study other reliability measures as meta-features. u Integrate our methodology in an active learning setting.
18