Explaining Machine Learning Models Armen Donigian Director of Data - PowerPoint PPT Presentation

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering

Roadmap + Definition of Interpretability + The Need for Interpretability + Role of Interpretability in Data Science Process + Relevant Application Domains + Barriers to Adoption + Conveying Interpretations + Research Directions

Working Definition of Interpretability “The ability to explain or to present in understandable terms to a human.” Paper titled "Towards A Rigorous Science of Interpretable Machine Learning"

The Need for Interpretability In Supervised ML, we learn a model to accomplish a specific goal by minimizing a loss function. Purpose is to trust & understand how the model uses inputs to make predictions. Train Validation loss is Not Enough ! Can’t encode needs below into single loss function... Test Bias: Non-stationarity Fairness : Overlook gender-biased word embeddings (or other protected classes) Validation (Refer paper titled: “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings”) Safety : Infeasible to test all failure scenarios Regulatory compliance: Adverse Action & Disparate Impact Mismatched Objectives - Single-Objective: Overly associates wolves with snow - Multi-Objective trade-off: Privacy vs Prediction Quality Security : Is model vulnerable to an adversarial user? - User asks for increase in credit ceiling to increase credit score

Interpretability: The Need to Keep Up GoogLeNet As our methods to learn patterns from data become more complex... Failure Modes: Adversarial examples (more complex model can have less intuitive failure modes) Small carefully constructed noise Figure 1 Paper titled "Explaining and Harnessing Adversarial Examples"

Role of Interpretability: Data Science Process Application ML Engineer End User Reference Figure 1

Application Domains for Interpretability Credit UW (Equal Credit Opportunity Act) - Adverse Action - Disparate Impact Neural machine translation - Bridge translation gap between source & target languages - Large corpus, unwanted co-occurrences of words which bias the model Figure 1 Medical diagnoses - Show physician regions where lesions appear in retina Figure 2 Autonomous driving - Saliency map of what model used to predict orientation & direction of steering Scientific discoveries - Show how molecules interact w/ enzymes, potential to learn causal relationships Think of the cost of an incorrect prediction! Figure 3

Barrier to Adoption in Underwriting The Explainable Machine Learning Challenge FICO teams up with Google, UC Berkeley, Oxford, Imperial, MIT and UC Irvine to sponsor a contest to generate new research in the area of algorithmic explainability - Home Equity Line of Credit (HELOC) dataset - Lines of credit $5,000 to $150,000 The black box nature of machine learning algorithms means that they are currently neither interpretable nor explainable… Without explanations, these algorithms cannot meet regulatory requirements, and thus cannot be adopted by financial institutions. - FICO blog

Catalogue Methods by Output Visualizations (Intuitive) DeepDream Partial Dependence Plots, Correlations, Dim Reduction, Clustering Text For image captioning, we can use stochastic neighborhood embedding using n-dims to find relative neighborhoods Asked to find bananas, DeepDream finds bananas in noise Figure 1 Examples t-SNE Find most influential training samples by unweighing different samples & observe sensitivity Influence Functions Figure 2 Figure 3

Ways to Convey Interpretability (Feat Level) Naturally Interpretable Models Sensitivity Analysis: “What makes the shark less/more a shark?” ● Measure sensitivity of output to changes made in the input features ● Randomly shuffle feature values one column at a time and measure change on performance ● Saliency map of what model was looking for when it made decision ○ Which pixels lead to increase/decrease of prediction score when changed? Approach: Permutation Impact Decomposition : “What makes the shark a shark?” ● Breaks down relevance of each feature to the prediction as a whole ● Done with respect to some reference (select bottom tier of good loans) ● Feature attributions must add up to the whole prediction (normalizing factor) Approach: Backprop Figure 1 Figure 2

Naturally Interpretable Models Decision Trees Linear Models f(x) = a 1 x 1 + a 2 x 2 + b contrib(x i ) = a i x i Boston f(x) - f(x 0 ) = � i contrib(x i ) housing prices dataset baseline: x 0 = (0,0) Figure 1 Decomposition: assigns blame to causes (some reference cause) Trace path of each decision & observe how it changes the regression value. ● Feature importances. How often a feature is used to make a decision? Sensitivity: Take gradient of this model w/ respect to input, coefficients Check out SHapley Additive exPlanations, treeinterpreter remain.

Permutation Feature Importance Permutation feature importance Randomly shuffle feature values one column at a time and measure change on performance Pros Simple implementation Model agnostic Cons No variable interaction Computationally expensive Works when few features are important & Single pixel perturbation operate independently does not change prediction

Surrogate Models (LIME) Local Interpretable Model-Agnostic Explanations Learn a simple interpretable model about the test point using proximity weighted samples Figure 1 Top 3 predicted classes Figure 2 Pros Model-agnostic Cons Computationally Expensive Figure 3

Backpropagation Based Approaches Gradients (saliency map) ● Start w/ particular output ● Assign importance scores to neurons in layer below depending on function connecting those 2 layers ● Repeat process until you reach input ● With a single backward prop, you get importance scores for all features in the input Gradient w/ respect to inputs gives us feature attributions Pros Simple and efficient GPU-optimized implementation Cons Fails in flat regions (e.g. ReLU)...gives 0 when contribution isn’t zero

Backprop Approaches if f(x) ≠ f(x 0 ) Improving gradients If 2 feature vectors differ only on a attribution(x 2 ) > 0 single feature but have different predictions then the differing Dealing with absence of signal feature attributions should be non-zero attribution. x Towards decomposition x 0 Define a set of axioms: Sensitivity Implementation invariance Completeness/additivity Linearity f(x) - f(x 0 ) = � i attr(x i )

Backprop Approaches Better way to backprop thru RELUs - DeconvNet Equivalent to gradients, but ReLU in backwards direction - Guided Backprop Gradients, but ReLU in both directions - PatternNet/Attribution Correct gradient for correlated distracting noise - Layerwise Relevance Propagation Integrated Gradients Equivalent to input-scaled gradients - Pick starting value, scale up linearly from reference to actual value, compute gradients along the way Some other interesting approaches... - Positive & negative contribution scores - Integrated Gradients DeepLift Path integral of gradients from baseline - Compare activation of each neuron to its reference activation - DeepLIFT - Assign contribution scores based on difference True decomposition relative to baseline with discrete jump - Positive & negative contribution scores - Deep Taylor Decomposition - Generalizes to all activations - Importance is propagated even when gradient is 0 Taylor approximation about a baseline for each neuron

Evaluating Interpretability Methods If we have a set of feature contributions... Spearman’s Rank-Order Correlation What % of Top-K intersect Experimental Evaluation Approaches Assign a user (domain expert) tasks based on the produced feature attributions - Show saliency maps, ask user to choose which classifier generalizes better - Show attributions & ask user to perform feature selection to improve the model - Ask user to identify classifier failure modes

Adversarial Examples Interpretability can suffer from adversarial attacks independently of prediction Figure 1 Paper titled "Interpretation of Neural Networks is Fragile" Attack types Figure 2 Top-k attack Take top 5 features, create distortion which drops their rank Center attack Take center of mass, try to move it as far as it can with some constrained distortion, goal to move the center of mass of the saliency map

Research Directions Better loss functions for interpretability Understand what makes certain models more interpretable and how interpretability fails Explain models in unsupervised learning, sequence learning (RNNs), and reinforcement learning E.g. generating text explanations of the actions of a reinforcement learning agent Develop interpretability techniques into tools for model diagnostics, security, and compliance

Explaining Machine Learning Models Armen Donigian Director of Data - PowerPoint PPT Presentation

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap + Definition of Interpretability + The Need for Interpretability + Role of Interpretability in Data Science Process + Relevant Application

Explaining Deep Learning Predictions and Isaac Ahern Integrating Domain Ontologies Outline

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable

Data Science in the Wild Lecture 14: Explaining Models Eran Toch Data Science in the Wild,

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

The importance of claw health traits and implications for future genetic evaluation in German

76 Million Boomers 83 Million Millennials 19 to 35 91 Million Millennials 16 to 35 500

How Online Ads Impact Students Consumer Behavior Group 8 INF397C Professor Jacek Gwizdka

ANOTHER APPENDIX TO New Performance Metrics based on Multigrade Relevance: Their Application to

Efficiency Measurement of Turkish Public Universities with Data Envelopment Analysis (DEA) Taptuk

How effective is participation in public environmental decision-making? Early findings from a

Web News Sentence Searching Using Linguistic Graph Similarity Kim Schouten & Flavius

KNOWLEDGE AND SKILLS OF YOUNG ADOLESCENTS TO REFUSE SUBSTANCES Lucky Herawati * , Johan Arief

Sambuz

Useful Links

Newsletter

Mail Us

Explaining Machine Learning Models Armen Donigian Director of Data - PowerPoint PPT Presentation

Explaining Machine Learning Models Armen Donigian Director of Data Science Engineering Roadmap + Definition of Interpretability + The Need for Interpretability + Role of Interpretability in Data Science Process + Relevant Application

Explaining Deep Learning Predictions and Isaac Ahern Integrating Domain Ontologies Outline

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable

Data Science in the Wild Lecture 14: Explaining Models Eran Toch Data Science in the Wild,

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Machine Learning - Intro Aarti Singh Machine Learning 10-701/15-781 Sept 8, 2010 You tell me

The importance of claw health traits and implications for future genetic evaluation in German

76 Million Boomers 83 Million Millennials 19 to 35 91 Million Millennials 16 to 35 500

How Online Ads Impact Students Consumer Behavior Group 8 INF397C Professor Jacek Gwizdka

ANOTHER APPENDIX TO New Performance Metrics based on Multigrade Relevance: Their Application to

Efficiency Measurement of Turkish Public Universities with Data Envelopment Analysis (DEA) Taptuk

How effective is participation in public environmental decision-making? Early findings from a

Web News Sentence Searching Using Linguistic Graph Similarity Kim Schouten &amp; Flavius

KNOWLEDGE AND SKILLS OF YOUNG ADOLESCENTS TO REFUSE SUBSTANCES Lucky Herawati * , Johan Arief

Sambuz

Useful Links

Newsletter

Mail Us

Web News Sentence Searching Using Linguistic Graph Similarity Kim Schouten & Flavius