binary classification
play

Binary Classification Many slides attributable to: Prof. Mike - PowerPoint PPT Presentation

Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Binary Classification Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale Doshi-Velez (Harvard) James, Witten, Hastie, Tibshirani


  1. Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Binary Classification Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale Doshi-Velez (Harvard) James, Witten, Hastie, Tibshirani (ISL/ESL books) 1

  2. Logistics • Waitlist: We have some room, contact me • HW2 due TONIGHT (Wed 2/6 at 11:59pm) • What you submit: PDF and zip • Please annotate pages in Gradescope! • HW3 out later tonight, due a week from today • What you submit: PDF and zip • Please annotate pages in Gradescope! • Next recitation is Mon 2/11 • Practical binary classifiers in Python with sklearn • Numerical issues and how to address them Mike Hughes - Tufts COMP 135 - Spring 2019 2

  3. Objectives : Classifier Overview • 3 steps of a classification task • Prediction • Making hard binary decisions • Predicting class probabilities • Training • Evaluation • Performance Metrics • A “taste” of 3 Methods • Logistic Regression • K-Nearest Neighbors • Decision Tree Regression Mike Hughes - Tufts COMP 135 - Spring 2019 3

  4. What will we learn? Evaluation Supervised Training Learning Data, Label Pairs Performance { x n , y n } N measure Task n =1 Unsupervised Learning data label x y Reinforcement Learning Prediction Mike Hughes - Tufts COMP 135 - Spring 2019 4

  5. Before: Regression y is a numeric variable Supervised e.g. sales in $$ Learning regression y Unsupervised Learning Reinforcement Learning x Mike Hughes - Tufts COMP 135 - Spring 2019 5

  6. Task: Binary Classification y is a binary variable Supervised (red or blue) Learning binary classification x 2 Unsupervised Learning Reinforcement Learning x 1 Mike Hughes - Tufts COMP 135 - Spring 2019 6

  7. Example: Hotdog or Not Mike Hughes - Tufts COMP 135 - Spring 2019 7

  8. Task: Multi-class Classification y is a discrete variable Supervised (red or blue or green or purple) Learning multi-class classification x 2 Unsupervised Learning Reinforcement Learning x 1 Mike Hughes - Tufts COMP 135 - Spring 2019 8

  9. Classification Example: Swype Many possible letters: Multi-class classification Mike Hughes - Tufts COMP 135 - Spring 2019 9

  10. Binary Prediction Step Goal: Predict label (0 or 1) given features x x i , [ x i 1 , x i 2 , . . . x if . . . x iF ] • Input: “features” Entries can be real-valued, or other numeric types (e.g. integer, binary) “covariates” “predictors” “attributes” • Output: y i ∈ { 0 , 1 } Binary label (0 or 1) “responses” “labels” Mike Hughes - Tufts COMP 135 - Spring 2019 10

  11. Binary Prediction Step >>> # Given: pretrained regression object model >>> # Given: 2D array of features x >>> x_NF.shape (N, F) >>> yhat_N = model.predict(x_NF) >>> yhat_N[:5] # peek at predictions [0, 0, 1, 0, 1] >>> yhat_N.shape (N,) Mike Hughes - Tufts COMP 135 - Spring 2019 11

  12. Types of binary predictions TN : true negative FP : false positive FN : false negative TP : true positive 12

  13. Example: Which outcome is this? TN : true negative FP : false positive FN : false negative TP : true positive 13

  14. Example: Which outcome is this? Answer: True Positive TN : true negative FP : false positive FN : false negative TP : true positive 14

  15. Example: Which outcome is this? TN : true negative FP : false positive FN : false negative TP : true positive 15

  16. Example: Which outcome is this? Answer: True Negative (TN) TN : true negative FP : false positive FN : false negative TP : true positive 16

  17. Example: Which outcome is this? TN : true negative FP : false positive FN : false negative TP : true positive 17

  18. Example: Which outcome is this? Answer: False Negative (FN) TN : true negative FP : false positive FN : false negative TP : true positive 18

  19. Example: Which outcome is this? TN : true negative FP : false positive FN : false negative TP : true positive 19

  20. Example: Which outcome is this? Answer: False Positive (FP) TN : true negative FP : false positive FN : false negative TP : true positive 20

  21. Probability Prediction Step Goal: Predict probability p(Y=1) given features x x i , [ x i 1 , x i 2 , . . . x if . . . x iF ] • Input: “features” Entries can be real-valued, or other numeric types (e.g. integer, binary) “covariates” “predictors” “attributes” • Output: ˆ Probability between 0 and 1 p i e.g. 0.001, 0.513, 0.987 “probabilities” Mike Hughes - Tufts COMP 135 - Spring 2019 21

  22. Probability Prediction Step >>> # Given: pretrained regression object model >>> # Given: 2D array of features x >>> x_NF.shape (N, F) >>> yproba_N2 = model.predict_proba(x_NF) >>> yproba_N2.shape (N, 2) Column index 1 gives probability of positive label p(Y = 1) >>> yproba_N2[:, 1] [0.003, 0.358, 0.987, 0.111, 0.656] Mike Hughes - Tufts COMP 135 - Spring 2019 22

  23. Thresholding to get Binary Decisions Credit: Wattenberg, Viégas, Hardt Mike Hughes - Tufts COMP 135 - Spring 2019 23 23

  24. Thresholding to get Binary Decisions Credit: Wattenberg, Viégas, Hardt Mike Hughes - Tufts COMP 135 - Spring 2019 24

  25. Thresholding to get Binary Decisions Credit: Wattenberg, Viégas, Hardt Mike Hughes - Tufts COMP 135 - Spring 2019 25

  26. Pair Exercise Interactive Demo: https://research.google.com/bigpicture/attacking- discrimination-in-ml/ Loan and pay back: +$300 Loan and not pay back: -$700 Goals: • What threshold maximizes accuracy? • What threshold maximizes profit? • What needs to be true of costs so threshold is the same for profit and accuracy? Mike Hughes - Tufts COMP 135 - Spring 2019 26

  27. Classifier: Training Step Goal: Given a labeled dataset, learn a function that can perform prediction well • Input: Pairs of features and labels/responses { x n , y n } N n =1 y ( · ) : R F → { 0 , 1 } ˆ • Output: Useful to break into two steps: 1) Produce probabilities in [0, 1] OR real-valued scores 2) Threshold to make binary decisions Mike Hughes - Tufts COMP 135 - Spring 2019 27

  28. Classifier: Training Step >>> # Given: 2D array of features x >>> # Given: 1D array of binary labels y >>> y_N.shape (N,) >>> x_NF.shape (N, F) >>> model = BinaryClassifier() >>> model.fit(x_NF, y_N) >>> # Now can call predict or predict_proba Mike Hughes - Tufts COMP 135 - Spring 2019 28

  29. Classifier: Evaluation Step Goal: Assess quality of predictions Many ways in practice: 1) Evaluate probabilities / scores directly logistic loss, hinge loss, … 2) Evaluate binary decisions at specific threshold accuracy, TPR, TNR, PPV, NPV, etc. 3) Evaluate across range of thresholds ROC curve, Precision-Recall curve Mike Hughes - Tufts COMP 135 - Spring 2019 29

  30. Metric: Confusion Matrix Counting mistakes in binary predictions #TN : num. true negative #TP : num. true positive #FN : num. false negative #FP : num. false positive #TP #FP #FN #TP 30

  31. Metric: Accuracy accuracy = fraction of correct predictions TP + TN = TP + TN + FN + FP Potential problem: Suppose your dataset has 1 positive example and 99 negative examples What is the accuracy of the classifier that always predicts ”negative”? Mike Hughes - Tufts COMP 135 - Spring 2019 31

  32. Metric: Accuracy accuracy = fraction of correct predictions TP + TN = TP + TN + FN + FP Potential problem: Suppose your dataset has 1 positive example and 99 negative examples What is the accuracy of the classifier that always predicts ”negative”? 99%! Mike Hughes - Tufts COMP 135 - Spring 2019 32

  33. Metrics for Binary Decisions “sensitivity”, “recall” “specificity”, 1 - FPR “precision” Emphasize the metrics appropriate for your application. 33

  34. Goal: App to classify cats vs. dogs from images Which metric might be most important? Could we just use accuracy? Mike Hughes - Tufts COMP 135 - Spring 2019 34

  35. Goal: Classifier to find relevant tweets to list on website Which metric might be most important? Could we just use accuracy? Mike Hughes - Tufts COMP 135 - Spring 2019 35

  36. Goal: Detector for tumors based on medical image Which metric might be most important? Could we just use accuracy? Mike Hughes - Tufts COMP 135 - Spring 2019 36

  37. ROC Curve (across thresholds) perfect Specific thresh TPR (sensitivity) random guess FPR (1 – specificity) 37

  38. Area under ROC curve (aka AUROC or AUC or “C statistic”) Area varies from 0.0 – 1.0. 0.5 is random guess. 1.0 is perfect. TPR Graphical: (sensitivity) FPR Probabilistic: (1 – specificity) AUROC , Pr(ˆ y ( x i ) > ˆ y ( x j ) | y i = 1 , y j = 0) For random pair of examples, one positive and one negative, What is probability classifier will rank positive one higher? 38

  39. Precision-Recall Curve precision recall (= TPR) Mike Hughes - Tufts COMP 135 - Spring 2019 39

  40. AUROC not always best choice AUROC: red is better Blue much better for alarm fatigue 40

  41. Classifier: Evaluation Metrics https://scikit-learn.org/stable/modules/model_evaluation.html 1) To evaluate predicted scores / probabilities 2) To evaluate specific binary decisions 3) To make ROC or PR curves and compute areas Mike Hughes - Tufts COMP 135 - Spring 2019 41

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend