introduction to machine learning evaluation measures for
play

Introduction to Machine Learning Evaluation: Measures for Binary - PowerPoint PPT Presentation

Introduction to Machine Learning Evaluation: Measures for Binary Classification: ROC visualization compstat-lmu.github.io/lecture_i2ml LABELS: ROC SPACE Plot True Positive Rate and False Positive Rate: 1.00 True Class y C2 +


  1. Introduction to Machine Learning Evaluation: Measures for Binary Classification: ROC visualization compstat-lmu.github.io/lecture_i2ml

  2. LABELS: ROC SPACE Plot True Positive Rate and False Positive Rate: 1.00 True Class y C2 + − ● unclear winner Pred. + TP FP 0.75 C1 ● ˆ y FN TN − dominates ● TPR C3 0.50 TP TPR = 0.25 TP + FN FP FPR = FP + TN 0.00 0.00 0.25 0.50 0.75 1.00 FPR � c Introduction to Machine Learning – 1 / 16

  3. LABELS: ROC SPACE The best classifier lies on the top-left corner The diagonal ≈ random labels (with different proportions). Assign positive x as "pos" with 25% probability → TPR = 0 . 25. Assign negative x as "pos" with 25% probability → FPR = 0 . 25. 1.00 ● Best Pos−100% ● 0.75 ● Pos−75% TPR 0.50 0.25 ● Pos−25% 0.00 ● Pos−0% 0.00 0.25 0.50 0.75 1.00 FPR � c Introduction to Machine Learning – 2 / 16

  4. LABELS: ROC SPACE In practice, we should never obtain a classifier below the diagonal. Inverting the predicted labels (0 → 1 and 1 → 0) will result in a reflection at the diagonal. 1.00 C2 0.75 ● TPR 0.50 C1 0.25 ● 0.00 0.00 0.25 0.50 0.75 1.00 FPR � c Introduction to Machine Learning – 3 / 16

  5. LABEL DISTRIBUTION IN TPR AND FPR TPR and FPR are insensitive to the class distribution: Not affected by changes in the ratio n + / n − (at prediction). Example 1: Example 2: Proportion n + / n − = 1 Proportion n + / n − = 2 Actual Positive Actual Negative Actual Positive Actual Negative Pred. Positive 40 25 Pred. Positive 80 25 Pred. Negative 10 25 Pred. Negative 20 25 MCE = 35/100 MCE = 45/150 = 30/100 TPR = 0 . 8 TPR = 0 . 8 FPR = 0 . 5 FPR = 0 . 5 Note: If class proportions differ during training, the above is not true. Estimated posterior probabilities can change! � c Introduction to Machine Learning – 4 / 16

  6. FROM PROBABILITIES TO LABELS: ROC CURVE Remember: Both probabilistic and scoring classifiers can output classes by thresholding. h ( x ) := [ π ( x )) ≥ c ] h ( x ) = [ f ( x ) ≥ c ] or To draw a ROC curve : 1.00 Iterate through all possible True Positive Rate thresholds c 0.75 0.50 → Visual inspection of all possible thresholds / results 0.25 0.00 0.00 0.25 0.50 0.75 1.00 False Positive Rate � c Introduction to Machine Learning – 5 / 16

  7. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.9 → TPR = 0.167 → FPR = 0 � c Introduction to Machine Learning – 6 / 16

  8. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.85 → TPR = 0.333 → FPR = 0 � c Introduction to Machine Learning – 7 / 16

  9. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.66 → TPR = 0.5 → FPR = 0 � c Introduction to Machine Learning – 8 / 16

  10. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.6 → TPR = 0.5 → FPR = 0.167 � c Introduction to Machine Learning – 9 / 16

  11. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.55 → TPR = 0.667 → FPR = 0.167 � c Introduction to Machine Learning – 10 / 16

  12. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate c = 0.3 → TPR = 0.833 → FPR = 0.5 � c Introduction to Machine Learning – 11 / 16

  13. ROC CURVE 1.00 # Truth Score 1 Pos 0.95 True Positive Rate 2 Pos 0.86 0.75 3 Pos 0.69 4 Neg 0.65 0.50 5 Pos 0.59 6 Neg 0.52 7 Pos 0.51 0.25 8 Neg 0.39 9 Neg 0.28 10 Neg 0.18 0.00 11 Pos 0.15 0.00 0.25 0.50 0.75 1.00 12 Neg 0.06 False Positive Rate � c Introduction to Machine Learning – 12 / 16

  14. ROC CURVE The closer the curve to the top-left corner, the better If ROC curves cross, a different model can be better in different parts of the ROC space 1.00 model 0.75 very good TPR 0.50 ok1 ok2 0.25 bad 0.00 0.00 0.25 0.50 0.75 1.00 FPR � c Introduction to Machine Learning – 13 / 16

  15. AUC: AREA UNDER ROC CURVE The AUC (in [0,1]) is a single metric to evaluate scoring classifiers AUC = 1: Perfect classifier AUC = 0.5: Randomly ordered 1.00 True Positive Rate 0.75 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 False Positive Rate � c Introduction to Machine Learning – 14 / 16

  16. AUC: AREA UNDER ROC CURVE Interpretation: Probability that classifier ranks a random positive higher than a random negative observation Truth Score Choose a random positive 1 0.9 1 0.76 1 0.76 1 0.7 0 0.5 Choose a random negative 1 0.45 0 0.3 0 0.3 0 0.1 AUC = 0.9167 Classifier ranks the positive higher than the negative (with probability 0.9167) � c Introduction to Machine Learning – 15 / 16

  17. PARTIAL AUC Sometimes it can be useful to look at a specific region under the ROC curve ⇒ partial AUC (pAUC). Examples: focus on a region with low FPR or a region with high TPR: 1.0 1.0 0.8 0.8 0.6 0.6 tpr tpr Partial AUC: 0.086 Partial AUC: 0.128 0.4 0.4 0.2 0.2 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 fpr fpr � c Introduction to Machine Learning – 16 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend