visualization for classification
play

Visualization for Classification ROC, AUC, Confusion Matrix Duen - PowerPoint PPT Presentation

http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Visualization for Classification ROC, AUC, Confusion Matrix Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics


  1. http://poloclub.gatech.edu/cse6242 
 CSE6242 / CX4242: Data & Visual Analytics 
 Visualization for Classification 
 ROC, AUC, Confusion Matrix Duen Horng (Polo) Chau 
 Assistant Professor 
 Associate Director, MS Analytics 
 Georgia Tech Parishit Ram 
 GT PhD alum; SkyTree Partly based on materials by 
 Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos, Parishit Ram (GT PhD alum; SkyTree), Alex Gray 1

  2. Visualizing Classification Performance Confusion matrix https://en.wikipedia.org/wiki/Confusion_matrix 2

  3. Hard to spot trends and patterns Easier 3 http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

  4. Very important: 
 Find out what “positive” means https://en.wikipedia.org/wiki/Confusion_matrix 4

  5. Very important: 
 Find out what “positive” means https://en.wikipedia.org/wiki/Confusion_matrix 5

  6. Visualizing Classification Performance 
 using ROC curve 
 (Receiver Operating Characteristic)

  7. Polonium’s ROC Curve Positive class: malware Negative class: benign Ideal 85% True Positive Rate 
 1% False Alarms True Positive Rate % of bad correctly labeled False Positive Rate (False Alarms) % of good labeled as bad 7

  8. Measuring Classification Performance 
 using AUC (Area under the curve) Ideal 85% True Positive Rate 
 1% False Alarms

  9. If a machine learning algorithm achieves 0.9 AUC (out of 1.0) . 
 That’s a great algorithm, right? 9

  10. Be Careful with AUC! 10

  11. Weights in combined models Bagging / Random forests • Majority voting Let people play with the weights? 11

  12. EnsembleMatrix http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf 12

  13. Improving performance • Adjust the weights of the individual classifiers • Data partition to separate problem areas o Adjust weights just for these individual parts • State-of-the-art performance, on one dataset http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend