Visualization for Classification ROC, AUC, Confusion Matrix Duen - - PowerPoint PPT Presentation

visualization for classification
SMART_READER_LITE
LIVE PREVIEW

Visualization for Classification ROC, AUC, Confusion Matrix Duen - - PowerPoint PPT Presentation

http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Visualization for Classification ROC, AUC, Confusion Matrix Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics


slide-1
SLIDE 1

1 http://poloclub.gatech.edu/cse6242


CSE6242 / CX4242: Data & Visual Analytics


Visualization for Classification


ROC, AUC, Confusion Matrix

Duen Horng (Polo) Chau


Assistant Professor
 Associate Director, MS Analytics
 Georgia Tech

Partly based on materials by 
 Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos, Parishit Ram (GT PhD alum; SkyTree), Alex Gray

Parishit Ram 
 GT PhD alum; SkyTree

slide-2
SLIDE 2

Visualizing Classification Performance

Confusion matrix

2

https://en.wikipedia.org/wiki/Confusion_matrix

slide-3
SLIDE 3

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

3

Hard to spot trends and patterns Easier

slide-4
SLIDE 4

Very important: 
 Find out what “positive” means

4

https://en.wikipedia.org/wiki/Confusion_matrix

slide-5
SLIDE 5

Very important: 
 Find out what “positive” means

5

https://en.wikipedia.org/wiki/Confusion_matrix

slide-6
SLIDE 6

Visualizing Classification Performance
 using ROC curve 


(Receiver Operating Characteristic)

slide-7
SLIDE 7

Polonium’s ROC Curve

Positive class: malware Negative class: benign

85% True Positive Rate
 1% False Alarms

Ideal

True Positive Rate % of bad correctly labeled False Positive Rate (False Alarms) % of good labeled as bad

7

slide-8
SLIDE 8

Measuring Classification Performance
 using AUC (Area under the curve)

85% True Positive Rate
 1% False Alarms

Ideal

slide-9
SLIDE 9

If a machine learning algorithm achieves 0.9 AUC (out of 1.0).
 That’s a great algorithm, right?

9

slide-10
SLIDE 10

Be Careful with AUC!

10

slide-11
SLIDE 11

Weights in combined models Bagging / Random forests

  • Majority voting

Let people play with the weights?

11

slide-12
SLIDE 12

EnsembleMatrix

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

12

slide-13
SLIDE 13

Improving performance

  • Adjust the weights of

the individual classifiers

  • Data partition to

separate problem areas

  • Adjust weights just for

these individual parts

  • State-of-the-art

performance, on one dataset

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

13