Visualization for Classification ROC, AUC, Confusion Matrix Mahdi - - PowerPoint PPT Presentation

visualization for classification
SMART_READER_LITE
LIVE PREVIEW

Visualization for Classification ROC, AUC, Confusion Matrix Mahdi - - PowerPoint PPT Presentation

Class Website CX4242: Visualization for Classification ROC, AUC, Confusion Matrix Mahdi Roozbahani Lecturer, Computational Science and Engineering, Georgia Tech Visualizing Classification Performance Confusion matrix


slide-1
SLIDE 1

Class Website

CX4242:

Visualization for Classification

ROC, AUC, Confusion Matrix

Mahdi Roozbahani Lecturer, Computational Science and Engineering, Georgia Tech

slide-2
SLIDE 2

Visualizing Classification Performance

Confusion matrix

3

https://en.wikipedia.org/wiki/Confusion_matrix

slide-3
SLIDE 3

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

4

Hard to spot trends and patterns Much easier!

slide-4
SLIDE 4

Very important: Find out what “positive” means

5

Predicated

Cat Dog

Actual

Cat 5 3 Dog 2 4

slide-5
SLIDE 5

Very important: Find out what “positive” means

6

https://en.wikipedia.org/wiki/Confusion_matrix

“False Alarm” easy to remember in security applications

slide-6
SLIDE 6

Visualizing Classification Performance using ROC curve

(Receiver Operating Characteristic)

slide-7
SLIDE 7

Polonium’s ROC Curve

Positive class: malware Negative class: benign

85% True Positive Rate 1% False Alarms

Ideal

True Positive Rate % of bad correctly labeled False Positive Rate (False Alarms) % of good labeled as bad

8

slide-8
SLIDE 8

Measuring Classification Performance using AUC (Area under the ROC curve)

85% True Positive Rate 1% False Alarms

Ideal

slide-9
SLIDE 9

If a machine learning algorithm achieves 0.9 AUC (out of 1.0), that’s a great algorithm, right?

10

slide-10
SLIDE 10

Be Careful with AUC!

11

slide-11
SLIDE 11

Weights in combined models Bagging / Random forests

  • Majority voting

Let people play with the weights?

13

slide-12
SLIDE 12

EnsembleMatrix

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

14

slide-13
SLIDE 13

Improving performance

  • Adjust the weights of

the individual classifiers

  • Data partition to

separate problem areas

  • Adjust weights just for

these individual parts

  • Caveat: evaluation

used one dataset

http://research.microsoft.com/en-us/um/redmond/groups/cue/publications/CHI2009-EnsembleMatrix.pdf

15