Performance Metrics for Graph Mining Tasks 1 Outline - PowerPoint PPT Presentation

Performance Metrics for Graph Mining Tasks 1

Outline • Introduction to Performance Metrics • Supervised Learning Performance Metrics • Unsupervised Learning Performance Metrics • Optimizing Metrics • Statistical Significance Techniques • Model Comparison 2

Introduction to Performance Metrics Performance metric measures how well your data mining algorithm is performing on a given dataset. For example, if we apply a classification algorithm on a dataset, we first check to see how many of the data points were classified correctly. This is a performance metric and the formal name for it is “accuracy.” Performance metrics also help us decide is one algorithm is better or worse than another. For example, one classification algorithm A classifies 80% of data points correctly and another classification algorithm B classifies 90% of data points correctly. We immediately realize that algorithm B is doing better. There are some intricacies that we will discuss in this chapter. 4

Supervised Learning Performance Metrics Metrics that are applied when the ground truth is known (E.g., Classification tasks) Outline: • 2 X 2 Confusion Matrix • Multi-level Confusion Matrix • Visual Metrics • Cross-validation 6

2X2 Confusion Matrix An 2X2 matrix , is used to tabulate the results of 2-class supervised learning problem and entry (i,j) represents the number of elements with class label i , but predicted to have class label j . Predicted False Negative True Positive Class + - + f ++ f +- C = f ++ + f +- Actual Class - f -+ f -- D = f -+ + f -- A = f ++ + f -+ B = f +- + f -- T = f ++ + f -+ + f +- + f -- False Positive True Negative + and – are two class labels 7

2X2 Confusion Metrics Example Results from a Classification Corresponding Algorithms 2x2 matrix for the given table Vertex Actual Predicted Predicted ID Class Class Class 1 + + + - 2 + + Actual + 4 1 C = 5 Class 3 + + - 2 1 D = 3 4 + + A = 6 B = 2 T = 8 5 + - • True positive = 4 6 - + • False positive = 1 7 - + • True Negative = 1 8 - - • False Negative =2 8

2X2 Confusion Metrics Performance Metrics Walk-through different metrics using the following example 1. Accuracy is proportion of correct predictions 2. Error rate is proportion of incorrect predictions 3. Recall is the proportion of “+” data points predicted as “+” 4. Precision is the proportion of data points predicted as “+” that are truly “+” 9

Multi-level Confusion Matrix An nXn matrix, where n is the number of classes and entry (i,j) represents the number of elements with class label i , but predicted to have class label j 10

Multi-level Confusion Matrix Example Predicted Class Marginal Sum of Class 1 Class 2 Class 3 Actual Values Class 1 2 1 1 4 Actual Class 2 1 2 1 4 Class Class 3 1 2 3 6 Marginal Sum of 4 5 5 T = 14 Predictions 11

Multi-level Confusion Matrix Conversion to 2X2 f ++ Predicted Class f -+ f +- Class 1 Class 2 Class 3 Actual Class 1 2 1 1 f -- Class Class 2 1 2 1 Class 3 1 2 3 We can now apply all the 2X2 Predicted Class 2X2 Matrix metrics Specific to Class 1 Not Class 1 Class 1 (+) (-) Accuracy = 2/14 Error = 8/14 Class 1 (+) 2 2 C = 4 Actual Recall = 2/4 Class Not Class 1 (-) 2 8 D = 10 Precision = 2/4 A = 4 B = 10 T = 14

Multi-level Confusion Matrix Performance Metrics Predicted Class Class 1 Class 2 Class 3 Class 1 2 1 1 Actual Class 2 1 2 1 Class Class 3 1 2 3 1. Critical Success Index or Threat Score is the ratio of correct predictions for class L to the sum of vertices that belong to L and those predicted as L 2. Bias - For each class L, it is the ratio of the total points with class label L to the number of points predicted as L. 13 Bias helps understand if a model is over or under-predicting a class

Confusion Metrics R-code • library(PerformanceMetrics) • data(M) • M • [,1] [,2] • [1,] 4 1 • [2,] 2 1 • twoCrossConfusionMatrixMetrics(M) • data(MultiLevelM) • MultiLevelM • [,1] [,2] [,3] • [1,] 2 1 1 • [2,] 1 2 1 • [3,] 1 2 3 • multilevelConfusionMatrixMetrics(MultiLevelM) 14

Visual Metrics Metrics that are plotted on a graph to obtain the visual picture of the performance of two class classifiers (0,1) - Ideal 1 (1,1) True Positive Rate Predicts the +ve class all the time ROC plot (0,0) AUC = 0.5 Predicts the –ve class all the time 0 False Positive Rate 0 1 Plot the performance of multiple models to 15 decide which one performs best

Understanding Model Performance based on ROC Plot 1. Models that lie in Models that lie in this upper right are upper left have good 1 liberal. performance 2. Will predict “+” Note: This is where you True Positive Rate with little aim to get the model evidence 3. High False positives Models that lie in AUC = 0.5 1. Models that lie in this area perform lower left are worse than random conservative. Note: Models here can 2. Will not predict be negated to move “+” unless strong 0 them to the upper right evidence False Positive Rate 0 corner 1 3. Low False positives but high False Negatives 16

ROC Plot Example 1 M 1 (0.1,0.8) True Positive Rate M 3 (0.3,0.5) M 2 (0.5,0.5) 0 False Positive Rate 0 1 M 1 ’s performance occurs furthest in the upper-right direction and hence is considered the best model. 17

Cross-validation Cross-validation also called rotation estimation, is a way to analyze how a predictive data mining model will perform on an unknown dataset, i.e., how well the model generalizes Strategy: 1. Divide up the dataset into two non-overlapping subsets 2. One subset is called the “test” and the other the “training” 3. Build the model using the “training” dataset 4. Obtain predictions of the “test” set 5. Utilize the “test” set predictions to calculate all the performance metrics Typically cross-validation is performed for multiple iterations, selecting a different non-overlapping test and training set each time 18

Types of Cross-validation • hold-out: Random 1/3 rd of the data is used as test and remaining 2/3 rd as training • k-fold: Divide the data into k partitions, use one partition as test and remaining k-1 partitions for training • Leave-one-out: Special case of k-fold, where k=1 Note: Selection of data points is typically done in stratified manner, i.e., the class distribution in the test set is similar to the training set 19

Unsupervised Learning Performance Metrics Metrics that are applied when the ground truth is not always available (E.g., Clustering tasks) Outline: • Evaluation Using Prior Knowledge • Evaluation Using Cluster Properties 21

Evaluation Using Prior Knowledge To test the effectiveness of unsupervised learning methods is by considering a dataset D with known class labels, stripping the labels and providing the set as input to any unsupervised leaning algorithm, U . The resulting clusters are then compared with the knowledge priors to judge the performance of U To evaluate performance 1. Contingency Table 2. Ideal and Observed Matrices 22

Contingency Table Cluster Same Cluster Different Cluster Same Class u 11 u 10 Class Different Class u 01 u 00 (A) To fill the table, initialize u 11, u 01, u 10, u 00 to 0 (B) Then, for each pair of points of form (v,w): 1. if v and w belong to the same class and cluster then increment u 11 2. if v and w belong to the same class but different cluster then increment u 10 3. if v and w belong to the different class but same cluster then increment u 01 4. if v and w belong to the different class and cluster then increment u 00 23

Contingency Table Performance Metrics Example Matrix Cluster Same Cluster Different Cluster Same Class 9 4 Class Different Class 3 12 • Rand Statistic also called simple matching coefficient is a measure where both placing a pair of points with the same class label in the same cluster and placing a pair of points with different class labels in different clusters are given equal importance, i.e., it accounts for both specificity and sensitivity of the clustering • Jaccard Coefficient can be utilized when placing a pair of points with the same class label in the same cluster is primarily important 24

Performance Metrics for Graph Mining Tasks 1 Outline - PowerPoint PPT Presentation

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical Significance

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Graph Essentials Graph Basics Social Media Mining Social Media Mining Measures and Metrics

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Chapter X: Graph Mining Information Retrieval & Data Mining Universitt des Saarlandes,

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Data Mining: Concepts and Techniques Chapter 9 Graph mining and Social Network Analysis

Topic II: Graph Mining Discrete Topics in Data Mining Universitt des Saarlandes, Saarbrcken

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

Cement, Aggregates, Mining Presentation Cement, Aggregates and Mining Cement, Aggregates and

Maternal-newborn health performance measurement & management On routine documentation &

PERFORMANCE-BASED PLANNING 2.0 RAISING THE BAR WITH SHRP-2 WEST VIRGINIA MPO CONFERENCE SALEEM

on Employee Engagement Frazer Rendell Chair, Performance Management Sub Group, Engage for

Assessfest Its All About Assessment! Robyn Reafler, M.A. Karen Violanti, Ph. D.

ESTABLISHING GOALS & PERFORMANCE MEASURES Jordan Kaufman, Kern County Treasurer-Tax Collector

Assessing the Impact of NASAs STEM Engagement Investments: Development of External &

DATA S HARING, S YS TEM PERFORMANCE, AND WIOA COMMON METRICS WORKGROUP MARCH 24, 2015

WHATS NEW IN PUBLIC LIBRARY SERVICE MEASUREMENT Making it Count CLA 2014 Preconference

Performance Metrics for Graph Mining Tasks 1 Outline - PowerPoint PPT Presentation

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical Significance

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Graph Essentials Graph Basics Social Media Mining Social Media Mining Measures and Metrics

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Chapter X: Graph Mining Information Retrieval &amp; Data Mining Universitt des Saarlandes,

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Data Mining: Concepts and Techniques Chapter 9 Graph mining and Social Network Analysis

Topic II: Graph Mining Discrete Topics in Data Mining Universitt des Saarlandes, Saarbrcken

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Shared Memory Programming with OpenMP Lecture 6: Tasks What are tasks? Tasks are

Scheduling Aperiodic Tasks Background Scheduling Treat aperiodic tasks as lowest-priority

Cement, Aggregates, Mining Presentation Cement, Aggregates and Mining Cement, Aggregates and

Maternal-newborn health performance measurement &amp; management On routine documentation &amp;

PERFORMANCE-BASED PLANNING 2.0 RAISING THE BAR WITH SHRP-2 WEST VIRGINIA MPO CONFERENCE SALEEM

on Employee Engagement Frazer Rendell Chair, Performance Management Sub Group, Engage for

Assessfest Its All About Assessment! Robyn Reafler, M.A. Karen Violanti, Ph. D.

ESTABLISHING GOALS &amp; PERFORMANCE MEASURES Jordan Kaufman, Kern County Treasurer-Tax Collector

Assessing the Impact of NASAs STEM Engagement Investments: Development of External &amp;

DATA S HARING, S YS TEM PERFORMANCE, AND WIOA COMMON METRICS WORKGROUP MARCH 24, 2015

WHATS NEW IN PUBLIC LIBRARY SERVICE MEASUREMENT Making it Count CLA 2014 Preconference

Chapter X: Graph Mining Information Retrieval & Data Mining Universitt des Saarlandes,

Maternal-newborn health performance measurement & management On routine documentation &

ESTABLISHING GOALS & PERFORMANCE MEASURES Jordan Kaufman, Kern County Treasurer-Tax Collector

Assessing the Impact of NASAs STEM Engagement Investments: Development of External &