A Quality Metric for Visualization of Clusters in Graphs Amyra - PowerPoint PPT Presentation

A Quality Metric for Visualization of Clusters in Graphs Amyra Meidiana, Seok-Hee Hong, Peter Eades (University of Sydney, Australia) Daniel Keim (University of Konstanz, Germany)

Motivation ● Clustering is an important task in graph analysis ● No metric exists that measures how faithfully a graph drawing displays the clustering structure of the graph ● Aim: define, implement and evaluate a quality metric quantifying how faithfully a graph drawing displays a graph’s clustering structure

Contribution 1. Design and implement a new clustering quality metric 2. Experiment 1: Validate the clustering quality metric through graph drawing deformation experiments 3. Experiment 2: Compare various graph drawing algorithms using the clustering quality metric

Clustering Quality Metric: Framework

Clustering Quality Metric: Details ● Geometric clustering C’: k-means clustering ● Clustering comparison metrics: Adjusted Rand Index (ARI): measures clustering similarity based on # of item pairs ○ classified into the same cluster in both clusterings & into different clusters in both clusterings Adjusted Mutual Information (AMI): measures how much information of one ○ clustering can be gained from the other Fowlkes-Mallows Index (FMI): measures the similarity of C’ to C using the number ○ of true positives, false positives, and false negatives Completeness (CMP): the extent to which all members of a cluster in C are ○ assigned to the same cluster in C’ Homogeneity (HOM): the extent to which each cluster in C′ only contains members ○ of the same cluster in C

Experiment 1: Validation Experiment ● Validation experiment steps: Start with a good graph drawing with no cluster overlap 1. Perturb vertex positions to deform the cluster structures in the drawing 2. ● Validation experiments performed on synthetic graphs with known ground truth clusters ● Hypothesis 1: Clustering quality metric scores will decrease as the drawings are further deformed

Validation Experiments Examples Step 0 Step 3 Step 7 Step 10

Validation Experiments Results ● Scores decrease as the drawings are distorted, validating Hypothesis 1 ● CQ ARI and CQ FMI are more sensitive in capturing changes in quality

Experiment 2: Layout Comparison ● Layout comparison using clustering quality metrics ● Cluster-focused layouts: LinLog, Backbone, tsNET ● Other layouts: Force-directed layouts (Fruchterman Reingold (FR), Organic) ○ Multilevel force-directed layouts (FM3, sfdp) ○ MDS-based layouts (Metric MDS, Pivot MDS) ○ Stress-based layouts (Stress Majorization, Sparse Stress Minimization) ○ Spectral layout ○ ● Hypothesis 2: the cluster-focused layouts will score higher on clustering quality metrics than other layouts

Layout Comparison Example: Synthetic dataset FR Organic Stress Maj. Metric MDS Backbone FM3 Spectral S. Stress Min. tsNET Pivot MDS sfdp LinLog

Layout Comparison Examples: real world dataset FR Organic Stress Maj. Metric MDS Backbone FM3 Spectral S. Stress Min. tsNET Pivot MDS sfdp LinLog Data taken from: Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (Jun 2014)

Layout Comparison Results ● LinLog and tsNET attain the top two scores averaged over all datasets, supporting Hypothesis 2 ● Backbone is in the top three for real world datasets ● sfdp scores highest among non-cluster focused layouts ● Organic and MDS layouts fall on the low end of CQ scores Average over all comparison datasets Average over real world datasets

Summary ● Designed, implemented, and validated a clustering quality metric for graph drawings ● Evaluated various graph layout algorithms using the metrics and validated the claims of some cluster-focused layout Future work ● Combination with readability metrics (e.g. to address node overlap issues) ● Use other geometric clustering methods ● Extension to data clustering metrics

A Quality Metric for Visualization of Clusters in Graphs Amyra - PowerPoint PPT Presentation

A Quality Metric for Visualization of Clusters in Graphs Amyra Meidiana, Seok-Hee Hong, Peter Eades (University of Sydney, Australia) Daniel Keim (University of Konstanz, Germany) Motivation Clustering is an important task in graph

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

I nternational research The evidence on clusters is clear Firms located in clusters are more

Internet Server Clusters Internet Server Clusters Jeff Chase Duke University, Department of

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Visualization CS 299 Introduction to Data Science Overview 1. What Is Visualization? 2.

Visualization Systems 11-1 Ronald Peikert SciVis 2008 - Visualization Systems Modular

Data Visualization Tools, How do you make a visualization? Is it the right visualization?

Feature Selection and Clustering Jiwoo Bang, Chungyong Kim, Kesheng Wu, Alexander Sim, Suren

Streaming algorithms for k -center clustering with outliers and with anonymity Richard Matthew

Focused Clustering and Outlier Detection in Large Attributed Graphs ACM SIG-KDD August 26, 2014

Engines Previously We talked about the motivation behind vertical search engines,

Clustering Sriram Sankararaman (Adapted from slides by Junming Yin) Outline Introduction

INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Sch utzes, linked from

CMS Data Availability and Request Process David Lanctin, M.P.H. Technical Advisor ResDAC,

Robot Object Manipulation Using RFIDs Jue Wang Fadel Adib, Ross Knepper, Dina Katabi, Daniela Rus

A Quality Metric for Visualization of Clusters in Graphs Amyra - PowerPoint PPT Presentation

A Quality Metric for Visualization of Clusters in Graphs Amyra Meidiana, Seok-Hee Hong, Peter Eades (University of Sydney, Australia) Daniel Keim (University of Konstanz, Germany) Motivation Clustering is an important task in graph

Welcome back... Metric spaces. Approximate metric using a tree. Tree metric: 16 16 A metric

Metric Spaces Definition If d is a metric on X , then the metric topology on X induced by d is

I nternational research The evidence on clusters is clear Firms located in clusters are more

Internet Server Clusters Internet Server Clusters Jeff Chase Duke University, Department of

Security Visualization Tim Vidas &amp; Hanan Hibshi UPS 2011 1 Visualization Visualization can

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Information- -Velocity Metric Velocity Metric Information-Velocity Metric Information for the

Visualization Visualization Understand what ConvNets learn 2 Visualization The development of

Data Visualization Brait ispuu Types of Visualization Mathematical Visualization y =

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

Visualization CS 299 Introduction to Data Science Overview 1. What Is Visualization? 2.

Visualization Systems 11-1 Ronald Peikert SciVis 2008 - Visualization Systems Modular

Data Visualization Tools, How do you make a visualization? Is it the right visualization?

Feature Selection and Clustering Jiwoo Bang, Chungyong Kim, Kesheng Wu, Alexander Sim, Suren

Streaming algorithms for k -center clustering with outliers and with anonymity Richard Matthew

Focused Clustering and Outlier Detection in Large Attributed Graphs ACM SIG-KDD August 26, 2014

Engines Previously We talked about the motivation behind vertical search engines,

Clustering Sriram Sankararaman (Adapted from slides by Junming Yin) Outline Introduction

INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Sch utzes, linked from

CMS Data Availability and Request Process David Lanctin, M.P.H. Technical Advisor ResDAC,

Robot Object Manipulation Using RFIDs Jue Wang Fadel Adib, Ross Knepper, Dina Katabi, Daniela Rus

Security Visualization Tim Vidas & Hanan Hibshi UPS 2011 1 Visualization Visualization can