Focused Clustering and Outlier Detection in Large Attributed Graphs
ACM SIG-KDD
August 26, 2014 Patricia Iglesias Sánchez*, Emmanuel Müller*†
*Karlsruhe Institute of Technology †University of Antwerp
Focused Clustering and Outlier Detection in Large Attributed Graphs - - PowerPoint PPT Presentation
Focused Clustering and Outlier Detection in Large Attributed Graphs ACM SIG-KDD August 26, 2014 Bryan Perozzi , Leman Akoglu Stony Brook University Patricia Iglesias Snchez * , Emmanuel Mller * * Karlsruhe Institute of Technology
*Karlsruhe Institute of Technology †University of Antwerp
Attributed graph:
Examples:
Age School Relationship Status
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 2
Numerous attributes (ex: Facebook profiles) Many irrelevant for most queries
Ex: When trying to sell mortgages
Useful: Income, Credit Score, Employer Not Useful: Hair Color, # Apps Installed
Ex: When trying to sell make up
Useful: Hair Color, Skin Tone, Gender Not Useful: Shoe Size
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 3
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 4
Users provide examples of the kind of
We infer the similarity function that matters to
Introduction New Problem:
Our Approach: FocusCO Evaluation Conclusion
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 5
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 6
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 7
Users provide examples of
Ex: ‘Yann LeCun’ and
We learn a focus
Education Level Location
We extract clusters
which agree with the focus
We detect outliers
which don’t agree with focus
Graph Clustering Attributed Graphs Attribute Subspace User Preference Outlier Detection
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 8
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 9
… …
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 10
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 11
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 12
… …
Local clustering algorithm
Not cluster whole graph
Expands a cluster around
Two procedures:
1.
2.
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 13
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 14
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 15
Synthetic and Real World Graphs Performance measures:
Cluster quality: NMI Outlier accuracy: precision, F1
Compared to:
CODA [Gao+’10] METIS (no outlier detection) [Karypis+’98]
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 16
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 17
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 18
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 19
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 20
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 21
Focused Outlier publishes in IR
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 22
Focused Outlier did not mention Waas.
Bryan Perozzi Focused Clustering and Outlier Detection in Large Attributed Graphs 23
examples
user infer focus attributes
focus
Clustering