an expressive dissimilarity measure for relational
play

An expressive dissimilarity measure for relational clustering using - PowerPoint PPT Presentation

An expressive dissimilarity measure for relational clustering using neighbourhood trees Sebastijan Dumani , Hendrik Blockeel DTAI, CS Department, KU Leuven ECML PKDD 2017, Journal track 1 Outline 2/28 1 Overture 2 How do we do it now?


  1. An expressive dissimilarity measure for relational clustering using neighbourhood trees Sebastijan Dumančić , Hendrik Blockeel DTAI, CS Department, KU Leuven ECML PKDD 2017, Journal track

  2. 1 – Outline 2/28 1 Overture 2 How do we do it now? 3 An expressive dissimilarity for relational data 4 Experiments and results 5 Summary Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  3. 1 – Identifying groups in data 3/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  4. 1 – Identifying groups in data 4/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  5. 1 – Identifying groups in data 5/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  6. 1 – Identifying groups in data 6/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  7. 1 – Which clustering is correct? 7/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  8. 1 – Which clustering is correct? 8/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  9. 1 – What about relational data? 9/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  10. 1 – (Statistical) relational machine learning 10/28 Machine learning with a powerful knowledge representation language usually based on first-order logic Common representation for: vectors graphs sequences ... ... with a unifying reasoning and learning engine Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  11. 1 – Many faces of relational data 11/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  12. 1 – Many faces of relational data 11/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  13. 2 – Outline 12/28 1 Overture 2 How do we do it now? 3 An expressive dissimilarity for relational data 4 Experiments and results 5 Summary Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  14. 2 – How do we do it now? 13/28 Hybrid similarities Graph kernels Relational similarities incorporate link structural similarities of comparing logical information into graphs constructs attribute-based similarity measure the similarity of random walks, propagation logical formulas in connected vertices of information common, matching terms Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  15. 2 – How do we do it now? 13/28 Hybrid similarities Graph kernels Relational similarities incorporate link structural similarities of comparing logical information into graphs constructs attribute-based similarity measure the similarity of random walks, propagation logical formulas in connected vertices of information common, matching terms Impose a fixed bias Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  16. 3 – Outline 14/28 1 Overture 2 How do we do it now? 3 An expressive dissimilarity for relational data 4 Experiments and results 5 Summary Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  17. 3 – How similar are ProfA and ProfB ? 15/28 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  18. 3 – Main motivations 16/28 A similarity measure for relational data should: incorporate multiple views of similarity be easily adaptable take attributes and relationships into account insensitive to neighbourhood size be efficient Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  19. 3 – Neighbourhood trees 17/28 Neighbourhood trees summarize the neighbourhood of an instance/example Data Neighbourhood tree Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  20. 3 – Neighbourhood trees 17/28 Neighbourhood trees summarize the neighbourhood of an instance/example Data Neighbourhood tree Similarity of instances = similarity of their neighbourhood trees Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  21. 3 – Comparing neighbourhood trees 18/28 Decompose NTs into semantic parts Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  22. 3 – Comparing neighbourhood trees 18/28 Decompose NTs into semantic parts similarity = linear combination of similarities of individual semantic parts ( w 1 , w 2 , w 3 , s 4 , w 5 ) Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  23. 3 – Comparing semantic parts 19/28 Decompose NT in multisets of: attribute edge labels vertex identities per level and vertex type Multiset of edge labels (level 1): { (Advised,2), (Advised,2), (TaughtBy,2) } Compare two multisets, A and B with χ 2 distance ( f A ( x ) − f B ( x )) 2 χ 2 ( A, B ) = � f A ( x ) + f B ( x ) x ∈ A ∪ B Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  24. 3 – Generality of the approach 20/28 Many of the existing similarities are a special case: hybrid similarities relational similarities ... or they can be defined over neighbourhood trees (graph kernels) with different biases: makes it easier to compare the imposed biases Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  25. 3 – Generality of the approach 20/28 Many of the existing similarities are a special case: hybrid similarities relational similarities ... or they can be defined over neighbourhood trees (graph kernels) with different biases: makes it easier to compare the imposed biases Additionally: effective - linear in the number of unique elements in a multiset Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  26. 4 – Outline 21/28 1 Overture 2 How do we do it now? 3 An expressive dissimilarity for relational data 4 Experiments and results 5 Summary Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  27. 4 – Experimental setup 22/28 Datasets: Questions: IMDB Quality of the obtained clustering? UWCSE Are different views really necessary? Mutagenesis Can we learn the bias from data? WebKB Can we learn the bias from labels? TerroristAttacks combined with spectral and hierarchical clustering a wide range of existing similarity measures performance measure: ARI/Accuracy Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  28. 4 – Quality of the obtained clusterings 23/28 Takeaway message: incorporating multiple biases consistently performs well Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  29. 4 – Are different views needed? 24/28 Takeaway message: relational data requires multiple views of similarity in order to find informative clusters Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  30. 4 – Learning the weights from data 25/28 ReCeNT with w i = 0 . 2 vs. AASC + ReCeNT AASC - given multiple similarity matrices, find an optimal combination for clustering barely any benefit Huang, Chuang, Chen: Affinity Aggregation for Spectral Clustering Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  31. 4 – Learning weights from labels 26/28 Similarity measure in combination with a kNN (parameters optimised with CV) Takeaway message: when labels are provided, ReCeNT outperforms the competing similarities Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  32. 5 – Outline 27/28 1 Overture 2 How do we do it now? 3 An expressive dissimilarity for relational data 4 Experiments and results 5 Summary Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  33. 5 – Summary 28/28 A similarity measure for relational data that: is versatile (meta-similarity) easily adaptable efficient generalization of many existing structured/relational sims works well across many different tasks Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

  34. 5 – Summary 28/28 A similarity measure for relational data that: is versatile (meta-similarity) easily adaptable efficient generalization of many existing structured/relational sims works well across many different tasks Code: https://dtai.cs.kuleuven.be/software/recent S. Dumancic, H. Blockeel: Clustering-Based Unsupervised Relational Representation Learning with an Explicit Distributed Representation , IJCAI ’17 S. Dumancic, H. Blockeel: Demystifying Relational Latent Representations , ILP ’17 Relational clustering over neighbourhood trees – S. Dumančić, H. Blockeel

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend