learning to rank and compare graph layouts
play

Learning to rank and compare graph layouts Toby Dylan Hocking - PowerPoint PPT Presentation

Learning to rank and compare graph layouts Toby Dylan Hocking toby@sg.cs.titech.ac.jp http://sugiyama-www.cs.titech.ac.jp/~toby/ joint work with Supaporn Spanurattana and Masashi Sugiyama 6 Aug 2013 Introduction: what makes a graph layout good


  1. Learning to rank and compare graph layouts Toby Dylan Hocking toby@sg.cs.titech.ac.jp http://sugiyama-www.cs.titech.ac.jp/~toby/ joint work with Supaporn Spanurattana and Masashi Sugiyama 6 Aug 2013

  2. Introduction: what makes a graph layout good or bad? Learning to rank and compare graph layouts

  3. Biology is full of networks (graphs) Source: Kyoto encyclopedia of genes and genomes (KEGG).

  4. Biology is full of networks (graphs) Source: Wikipedia “Citric acid cycle.”

  5. Goal: find a good layout for a particular graph Two categories of methods for graph layout ◮ Heuristic layout algorithms: ◮ Force-directed ◮ Hierarchical clustering (trees/dendrograms) ◮ Hive plots ◮ ... ◮ Manual layout using programs such as: ◮ Cytoscape/cytoscape.js ◮ Gephi ◮ Image processing: gimp/inkscape ◮ ...

  6. Force-directed layout has many tuning parameters Source: Data-Driven Documents (D3) JavaScript visualization library (Bostock 2011). parameter min default max size ? 1 x 1 ? link distance 0 20 ∞ link strength 0 1 1 friction 0 0.9 1 charge −∞ -30 ∞ theta 0 0.8 ∞ gravity 0 0.1 ∞ Question: how to tune these parameters for a specific graph?

  7. Manual layout using a GUI is time-consuming ◮ Try default parameters of several different algorithms. ◮ Play with tuning parameters, select a combination that looks good. ◮ Finally, refine the algorithm’s layout by dragging nodes to positions that look better. Goal: learn from a database of manually labeled graphs.

  8. Manual layout using a GUI is time-consuming ◮ Try default parameters of several different algorithms. ◮ Play with tuning parameters, select a combination that looks good. ◮ Finally, refine the algorithm’s layout by dragging nodes to positions that look better. Goal: learn from a database of manually labeled graphs.

  9. Pairwise comparison in the graph layout literature Source: Holten and van Wijk, “Force-Directed Edge Bundling for Graph Visualization,” EuroVis 2009.

  10. Pairwise comparison in the graph layout literature Source: Muelder and Ma, “Rapid Graph Layout Using Space Filling Curves,” InfoVis 2008.

  11. Pairwise comparison in the graph layout literature Source: Gorochowski et al. , “Using Aging to Visually Uncover Evolutionary Processes on Networks,” IEEE Trans. Viz 2012.

  12. Introduction: what makes a graph layout good or bad? Learning to rank and compare graph layouts

  13. Learning a comparison function We are given n training pairs ( G i , x i , x ′ i , y i ) where we have ◮ a graph G i , i ∈ R p of that graph (feature vectors), ◮ two layouts x i , x ′  − 1 if x i is better   ◮ a comparison y i = 0 if x i is as good as x ′ i  1 if x ′ i is better .  Goal: find a comparison function g : R p × R p → {− 1 , 0 , 1 } ◮ Symmetry: g ( x , x ′ ) = − g ( x ′ , x ). ◮ Good prediction with respect to the zero-one loss E : � � y i , g ( x i , x ′ � minimize E i ) g i ∈ test

  14. Learning to rank and compare We will learn a ◮ Ranking function f : R p → R . Bigger means a better layout. ◮ Threshold t ∈ R + . A small difference | f ( x ′ ) − f ( x ) | ≤ t is not significant.  if f ( x ′ ) − f ( x ) < − t − 1   ◮ Comparison function g t ( x , x ′ ) = 0 if | f ( x ′ ) − f ( x ) | ≤ t  1 if f ( x ′ ) − f ( x ) > t .  The problem becomes n � � y i , g t ( x i , x ′ � minimize E i ) f , t i =1

  15. Some labeled layouts of a 2-node graph good 1 good 2 good 3 200 150 100 50 bad 11 bad 12 bad 13 y 200 150 100 50 -300 -200 -100 0 -300 -200 -100 0 -300 -200 -100 0 x

  16. Map 20 layouts x i ∈ R 2 to a feature space 1.6 1.2 label angle good 0.8 bad 0.4 0 100 200 300 distance

  17. Generate 10 pairwise constraints x ′ i − x i ∈ R 2 1.6 1.2 label angle good 0.8 bad 0.4 0 100 200 300 distance

  18. 10 labeled difference vectors x ′ i − x i ∈ R 2 1 comparison y i angle -1 0 0 1 -1 -200 0 200 distance

  19. All 190 labeled difference vectors x ′ i − x i ∈ R 2 1 comparison y i angle -1 0 0 1 -1 -200 0 200 distance

  20. Max margin comparison function line margin decision 1 comparison y i angle -1 0 0 1 -1 constraint active f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 inactive -200 0 200 distance

  21. g when switching train direction x i , x ′ Invariance of ˆ i line margin 1 decision comparison angle y i 0 -1 0 constraint -1 active inactive f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 -200 0 200 distance

  22. Defining the margin Recall: for all pairs i ∈ { 1 , . . . , n } we have i ∈ R p and ◮ features x i , x ′ ◮ comparisons y i ∈ {− 1 , 0 , 1 } . We define ◮ Ranking function f ( x ) = w ⊺ x ∈ R . ◮ Threshold t = 1. ◮ Comparison function g 1 ( x , x ′ ) ∈ {− 1 , 0 , 1 } . y i = − 1 y i = 0 y i = 1 1 margin µ 0 -1 0 1 -1 0 1 -1 0 1 predicted rank difference f ( x ′ i ) − f ( x i )

  23. Max margin comparison is a linear program (LP) For y ∈ {− 1 , 0 , 1 } , let I y = { i | y i = y } be the corresponding training indices. maximize µ ∈ R , w ∈ R p µ subject to µ ≤ 1 − | w ⊺ ( x ′ i − x i ) | , ∀ i ∈ I 0 µ ≤ − 1 + w ⊺ ( x ′ i − x i ) y i , ∀ i ∈ I 1 ∪ I − 1 . Note: if the optimal µ > 0 then the data are separable.

  24. Related work: reject, rank, and rate ❳❳❳❳❳❳❳❳❳❳❳ Inputs single items x pairs of items x , x ′ Outputs y ∈ {− 1 , 1 } SVM SVMrank y ∈ {− 1 , 0 , 1 } Reject option this work ◮ PL Bartlett and MH Wegkamp. Classification with a reject option using a hinge loss. JMLR, 9:1823–1840, 2008. (statistical properties of the hinge loss) ◮ T Joachims. Optimizing search engines using clickthrough data. KDD 2002. (SVMrank) ◮ K Zhou et al. Learning to rank with ties. SIGIR 2008. (boosting, ties are more effective with more output values) ◮ R Herbrich et al. TrueSkill: a Bayesian skill rating system. NIPS 2006. (generalization of Elo for chess)

  25. SVMrank is a quadratic program (QP) minimize w ⊺ w w ∈ R p subject to w ⊺ ( x ′ i − x i ) y i ≥ 1 , ∀ i ∈ I 1 ∪ I − 1 . line 4 margin f ( x ′ ) − f ( x ) = 0 decision 2 comparison y i angle -1 0 0 1 -2 constraint active f ( x ′ ) − f ( x ) = − 1 f ( x ′ ) − f ( x ) = 1 inactive -2 -1 0 1 2 distance

  26. Conclusions and future work Learned a function f ( x ) for ranking a graph layout x . ◮ Features for good performance on real graphs? ◮ Tune layout algorithm parameters to maximize f . ◮ SVMrank is sufficient under what assumption?

  27. Thank you! Supplementary slides appear after this one.

  28. Layout evaluation metrics (features x i , x ′ i ) ◮ Number of crossing edges (smaller is better) ◮ Aspect ratio (closer to 1:1 is better?) ◮ Symmetry (more is better when the graph has symmetries) ◮ Edge length (small and less variable is better?) ◮ Angle between edge pairs (big is better?) ◮ Area of smallest bounding box (smaller is better to let small features be more legible) Source: http://en.wikipedia.org/wiki/Graph_drawing# Quality_measures

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend