big graphs for big data parallel matching and
play

Big graphs for big data: parallel matching and Outline clustering - PowerPoint PPT Presentation

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Rob H. Bisseling Clustering Introduction Sequential Mathematical


  1. Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Rob H. Bisseling Clustering Introduction Sequential Mathematical Institute, Utrecht University Results Conclusion Collaborators: Bas Fagginger Auer, Fredrik Manne, Albert-Jan Yzelman Asia-trip A-Eskwadraat, July 2014 1

  2. Graph Matching Introduction Greedy algorithm Outline Parallelisable 1/2-approximation algorithm Matching Introduction BSP algorithm Greedy Parallelisable GPU algorithm BSP algorithm GPU algorithm Results Clustering Introduction Sequential Clustering Results Conclusion Introduction Sequential algorithm GPU algorithm Results Conclusion 2

  3. Matching can win you a Nobel prize Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering Introduction Sequential Results Conclusion Source: Slate magazine October 15, 2012 3

  4. Motivation of graph matching Outline Matching Introduction ◮ Graph matching is a pairing of neighbouring vertices. Greedy Parallelisable ◮ It has applications in BSP algorithm GPU algorithm • medicine: finding suitable donors for organs Clustering • social networks: finding partners Introduction Sequential • scientific computing: finding pivot elements in matrix Results Conclusion computations • graph coarsening: making the graph smaller by merging similar vertices 4

  5. Motivation of greedy/approximation graph matching Outline Matching ◮ Optimal solution is possible in polynomial time. Introduction Greedy ◮ Time for weighted matching in graph G = ( V , E ) is Parallelisable BSP algorithm O ( mn + n 2 log n ) with n = | V | the number of vertices, GPU algorithm Clustering and m = | E | the number of edges (Gabow 1990). Introduction Sequential ◮ The aim is a billion vertices, n = 10 9 , with 100 edges per Results vertex, i.e. m = 10 11 . Conclusion ◮ Thus, a time of O (10 20 ) = 100 , 000 Petaflop units is far too long. Fastest supercomputer today, the Chinese Tianhe-2 (Milky-Way 2), performs 33.8 Petaflop/s. ◮ We need linear-time greedy or approximation algorithms. 5

  6. Formal definition of graph matching Outline Matching Introduction Greedy ◮ A graph is a pair G = ( V , E ) with vertices V and edges E . Parallelisable BSP algorithm GPU algorithm ◮ All edges e ∈ E are of the form e = ( v , w ) for vertices Clustering v , w ∈ V . Introduction Sequential ◮ A matching is a collection M ⊆ E of disjoint edges. Results Conclusion ◮ Here, the graph is undirected, so ( v , w ) = ( w , v ). 6

  7. Maximal matching Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering Introduction Sequential Results Conclusion ◮ A matching is maximal if we cannot enlarge it further by adding another edge to it. 7

  8. Maximum matching Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering Introduction Sequential Results Conclusion ◮ A matching is maximum if it possesses the largest possible number of edges, compared to all other matchings. 8

  9. Edge-weighted matching Outline Matching ◮ If the edges are provided with weights ω : E → R > 0 , Introduction Greedy finding a matching M which maximises Parallelisable BSP algorithm GPU algorithm � ω ( M ) = ω ( e ) , Clustering Introduction Sequential e ∈ M Results Conclusion is called edge-weighted matching. ◮ Greedy matching provides us with maximal matchings, but not necessarily with maximum possible weight. 9

  10. Sequential greedy matching Outline ◮ In random order, vertices v ∈ V select and match Matching Introduction neighbours one-by-one. Greedy Parallelisable ◮ Here, we can pick BSP algorithm GPU algorithm • the first available neighbour w of v , Clustering greedy random matching Introduction Sequential • the neighbour w with maximum ω ( v , w ), Results greedy weighted matching Conclusion ◮ Or: we sort all the edges by weight, and successively match the vertices v and w of the heaviest available edge ( v , w ). This is commonly called greedy matching. 10

  11. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  12. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  13. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  14. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  15. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  16. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  17. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  18. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  19. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  20. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  21. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  22. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  23. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  24. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  25. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  26. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  27. Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering 4 Introduction 3 Sequential 7 Results Conclusion 9 5 6 11

  28. Greedy matching is a 1/2-approximation algorithm Outline Matching Introduction Greedy Parallelisable BSP algorithm ◮ Weight ω ( M ) ≥ ω optimal / 2 GPU algorithm ◮ Cardinality | M | ≥ | M card − max | / 2, because M is maximal. Clustering Introduction Sequential ◮ Time complexity is O ( m log m ), because all edges must be Results sorted. Conclusion 12

  29. Parallel greedy matching: trouble 8 1 Outline 2 Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm 4 Clustering 3 Introduction 7 Sequential Results Conclusion 9 5 6 Suppose we match vertices simultaneously. 13

  30. Parallel greedy matching: trouble 8 1 Outline 2 Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm 4 Clustering 3 Introduction 7 Sequential Results Conclusion 9 5 6 Two vertices each find an unmatched neighbour. . . 13

  31. Parallel greedy matching: trouble 8 1 Outline 2 Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm 4 Clustering 3 Introduction 7 Sequential Results Conclusion 9 5 6 . . . but generate an invalid matching. 13

  32. Parallelisable dominant-edge algorithm while E � = ∅ do pick a dominant edge ( v , w ) ∈ E Outline M := M ∪ { ( v , w ) } Matching E := E \ { ( x , y ) ∈ E : x = v ∨ x = w } Introduction Greedy V := V \ { v , w } Parallelisable BSP algorithm return M GPU algorithm Clustering ◮ An edge ( v , w ) ∈ E is dominant if Introduction Sequential Results ω ( v , w ) = max { ω ( x , y ) : ( x , y ) ∈ E ∧ ( x = v ∨ x = w ) } Conclusion 2 5 3 9 v w 6 7 6 8 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend