Big graphs for big data: parallel matching and Outline clustering - PowerPoint PPT Presentation

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Rob H. Bisseling Results Clustering Introduction Mathematical Institute, Utrecht University Sequential GPU algorithm Results 2D partitioning Collaborators: Bas Fagginger Auer, Fredrik Manne, Mostofa Patwary, 2D matching Daan Pelt, Albert-Jan Yzelman Conclusion References Workshop AMLaGAP, Orl´ eans, May 19, 2014 1

Graph Matching Introduction Greedy algorithm Parallelisable 1/2-approximation algorithm Outline BSP algorithm Matching Introduction GPU algorithm Greedy Parallelisable Results BSP algorithm GPU algorithm Results Clustering Clustering Introduction Introduction Sequential Sequential algorithm GPU algorithm Results GPU algorithm 2D partitioning Results 2D matching Conclusion 2D sparse matrix partitioning References 2D (edge-based) matching Conclusion 2

Matchmaker, Matchmaker, Make me a match Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Results Clustering Introduction Sequential GPU algorithm Results 2D partitioning From the film Fiddler on the roof 2D matching ◮ Hodel: Well, somebody has to arrange the matches. Conclusion References Young people can’t decide these things themselves. ◮ Hodel: For Papa, make him a scholar. ◮ Chava: For Mama, make him rich as a king. 3

Matching can win you a Nobel prize Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Results Clustering Introduction Sequential GPU algorithm Results 2D partitioning 2D matching Conclusion References Source: Slate magazine October 15, 2012 4

Motivation of graph matching Outline ◮ Graph matching is a pairing of neighbouring vertices. Matching Introduction ◮ It has applications in Greedy Parallelisable • medicine: finding suitable donors for organs BSP algorithm GPU algorithm • social networks: finding partners Results • scientific computing: finding pivot elements in matrix Clustering Introduction computations Sequential GPU algorithm • graph coarsening: making the graph smaller by merging Results similar vertices before partitioning it for parallel 2D partitioning computations 2D matching • bioinformatics: finding similarity in Protein-Protein Conclusion Interaction networks References 5

Motivation of greedy/approximation graph matching Outline Matching ◮ Optimal solution is possible in polynomial time. Introduction Greedy ◮ Time for weighted matching in graph G = ( V , E ) is Parallelisable BSP algorithm O ( mn + n 2 log n ) with n = | V | the number of vertices, GPU algorithm Results and m = | E | the number of edges (Gabow 1990). Clustering Introduction ◮ The aim is a billion vertices, n = 10 9 , with 100 edges per Sequential GPU algorithm vertex, i.e. m = 10 11 . Results 2D partitioning ◮ Thus, a time of O (10 20 ) = 100 , 000 Petaflop units is far 2D matching too long. Fastest supercomputer today, the Tianhe-2, Conclusion performs 33.8 Petaflop/s. References ◮ We need linear-time greedy or approximation algorithms. 6

Formal definition of graph matching Outline Matching Introduction Greedy ◮ A graph is a pair G = ( V , E ) with vertices V and edges E . Parallelisable BSP algorithm GPU algorithm ◮ All edges e ∈ E are of the form e = ( v , w ) for vertices Results v , w ∈ V . Clustering Introduction ◮ A matching is a collection M ⊆ E of disjoint edges. Sequential GPU algorithm Results ◮ Here, the graph is undirected, so ( v , w ) = ( w , v ). 2D partitioning 2D matching Conclusion References 7

Maximal matching Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Results Clustering Introduction Sequential GPU algorithm Results 2D partitioning 2D matching ◮ A matching is maximal if we cannot enlarge it further by Conclusion References adding another edge to it. 8

Maximum matching Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Results Clustering Introduction Sequential GPU algorithm Results 2D partitioning 2D matching ◮ A matching is maximum if it possesses the largest possible Conclusion References number of edges, compared to all other matchings. 9

Edge-weighted matching Outline Matching ◮ If the edges are provided with weights ω : E → R > 0 , Introduction Greedy finding a matching M which maximises Parallelisable BSP algorithm GPU algorithm Results � ω ( M ) = ω ( e ) , Clustering Introduction e ∈ M Sequential GPU algorithm is called edge-weighted matching. Results 2D partitioning ◮ Greedy matching provides us with maximal matchings, 2D matching but not necessarily with maximum possible weight. Conclusion References 10

Sequential greedy matching Outline ◮ In random order, vertices v ∈ V select and match Matching Introduction neighbours one-by-one. Greedy Parallelisable ◮ Here, we can pick BSP algorithm GPU algorithm • the first available neighbour w of v , Results Clustering greedy random matching Introduction • the neighbour w with maximum ω ( v , w ), Sequential GPU algorithm greedy weighted matching Results 2D partitioning ◮ Or: we sort all the edges by weight, and successively match 2D matching the vertices v and w of the heaviest available edge ( v , w ). Conclusion This is commonly called greedy matching. References 11

Sequential greedy random matching Outline 8 1 Matching 2 Introduction Greedy Parallelisable BSP algorithm GPU algorithm Results 4 Clustering 3 Introduction 7 Sequential GPU algorithm Results 9 2D partitioning 5 2D matching 6 Conclusion References 12

Big graphs for big data: parallel matching and Outline clustering - PowerPoint PPT Presentation

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Rob H. Bisseling Results Clustering Introduction Mathematical Institute,

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Big graphs for big data: parallel matching and Outline clustering on billion-vertex graphs

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

Graphs () Graphs () Graphs Graphs Graphs are collections of nodes

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Sources for this lecture 3. Matching in bipartite and general graphs The material for this

8.1 Matching in General Graphs For the most part, weve discussed matching restricted to

1 Matching in General Graphs For the most part, weve discussed matching restricted to

Matching Bipartite Matching Input Given a (undirected) graph G = ( V , E ) Input Given a bipartite

Week 4 Kullmann Graphs and directed graphs Elementary Graph Algorithms Representing graphs

Graphs Graphs Examples Definitions Implementation/Representation of graphs Graphs

On some classes of Deza graphs Deza graphs without 3-cocliques Line graphs V.V. Kabanov 1 Deza

Mining Data Graphs Semi-supervised learning, label propagation, Web Search Data graphs Data

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Outline Morning program Preliminaries Text matching I Text matching II Afternoon program

WSColab: Structured Collaborative Tagging For Web Service Matchmaking Maciej Gawinecki

Matchmaker Creators: Matthew Mans (mans1626), Lance Ogoshi (logoshi), Hope Crandall (hopesc),

Where do I send my jobs? Grid informa4on systems in the

Storage Information Services Ted Hesselroth Fermilab Abhishek Singh Rana and Frank Wuerthwein UC

Challenges in Dynamic Deployment of Condor Across Distributed Environments Andrew Pavlo

Platform Thinking Sneha (TA) CSE 190 Case Studies Ubers Dynamic Pricing Increase cost

Towards Using Explicit Semantics in Life Science Workflows Pisa June 2007 Dr. Jos F. Aldana

HMIS User Meeting May 2020 211 Orange County 1 Agenda Agency Presentations 2. Project