Joseph Gonzalez Joint work with: Yucheng Haijie Danny Carlos - PowerPoint PPT Presentation

Distributed Graph-Parallel Computation on Natural Graphs Joseph Gonzalez Joint work with: Yucheng Haijie Danny Carlos Low Gu Bickson Guestrin

Graphs are ubiquitous.. 2

Social Media Science Advertising Web • Graphs encode relationships between: People Products Ideas Facts Interests • Big : billions of vertices and edges and rich metadata 3

Graphs are Essential to Data-Mining and Machine Learning • Identify influential people and information • Find communities • Target ads and products • Model complex data dependencies 4

Natural Graphs Graphs derived from natural phenomena 5

Problem: Existing distributed graph computation systems perform poorly on Natural Graphs . 6

PageRank on Twitter Follower Graph Natural Graph with 40M Users, 1.4 Billion Links Runtime Per Iteration 0 50 100 150 200 Hadoop GraphLab Order of magnitude by exploiting properties Twister of Natural Graphs Piccolo PowerGraph Hadoop results from [Kang et al. '11] 7 Twister (in-memory MapReduce) [Ekanayake et al. ‘10]

Properties of Natural Graphs Power-Law Degree Distribution 8

Power-Law Degree Distribution 10 10 More than 10 8 vertices have one neighbor. 10 8 Number of Vertices Top 1% of vertices are High-Degree 10 6 adjacent to Vertices count 50% of the edges! 10 4 10 2 AltaVista WebGraph 1.4B Vertices, 6.6B Edges 10 0 10 0 10 2 10 4 10 6 10 8 Degree degree 9

Power-Law Degree Distribution “Star Like” Motif President Followers Obama 10

Power-Law Graphs are Difficult to Partition CPU 1 CPU 2 • Power-Law graphs do not have low-cost balanced cuts [Leskovec et al. 08, Lang 04] • Traditional graph-partitioning algorithms perform poorly on Power-Law Graphs. [Abou-Rjeili et al. 06] 11

Properties of Natural Graphs High-degree Power-Law Low Quality Vertices Degree Distribution Partition 12

Program Run on This For This Machine 1 Machine 2 • Split High-Degree vertices • New Abstraction à Equivalence on Split Vertices 13

How do we program graph computation? “Think like a Vertex.” -Malewicz et al. [SIGMOD’10] 14

The Graph-Parallel Abstraction • A user-defined Vertex-Program runs on each vertex • Graph constrains interaction along edges – Using messages (e.g. Pregel [PODC’09, SIGMOD’10]) – Through shared state (e.g., GraphLab [UAI’10, VLDB’12]) • Parallelism : run multiple vertex programs simultaneously 15

Example Depends on the popularity their followers Depends on popularity of her followers What’s the popularity of this user? Popular? 16

PageRank Algorithm X R [ i ] = 0 . 15 + w ji R [ j ] j ∈ Nbrs( i ) Rank of Weighted sum of user i neighbors’ ranks • Update ranks in parallel • Iterate until convergence 17

The Pregel Abstraction Vertex-Programs interact by sending messages . Pregel_PageRank (i, messages ) : i // Receive all the messages total = 0 foreach ( msg in messages ) : total = total + msg // Update the rank of this vertex R[i] = 0.15 + total // Send new messages to neighbors foreach (j in out_neighbors[i]) : Send msg( R[i] * w ij ) to vertex j Malewicz et al. [PODC’09, SIGMOD’10] 18

The GraphLab Abstraction Vertex-Programs directly read the neighbors state GraphLab_PageRank (i) i // Compute sum over neighbors total = 0 foreach ( j in in_neighbors(i)): total = total + R[j] * w ji // Update the PageRank R[i] = 0.15 + total // Trigger neighbors to run again if R[i] not converged then foreach ( j in out_neighbors(i)): signal vertex-program on j Low et al. [UAI’10, VLDB’12] 19

Challenges of High-Degree Vertices Sequentially process Sends many Touches a large Edge meta-data edges messages fraction of graph too large for single (Pregel) (GraphLab) machine Asynchronous Execution Synchronous Execution requires heavy locking (GraphLab) prone to stragglers (Pregel) 20

Communication Overhead for High-Degree Vertices Fan-In vs. Fan-Out 21

Pregel Message Combiners on Fan-In A Sum + B D C Machine 1 Machine 2 • User defined commutative associative (+) message operation: 22

Pregel Struggles with Fan-Out A B D C Machine 1 Machine 2 • Broadcast sends many copies of the same message to the same machine! 23

Fan-In and Fan-Out Performance • PageRank on synthetic Power-Law Graphs – Piccolo was used to simulate Pregel with combiners 10 Total Comm. (GB) 8 6 4 2 0 1.8 1.9 2 2.1 2.2 Power-Law Constant α More high-degree vertices 24

GraphLab Ghosting A A B D D B C C Ghost Machine 1 Machine 2 • Changes to master are synced to ghosts 25

GraphLab Ghosting A A B D D B C C Ghost Machine 1 Machine 2 • Changes to neighbors of high degree vertices creates substantial network traffic 26

Fan-In and Fan-Out Performance • PageRank on synthetic Power-Law Graphs • GraphLab is undirected 10 Total Comm. (GB) 8 6 4 2 0 1.8 1.9 2 2.1 2.2 Power-Law Constant alpha More high-degree vertices 27

Graph Partitioning • Graph parallel abstractions rely on partitioning: – Minimize communication – Balance computation and storage Y Data transmitted across network Machine 1 Machine 2 O(# cut edges) 28

Random Partitioning • Both GraphLab and Pregel resort to random (hashed) partitioning on natural graphs then the expected fraction of edges  | Edges Cut | � = 1 − 1 E | E | p 10 Machines à 90% of edges cut 100 Machines à 99% of edges cut! Machine 1 Machine 2 29

In Summary GraphLab and Pregel are not well suited for natural graphs • Challenges of high-degree vertices • Low quality partitioning 30

• GAS Decomposition : distribute vertex-programs – Move computation to data – Parallelize high-degree vertices • Vertex Partitioning: – Effectively distribute large power-law graphs 31

A Common Pattern for Vertex-Programs GraphLab_PageRank (i) // Compute sum over neighbors Gather Information total = 0 foreach ( j in in_neighbors(i)): About Neighborhood total = total + R[j] * w ji // Update the PageRank Update Vertex R[i] = 0.1 + total // Trigger neighbors to run again Signal Neighbors & if R[i] not converged then Modify Edge Data foreach ( j in out_neighbors(i)) signal vertex-program on j 32

GAS Decomposition A pply G ather (Reduce) S catter Accumulate information Apply the accumulated Update adjacent edges about neighborhood value to center vertex and vertices. User Defined: User Defined: User Defined: Apply ( , Σ ) à Y Scatter ( ) à Gather ( ) à Σ Y Y’ Y ’ Σ 1 + Σ 2 à Σ 3 Y Σ ’ Y’ Y Y Update Edge Data & Parallel + Σ + … + à Activate Neighbors Sum 33 Y Y Y

PageRank in PowerGraph X R [ i ] = 0 . 15 + w ji R [ j ] j ∈ Nbrs( i ) PowerGraph_PageRank(i) Gather ( j à i ) : return w ji * R[j] sum (a, b) : return a + b; Apply ( i, Σ ) : R[i] = 0.15 + Σ Scatter ( i à j ) : if R[i] changed then trigger j to be recomputed 34

Distributed Execution of a PowerGraph Vertex-Program Machine 1 Machine 2 Master G ather Y’ Y’ Y’ Y’ Σ Σ 1 Σ 2 + + + Mirror A pply Y Y Y Y Σ 3 Σ 4 S catter Mirror Mirror Machine 3 Machine 4 35

Minimizing Communication in PowerGraph Communication is linear in Y Y Y the number of machines each vertex spans A vertex-cut minimizes machines each vertex spans Percolation theory suggests that power law graphs have good vertex cuts . [Albert et al. 2000] 36

New Approach to Partitioning • Rather than cut edges: Must synchronize Y New Theorem: Y many edges For any edge-cut we can directly CPU 1 CPU 2 construct a vertex-cut which requires • we cut vertices: strictly less communication and storage. Must synchronize Y Y a single vertex CPU 1 CPU 2 37

Constructing Vertex-Cuts • Evenly assign edges to machines – Minimize machines spanned by each vertex • Assign each edge as it is loaded – Touch each edge only once • Propose three distributed approaches: – Random Edge Placement – Coordinated Greedy Edge Placement – Oblivious Greedy Edge Placement 38

Random Edge-Placement • Randomly assign edges to machines Machine 1 Machine 2 Machine 3 Balanced Vertex-Cut Y Spans 3 Machines Y Y Y Y Y Y Y Z Z Y Y Z Z Spans 2 Machines Not cut! 39

Analysis Random Edge-Placement • Expected number of machines spanned by a vertex: 20 18 Twitter Follower Graph Exp. # of Machines Spanned 16 41 Million Vertices 14 1.4 Billion Edges 12 10 Predicted 8 Random 6 Accurately Estimate 4 Memory and Comm. 2 Overhead 8 28 48 Number of Machines 40

Random Vertex-Cuts vs. Edge-Cuts • Expected improvement from vertex-cuts: 100 Comm. and Storage Reduction in 10 Order of Magnitude Improvement 1 0 50 100 150 Number of Machines 41

Greedy Vertex-Cuts • Place edges on machines which already have the vertices in that edge. A B B C Machine1 Machine 2 A B D E 42

Greedy Vertex-Cuts • De-randomization à greedily minimizes the expected number of machines spanned • Coordinated Edge Placement – Requires coordination to place each edge – Slower: higher quality cuts • Oblivious Edge Placement – Approx. greedy objective without coordination – Faster: lower quality cuts 43

Joseph Gonzalez Joint work with: Yucheng Haijie Danny Carlos - PowerPoint PPT Presentation

Distributed Graph-Parallel Computation on Natural Graphs Joseph Gonzalez Joint work with: Yucheng Haijie Danny Carlos Low Gu Bickson Guestrin Graphs are ubiquitous.. 2 Social Media Science Advertising Web Graphs encode

Report from the National Science Foundation J. Cottam-Allen, K. Dienes, M. Goldberg, S.

Building Custom RISC-V SoCs in Chipyard Abraham Gonzalez UC Berkeley abe.gonzalez@berkeley.edu

Intelligent Services Serving Machine Learning Joseph E. Gonzalez jegonzal@cs.berkeley.edu;

Prediction Serving what happens after learning? Joseph E. Gonzalez Asst. Professor, UC Berkeley

PowerGraph Distributed Graph-Parallel Computation on Natural Graphs by Gonzalez, Joseph E., et

Linear Regression and the Bias Variance Tradeoff Guest Lecturer Joseph E. Gonzalez slides

ri RISE to the Challenges of AI Systems Joseph E. Gonzalez Assistant Professor, UC Berkeley

Research at the intersection of AI + Systems Joseph E. Gonzalez Assistant Professor, UC Berkeley

Prediction Serving Joseph E. Gonzalez Asst. Professor, UC Berkeley jegonzal@cs.berkeley.edu

Parallel Splash Belief Propagation Joseph E. Gonzalez Yucheng Low Carlos Guestrin David

Joseph, a Christmas Hero Mahew 1:18-25 I. Joseph: a Devout Man of Simple Faith and Obedience

Flexible Perovskites on CNT as integrated batteries for powering optoelectronics R. Perez-Gonzalez

Clinical Pearls of PD1/PDL1 Inhibitors UNMH Pharmacy Residents Jessica Lewis-Gonzalez, PharmD

The Global Rise of Asset Prices and the Decline of the Labor Share Ignacio Gonzalez 1 Pedro

C ++ Compact Course Alexander Artiga Gonzalez (Z 712) Till Niese (Z 705)

Software Development Analytics Jesus M. Gonzalez-Barahona with GrimoireLab A bit of context

Overview of the Celebrity Profiling Task at PAN 2020 Lil Wayne WEEZY F LeFloid Kendall Neymar

CSCI 3136 Principles of Programming Languages Syntactic Analysis and Context-Free Grammars - 4

Ventura River Ventura River Multi- -Species HCP Species HCP Multi Presentation to: Ventura

American Cyanamid Superfund Site Joe Battipaglia, Region 2 RPM Climate Change Adaptation Webinar

Tao: Facebook's Distributed Data Store For The Social Graph Bronson et. al., ATC 2013 Joy

Election in Trees and Rings T-79.4001 Seminar on Theoretical Computer Science Ilari Nieminen

Leader Election Chapter 3 Observations Election in the Ring Election in the Mesh Election in

L. Confrontation at the Feast of Dedication John 10:22 42 1. John 10:22 During the 400