Data Partitioning Strategies for Graph Workloads on Heterogeneous - - PowerPoint PPT Presentation

data partitioning strategies for graph workloads on
SMART_READER_LITE
LIVE PREVIEW

Data Partitioning Strategies for Graph Workloads on Heterogeneous - - PowerPoint PPT Presentation

SC 2015 Data Partitioning Strategies for Graph Workloads on Heterogeneous Clusters Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, Lizy K. John The University of Texas at Austin mlebeane@utexas.edu SC 2015 Motivation Data Data


slide-1
SLIDE 1

SC 2015

Data Partitioning Strategies for Graph Workloads on Heterogeneous Clusters

Michael LeBeane, Shuang Song, Reena Panda, Jee Ho Ryoo, Lizy K. John The University of Texas at Austin mlebeane@utexas.edu

slide-2
SLIDE 2

SC 2015

▪ Heterogeneity is pervasive in modern data centers [][] ▪ Graph analytics are a pervasive workload in the data center []

– Many frameworks available to efficiently and easily perform graph analytics [][][][]

▪ Most frameworks are not equipped to deal with heterogeneity in the data center

Motivation

2 Michael LeBeane 11/18/2015

Network Compute Node Compute Node Compute Node Compute Node Compute Node Compute Node

Data Data Data Data Data Data

slide-3
SLIDE 3

SC 2015

▪ Online vs. Offline Partitioning

Background

3 Michael LeBeane 11/18/2015

1 2 1 1 2 2 1 2 1 2 1 1

▪ All work performed on PowerGraph[] framework ▪ Three relevant graph partitioning topics:

– Online vs. Offline Partitioning – Vertex vs. Edge Cut – Gather/Apply/Scatter

slide-4
SLIDE 4

SC 2015

▪ Vertex vs. Edge Cut

Background

4 Michael LeBeane 11/18/2015

Machine X Machine Y (a) Vertex Cut (b) Edge Cut

Master Ghost

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

(a) Gather (b) Apply (c) Scatter

F(x) 1 2 3 4 1’ 2’ 3’ 4’ 5 5’

▪ Gather/Apply/Scatter

slide-5
SLIDE 5

SC 2015

▪ Skewed Data Partitioning

Workload Skew in Heterogeneous Data Centers

5 Michael LeBeane 11/18/2015

Time

Compute Communication Fast Node Data Data

Barrier

Slow Node Compute Communication Compute Communication Compute Communication

Runtime Improvement

Communication Compute Communication Fast Node Data Data Slow Node

Barrier

Idle Compute Communication Compute Communication

Time

Idle Compute

▪ Normal Data Partitioning

slide-6
SLIDE 6

SC 2015

▪ Local node computation time dependent on data distribution ▪ To properly balance work, we need:

– Estimation of each node’s computational capacity – Partitioning algorithms that account for skewed computational capacity

Heterogeneous Graph Analytics

6 Michael LeBeane 11/18/2015

File 1 File 2 File N Loading Files Partitioning Graph Finalizing Graph App Execution Data Data Data Data

Baseline Partitioner

Heterogeneity Aware Partitioner

Computation Capacity

1 2

Graph

Node 1 Node 2 Node 3 Node n

slide-7
SLIDE 7

SC 2015

▪ Computation capacity is complex ▪ Dependent on many factors:

– Hardware of the node – Nature of the graph – Nature of the algorithm – Communication patterns

▪ Can we determine a simple, static estimate?

Heterogeneous Computation Capacity

7 Michael LeBeane 11/18/2015

File 1 File 2 File N Loading Files Partitioning Graph Finalizing Graph App Execution Data Data Data Data

Baseline Partitioner

Heterogeneity Aware Partitioner Computation Capacity 1 2

Graph

Node 1 Node 2 Node 3 Node n

slide-8
SLIDE 8

SC 2015

Skew Factor Calculation

8 Michael LeBeane 11/18/2015

▪ Static estimate of node computational capacity could be based on:

– Threads: Logical compute threads on node (default N – 2 ) – Memory: Physical memory assigned to a node – Profiling: Local throughput of graph subset and algorithm

▪ We will refer to the estimated ratios of computation capacity as the skew factor of the heterogeneous data center

Name HW Threads Memory Network c4.xlarge 4 7.5 GB 100 Mbps to 1.86 Gbps c4.2xlarge 8 15 GB 100 Mbps to 1.86 Gbps c4.4xlarge 16 30 GB 100 Mbps to 1.86 Gbps c4.8xlarge 36 60 GB up to 8.86 Gbps Thread Skew Factor Memory Skew Factor 1 1 3 2 7 4 17 8

slide-9
SLIDE 9

SC 2015

▪ Online partitioning algorithms must be modified to support skew factor ▪ Easy to modify current online partitioning algorithms ▪ We have modified 5 popular algorithms from multiple sources

Heterogeneous Partitioning Algorithm

9 Michael LeBeane 11/18/2015

File 1 File 2 File N Loading Files Partitioning Graph Finalizing Graph App Execution Data Data Data Data

Baseline Partitioner

Heterogeneity Aware Partitioner Computation Capacity 1 2

Graph

Node 1 Node 2 Node 3 Node n

slide-10
SLIDE 10

SC 2015

Problem Formulation

10 Michael LeBeane 11/18/2015

▪ Statically estimated based on:

– Threads: Logical compute threads on node (default N – 2 ) – Memory: Physical memory assigned to a node – Profiling: Local throughput of graph subset and algorithm

▪ Statically estimated based on:

slide-11
SLIDE 11

SC 2015

Random Skewed Partitioner

11

▪ Original ▪ Skewed

Node 0 Node 1 Node n ….

Random Assignment

….

Random Assignment

Node 0 Node 1 Node n

Skew Factor

Edge Edge

▪ Random assignment of edges to nodes

slide-12
SLIDE 12

SC 2015

Greedy Skewed Partitioner

12 Michael LeBeane 11/18/2015

▪ Original ▪ Skewed

Node 0 Node 1 Node n ….

Heuristic Assignment

….

Heuristic Assignment

Node 0 Node 1 Node n

Skew Factor

Edge Edge Balance Balance

▪ Greedy decision using current distribution of edges

– Either locally or coordinated

slide-13
SLIDE 13

SC 2015

Grid Skewed Partitioner

13 Michael LeBeane 11/18/2015

▪ Original ▪ Skewed

Node 0 Node 1 Node n ….

Grid Hash

Edge

▪ Greedy decision using current distribution of edges

– Either locally or coordinated

Random Selection

Grid

Node 0 Node 1 Node n ….

Grid Hash

Edge

Random Selection

Grid

Skew Factor

slide-14
SLIDE 14

SC 2015

Hybrid Skewed Partitioner

14 Michael LeBeane 11/18/2015

Node 0 Node 1 Node n

….

Random Assignment

Edge Vertex

Node 0 Node 1 Node n

….

Heuristic Assignment Degree > Threshold

Vertex

Node 0 Node 1 Node n

….

Random Assignment

Edge Vertex

Node 0 Node 1

….

Heuristic Assignment Degree > Threshold

Vertex

Node n

▪ Skewed

Skew Factor Skew Factor

▪ Random assignment of edges/verticies to nodes based on degree ▪ Original

slide-15
SLIDE 15

SC 2015

Ginger Skewed Partitioner

15 Michael LeBeane 11/18/2015 15 Michael LeBeane 11/18/2015

▪ Random assignment of edges/verticies to nodes based on degree

Node 0 Node 1 Node n

….

Random Assignment

Edge Vertex

Node 0 Node 1 Node n

….

Heuristic Assignment Degree > Threshold

Vertex

Node 0 Node 1 Node n

….

Random Assignment

Edge Vertex

Node 0 Node 1

….

Heuristic Assignment Degree > Threshold

Vertex

Node n

▪ Skewed

Skew Factor Skew Factor

▪ Original

Balance Balance

slide-16
SLIDE 16

SC 2015

Experimental Setup

16 Michael LeBeane 11/18/2015

▪ Algorithms

– Graph: PageRank (PR), Connected Components (CC), Triangle Count (TC) – Matrix: Stochastic Gradient Descent (SGD), Alternating Least Squares (ALS)

▪ Data Sets

Name Vertices Edges Size (Uncompressed) Type Algorithms

amazon 403,394 3,384,388 46MB Directed Graph PR,CC,TC citation 3,774,768,NA 16,518,948 268MB Directed Graph PR,CC,TC netflix NA NA 100MB Sparse Matrix ALS,SGD road-map 1,379,917 1,921,660 84MB Undirected Graph PR,CC,TC social-network 4,847,571 68,993,773 1.1GB Directed Graph PR,CC,TC twitter 41,000,000 1,400,000,000 25GB Directed Graph PR,CC,TC wiki 2,394,385 5,021,410 64MB Directed Graph PR,CC,TC

slide-17
SLIDE 17

SC 2015

Experimental Setup

17 Michael LeBeane 11/18/2015

▪ Data Center

– Graph: PageRank (PR), Connected Components (CC), Triangle Count (TC) – Matrix: Stochastic Gradient Descent (SGD), Alternating Least Squares (ALS)

▪ Skew Factor

– Results use Thread Based Skew Factor

slide-18
SLIDE 18

SC 2015

Execution Time

18 Michael LeBeane 11/18/2015

▪ Pagerank

10 20 30 40 50 60 Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger social_network amazon citation road_map wiki Runtime (s) Transmit Receive Gather Apply Scatter

Skewed Baseline

slide-19
SLIDE 19

SC 2015

Execution Time

19 Michael LeBeane 11/18/2015

▪ Connected Components

20 40 60 80 100 120 140 160 2 4 6 8 10 12 14 16 Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger social_network amazon citation road_map wiki Runtime (s) Runtime (s) Receive Gather Apply Scatter Transmit

Skewed Baseline

Right Axis

slide-20
SLIDE 20

SC 2015

Execution Time

20 Michael LeBeane 11/18/2015

▪ Triangle Count

10 20 30 40 50 60 1 2 3 4 5 6 Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger social_network amazon citation road_map wiki Runtime (s) Runtime (s) Receive Gather Apply Scatter Transmit

Skewed Baseline

Right Axis

slide-21
SLIDE 21

SC 2015

Execution Time

21 Michael LeBeane 11/18/2015

▪ Stochastic Gradient Descent

2 4 6 8 10 12 14 16 18 Random Greedy Grid Hybrid Ginger netflix Runtime (s) TX RX G A S

Skewed Baseline

10 20 30 40 50 60 70 80 90 100 Random Greedy Grid Hybrid Ginger netflix Runtime (s) TX RX G A S

Skewed Baseline

▪ Alternating Least Squares

slide-22
SLIDE 22

SC 2015

Data distribution

22 Michael LeBeane 11/18/2015

▪ Ideal distribution 17-7-3-1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 SRandom SGreedy SGrid SHybrid SGinger SRandom SGreedy SGrid SHybrid SGinger SRandom SGreedy SGrid SHybrid SGinger SRandom SGreedy SGrid SHybrid SGinger SRandom SGreedy SGrid SHybrid SGinger Non-Skew Target-Skew social_network amazon citation road_map wiki

  • ptimal

Relative Edge Distribution Node (1) Node (3) Node (7) Node (17)

slide-23
SLIDE 23

SC 2015

Results

23 Michael LeBeane 11/18/2015

▪ Skewed approach generally decreases network communication

0.5 1 1.5 2 2.5 3 3.5 4 Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger social_network amazon citation road_map wiki Replication Factor

Skewed Baseline

slide-24
SLIDE 24

SC 2015

Results

24 Michael LeBeane 11/18/2015

▪ Data Ingress Time

5 10 15 20 25 30 35 40 Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger Random Greedy Grid Hybrid Ginger social_network amazon citation road_map wiki Ingress Time (s)

Skewed Baseline

58

slide-25
SLIDE 25

SC 2015

Scale-out Results

25 Michael LeBeane 11/18/2015

Configuration Name C4.2xlarge C4.4xlarge C4.8xlarge Config 1 12 8 4 Config 2 8 8 8 Config 3 4 8 12 Config 4 3 5 16 5 10 15 20 25 30 35 Config 1 Config 2 Config 3 Config 4 Percentage Improvement Cluster Size Random Greedy Grid Hybrid Ginger 20 40 60 80 100 120 140 160 180 200 10 20 30 40 50 60 Runtime (s) Cluster Size Random SRandom Greedy SGreedy Grid SGrid Hybrid SHybrid Ginger SGinger

▪ Extremely large Twitter graph ▪ No benefits after 36 nodes

slide-26
SLIDE 26

SC 2015

Future Work

26 Michael LeBeane 11/18/2015

▪ Incorporate better network model ▪ Profile based partitioning scheme

– How do we sample graph inputs?

slide-27
SLIDE 27

SC 2015

Conclusion

27 Michael LeBeane 11/18/2015

▪ Simple, static throughput estimation can greatly improve performance ▪ We modify 5 existing on-line graph partitioning strategies for heterogeneous environments ▪ Our modified algorithms improve runtime by as much as 64% and

  • n average 32% on Amazon EC2

▪ We show that our strategies also work up to 48 nodes, achieving 18% performance improvement on scale-out

slide-28
SLIDE 28

SC 2015

28 Michael LeBeane 11/18/2015

Thank You!

slide-29
SLIDE 29

SC 2015

References

Michael LeBeane 11/18/2015 29

[1] S. Garg, S. Sundaram, and H. D. Patel. Robust heterogeneous data center design: A principled approach. SIGMETRICS Perform. Eval. Rev., 39(3):28–30, Dec. 2011. [2] B.-G. Chun, G. Iannaccone, G. Iannaccone, R. Katz, G. Lee, and L. Niccolini. An energy case for hybrid datacenters. SIGOPS Oper. Syst. Rev., 4(1):76–80, Mar. 2010. [1] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, pages 17–30. USENIX Association, 2012.