X-Stream: A Case Study in Building a Graph Processing System
Amitabha Roy (LABOS)
1
Building a Graph Processing System Amitabha Roy (LABOS) 1 - - PowerPoint PPT Presentation
X-Stream: A Case Study in Building a Graph Processing System Amitabha Roy (LABOS) 1 X-Stream Graph processing system Single Machine Works on graphs stored Entirely in RAM Entirely in SSD Entirely on Magnetic Disk
1
2
3
4
5
6
𝑊 𝑁)
7
V(1) V(1)
𝐶 log𝑁
𝐶
𝑊 𝑁)
V(2) V(2)
𝐹 𝐶 + 𝑊 𝐶 + 𝑉 𝐶)
8
𝑉 𝐶 log𝑁
𝐶
𝑊 𝑁)
9
𝑉 𝐶 log𝑁
𝐶
𝑊 𝑁)
10
𝑉 𝐶 log𝑁
𝐶
𝑊 𝑁)
11
𝑉 𝐶 + 𝑊 𝐶)
12
𝑊+𝐹 𝐶 + 𝐹 𝐶 log𝑁
𝐶
𝑊 𝑁
13
14
15
16
17
18
Option Buffers controlled by Overhead
19
20
levels in the hierarchy
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Scatter
SOURCE DEST
1 3 1 5 2 7 2 4 3 2 3 8 4 3 4 7 4 8 5 6 6 1 8 5 8 6
V
1 2 3 4 5 6 7 8
41
42
Scatter Scatter
SOURCE DEST
1 3 1 5 2 7 2 4 3 2 3 8 4 3 4 7 4 8 5 6 6 1 8 5 8 6
V
1 2 3 4 5 6 7 8
43
SOURCE DEST
1 3 1 5 2 7 2 4 3 2 3 8 4 3 4 7 4 8 5 6 6 1 8 5 8 6
44
SOURCE DEST
1 3 8 6 5 6 2 4 3 2 4 7 4 3 3 8 4 8 2 7 6 1 8 5 1 5
𝑻𝒅𝒃𝒖𝒖𝒇𝒔𝒕×𝑭𝒆𝒉𝒇 𝑬𝒃𝒖𝒃 𝑻𝒇𝒓𝒗𝒇𝒐𝒖𝒋𝒃𝒎 𝑩𝒅𝒅𝒇𝒕𝒕 𝑪𝒃𝒐𝒆𝒙𝒋𝒆𝒖𝒊
𝑭𝒆𝒉𝒇 𝑬𝒃𝒖𝒃 𝑺𝒃𝒐𝒆𝒑𝒏 𝑩𝒅𝒅𝒇𝒕𝒕 𝑪𝒃𝒐𝒆𝒙𝒋𝒆𝒖𝒊
45
46
V 1 2 3 4 5 6 7 8
47
48
V1 1 2 3 4 V2 5 6 7 8 SOURCE DEST 1 5 4 7 2 7 4 3 4 8 3 8 2 4 1 3 3 2 SOURCE DEST 5 6 8 6 8 5 6 1
49
V1 1 2 3 4
50
SOURCE DEST 1 5 4 7 2 7 4 3 4 8 3 8 2 4 1 3 3 2
V1 1 2 3 4
51
SOURCE DEST 1 5 4 7 2 7 4 3 4 8 3 8 2 4 1 3 3 2
52
53
54
55
56
57
58
1 2 3 4 5 6 Netflix/ALS Twitter/Pagerank Twitter/Belief Propagation RMAT27/WCC
59
60
1 2 3 4 5 6 Netflix/ALS Twitter/Pagerank Twitter/Belief Propagation RMAT27/WCC
61
500 1000 1500 2000 2500 3000 Time (sec) Graphchi Sharding X-Stream runtime
62
63
100 200 300 400 500 600 700 800 900 1000 Read (MB/s) 5 minute window X-Stream Graphchi
64
100 200 300 400 500 600 700 800 Write (MB/s) 5 minute window X-Stream Graphchi
65
66
67
0:00:01 0:00:05 0:00:21 0:01:24 0:05:37 0:22:30 1:30:00 6:00:00 24:00:00 Time (HH:MM:SS) Input Edge Data
Weakly Connected Components
16 GB RAM 400 GB SSD 6 TB Disk 8 Million V, 128 Million E, 8 sec 256 Million V, 4 Billion E, 33 mins 4 Billion V, 64 Billion E, 26 hours
68
69
70
71
1 6 3 8 7 4 2 5 1 2 3 4 5 6 7 8
D=3, BFS in 3 steps, Most real-world graphs D=7, BFS in 7 steps
72
73
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Netflix/ALS Twitter/Pagerank Twitter/Belief Propagation RMAT27/WCC Fraction of Runtime Benchmark Graphchi Runtime Breakdown Compute + I/O Re-sort shard
74
75