StreamBox: Modern Stream Processing
- n a Multicore Machine
Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu Lin, Purdue ECE
StreamBox: Modern Stream Processing on a Multicore Machine Hongyu - - PowerPoint PPT Presentation
StreamBox: Modern Stream Processing on a Multicore Machine Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu Lin, Purdue ECE http://xsel.rocks/p/streambox
Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu Lin, Purdue ECE
3
4
5
6
7
8
Input Transform-0 Transform-1 Transform-2 Output Epoch Epoch Epoch Epoch Epoch Epoch Epoch Epoch Epoch
Infinite-data-stream
Intel ¡Xeon ¡E7-‑4830 ¡v4
9
35MB L3
960K B
3584 KB
960K B 960K B 960K B
3584 KB 3584 KB 3584 KB
Core Core 1 Core 2 Core 13
NUMA% NUMA% 1 NUMA% 2 NUMA% 3
Input Transform-0 Transform-1 Transform-2 Output Epoch Epoch Epoch Epoch Epoch Epoch Epoch Epoch Epoch
Infinite-data-stream
Epoch Epoch Epoch Epoch Epoch Epoch
10:00
Epoch Epoch Epoch
15:00 20:00 5:00
10
Epoch Epoch Epoch Epoch Epoch Epoch
10:00
Epoch Epoch Epoch
15:00 20:00 10:00 0:00 5:00
11
Epoch Epoch Epoch Epoch Epoch Epoch
10:00
Epoch Epoch Epoch
15:00 20:00 10:00 15:00 5:00 5:00 0:00
12
Epoch Epoch Epoch Epoch Epoch Epoch
10:00
Epoch Epoch Epoch
15:00 10:00
13
14
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 10:00
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 20:00 10:00 0:00 5:00
2000 4000 6000 8000 4 12 32 56
7K 10K 10K 8K
Throughput KRec/s # Cores StreamBox Spark Streaming Beam
15
16
17
18
Processing System
19
1:00 – 1:05
Infinite input stream
20
1:00 – 1:05 Windows by event time Infinite input stream
21
1:00 – 1:05 Windows by event time Infinite input stream
22
1:00 – 1:05 Windows by event time Infinite input stream
23
1:05 – 1:10
1:00 – 1:05 Windows by event time Infinite input stream
24
1:05 – 1:10 1:00 – 1:05 Windows by event time Infinite input stream
1:10 – 1:15
25
1:05 – 1:10 1:00 – 1:05 Windows by event time Infinite input stream
1:10 – 1:15
26
1:05 – 1:10 1:00 – 1:05 Windows by event time Infinite input stream
1:10 – 1:15
27
Windows by event time Infinite input stream
1:05 – 1:10
28
1:05 – 1:10 Windows by event time Infinite input stream
1:10 – 1:15
29
Watermark 1:05 Watermark 1:10 Infinite input stream
30
Watermark 1:10 Infinite input stream
1:05 – 1:10 Windows by event time
Watermark 1:05
31
Watermark 1:10 Infinite input stream
1:05 – 1:10 Windows by event time
Watermark 1:05
32
Watermark 1:10 Infinite input stream
1:05 – 1:10 Windows by event time
1:10 – 1:15
Watermark 1:05
33
Watermark 1:10 Infinite input stream
1:05 – 1:10 Windows by event time
1:10 – 1:15
Watermark 1:05
34
Infinite input stream
1:05 – 1:10 Windows by event time
1:10 – 1:15
Watermark 1:05 Watermark 1:10
35
Watermark 1:05 Watermark 1:10 Infinite input stream
36
37
38
35MB L3
960K B
3584 KB
960K B 960K B 960K B
3584 KB 3584 KB 3584 KB
Core Core 1 Core 2 Core 13
NUMA% NUMA% 1 NUMA% 2 NUMA% 3
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 10:00
39
40
0:20 0:22
0:12 0:18 0:05 0:11 0:10
41
0:20 0:22
0:12 0:18 0:05 0:11 0:10
42
0:20 0:22
0:12 0:18 0:05 0:11 0:10 0:20
0:12 0:18 0:05 0:11 0:10
43
0:20 0:22
0:12 0:18 0:05 0:11 0:10 0:20
0:12 0:18 0:05 0:11 0:10
44
0:20 0:22
0:12 0:18 0:05 0:11 0:10 0:20
0:12 0:18 0:05 0:11 0:10
45
End watermark 20:00
46
20:00 15:00
47
20:00 15:00
48
20:00 15:00 20:00 15:00
49
15:00 20:00 20:00
50
20:00 15:00
51
20:00 15:00
52
(Upstream) Transform 1 Transform 2 Transform 3 Oldest Newest Transform 0 (Downstream)
25:00 20:00 15:00 10:00 09:00 04:00
53
54
55
CM56 256GB DRAM 14 cores 14 cores 14 cores 14 cores CM12 256GB DRAM 6 cores 6 cores
56
57
58
1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores T weets Sentiment Analysis CM56 (1sec)
59
1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores T weets Sentiment Analysis CM56 (1sec) CM56 (500ms)
60
1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores T weets Sentiment Analysis CM56 (1sec) CM56 (500ms) CM12 (1sec) CM12 (500ms)
1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores Word Count CM56 (1sec) CM56 (50ms) CM12 (1sec) CM12 (50ms) 61 1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores T emporal Join CM56 (1sec) CM56 (50ms) CM12 (1sec) CM12 (50ms) 200 400 600 800 1000 1200 1400 4 12 32 56 Throughput KRec/s # Cores Network Latency Monitoring CM56 (1sec) CM56 (500ms) CM12 (1sec) CM12 (500ms) 1000 2000 3000 4000 5000 4 12 32 56 Throughput KRec/s # Cores T weets Sentiment Analysis CM56 (1sec) CM56 (500ms) CM12 (1sec) CM12 (500ms) 500 1000 1500 2000 4 12 32 56 Throughput KRec/s # Cores Counting Distinct URLs CM56 (1sec) CM56 (50ms) CM12 (1sec) CM12 (50ms) 10000 20000 30000 40000 4 12 32 56 Throughput KRec/s # Cores Windowed Grep CM56 (1sec) CM56 (50ms) CM12 (1sec) CM12 (50ms)
62
Spark: v2.1.0 Beam: v0.5.0
2000 4000 6000 8000 4 12 32 56
7K 10K 10K 8K
Throughput KRec/s # Cores StreamBox Spark Streaming Beam
63
200 400 600 800 1000 4 12 32 56 Throughput KRec/s # Cores 0% 20% 40% 2000 4000 6000 4 12 32 56 Throughput KRec/s # Cores 0% 20% 40%
Netmon Tweets
2000 4000 6000 4 12 32 56 Throughput KRec/s # Cores 0% 20% 40%
WordCount
Drop 7%
64 2000 4000 6000 8000 32 56 Throughput KRec/s # Cores StreamBox In-order
10000 20000 30000 40000 50000 32 56 Throughput KRec/s # Cores StreamBox In-order
Drop 87%
NO parallel NO parallel
Prior work StreamBox
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 10:00
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 20:00 10:00 0:00 5:00
cluster with a few hundreds of CPU cores
65
35MB L3
960K B
3584 KB
960K B 960K B 960K B
3584 KB 3584 KB 3584 KB
Core Core 1 Core 2 Core 13
NUMA% NUMA% 1 NUMA% 2 NUMA% 3
Transform)0 Transform)1 Transform)2
Epoch Epoch Epoch Epoch Epoch Epoch
Pipeline
10:00
Epoch Epoch Epoch
15:00 10:00