dist-gem5: Distributed Simulation of Compute Clusters
Mohammad Alian, Umur Darbaz, Gabor Dozsa, Stephan Diestelhorst, Daehoon Kim, Nam Sung Kim University of Illinois Urbana-Champaign ARM Ltd., Cambridge, UK
1
dist-gem5: Distributed Simulation of Compute Clusters Mohammad - - PowerPoint PPT Presentation
dist-gem5: Distributed Simulation of Compute Clusters Mohammad Alian, Umur Darbaz, Gabor Dozsa, Stephan Diestelhorst, Daehoon Kim, Nam Sung Kim University of Illinois Urbana-Champaign ARM Ltd., Cambridge, UK 1 2 Outline motivation
1
2
dist-gem5 architecture evaluation conclusion what is gem5
3
dist-gem5 architecture evaluation conclusion what is gem5
4
Core Integrated IP ARM ISA Support
ARMv7a ARMv8 GICv2
CPU Models
L1-L3 $ SCU ArchTimer PMU
IO components Simulation support
UART UHDLCD 10Gb NIC NVMe DMA KVMv7 Traffic Gen Traffic Monitor
Memory
Flash DRAM
Interconnect
Crossbar Snoop filter Bridges Stream Line Sim Points Power Model Int. KVMv8 FracFact PCA UFS Timers RTC
GPU models
NoMali HMC
dist-gem5 architecture evaluation conclusion what is gem5
Atomic Timing Out of Order In Order
5
scale OS ISAs caches memory network devices
performance Power
cores
dist-gem5 architecture evaluation conclusion what is gem5
6
host #1 dist-gem5 architecture evaluation conclusion what is gem5 simulated system #1 gem5 process physical machine simulated network switch host #4 simulated system #3 host #3 simulated system #2 host #2
7
dist-gem5 architecture evaluation conclusion what is gem5
8
dist-gem5 architecture evaluation conclusion what is gem5
9
dist-gem5 architecture evaluation conclusion what is gem5
physical host #1 physical host #3 physical host #2 physical switch phys NIC#1 phys NIC#2 phys port1 phys port2 phys port3 phys NIC#3
10
dist-gem5 architecture evaluation conclusion what is gem5
physical host #1 physical host #3 physical host #2 physical switch phys NIC#1 phys NIC#2 phys port1 phys port2 phys port3 phys NIC#3
11
gem5 #1 simulated system #1 sim NIC gem5 #3 simulated switch
dist-gem5 architecture evaluation conclusion what is gem5
gem5 #2 simulated system #2 sim NIC sim port0 sim port1
physical host #1 physical host #3 physical host #2 physical switch phys NIC#1 phys NIC#2 phys port1 phys port2 phys port3 phys NIC#3 gem5 #1 simulated system #1 sim NIC gem5 #3 simulated switch gem5 #2 simulated system #2 sim NIC sim port0 sim port1
12
sim pkt TCP sim pkt sim pkt TCP sim pkt sim pkt dist-gem5 architecture evaluation conclusion what is gem5
13
eventQ simulation thread send pkt recv pkt physical host gem5 process receiver thread phys NIC
dist-gem5 architecture evaluation conclusion what is gem5
14
dist-gem5 architecture evaluation conclusion what is gem5
15
wall clock time
gem5#0 gem5#1 send time expected delivery time simulated network delay
dist-gem5 architecture evaluation conclusion what is gem5
recv time late packet arrival
16
wall clock time
gem5#0 gem5#1 gem5#0 gem5#1 send time packet arrival wall clock time expected delivery time simulated network delay quantum global sync
dist-gem5 architecture evaluation conclusion what is gem5
quantum
17
dist-gem5 architecture evaluation conclusion what is gem5
server #2
18
Server #1 server #3 server #4 server #5 server #6 server #7 Server #0
top of rack switch #0
server #10 server #9 server #11 server #12 server #13 server #14 server #15 server #8
top of rack switch #1
server #58 server #57 server #59 server #60 server #61 server #62 server #63 server #56
top of rack switch #7 aggregate switch
dist-gem5 architecture evaluation conclusion what is gem5
physical host
19
top of rack switch #0 top of rack switch #1 top of rack switch #7 aggregate switch p8 p0 p7 p0 p7 p8
p0 p7 p1 p8 gem5 simulated etherLink simulated port distEtherLink simulated etherSwitch
dist-gem5 architecture evaluation conclusion what is gem5
p0 p7
MAC Table In-orderQ#0 In-orderQ#n IPORT#0 IPORT#n OPORT#0 OPORT#n
20
dist-gem5 architecture evaluation conclusion what is gem5
21
quad core physical host gem5#6
system#6
gem5#7
switch
gem5#4
system#4
gem5#2
system#2
gem5#0
system#0
gem5#5
system#5
gem5#3
system#3
gem5#1
system#1
quad core physical host gem5#6
system#6
gem5#7
switch
gem5#4
system#4
gem5#5
system#5
quad core physical host gem5#0
system#6 switch system#4 system#2 system#0 system#5 system#3 system#1
quad core physical host gem5#6
system#2
gem5#7
system#3
gem5#4
system#0
gem5#5
system#1
dist-gem5 architecture evaluation conclusion what is gem5
22
category gem5 configuration O3 core 4 cores; 4 way superscalar memory 8GB DDR3 1600 MHz network Intel GbE NIC; 1 μs Link latency OS Linux Ubuntu 14.04 (Kernel 4.3)
dist-gem5 architecture evaluation conclusion what is gem5
23
quad core physical host gem5#6
system#6
gem5#7
switch
gem5#4
system#4
gem5#5
system#5
quad core physical host gem5#0
system#6 switch system#4 system#2 system#0 system#5 system#3 system#1
quad core physical host gem5#6
system#2
gem5#7
system#3
gem5#4
system#0
gem5#5
system#1
24
0.0 0.3 0.6 0.9 1.2 1.5 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 Latancy (ms) Bandwidth (Gbps) dist-gem5 phys 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 1 5 10 20 30 40 50 60 70 80 90 95 Latency (ms) memcached Distribution Percentile dist-gem5 phys
dist-gem5 architecture evaluation conclusion what is gem5
25
2.7 6.3 21.8 36.0 83.1 2.7 3.7 6.6 6.0 6.5 10 20 30 40 50 60 70 80 90 3 7 15 31 63 Speedup ( Norm. single-threaded-gem5) Number of Simulated Nodes dist-gem5 parallel-gem5
dist-gem5 architecture evaluation conclusion what is gem5
26
1.4 1.9 1.9 3.9 11.2 23.9 1.0 2.6 9.4 25.0 57.3 1.0 10.0 100.0 10 20 30 40 50 60 70 Normalized Simulation Time Number of Simulated Nodes dist-gem5 parallel-gem5 single-threaded-gem5
dist-gem5 architecture evaluation conclusion what is gem5
27
4 8 12 16 20 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.5 1 2 4 8 16 32 64 128 Number of Requests (K Req) Normalized Simulation Time Synchronization Quantum Size (μs) Simulation Time Req#
dist-gem5 architecture evaluation conclusion what is gem5
28
dist-gem5 architecture evaluation conclusion what is gem5
29