Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin - PowerPoint PPT Presentation

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October 4th, 2015 Duke Computer Architecture

Case for Blade Servers An era of big data Duke Computer Architecture 2

Case for Blade Servers An era of big data needs big memory. • Machines with large memory • Distributed memory systems HP Moonshot Server Cartridge Distributed systems Duke Computer Architecture 3

Case for Blade Servers a node a blade Figure 1: Two blade server nodes connected through Ethernet [1,2] [1] K. Lim, J Chang, T. Mudge, P. Ranganathan. Disaggregated memory for expansion and sharing in blade servers. [2] R. Hou, T. Jiang, L. Zhang, P. Qi, J. Dong. Cost effective data center servers. Duke Computer Architecture 4

Case for Blade Servers blade&0 blade&1 MC C M M M M RC NTB Inter-processor Links M M M M (e.g., HyperTransport) Figure 1: Two blade server nodes connected Inter-blade Links through Ethernet [1,2] (e.g., PCIe) M M M M M M M M blade&2 blade&3 Figure 2 : 2D figure of a server node design with inter-blade links and inter-processor links. Blade servers provide compute and memory capacity in a dense form factor. [1] K. Lim, J Chang, T. Mudge, P. Ranganathan. Disaggregated memory for expansion and sharing in blade servers. [2] R. Hou, T. Jiang, L. Zhang, P. Qi, J. Dong. Cost effective data center servers. Duke Computer Architecture 5

Case for Blade Servers Applications: in-memory computational frameworks • Big data analytical frameworks: e.g., Spark • Graph type of workloads: e.g., GraphLab, Spark GraphX • In-memory databases: e.g. MonetDB Challenges: hardware-software co-design costs both engineering efforts and time. A fast and cost-effect way to understand the system is through technology models. Duke Computer Architecture 6

Motivation for Technology Models We identify and derive key technology parameters for analyzing their effects on system performance, throughput and energy. Potentially, those models can help to • choose hardware technologies and configurations • understand performance and energy impacts • close the loop for hardware and software co-design Duke Computer Architecture 7

Agenda 1. Derive technology models 2. Characterizing non-uniform memory access 3. Develop NUMA-aware schedulers Duke Computer Architecture 8

Communication Technologies DDR3 • HyperTransport memory • Intel Quick Path blade&0 blade&1 inter- processor MC C M M M M RC NTB Inter-processor Links M M M M (e.g., HyperTransport) Inter-blade Links (e.g., PCIe) M M M M inter-blade • PCIe 3.0 M M M M • InfiniBand blade&2 blade&3 Figure 2 : A blade server node design with inter-blade links and inter-processor links. Duke Computer Architecture 9

Delay and Energy Estimates Key Figure 3: Derived and surveyed technology and architectural parameters Estimates Duke Computer Architecture 10

With these Estimates • Explore system organizations for blade servers • Analyze communication delay and energy • Address challenges in system management • e.g.: non-uniform memory access (NUMA) Duke Computer Architecture 11

Agenda 1. Derive technology models 2. Characterizing non-uniform memory access 3. Develop NUMA-aware schedulers Duke Computer Architecture 12

NUMA Effects Interprocessor Interblade • Processors access different 1.8 CPI normalized to local memory regions with different 1.6 latencies — non-uniform access memory access ( NUMA) 1.4 • NUMA degrades application 1.2 performance 1 t k n n n o u a i s o r s c • Multiple communication paths e e d g r g r a o e P w r introduce multiple levels of c i t s i g o NUMA L Figure 4 : Single thread performance (CPI) degradation for NUMA access. Duke Computer Architecture 13

NUMA-aware Scheduling Policies Figure 5: NUMA-aware scheduling algorithms [3] [3] M. Zaharia et el, Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. Duke Computer Architecture 14

NUMA-aware Scheduling Policies • Local execution • IP-1 : inter-processor 1-hop execution • IP-2 : inter-processor 2-hop execution • IB : inter-blade execution Applications’ NUMA effects vary; throughput and latency goals differ. Choose the optimal policy accordingly. Duke Computer Architecture 15

Methods - NUMA Simulation Characterize application sensitivity to NUMA over each type of communication technology Marssx86 + DRAMSim Interconnections CPU DRAM DRAM CPU Add additional latency for different communication paths Duke Computer Architecture 16

Methods - Remote vs Local Local Remote Benchmarks: 1 0.8 1-7: Apache Spark • 0.6 8-11: Phoenix MapReduce • 0.4 0.2 12-20: PARSEC 2.0 • 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 7: distinguish remote vs local (assuming heap is remote; x-axis is benchmark id; y-axis is percentage.) Duke Computer Architecture 17

Methods - Queueing Simulation Model task queues and analyze queueing dynamics. One blade server node Cores per socket 16 Sockets per blade 4 Change inter-arrival time to Service time per core vary system utilization changes based on NUMA Number of blades per node 4 effects Task size 100M instructions Inter-arrival time exponential distribution λ = 6000 t/s Service time/core # Instructions/IPC/core frequency Figure 8: Queueing simulation parameters Duke Computer Architecture 18

Results — Throughput Maximum Sustained Throughput • Increase the system load 1.6 Local to test the maximum IP − 1 1.5 sustained throughput. IP − 2 1.4 Normalized to IB • Avoiding NUMA always 1.3 increases throughput. 1.2 1.1 1 0.9 1 2 3 4 5 6 7 8 91011121314151617181920 Compute-intensive: 7, 9-11, 13-20 • Memory-intensive: 1-6, 8, 12 • Duke Computer Architecture 19

Results — Latency/QoS 95th Percentile Response Time (High Utilization) • Permitting NUMA can 2.2 Local improve the quality of IP − 1 2 IP − 2 service. Speed − up Relative to IB 1.8 • CI tasks should choose 1.6 IB to permit NUMA. 1.4 1.2 • MI tasks should choose IP-1 and IP-2 to 1 selectively permit NUMA 0.8 1 2 3 4 5 6 7 8 91011121314151617181920 in highly loaded servers. Compute-intensive: 7, 9-11, 13-20 • Memory-intensive: 1-6, 8, 12 • Duke Computer Architecture 20

Results — Communication Energy Data Migration Energy 8 • If data is near, remote Inter − processor 1 − Hop Links access is more beneficial Inter − processor 2 − Hop Links 7 Normalized to Remote Access Inter − blade Links (3-4x) on for saving 6 energy. 5 • If data is far, remote 4 access is less beneficial because of high-cost 3 links. 2 1 • Energy benefits depend on page reuse rate and 0 1 2 3 4 5 6 7 8 91011121314151617181920 communication channels. Compute-intensive: 7, 9-11, 13-20 • Memory-intensive: 1-6, 8, 12 • 18-20 is out of scope • Duke Computer Architecture 21

Results — Communication Channels Local DRAM Inter − processor 1 − Hop Inter − processor 2 − Hop Inter − blade 1 0.8 0.6 0.4 0.2 0 Local IP − 1 IP − 2 IB Figure 9: link utilization percentages for application 1. • Use link utilization percentage to estimate average communication power. Duke Computer Architecture 22

Results — Communication Power Communication Power 20 • HyperTransport and Local IP − 1 PCIe dissipate around IP − 2 IB 15 40W, 60W at peak utilization. W 10 • S1-S6 suggests that these Spark workloads 5 use about 25% of the link bandwidth. 0 1 2 3 4 5 6 7 8 91011121314151617181920 Compute-intensive: 7, 9-11, 13-20 • Memory-intensive: 1-6, 8, 12 • 12 is out-of-scope • Duke Computer Architecture 23

Conclusions and Future Directions • Model blade servers for emerging big-data applications. • Study NUMA-aware schedulers and their effects on throughput, latency and power. • Provide guidelines for choosing an optimal policy. Future directions: • Extend validation to real system measurements. Duke Computer Architecture 24

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October 4th, 2015 Duke Computer Architecture

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin - PowerPoint PPT Presentation

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October 4th, 2015 Duke Computer Architecture Case for Blade Servers An era of big data Duke Computer Architecture 2 Case for Blade Servers An era of

Ordinary DNS: www.google.com A? Client's k.root-servers.net com. NS a.gtld-servers.net Resolver

Fatigue of Wind Blade Laminates: Fatigue of Wind Blade Laminates: Effects of Resin and Fabric

12/3/17 DISCLOSURES I am a full-time employee of Blade Therapeutics. Blade Therapeutics is a

Skating Skating grasses. Would the blade still cut the grasses if grasses. Would the blade still

Ordinary DNS: www.google.com A? Client's k.root-servers.net com. NS a.gtld-servers.net Resolver

Server Design Server Design Srinidhi Varadarajan Topics Topics Types of servers Server

Services Stephen James Clients vs Servers Clients consume services Servers provide

Lost Art of the Blade Plate Adam Starr, MD Chief, Orthopaedic Surgery Parkland Memorial Hospital

Clean Sky/ SFWA-ITD BLADE FLIGHTLAB The flagship laminar flow project AIRBUS SAS 26 th of

Blade Testing at NRELs National Wind Technology Center 2010 Sandia National Laboratory

Helicopter Performance The Effect of Blade Twist Travis Ritchie AEM 495 Fall

Exploring Qualcomm Baseband via ModKit Tencent Blade Team Tencent Security Platform Department

Question: Skating A rotary lawn mower spins its sharp blade rapidly over the lawn and cuts the

Tolerating Faults in Disaggregated Datacenters Amanda Carbonari , Ivan Beschastnikh University

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Five Minute Settlement AEMC Public Forum Costs & price impacts Costs of implementing 5

Whittier/Lyndale Bikeway The meeting will begin shortly September 1, 2020 Racism as a Public

How do I know if my house or property is right for solar? Examples of GREAT roofs Roof Mount

Welcome to Boston Medical Center 1 Our Innovation Medical Patients Payors Providers

Cycling ng news Neil Guthrie hrie UKs first Dutch style roundabout under construction East

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

EGI FedCloud in the last 6 months Lifewatch Status (Seville Site) 10/11/15 1 The infrastructure

An Analysis of SMP Memory Allocators: MapReduce on Large Shared-Memory Systems Robert D

Modeling Resilience in Cloud-Scale Data Centers John

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin - PowerPoint PPT Presentation

Modeling Communication Costs in Blade Servers Qiuyun Wang, Benjamin Lee Duke University October 4th, 2015 Duke Computer Architecture Case for Blade Servers An era of big data Duke Computer Architecture 2 Case for Blade Servers An era of

Ordinary DNS: www.google.com A? Client's k.root-servers.net com. NS a.gtld-servers.net Resolver

Fatigue of Wind Blade Laminates: Fatigue of Wind Blade Laminates: Effects of Resin and Fabric

12/3/17 DISCLOSURES I am a full-time employee of Blade Therapeutics. Blade Therapeutics is a

Skating Skating grasses. Would the blade still cut the grasses if grasses. Would the blade still

Ordinary DNS: www.google.com A? Client's k.root-servers.net com. NS a.gtld-servers.net Resolver

Server Design Server Design Srinidhi Varadarajan Topics Topics Types of servers Server

Services Stephen James Clients vs Servers Clients consume services Servers provide

Lost Art of the Blade Plate Adam Starr, MD Chief, Orthopaedic Surgery Parkland Memorial Hospital

Clean Sky/ SFWA-ITD BLADE FLIGHTLAB The flagship laminar flow project AIRBUS SAS 26 th of

Blade Testing at NRELs National Wind Technology Center 2010 Sandia National Laboratory

Helicopter Performance The Effect of Blade Twist Travis Ritchie AEM 495 Fall

Exploring Qualcomm Baseband via ModKit Tencent Blade Team Tencent Security Platform Department

Question: Skating A rotary lawn mower spins its sharp blade rapidly over the lawn and cuts the

Tolerating Faults in Disaggregated Datacenters Amanda Carbonari , Ivan Beschastnikh University

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Five Minute Settlement AEMC Public Forum Costs &amp; price impacts Costs of implementing 5

Whittier/Lyndale Bikeway The meeting will begin shortly September 1, 2020 Racism as a Public

How do I know if my house or property is right for solar? Examples of GREAT roofs Roof Mount

Welcome to Boston Medical Center 1 Our Innovation Medical Patients Payors Providers

Cycling ng news Neil Guthrie hrie UKs first Dutch style roundabout under construction East

Wind Turbines Wind Turbines A balanced wind turbine rotates smoothly A balanced wind turbine

EGI FedCloud in the last 6 months Lifewatch Status (Seville Site) 10/11/15 1 The infrastructure

An Analysis of SMP Memory Allocators: MapReduce on Large Shared-Memory Systems Robert D

Modeling Resilience in Cloud-Scale Data Centers John

Five Minute Settlement AEMC Public Forum Costs & price impacts Costs of implementing 5