Empya : Saving Energy in the Face of Varying Workloads Christopher - - PowerPoint PPT Presentation
Empya : Saving Energy in the Face of Varying Workloads Christopher - - PowerPoint PPT Presentation
Empya : Saving Energy in the Face of Varying Workloads Christopher Eibel , Thao-Nguyen Do, Robert Meiner, and Tobias Distler System Software Group Friedrich-Alexander-Universitt Erlangen-Nrnberg (FAU) The 2018 IEEE International
Introduction & Motivation
Execution platforms vs. energy demand in data centers
Programming and execution platforms are generally not energy aware Dynamic applications in data centers are faced with varying workloads Resources are often statically assigned ➔ Consequence: Lots of energy is being wasted
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Introduction & Motivation 2
Introduction & Motivation
Execution platforms vs. energy demand in data centers
Programming and execution platforms are generally not energy aware Dynamic applications in data centers are faced with varying workloads Resources are often statically assigned ➔ Consequence: Lots of energy is being wasted
Example application: key–value store
Receives and processes user requests with basic operations (e.g., get(key)) Programmers may choose between two configuration options:
➔ Staticenergy: Lower performance when load is high ➔ Staticperf : Wasting energy when load is low
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Introduction & Motivation 2
Introduction & Motivation
Execution platforms vs. energy demand in data centers
Programming and execution platforms are generally not energy aware Dynamic applications in data centers are faced with varying workloads Resources are often statically assigned ➔ Consequence: Lots of energy is being wasted
Example application: key–value store
Receives and processes user requests with basic operations (e.g., get(key)) Programmers may choose between two configuration options:
➔ Staticenergy: Lower performance when load is high ➔ Staticperf : Wasting energy when load is low
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Introduction & Motivation 2
5 10 15 20 Staticperf Staticenergy Power demand at 70 kOps/s Power demand [W] 100 200 300 400 Staticperf Staticenergy Maximum throughput Throughput [kOps/s]
Introduction & Motivation
Execution platforms vs. energy demand in data centers
Programming and execution platforms are generally not energy aware Dynamic applications in data centers are faced with varying workloads Resources are often statically assigned ➔ Consequence: Lots of energy is being wasted
Example application: key–value store
Receives and processes user requests with basic operations (e.g., get(key)) Programmers may choose between two configuration options:
➔ Staticenergy: Lower performance when load is high ➔ Staticperf : Wasting energy when load is low
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Introduction & Motivation 2
5 10 15 20 Staticperf Staticenergy Power demand at 70 kOps/s Power demand [W] 100 200 300 400 Staticperf Staticenergy Maximum throughput Throughput [kOps/s]
Goals
Control the energy–performance tradeoff Propose platform that
❶ frees programmers from taking care of energy optimizations ❷ uses available techniques at hardware and software level ❸ adapts dynamically to varying workloads
General Approach
EMPYA: energy-aware middleware platform for dynamic applications Key design principles for EMPYA
❶ Energy-efficiency awareness
Avoid high CPU utilization because of disproportionate power-to-performance ratio Not necessarily select configuration with full resource allocation
❷ Multi-level awareness
Exploit available techniques at multiple levels Coordinate techniques: Best energy efficiency with respect to required performance
❸ Energy awareness
Integrated regulator making energy-aware reconfigurations No additional services
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 3
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 4
Platform Level OS Level Hardware Level Energy Regulator
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 4
➔ Vary #threads ➔ Mapping of application components to threads
Platform Level OS Level Hardware Level Energy Regulator
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 4
➔ Vary #threads ➔ Mapping of application components to threads ➔ Vary #(un)active cores ➔ Mapping of application threads to active cores
Platform Level OS Level Hardware Level Energy Regulator
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 4
➔ Vary #threads ➔ Mapping of application components to threads ➔ Vary #(un)active cores ➔ Mapping of application threads to active cores ➔ Vary upper power limits ➔ Instruct hardware to enforce them
Platform Level OS Level Hardware Level Energy Regulator
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 4
➔ Vary #threads ➔ Mapping of application components to threads ➔ Vary #(un)active cores ➔ Mapping of application threads to active cores ➔ Vary upper power limits ➔ Instruct hardware to enforce them
Platform Level OS Level Hardware Level Energy Regulator
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 5
Platform Level OS Level Hardware Level Energy Regulator A0 A1 A2
Thread Pool0 Thread Pool1
C0 C1 C2 C3
Active Active Inactive Active
Power Control Energy Accounting
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 5
❶ Actors
Each actor maintains its own state Communication via message passing Actor is independent
- f executing thread
➔ Implementation: Akka toolkit
Platform Level OS Level Hardware Level Energy Regulator A0 A1 A2
Thread Pool0 Thread Pool1
C0 C1 C2 C3
Active Active Inactive Active
Power Control Energy Accounting
EMPYA – Exploiting Techniques at Different Levels
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 5
❶ Actors
Each actor maintains its own state Communication via message passing Actor is independent
- f executing thread
➔ Implementation: Akka toolkit
❷ Power limiting
Running average power limit (RAPL) Originally developed for power limiting (e.g., temperature issues) Enables power and energy measurements Power capping very powerful for reducing the energy demand
Platform Level OS Level Hardware Level Energy Regulator A0 A1 A2
Thread Pool0 Thread Pool1
C0 C1 C2 C3
Active Active Inactive Active
Power Control Energy Accounting
EMPYA – Energy Regulator
Self-adapting system with continuous feedback loop
Monitor application performance Emit dynamic HW/SW reconfigurations
Energy-profile database
Configuration characteristics Workload-specific power values
Energy policies
Primary performance goal (e.g., throughput) Secondary performance goal (e.g., latency)
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 6
Platform Level OS Level Hardware Level
Energy Regulator
Observer Configurator Control Unit Profiles
Energy Policy
getValues() reconfigure()
EMPYA – Energy Regulator
Self-adapting system with continuous feedback loop
Monitor application performance Emit dynamic HW/SW reconfigurations
Energy-profile database
Configuration characteristics Workload-specific power values
Energy policies
Primary performance goal (e.g., throughput) Secondary performance goal (e.g., latency)
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 6
Platform Level OS Level Hardware Level
Energy Regulator
Observer Configurator Control Unit Profiles
Energy Policy
getValues() reconfigure()
ID Configuration Performance Power #Threads #Cores Cap Throughput Latency usage α 24 8 None 390.5 kOps/s 0.42 ms 51.2 W 70.4 kOps/s 0.37 ms 19.3 W λ 12 6 22 W 224.8 kOps/s 0.62 ms 22.0 W 50.5 kOps/s 0.25 ms 15.3 W ω 1 1 10 W 20.6 kOps/s 0.22 ms 10.0 W 15.1 kOps/s 0.21 ms 9.7 W
EMPYA – Energy Regulator
Self-adapting system with continuous feedback loop
Monitor application performance Emit dynamic HW/SW reconfigurations
Energy-profile database
Configuration characteristics Workload-specific power values
Energy policies
Primary performance goal (e.g., throughput) Secondary performance goal (e.g., latency)
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Approach 6
Platform Level OS Level Hardware Level
Energy Regulator
Observer Configurator Control Unit Profiles
Energy Policy
getValues() reconfigure() energy policy { application = key-value-store; throughput_min_ops_per_sec = 10k; throughput_priority = pri; latency_max_msec = 0.5; latency_priority = sec; }
ID Configuration Performance Power #Threads #Cores Cap Throughput Latency usage α 24 8 None 390.5 kOps/s 0.42 ms 51.2 W 70.4 kOps/s 0.37 ms 19.3 W λ 12 6 22 W 224.8 kOps/s 0.62 ms 22.0 W 50.5 kOps/s 0.25 ms 15.3 W ω 1 1 10 W 20.6 kOps/s 0.22 ms 10.0 W 15.1 kOps/s 0.21 ms 9.7 W
Evaluation – Evaluation Setup
Hardware
Client and server machines, switched 1 Gbps Ethernet Intel Xeon E3-1245 v3 & Xeon E3-1275 v5 processors 8 cores with Hyper-Threading enabled, 3.40 GHz Speed Step and TurboBoost enabled
Application classes
Use case A: Key–value store with mixed operations (get, set, exists) Use case B: MapReduce running single, different jobs
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Evaluation 7
Xeon Client machine Key1 Values1 Key2 Values2 Keyn Valuesn
. . . . . .
Data Map Shuffle Reduce Output
Evaluation – Use Case A: Key–Value Store
Staticperf vs. EMPYA Throughput as primary performance goal
60 120 180 240 300 360 420 100 200 300 400 Time [s] Throughput [kOps/s] Staticperf EMPYA 60 120 180 240 300 360 420 10 20 30 40 50 60 Time [s] Power [W] Staticperf EMPYA
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Evaluation 8
Evaluation – Use Case A: Key–Value Store
Staticperf vs. EMPYAlatency Throughput as primary and latency as secondary performance goal
60 120 180 240 300 360 420 0.5 1.0 1.5 2.0 Time [s] Latency [ms] Staticperf EMPYAdefault EMPYAlatency 60 120 180 240 300 360 420 10 20 30 40 50 60 Time [s] Power [W] Staticperf EMPYAdefault EMPYAlatency
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Evaluation 8
Evaluation – Use Case B: MapReduce
Staticenergy/Staticperf vs. EMPYA Performance goal: Specifying maximum execution-time penalties
1 Energy
Staticperf Staticenergy Empya5% Empya15% Empya30%
1 2 Runtime
1507.7 J 517.5 J 752.9 J 594.2 J 529.6 J 2982.5 J 1456.6 J 2124.9 J 1827.7 J 1815.1 J 2402.4 J 887.1 J 1537.8 J 1346.1 J 889.7 J 2527.7 J 843.9 J 1271.6 J 1244.3 J 947.1 J 1967.3 J 724.8 J 1328.0 J 923.4 J 817.9 J 643.0 J 242.4 J 423.3 J 397.9 J 367.5 J 1415.8 J 432.8 J 511.4 J 481.3 J 443.2 J 64.9 s 84.3 s 66.2 s 72.9 s 83.5 s 102.1 s 196.0 s 109.8 s 117.5 s 123.7 s 62.2 s 119.0 s 63.3 s 64.3 s 77.2 s 67.6 s 113.6 s 70.6 s 74.8 s 92.8 s 60.1 s 98.1 s 61.5 s 68.4 s 75.1 s 15.3 s 32.5 s 15.9 s 16.0 s 20.4 s 41.9 s 59.1 s 44.2 s 45.2 s 51.7 s
sort kmeans wc topn mean join
- verlap
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Evaluation 9
Conclusion
EMPYA
Self-adaptive middleware platform enforcing HW and SW reconfigurations Exploiting actors and operating-system functionality Power capping as an effective power- and energy-reduction measure
Key–value store: Up to 34 % less power demand MapReduce: Energy savings of 22–64 %
Future and ongoing work
Making decisions in a distributed manner for multiple machines Carefully increasing the configuration space → heterogeneity
- C. Eibel, C. Gulden, W. Schröder-Preikschat, and T. Distler
Strome: Energy-Aware Data-Stream Processing
In Proceedings of the 18th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2018), 2018.
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Conclusion 10
Conclusion
EMPYA
Self-adaptive middleware platform enforcing HW and SW reconfigurations Exploiting actors and operating-system functionality Power capping as an effective power- and energy-reduction measure
Key–value store: Up to 34 % less power demand MapReduce: Energy savings of 22–64 %
Future and ongoing work
Making decisions in a distributed manner for multiple machines Carefully increasing the configuration space → heterogeneity
- C. Eibel, C. Gulden, W. Schröder-Preikschat, and T. Distler
Strome: Energy-Aware Data-Stream Processing
In Proceedings of the 18th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS 2018), 2018.
IC2E ’18 Empya: Saving Energy in the Face of Varying Workloads Conclusion 10