Evaluation of the HPC Challenge Benchmarks in Virtualized - - PowerPoint PPT Presentation
Evaluation of the HPC Challenge Benchmarks in Virtualized - - PowerPoint PPT Presentation
Evaluation of the HPC Challenge Benchmarks in Virtualized Environments Vince Weaver ICL Lunch Talk 8 July 2011 VHPC11 Paper 6th Workshop on Virtualization in High-Performance Cloud Computing Piotr Luszczek, Eric Meek, Shirley Moore, Dan
VHPC’11 Paper
6th Workshop on Virtualization in High-Performance Cloud Computing Piotr Luszczek, Eric Meek, Shirley Moore, Dan Terpstra, Vince Weaver, Jack Dongarra
1
Traditional HPC
AB ↓ ↓ C
2
Cloud-based HPC
AB ↓ ↓ C
3
Cloud Tradeoffs
Pros
- No AC bill
- No electricity bill
- No need to spend $$$
- n infrastructure
Cons
- Unexpected outages
- Data held hostage
- Infrastructure
not designed for HPC
4
Measuring Performance in the Cloud
First let’s just measure runtime This is difficult because in virtualized environments ❼ ♦ ✶ Time Loses All Meaning ↕ ❖ ✶
5
Simplified Model of Time Measurement
Hardware Operating System Application Time
6
Then the VM gets involved
Hardware Time Application Operating System VM Layer
7
Then you have multiple VMs
Hardware Time VM Layer App. ? ? OS1 OS2 OS2 OS1
8
So What Can We Do?
Hope we have exclusive access and measure wall-clock time.
9
Measuring Time Externally
- Ideally have local hardware access, root, and hooks into
the VM system
- Otherwise, you can sit there with a watch
- Danciu et al. send UDP packet to remote server
- Most of these are not possible in a true “cloud” setup
10
Measuring Time From Within Guest
- Use gettimeofday() or clock gettime()
- This might be the only interface we have
- How bad can it be?
11
Our Experimental Setup
- 8-core Core i7, (dual 4-core 2.93GHz Xeon X5570)
- VMware Player 3.1.4, VirtualBox 4.0.8, KVM 2.6.35
- HPC Challenge Benchmarks, Open MPI
- Time
measured by gettimeofday() invoked by MPI Wtime()
12
Accuracy Drift
- Typical development model is to re-run app over and over
again with slight changes while monitoring performance
- In virtualized environment,
factors inherent in the virtualization might change runtime run to run more than any optimization tuning
13
Ascending vs Descending – HPL
Bare metal showed no difference
10 20 30 40 50 60 70 2000 4000 6000 8000 10000 12000 14000 16000 18000 Percentage difference Matrix size VMware Player VirtualBox KVM
14
Performance Results
We use a relative metric, defined as: performanceVM performancebare metal × 100%
15
HPL – Low OS/Communication Overhead
5000 10000 15000 Problem Size 20 40 60 80 100 % Of Bare Metal Performance VMware Player VirtualBox KVM
16
MPIRandomAccess – High OS/Communication Overhead
18 20 22 24 26 28 Log of Problem Size 20 40 60 80 100 % Of Bare Metal Performance VMware Player VirtualBox KVM
17
Conclusion
- Virtualization
exacerbates the existing problem
- f
accurate performance measurement
- Different workloads can stress the VM layer in drastically
different ways
- Extra care needs to be taken to generate repeatable
results
18
Future Work
- Validate internal time measurements with external ones
- More analysis of sources of VM overhead
- Performance of larger systems with off-node network
activity
19
Future Work – PAPI-V
- “Improved” timer support. Direct wall-clock access?
- Virtualized performance counters
- Components for the virtualized hardware:
Network Interfaces, etc.
20
Questions?
vweaver1@eecs.utk.edu
21