Virtualization to Improve Compute Node Performance Brian Kocoloski - - PowerPoint PPT Presentation

virtualization to improve
SMART_READER_LITE
LIVE PREVIEW

Virtualization to Improve Compute Node Performance Brian Kocoloski - - PowerPoint PPT Presentation

Department of Computer Science Better than Native: Using Virtualization to Improve Compute Node Performance Brian Kocoloski Jack Lange Department of Computer Science University of Pittsburgh 6/29/2012 1 Department of Computer Science


slide-1
SLIDE 1

Department of Computer Science

Better than Native: Using Virtualization to Improve Compute Node Performance

Brian Kocoloski Jack Lange Department of Computer Science University of Pittsburgh

1

6/29/2012

slide-2
SLIDE 2

Department of Computer Science

Linux is becoming the dominant supercomputing OS …

2

Source: http://en.wikipedia.org/wiki/File:Operating_systems_used_on_top_500_supercomputers.svg

slide-3
SLIDE 3

Department of Computer Science

… but some applications need less overhead

3

  • Lightweight Kernels (LWKs) provide low overhead

access to hardware

  • Q: How do we provide LWKs to applications that need

them, but not to those that don’t?

  • A: Virtualization
  • Applications running in a virtual environment can
  • utperform the same applications running natively
slide-4
SLIDE 4

Department of Computer Science

  • Memory Management

– Biggest problem – Widely recognized as a source of overhead

  • OS Noise

– HPC apps are tightly synchronized – Timing is a big deal

  • Non-technical

Drawbacks of Linux

4

slide-5
SLIDE 5

Department of Computer Science

  • ZeptoOS

– “Big Memory” – Memory is statically sized, allocated at boot time – Compatibility

  • Cray’s CNL

– HugeTLBfs – Maximum of 2MB-sized memory regions available

Disadvantages of Current Schemes

5

slide-6
SLIDE 6

Department of Computer Science

Our Approach

6

VMM layer Lightweight Kernel Needs LWK? HPC Application Yes No Linux Compute Node OS

Hardware

Linux Compute Node OS

Hardware

Application Completes Linux Compute Node OS

Hardware

slide-7
SLIDE 7

Department of Computer Science

  • OS-independent embeddable virtual machine monitor
  • Strip resources away from host OS
  • Low noise, low overhead memory management
  • Developed at Northwestern University, University of

New Mexico, and University of Pittsburgh

  • Open source and freely available

Palacios

http://www.v3vee.org/palacios

7

slide-8
SLIDE 8

Department of Computer Science

  • Lightweight Kernel from Sandia National Labs
  • Moves resource management as close to application as

possible

  • Mostly Linux-compatible user environment
  • Modern code-base with Linux-like organization
  • Open source and freely available

Kitten

http://software.sandia.gov/trac/kitten

8

slide-9
SLIDE 9

Department of Computer Science

System Architecture

9

slide-10
SLIDE 10

Department of Computer Science

System Architecture

10

slide-11
SLIDE 11

Department of Computer Science

System Architecture

11

slide-12
SLIDE 12

Department of Computer Science

Palacios’ Approach

12

Linux Management Linux Management Linux Management Offlined Palacios Management

  • Memory Management

– Bypass the Linux memory management strategies completely, at run time

  • OS Noise

– Control when the Linux scheduler is able to run – Take advantage of tickless host kernel

slide-13
SLIDE 13

Department of Computer Science

  • Two part evaluation:

1. Microbenchmarks – Stream, Selfish 2. Miniapplications – HPCCG, pHPCCG

  • Evaluation is preliminary

1. Currently limited to a single node running a commodity Fedora 15 kernel 2. Environments are not fully optimized

Evaluation

13

slide-14
SLIDE 14

Department of Computer Science

  • Two 6-core processors and 16 GB memory

– NUMA design

  • Kitten VM was configured with 1 GB of memory
  • Stream, HPCCG used OpenMP for shared memory

and ran 10 times

Environment

14

slide-15
SLIDE 15

Department of Computer Science

  • Palacios provides ~400 MB/s better memory performance on average

than Linux (4.74%)

  • 0.34 GB/s lower standard deviation on average

Stream

15

slide-16
SLIDE 16

Department of Computer Science

Linux Kitten

Selfish Detour

16

slide-17
SLIDE 17

Department of Computer Science

Selfish Detour

Linux Virtualized Kitten

17

slide-18
SLIDE 18

Department of Computer Science

  • Performs the conjugate gradient method to solve a

system of linear equations represented by a sparse matrix

  • Workload similar to that of many HPC

applications

  • Separate experiments to represent both CPU and

memory intensive workloads

HPCCG

18

slide-19
SLIDE 19

Department of Computer Science

HPCCG – CPU intensive

19

slide-20
SLIDE 20

Department of Computer Science

HPCCG – memory intensive

20

slide-21
SLIDE 21

Department of Computer Science

  • Extend to actual Cray hardware with a CNL host

– Show definitively if this approach can work

  • Explore the possibility that this approach can be

deployed in a cloud setting to provide virtual HPC environments on commodity clouds

– Previously infeasible, due to the contention, noise, etc. – Problems we think can be solved by the same techniques used in this work

Future Work

21

slide-22
SLIDE 22

Department of Computer Science

  • Palacios is capable of providing superior

performance to native Linux

  • Palacios can provide a low noise environment,

even when running on a noisy Linux host

  • While results are preliminary, they show that this

approach is feasible at small scales

Conclusions

22

slide-23
SLIDE 23

Department of Computer Science

  • Palacios: http://www.v3vee.org/palacios
  • Kitten: https://software.sandia.gov/trac/kitten
  • Email: briankoco@cs.pitt.edu

jacklange@cs.pitt.edu

Acknowledgments

23