scheduling for virtualized
play

Scheduling for Virtualized Accelerator-based Systems Vishakha Gupta , - PowerPoint PPT Presentation

Pegasus: Coordinated Scheduling for Virtualized Accelerator-based Systems Vishakha Gupta , Karsten Schwan @ Georgia Tech Niraj Tolia @ Maginatics Vanish Talwar, Parthasarathy Ranganathan @ HP Labs USENIX ATC 2011 Portland, OR, USA Increasing


  1. Pegasus: Coordinated Scheduling for Virtualized Accelerator-based Systems Vishakha Gupta , Karsten Schwan @ Georgia Tech Niraj Tolia @ Maginatics Vanish Talwar, Parthasarathy Ranganathan @ HP Labs USENIX ATC 2011 – Portland, OR, USA

  2. Increasing Popularity of Accelerators 2007 2011 2008 2009 2010 • IBM Cell- • IBM Cell- • Increasing • Tegras in • Amazon EC2 based- cellphones based popularity of adopts GPUs Playstation RoadRunner NVIDIA GPUs • Keeneland • Tianhe-1A powered • CUDA and Nebulae desktops and programmab supercomput laptops le GPUs for ers in Top500 developers 2

  3. Example x86-GPU System PCIe 3

  4. Example x86-GPU System Proprietary NVIDIA Driver and CUDA runtime • Memory management • Communication with device • Scheduling logic • Binary translation PCIe 4

  5. Example x86-GPU System C-like CUDA-based applications (host portion) Proprietary NVIDIA Driver and CUDA runtime • Memory management • Communication with device • Scheduling logic • Binary translation PCIe 5

  6. Example x86-GPU System C-like CUDA-based applications CUDA Kernels (host portion) Proprietary NVIDIA Driver and CUDA runtime • Memory management • Communication with device • Scheduling logic • Binary translation PCIe 6

  7. Example x86-GPU System C-like CUDA-based applications CUDA Kernels (host portion) Proprietary NVIDIA Driver and CUDA runtime • Memory management • Communication with device • Scheduling logic • Binary translation PCIe Design flaw: Bulk of logic in drivers which were meant to be for simple operations like read, write and handle interrupts Shortcoming: Inaccessibility and one scheduling fits all 7

  8. Sharing Accelerators 2011 2010 • Tegras in cellphones • Amazon EC2 adopts GPUs • Other cloud offerings by • HPC GPU Cluster (Keeneland ) AMD, NVIDIA 8

  9. Sharing Accelerators 2011 2010 • Tegras in cellphones • Amazon EC2 adopts GPUs • Other cloud offerings by • HPC GPU Cluster (Keeneland ) AMD, NVIDIA • Most applications fail to occupy GPUs completely − With the exception of extensively tuned (e.g. supercomputing) applications 9

  10. Sharing Accelerators 2011 2010 • Tegras in cellphones • Amazon EC2 adopts GPUs • Other cloud offerings by • HPC GPU Cluster (Keeneland ) AMD, NVIDIA • Most applications fail to occupy GPUs completely − With the exception of extensively tuned (e.g. supercomputing) applications • Expected utilization of GPUs across applications in some domains “may” follow patterns to allow sharing 10

  11. Sharing Accelerators 2011 2010 • Tegras in cellphones • Amazon EC2 adopts GPUs • Other cloud offerings by • HPC GPU Cluster (Keeneland ) AMD, NVIDIA • Most applications fail to occupy GPUs completely − With the exception of extensively tuned (e.g. supercomputing) applications • Expected utilization of GPUs across applications in some domains “may” follow patterns to allow sharing Need for accelerator sharing: resource sharing is now supported in NVIDIA’s Fermi architecture Concern: Can driver scheduling do a good job? 11

  12. NVIDIA GPU Sharing – Driver Default • Xeon Quadcore with 2 8800GTX NVIDIA Max GPUs, driver 169.09, CUDA SDK 1.1 • Coulomb Potential 50% [CP] benchmark Median from parboil benchmark suite Min • Result of sharing two GPUs among four instances of the application 12

  13. NVIDIA GPU Sharing – Driver Default • Xeon Quadcore with 2 8800GTX NVIDIA Max GPUs, driver 169.09, CUDA SDK 1.1 • Coulomb Potential 50% [CP] benchmark Median from parboil benchmark suite Min • Result of sharing two GPUs among four instances of the application Driver can: efficiently implement computation and data interactions between host and accelerator Limitations: Call ordering suffers when sharing – any scheme used is static and cannot adapt to different system expectations 13

  14. Re-thinking Accelerator-based Systems 14

  15. Re-thinking Accelerator-based Systems • Accelerators as first class citizens − Why treat such powerful processing resources as devices? − How can such heterogeneous resources be managed especially with evolving programming models, evolving hardware and proprietary software? 15

  16. Re-thinking Accelerator-based Systems • Accelerators as first class citizens − Why treat such powerful processing resources as devices? − How can such heterogeneous resources be managed especially with evolving programming models, evolving hardware and proprietary software? • Sharing of accelerators − Are there efficient methods to utilize a heterogeneous pool of resources? − Can applications share accelerators without a big hit in efficiency? 16

  17. Re-thinking Accelerator-based Systems • Accelerators as first class citizens − Why treat such powerful processing resources as devices? − How can such heterogeneous resources be managed especially with evolving programming models, evolving hardware and proprietary software? • Sharing of accelerators − Are there efficient methods to utilize a heterogeneous pool of resources? − Can applications share accelerators without a big hit in efficiency? • Coordination across different processor types − How do you deal with multiple scheduling domains? − Does coordination obtain any performance gains? 17

  18. Pegasus addresses the urgent need for systems support to smartly manage accelerators. 18

  19. Pegasus addresses the urgent need for systems support to smartly manage accelerators. (Demonstrated through x86--NVIDIA GPU-based systems) 19

  20. Pegasus addresses the urgent need for systems support to smartly manage accelerators. (Demonstrated through x86--NVIDIA GPU-based systems) It leverages new opportunities presented by increased adoption of virtualization technology in commercial, cloud computing, and even high performance infrastructures. 20

  21. Pegasus addresses the urgent need for systems support to smartly manage accelerators. (Demonstrated through x86--NVIDIA GPU-based systems) It leverages new opportunities presented by increased adoption of virtualization technology in commercial, cloud computing, and even high performance infrastructures. (Virtualization provided by Xen hypervisor and Dom0 management domain) 21

  22. ACCELERATORS AS FIRST CLASS CITIZENS 22

  23. Manageability Extending Xen for Closed NVIDIA GPUs VM Management Domain (Dom0) Management Domain (Dom0) Traditional Device Linux Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Traditional Devices Traditional Devices 23

  24. Manageability Extending Xen for Closed NVIDIA GPUs VM Management Domain (Dom0) Management Domain (Dom0) Traditional Device Linux Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices 24

  25. Manageability Extending Xen for Closed NVIDIA GPUs VM Management Domain (Dom0) Management Domain (Dom0) Traditional Runtime + Device Linux GPU Driver Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices 25

  26. Manageability Extending Xen for Closed NVIDIA GPUs VM Management Domain (Dom0) Management Domain (Dom0) GPU Application CUDA API Traditional Runtime + Device Linux GPU Driver Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices NVIDIA’s CUDA – Compute Unified Device Architecture for managing GPUs 26

  27. Manageability Extending Xen for Closed NVIDIA GPUs VM Management Domain (Dom0) Management Domain (Dom0) GPU Application GPU CUDA API Backend GPU Frontend Traditional Runtime + Device Linux GPU Driver Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices NVIDIA’s CUDA – Compute Unified Device Architecture for managing GPUs 27

  28. Manageability Extending Xen for Closed NVIDIA GPUs VM VM Management Domain (Dom0) Management Domain (Dom0) GPU Application GPU Application GPU CUDA API CUDA API Backend GPU Frontend GPU Frontend Traditional Runtime + Device Linux Linux GPU Driver Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices NVIDIA’s CUDA – Compute Unified Device Architecture for managing GPUs 28

  29. Manageability Extending Xen for Closed NVIDIA GPUs VM VM Management Domain (Dom0) Management Domain (Dom0) GPU Application GPU Application Mgmt GPU CUDA API CUDA API Extension Backend GPU Frontend GPU Frontend Traditional Runtime + Device Linux Linux GPU Driver Drivers Hypervisor (Xen) Hypervisor (Xen) General purpose multicores General purpose multicores Compute Accelerators (NVIDIA GPUs) Compute Accelerators (NVIDIA GPUs) Traditional Devices Traditional Devices NVIDIA’s CUDA – Compute Unified Device Architecture for managing GPUs 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend