S7281: Device Lending: Dynamic Sharing of GPUs in a PCIe Cluster
Jonas Markussen PhD student Simula Research Laboratory
S7281: Device Lending: Dynamic Sharing of GPUs in a PCIe Cluster - - PowerPoint PPT Presentation
S7281: Device Lending: Dynamic Sharing of GPUs in a PCIe Cluster Jonas Markussen PhD student Simula Research Laboratory Outline Motivation PCIe Overview Non-Transparent Bridges Device Lending Distributed applications may need
Jonas Markussen PhD student Simula Research Laboratory
Front-end Interconnect
. . . . . . . . . . . .
Compute node Compute node Compute node
… … …
implementation
Front-end Logical view of resources
… … … … … … . . .
Handled in software
Local Remote
Application Application
Local resource Remote resource using middleware
CUDA library + driver CUDA – middleware integration PCIe IO bus CUDA driver Interconnect transport (RDMA) Interconnect transport (RDMA) Middleware service/daemon Interconnect Middleware service PCIe IO bus
External PCIe cable PCIe interconnect switch RAM Memory bus PCIe bus PCIe interconnect host adapter PCIe IO device CPU and chipset Interconnect switch
Local Remote
Application
Local resource
CUDA library + driver PCIe IO bus PCIe IO bus PCIe-based interconnect
Remote resource over native fabric
Application CUDA library + driver PCIe IO bus
PCI-SIG. PCI Express 3.1 Base Specification, 2010. http://www.eetimes.com/document.asp?doc_id=1259778
5 10 15 20 25 30 35
Gen 2 Gen 3 Gen 4
Gigabytes per second (GB/s)
RAM PCIe device PCIe device PCIe device CPU and chipset
RAM PCIe device PCIe device PCIe device CPU and chipset Address space
0x00000… 0xFFFFF…
IO device IO device IO device
Interrupt vecs
RAM
0xfee00xxx
RAM
PCIe NTB adapter CPU and chipset PCIe NTB adapter CPU and chipset Address space NTB Local host Remote host Local RAM
Local Remote 0xf000 0x9000 . . . . . .
NTB addr mapping RAM
NTB-based interconnect Global addr space
Addr space in B Addr space in C Local RAM Local IO devices Global addr space A’s addr space Local RAM Local IO devices C’s addr space Exported address range Addr space in A
RAM NTB adapter 0xe000 CPU and chipset NTB adapter 0x1000 CPU and chipset RAM Physical device 0xb000 Inserted device 0x2000
Device driver Remote Local 0xb000 0x2000 . . . . . .
NTB addr mapping Owner Borrower PCIe hot-plug
RAM NTB adapter 0xe000 CPU and chipset NTB adapter 0x1000 CPU and chipset RAM Physical device 0xb000 Inserted device 0x2000
Device driver Local Remote 0xf000 0x5000 . . . . . .
NTB addr mapping Owner Borrower
dma_addr = dma_map_page(0x9000);
IOMMU
IOV Phys 0x5000 0x9000 . . . . . . Use addr 0xf000
Local Remote
Application
Borrowed remote resource
CUDA library + driver PCIe IO bus PCIe IO bus PCIe NTB interconnect
Unmodified local driver (with hot-plug support) Resource appears local to OS, driver, and app Hardware mappings ensure fast data path Works with any PCIe device (even individual SR-IOV functions)
Local Remote
Application Application
Borrowed remote resource Remote resource using middleware
CUDA library + driver CUDA – middleware integration CUDA driver Interconnect transport (RDMA) Interconnect transport (RDMA) Middleware service/daemon Interconnect Middleware service PCIe IO bus PCIe IO bus PCIe IO bus PCIe NTB interconnect
Local Remote
Application Application
Borrowed remote resource Local resource
CUDA library + driver PCIe IO bus PCIe IO bus PCIe NTB interconnect CUDA library + driver PCIe IO bus
2 4 6 8 10 12 14
4 KB 8 KB 16 KB 32 KB 64 KB 128 KB 256 KB 512 KB 1 MB 2 MB 4 MB 8 MB 16 MB
Gigabytes per second (GB/s) Transfer size
bandwidthTest (Local) bandwidthTest (Borrowed) PXH830 DMA (GPUDirect RDMA)
https://github.com/Dolphinics/cuda-rdma-bench
GPU: Quadro P400 Nvidia driver: Version 375.26 (Centos 7) CPU: Xeon E5-1630 3.7 GHz Memory: DDR4 2133 MHz
GPU SSD SSD SSD NIC FPGA GPU GPU GPU GPU CPU + chipset RAM CPU + chipset RAM CPU + chipset RAM NTB NTB NTB SSD SSD SSD NIC FPGA
Task B Task A
GPU GPU GPU SSD FPGA NIC GPU SSD SSD GPU GPU GPU
Task C
Examination room Examination room Server room
http://mlab.no/blog/2016/12/eir/
“Device Lending in PCI Express Networks” ACM NOSSDAV 2016 “Efficient Processing of Video in a Multi Auditory Environment using Device Lending of GPUs” ACM Multimedia Systems 2016 (MMSys’16) “PCIe Device Lending” University of Oslo 2015
My email address Selected publications