Boostin Boosting g Perf erfor
- rmance
mance and Ear and Earnings nings
- f
- f Cloud Computing
Cloud Computing Deplo Deployments yments with with rCUD rCUDA
Federico Silla
Universitat Politècnica de València Spain
Boostin Boosting g Perf erfor ormance mance and Ear and - - PowerPoint PPT Presentation
Boostin Boosting g Perf erfor ormance mance and Ear and Earnings nings of Cloud Computing of Cloud Computing Deplo Deployments yments with with rCUD rCUDA Federico Silla Universitat Politcnica de Valncia Spain Outline 1.
Boostin Boosting g Perf erfor
mance and Ear and Earnings nings
Cloud Computing Deplo Deployments yments with with rCUD rCUDA
Federico Silla
Universitat Politècnica de València Spain
GPU Technology Conference 2017
2/33
Outline
GPU Technology Conference 2017
3/33
Outline
GPU Technology Conference 2017
4/33
Using CUDA GPUs from virtual machines
virtual machine?
GPU Technology Conference 2017
5/33
to a virtual machine
Using CUDA GPUs from virtual machines
GPU Technology Conference 2017
6/33
be larger than the amount of GPUs present in the host
Using CUDA GPUs from virtual machines
virtual machines ≤ GPUs
GPU Technology Conference 2017
7/33
share the GPU in the host
Using CUDA GPUs from virtual machines
GPU Technology Conference 2017
8/33
Outline
GPU Technology Conference 2017
9/33
rCUDA … CUDA … they sound similar
GPU Technology Conference 2017
10/33
Basics of GPU computing
GPU GPU
Basic behavior of CUDA
GPU Technology Conference 2017
11/33
GPU GPU
Basics of GPU computing
GPU Technology Conference 2017
12/33
rCUDA … remote CUDA
A software technology that enables a more flexible use of GPUs in computing facilities
rCUDA is a development by Universitat Politècnica de València, Spain
GPU Technology Conference 2017
13/33
Basics of rCUDA
rCUDA is a development by Universitat Politècnica de València, Spain
GPU Technology Conference 2017
14/33
Basics of rCUDA
rCUDA is a development by Universitat Politècnica de València, Spain
GPU Technology Conference 2017
15/33
Physical configuration Logical configuration
rCUDA GPU virtualization envision
rCUDA allows a new vision of a GPU deployment, moving from
the usual cluster configuration: to the following one:
Interconnection Network
Network
GPU
PCIe
CPU CPU
RAM RAM Network
GPU
PCIe
CPU CPU
RAM RAM Network
GPU
PCIe
CPU CPU
RAM RAM Network
GPU
PCIe
CPU CPU
RAM RAM
node n node 2 node 3 node 1 RAM RAM RAM RAM
Logical connections
node n node 2 node 3 node 1
Interconnection Network
Network
CPU CPU
RAM RAM Network
CPU CPU
RAM RAM Network
CPU CPU
RAM RAM Network PCIe PCIe PCIe PCIe
CPU CPU
RAM RAM
GPU RAM GPU RAM GPU RAM GPU RAM
GPU Technology Conference 2017
16/33
Performance of applications using rCUDA
CUDA and rCUDA
Lower is better
GPU Technology Conference 2017
17/33
Performance of applications using rCUDA
EDR InfiniBand and P100 GPU CUDA-MEME BarraCUDA Lower is better Lower is better
GPU Technology Conference 2017
18/33
Why the good performance of rCUDA?
The low overhead of applications using rCUDA is due to:
remote GPU than CUDA does to the local GPU
than in CUDA
“Ideas Are Easy, Implementation Is Hard”
Guy Kawasaki, marketing specialist and Silicon Valley venture capitalist
GPU Technology Conference 2017
19/33
Example of performance with P2P copies
rCUDA model CUDA model
rCUDA scenario 1 rCUDA scenario 2
rCUDA provides the same semantics as CUDA
GPU Technology Conference 2017
20/33
Higher is better
Example of performance with P2P copies
rCUDA scenario 2
GPU Technology Conference 2017
21/33
Outline
GPU Technology Conference 2017
22/33
allows the use of more than one GPU at the host
Using rCUDA to access the GPU
Low performance network fabric available
KVM KVM
may be placed in the native domain and the rCUDA client would be placed inside the VMs
hypervisor would be used to exchange data between the rCUDA clients and the rCUDA server
GPU Technology Conference 2017
23/33
Using rCUDA to access the GPU
High performance network fabric available
KVM KVM
available, the rCUDA server can be placed in another node
VMs, either in a single remote node or in several remote nodes
GPU Technology Conference 2017
24/33
Application performance with KVM
LAMMPS CUDA-MEME CUDASW++ GPU-BLAST FDR InfiniBand + K20 !!
GPU Technology Conference 2017
25/33
Outline
GPU Technology Conference 2017
26/33
CUDA approach
GPU Technology Conference 2017
27/33
rCUDA approach
two GPUs can be either in the same host or in other computer
GPU Technology Conference 2017
28/33
Performance comparison
possible of one of the 4 following applications:
machines
Sharing GPUs among applications increases the overall amount of executed jobs
GPU Technology Conference 2017
29/33
Outline
GPU Technology Conference 2017
30/33
Conclusions
modified in order to use rCUDA
GPUs are not shared is not significantly reduced
increased when GPUs are shared among virtual machines
GPU Technology Conference 2017
31/33
Get a free copy of rCUDA at
http://www http://www.r .rcuda.net cuda.net
@rcuda_
More than 800 requests world wide
rCUDA is a development by Universitat Politècnica de València, Spain
GPU Technology Conference 2017
32/33
Get a free copy of rCUDA at
http://www http://www.r .rcuda.net cuda.net
@rcuda_
More than 800 requests world wide
Jaime Sierra Pablo Higueras Carlos Reaño Javier Prades Tony Díaz
rCUDA is a development by Universitat Politècnica de València, Spain
GPU Technology Conference 2017
33/33
rCUDA is a development by Universitat Politècnica de València, Spain