DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID - - PowerPoint PPT Presentation

delivering high performance remote graphics with nvidia
SMART_READER_LITE
LIVE PREVIEW

DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID - - PowerPoint PPT Presentation

DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID VIRTUAL GPU Andy Currid NVIDIA WHAT YOULL LEARN IN THIS SESSION NVIDIA's GRID Virtual GPU Architecture What it is and how it works Using GRID Virtual GPU on Citrix


slide-1
SLIDE 1

DELIVERING HIGH-PERFORMANCE REMOTE GRAPHICS WITH NVIDIA GRID VIRTUAL GPU

Andy Currid NVIDIA

slide-2
SLIDE 2

WHAT YOU’LL LEARN IN THIS SESSION

  • NVIDIA's GRID Virtual GPU Architecture

— What it is and how it works

  • Using GRID Virtual GPU on Citrix XenS

erver

  • How to deliver great remote graphics from GRID Virtual GPU
slide-3
SLIDE 3

ENGINEER / DESIGNER KNOWLEDGE WORKER POWER USER

Workstation High-end PC Entry-level PC

WHY VIRTUALIZE?

slide-4
SLIDE 4

Desktop workstation Quadro GPU

  • Awesome performance!
  • High cost
  • Hard to fully utilize, limited mobility
  • Challenging to manage
  • Data security can be a problem

WHY VIRTUALIZE?

slide-5
SLIDE 5

Notebook or thin client Datacenter Desktop workstation Quadro GPU

… CENTRALIZE THE WORKSTATION

  • Awesome

performance!

  • Easier to fully

utilize, manage and secure

  • Even more

expensive!

Remote Graphics

slide-6
SLIDE 6

Notebook or thin client Datacenter GPU-enabled server Remote Graphics

Hypervisor

Virtual Machine

Guest OS

NVIDIA Driver Apps

Virtual Machine

Guest OS

NVIDIA Driver Apps Direct GPU access from guest VM Dedicated GPU per user

NVIDIA GRID GPU

… VIRTUALIZE THE WORKSTATION

Citrix XenServer VMware ESX Red Hat Enterprise Linux Open source Xen, KVM

slide-7
SLIDE 7

Notebook or thin client Datacenter

GPU-enabled server

Remote Graphics Hypervisor

NVIDIA GRID vGPU Hypervisor GRID Virtual GPU Manager

Virtual Machine

Guest OS

NVIDIA Driver Apps

Virtual Machine

Guest OS

NVIDIA Driver Apps Direct GPU access from guest VMs Physical GPU Management

… SHARE THE GPU

slide-8
SLIDE 8

GPU-enabled server

Hypervisor

NVIDIA GRID vGPU Hypervisor GRID Virtual GPU Manager VM 2

Guest OS

NVIDIA Driver Apps

VM 1

Guest OS

NVIDIA Driver Apps

NVIDIA GRID VIRTUAL GPU

  • S

tandard NVIDIA driver stack in each guest VM

— API compatibility

  • Direct hardware access

from the guest

— Highest performance

  • GRID Virtual GPU

Manager

— Increased manageability

slide-9
SLIDE 9

GPU-enabled server

Hypervisor

NVIDIA GRID vGPU Hypervisor GRID Virtual GPU Manager VM 2

Guest OS

NVIDIA Driver Apps

VM 1

Guest OS

NVIDIA Driver Apps

VIRTUAL GPU RESOURCE SHARING

3D CE NVENC NVDEC

Framebuffer

Timeshared Scheduling Channels

VM1 FB VM2 FB

GPU BAR

VM1 BAR VM2 BAR

  • Frame buffer

— Allocated at VM startup

  • Channels

— Used to post work to the GPU — VM accesses its channels via GPU Base Address Register (BAR), isolated by CPU’s Memory Management Unit (MMU)

  • GPU Engines

— Timeshared among VMs, like multiple contexts on single OS

CPU MMU

slide-10
SLIDE 10

GPU-enabled server

Hypervisor

NVIDIA GRID vGPU Hypervisor GRID Virtual GPU Manager VM 2

Guest OS

NVIDIA Driver Apps

VM 1

Guest OS

NVIDIA Driver Apps

VIRTUAL GPU ISOLATION

Framebuffer

GPU MMU

VM1 FB VM2 FB

  • GPU MMU controls access

from engines to framebuffer and system memory

  • vGPU Manager maintains

per-VM pagetables in GPU’s framebuffer

  • Valid accesses are routed to

framebuffer or system memory

  • Invalid accesses are blocked

VM1 pagetables VM2 pagetables

Translated DMA access to VM physical memory and FB Pagetable access Untranslated accesses 3D CE NVENC NVDEC

slide-11
SLIDE 11

GPU-enabled server

Hypervisor

NVIDIA GRID vGPU Hypervisor GRID Virtual GPU Manager VM 1

Guest OS

NVIDIA Driver Apps

VIRTUAL GPU DISPLAY

  • Virtual GPU exposes virtual display

heads for each VM

— E.g. 2 heads at 2560x1600 resolution

  • Primary surfaces (front buffers) for

each head are maintained in a VM’s framebuffer

  • Physical scanout to a monitor is

replaced by hardware delivery direct to system memory

3D NVENC

Framebuffer

VM1 FB

CE

Head 2 Head 1

slide-12
SLIDE 12

NVIDIA GRID REMOTE GRAPHICS SDK

  • Available on vGPU and

passthrough GPU

  • Fast readback of desktop or

individual render targets

  • Hardware H.264 encoder
  • Citrix XenDesktop
  • VMware View
  • NICE DCV
  • HP RGS

GRID GPU or vGPU

3D NVENC Framebuffer

Apps Apps Apps Graphics commands

NVIFR NVFBC Render Target Front Buffer

H.264 or raw streams Remote Graphics Stack Network

slide-13
SLIDE 13
  • Citrix XenS

erver

— First hypervisor to support GRID vGPU — Also supports GPU passthrough — Open source — Full tools integration for GPU — GRID certified server platforms

  • VMware vS

phere

— Coming soon!

XenServer

USING NVIDIA GRID vGPU

vSphere

slide-14
SLIDE 14

XENSERVER SETUP

  • Install XenS

erver

  • Install XenCenter management GUI on PC
  • Install GRID Virtual GPU Manager

rpm -i NVIDIA-vgx-xenserver-6.2-331.30.i386.rpm

slide-15
SLIDE 15
  • Citrix

XenCenter management GUI

  • Assignment of

virtual GPU,

  • r passthrough
  • f dedicated

GPU

ASSIGNING A vGPU TO A VIRTUAL MACHINE

slide-16
SLIDE 16
  • VM’s console

accessed through XenCenter

  • Install NVIDIA

guest vGPU driver

BOOT, INSTALL OF NVIDIA DRIVERS

slide-17
SLIDE 17
  • NVIDIA driver

now loaded, vGPU is fully

  • perational
  • Verify with

NVIDIA control panel

vGPU OPERATION

slide-18
SLIDE 18
  • Use a high performance remote graphics stack
  • Tune the platform for best graphics

performance

DELIVERING GREAT REMOTE GRAPHICS

slide-19
SLIDE 19
  • Platform basics
  • GPU selection
  • NUMA considerations

TUNING THE PLATFORM

slide-20
SLIDE 20

PLATFORM BASICS

  • Use sufficient CPU!

— Graphically intensive apps typically need multiple cores

  • Ensure CPUs can reach their highest clock speeds

— Enable extended P-states / TurboBoost in the system BIOS — S et XenS erver’s frequency governor to performance mode

xenpm set-scaling-governor performance /opt/xensource/libexec/xen-cmdline --set-xen cpufreq=xen:performance

  • Use sufficient RAM! - don’ t overcommit memory
  • Fast storage subsystem - local S

S D or fast NAS / S AN

slide-21
SLIDE 21

MEASURING UTILIZATION

  • nvidia-smi

command line utility

  • Reports GPU

utilization, memory usage, temperature, and much more

[root@xenserver-vgx-test2 ~]# nvidia-smi Mon Mar 24 09:56:42 2014 +------------------------------------------------------+ | NVIDIA-SMI 331.62 Driver Version: 331.62 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GRID K1 On | 0000:04:00.0 Off | N/A | | N/A 31C P0 20W / 31W | 530MiB / 4095MiB | 61% Default | +-------------------------------+----------------------+----------------------+ | 1 GRID K1 On | 0000:05:00.0 Off | N/A | | N/A 29C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +-------------------------------+----------------------+----------------------+ | 2 GRID K1 On | 0000:06:00.0 Off | N/A | | N/A 26C P0 15W / 31W | 270MiB / 4095MiB | 7% Default | +-------------------------------+----------------------+----------------------+ | 3 GRID K1 On | 0000:07:00.0 Off | N/A | | N/A 28C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +-------------------------------+----------------------+----------------------+ | 4 GRID K1 On | 0000:86:00.0 Off | N/A | | N/A 26C P0 19W / 31W | 270MiB / 4095MiB | 45% Default | +-------------------------------+----------------------+----------------------+ | 5 GRID K1 On | 0000:87:00.0 Off | N/A | | N/A 27C P0 15W / 31W | 10MiB / 4095MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 6 GRID K1 On | 0000:88:00.0 Off | N/A | | N/A 33C P0 19W / 31W | 270MiB / 4095MiB | 53% Default | +-------------------------------+----------------------+----------------------+ | 7 GRID K1 On | 0000:89:00.0 Off | N/A | | N/A 32C P0 19W / 31W | 270MiB / 4095MiB | 46% Default | +-------------------------------+----------------------+----------------------+

slide-22
SLIDE 22

MEASURING UTILIZATION

  • GPU

utilization graph in XenCenter

slide-23
SLIDE 23

ENGINEER / DESIGNER KNOWLEDGE WORKER POWER USER

GRID K2

2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU)

GRID K1

4 entry Kepler GPUs 768 CUDA cores (192 / GPU) 16GB DDR3 (4GB / GPU)

PICK THE RIGHT GRID GPU

slide-24
SLIDE 24

ENGINEER DESIGNER KNOWLEDGE WORKER POWER USER

GRID K2

2 high-end Kepler GPUs 3072 CUDA cores (1536 / GPU) 8GB GDDR5 (4GB / GPU)

GRID K200

256MB framebuffer 2 heads, 1920x1200

GRID K240Q

1GB framebuffer 2 heads, 2560x1600

GRID K260Q

2GB framebuffer 4 heads, 2560x1600

SELECT THE RIGHT VGPU

slide-25
SLIDE 25

KNOWLEDGE WORKER POWER USER

GRID K100

256MB framebuffer 2 heads, 1920x1200

GRID K140Q

1GB framebuffer 2 heads, 2560x1600

SELECT THE RIGHT vGPU

GRID K1

4 entry Kepler GPUs 768 CUDA cores (192 / GPU) 16GB DDR3 (4GB / GPU)

slide-26
SLIDE 26
  • Non-Uniform Memory Access
  • Memory and GPUs connected

to each CPU

  • CPUs connected via

proprietary interconnect

  • CPU/ GPU access to memory
  • n same socket is fastest
  • Access to memory on remote

socket is slower

CPU S

  • cket 0

Core Core Core Core

GPU GPU

Core Core PCI Express

Memory 0

CPU S

  • cket 1

Core Core Core Core

GPU GPU

Core Core PCI Express

Memory 1

CPU Interconnect

TAKE ACCOUNT OF NUMA

slide-27
SLIDE 27
  • VM pinned to CPU

socket by restricting its vCPUs to run only on that socket

  • xe vm-param-set

uuid=<vm-uuid> VCPUs-params:mask= 0,1,2,3,4,5

CPU S

  • cket 0

Core Core

GPU GPU

Core

Memory 0

CPU S

  • cket 1

Core Core Core Core

GPU GPU

Core Core PCI Express

Memory 1

Core Core Core

Virtual Machine

vCPU vCPU vCPU vCPU

PIN vCPUS TO SOCKETS

slide-28
SLIDE 28

CPU S

  • cket 0

Core Core

GRID K2 GRID K2

Core

Memory 0

CPU S

  • cket 1

Core Core Core Core

GRID K2 GRID K2

Core Core

Memory 1

Core Core Core

SELECTING A vGPU ON A SPECIFIC SOCKET

GPU 1 GPU 2 GPU 3 GPU 4 GPU 5 GPU 6 GPU 7 GPU 8

slide-29
SLIDE 29

GPU Group “ GRID K2” Allocation policy: depth first Physical GPUs:

  • XenS

erver manages physical GPUs by means of GPU groups

  • Default behavior: all physical

GPUs of same type are placed in one GPU group

  • GPU group allocation policy:

— Depth first: allocate vGPU on most loaded GPU — Breadth first: allocate vGPU on least loaded GPU CPU S

  • cket 0

Core Core

K2 K2

Core

Memory 0

CPU S

  • cket 1

Core Core Core Core

K2 K2

Core Core

Memory 1

Core Core Core

GPU GROUPS

GPU 1 GPU 2 GPU 3 GPU 4 GPU 5 GPU 6 GPU 7 GPU 8

slide-30
SLIDE 30

GPU Group “ GRID K2” Allocation policy: depth first Physical GPUs:

  • Default GPU group takes no

account of where a VM is running

  • Y
  • ur VM may end up using a

vGPU that’s allocated on a GPU on a remote CPU socket

CPU S

  • cket 0

Core Core Core

Memory 0

CPU S

  • cket 1

K2

Core Core Core

GPU GROUPS

GPU 7

Virtual Machine

vCPU vCPU vCPU vCPU

slide-31
SLIDE 31

GPU Group “ GRID K2 S

  • cket 1”

Allocation policy: breadth first Physical GPUs: GPU Group “ GRID K2 S

  • cket 0”

Allocation policy: breadth first Physical GPUs:

  • Create custom GPU groups

— Per socket, or per GPU for ultimate control

  • xe gpu-group-create

name-label= "GRID K2 Socket 0”

  • xe pgpu-param-set

uuid=<pgpu-uuid> gpu-group-uuid= <group-uuid>

  • xe gpu-group-param-set

uuid=<group-uuid> allocation-algorithm= breadth-first

CPU S

  • cket 0

Core Core

GRID K2 GRID K2

Core

Memory 0

CPU S

  • cket 1

Core Core Core Core

GRID K2 GRID K2

Core Core

Memory 1

Core Core Core

GPU GROUPS

GPU 1 GPU 2 GPU 3 GPU 4 GPU 5 GPU 6 GPU 7 GPU 8

slide-32
SLIDE 32
  • NVIDIA's GRID Virtual GPU Architecture
  • GRID Virtual GPU on Citrix XenS

erver

  • Remote graphics performance

WRAP UP

slide-33
SLIDE 33
  • NVIDIA GRID vGPU User Guide

— Included with GRID vGPU drivers — Visit http:/ / www.nvidia.com/ vgpu, look for driver download link

  • Citrix XenS

erver with 3D Graphics Pack

— Visit http:/ / www.citrix.com/ go/ vgpu

  • Qualified server platforms

— Visit http:/ / www.nvidia.com/ buygrid

RESOURCES

slide-34
SLIDE 34
  • Remote Graphics

— Citrix XenDesktop http:/ / www.citrix.com/ xendesktop — HP Remote Graphics S

  • ftware (RGS

) http:/ / www8.hp.com/ us/ en/ campaigns/ workstations/ remote- graphics-software.html — NICE Desktop Cloud Visualization (DCV) https:/ / www.nice-software.com/ products/ dcv

  • XenS

erver CPU performance tuning

— http:/ / www.xenserver.org/ partners/ developing-products-for- xenserver/ 19-dev-help/ 138-xs-dev-perf-turbo.html

RESOURCES

slide-35
SLIDE 35

THANK YOU!

  • NVIDIA GRID

Resources

GRID Website www.nvidia.com/ vdi S ign up for the monthly GRID VDI Newsletter http:/ / tinyurl.com/ gridinfo GRID Y

  • uTube Channel

http:/ / tinyurl.com/ gridvideos Questions? Ask on our Forums https:/ / gridforums.nvidia.com NVIDIA GRID on LinkedIn http:/ / linkd.in/ QG4A6u Follow us on Twitter @ NVIDIAGRID