April 4-7, 2016 | Silicon Valley
Erik Bohnhorst, GRID Performance Architect, 04/04/2016
UNVEILING THE IMPACT OF TIME SLICING WITH NVIDIA GRID Erik - - PowerPoint PPT Presentation
April 4-7, 2016 | Silicon Valley UNVEILING THE IMPACT OF TIME SLICING WITH NVIDIA GRID Erik Bohnhorst, GRID Performance Architect, 04/04/2016 GRID vGPU resource sharing (Simplified) Time Slicing works AGENDA Using benchmarks to evaluate
April 4-7, 2016 | Silicon Valley
Erik Bohnhorst, GRID Performance Architect, 04/04/2016
2
GRID vGPU resource sharing (Simplified) Time Slicing works Using “benchmarks” to evaluate GRID Impact of realistic recommendations
3
Tesla GPU (simplified view) vGPU-1 vGPU-2 vGPU-n
Graphics Compute Video Encode Video Decode Copy Engine
GPU Engines
Framebuffer
VM-1 FB VM-2 FB VM-n FB
… Each vGPU is assigned a fixed range of framebuffer for its exclusive use
t=1 t=2 t=16
GPU engines (Graphics/Compute, Video Encode, Video Decode and Copy Engine) are time sliced and can execute in parallel Each vGPU has exclusive access to the entire engine during its time slice (all CUDA cores)
4
Framebuffer
Framebuffer usage in MB Scores
2Q,4Q and 8Q vGPU profile perform equally (Sufficient framebuffer + Entire time of the 3D Engine)
5
6
JT2GO – Zooming/Panning/Rotating
7
Panning Rotating
GPU seems to be constantly utilized during zooming, panning and rotating
8
Lots of unused time in between spikes GPU CPU
9
NVIDIA uses ultra high-end GPUs with GRID for maximum available time Time a different process/virtual machine can use the GPU Time a different process/virtual machine can use the GPU
10
NVIDIA uses ultra high-end GPUs with GRID for maximum available time
11
NVIDIA uses ultra high-end GPUs with GRID for maximum available time
12
GPU heavy process (zooming)
Human workflow
Human workflow
VP12 Catia viewset
Benchmark
Synthetic workload
13
Unrealistic users per host recommendation using benchmarks
Active Active
Performance
Active
Realistic users per host recommendation using human workflows
Performance Time
Idle Idle Idle Active Idle
Time
Idle Idle Idle Active Active Active Active Active Active Idle Idle Idle Active Active Active Active
Time
14
Realistic users per host (UPH) recommendations can only be generated with human workflows. Evaluate GRID by monitoring human workflows by working with a small group of real end users on GRID vGPU.
Monitor Configure/ Change Run
15
“BENCHMARKS” REAL USERS Cost per Server
$30,000 $40,000
Users per Host
8 16
Software Costs
Per User Per User
Cost per User (Server Hardware)
$3,750 $2,500
“Cost per User” drops ”significantly” when evaluating GRID with real users.
16
Time Slicing the 3D Engine allows sharing based on actual need for great performance at scale. Leveraging benchmarks results in unrealistic recommendations and too low user per host recommendations. Too low user per host recommendations create unrealistic TCO/ROI assumptions Too high TCO/ROI assumptions could delay/kill the project
April 4-7, 2016 | Silicon Valley
JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join