PROFESSIONAL VR: AN UPDATE Robert Menzel, Ingo Esser GTC 2018, - - PowerPoint PPT Presentation

β–Ά
professional vr an update
SMART_READER_LITE
LIVE PREVIEW

PROFESSIONAL VR: AN UPDATE Robert Menzel, Ingo Esser GTC 2018, - - PowerPoint PPT Presentation

PROFESSIONAL VR: AN UPDATE Robert Menzel, Ingo Esser GTC 2018, March 26 2018 NVIDIA VRWORKS Comprehensive SDK for VR Developers GRAPHICS TOUCH & PHYSICS HEADSET AUDIO PROFESSIONAL VIDEO 2 NVIDIA VRWORKS ADOPTION ENGINES HEADSETS


slide-1
SLIDE 1

Robert Menzel, Ingo Esser GTC 2018, March 26 2018

PROFESSIONAL VR: AN UPDATE

slide-2
SLIDE 2

2

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-3
SLIDE 3

3

NVIDIA VRWORKS ADOPTION

SOME PRO APPLICATIONS HEADSETS ENGINES

slide-4
SLIDE 4

4

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-5
SLIDE 5

5

CAD DATA IN VR

slide-6
SLIDE 6

6

GRAPHICS PIPELINE

VR Workloads

1512 1680 1512

124M Pix/s N vertices 60 Hz 457M Pix/s 2N vertices 90 Hz

1080 1920

~3.6x 3x Preprocessing Geometric Pipeline Rasterization Fragment Shader Postprocessing Application 3x

slide-7
SLIDE 7

7

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-8
SLIDE 8

8

SINGLE PASS STEREO

Render eyes separately Doubles CPU and GPU load

Traditional Rendering

slide-9
SLIDE 9

9

SINGLE PASS STEREO

Single Pass Stereo uses Simultaneous Multi-Projection architecture Draw geometry only once Vertex/Geometry stage runs once Outputs two positions for left/right Only rasterization is performed per-view

Using SPS to improve rendering performance

slide-10
SLIDE 10

10

SINGLE PASS STEREO

In OpenGL via GL_NV_stereo_view_rendering Create texture array for rendering left and right eye simultaneously No other changes needed, shaders perform SPS

OpenGL

slide-11
SLIDE 11

11

SINGLE PASS STEREO

Calculate projection space position proj_pos = proj * view * model * inPosition; Output both positions via different builtin variables, only x component may differ gl_Position = proj_pos + vec4(offset, 0, 0, 0); gl_SecondaryPositionNV = proj_pos – vec4(offset, 0, 0, 0); Use declaration and value of gl_Layer to route output to layers 0 and 1 of tex array layout(secondary_view_offset=1) out highp int gl_Layer; gl_Layer = 0;

OpenGL - Vertex Shader

slide-12
SLIDE 12

12

SINGLE PASS STEREO

In Vulkan via VK_NVX_multiview_per_view_attributes Create layered texture image and view for rendering left and right simultaneously Requires MultiView support Update: VK_KHX_multiview ratified to VK_KHR_multiview in Vulkan 1.1

Vulkan

slide-13
SLIDE 13

13

SINGLE PASS STEREO

Calculate projection space position proj_pos = (proj * view * model * inPosition).xyz; Standard MultiView – specify once, may execute shader twice gl_Position = proj_pos + UBO.offsets[gl_ViewIndex]; With per-view attributes - also specify positions explicitly, execute shader only once gl_PositionPerViewNV[0] = proj_pos + UBO.offsets[0]; gl_PositionPerViewNV[1] = proj_pos + UBO.offsets[1];

Vulkan - Vertex Shader

slide-14
SLIDE 14

14

Single Pass Stereo: Benefits in geometry bound scenarios Heavy fragment shaders will reduce scaling

7.1 7.2 6.7 6.8 3.7 4.5 Flat shading + Phong Traditional MultiView MultiView with per-view attributes 7.1 7.2 7.2 6.7 6.8 6.9 3.7 4.5 4.9 Flat shading + Phong + Noise Traditional MultiView MultiView with per-view attributes

GRAPHICS PIPELINE

Single Pass Stereo Performance Results

NVIDIA Quadro P6000, Scene with 17.6M faces, frame times in ms

7.1 6.7 3.7 Flat shading Traditional MultiView MultiView with per-view attributes

SPS Preprocessing Geometric Pipeline Rasterization Fragment Shader Postprocessing Application SPS

slide-15
SLIDE 15

15

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-16
SLIDE 16

16

VR SLI

Crash course Geometry Materials Left view data Right view data L R R

slide-17
SLIDE 17

17

VR SLI

Scaling 1 vs 2 GPUs

App Left App Right GPU L GPU R App Both GPU L GPU R Copy

Time: GPU L + GPU R Time: GPU + Copy π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• = 2 βˆ— 𝐻𝑄𝑉 𝐻𝑄𝑉 + π·π‘π‘žπ‘§

slide-18
SLIDE 18

18

VR SLI

π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• = 2 βˆ— 𝐻𝑄𝑉 𝐻𝑄𝑉 + π·π‘π‘žπ‘§ Typical render resolution for Vive 1512 x 1680 (per eye) Copy time over PCIe (@6GB/s) 1.5ms Max scaling with 11ms frame time

2βˆ—9.5𝑛𝑑 9.5𝑛𝑑+1.5𝑛𝑑 = 1.72

Scaling determined by workload and copy time

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Scaling factor Workload (ms)

9.5ms x1.72

slide-19
SLIDE 19

19 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Scaling Factor Copy time (ms)

VR SLI

π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• = 2 βˆ— 𝐻𝑄𝑉 𝐻𝑄𝑉 + π·π‘π‘žπ‘§ Vive Pro render resolution 2016 x 2240 Copy time over PCIe (@6GB/s) 2.8ms Max scaling with 11ms frame time

2βˆ—8.2𝑛𝑑 8.2𝑛𝑑+2.8𝑛𝑑 = 1.49

Higher resolutions limit scalability

slide-20
SLIDE 20

20

VR SLI

Improve scaling using NVLink

Copy times can hurt scaling with higer resolutions NVLink on dual Quadro GP100: 4x faster than PCIe 3.0 Copy time for Vive Pro (2016 x 2240): 0.7ms Max scaling with 11ms frame time

2βˆ—10.3𝑛𝑑 10.3𝑛𝑑+0.7𝑛𝑑 = 1.87

NVLink is used automatically if present

NVLINK speed measured with 2 bridges, copy via OpenGL multicast, single frame of HTC Vive, on HP z840 workstation

slide-21
SLIDE 21

21

VR SLI

Upcoming HMDs can improve scaling

Some upcoming HMDs have one display cable per eye SLI system: Plug each cable into one GPU Eliminate inter-GPU copies by presenting on each GPU Needs support from the VR runtime: near future Working on 4-GPU configuration Two pairs of GPUs connected via NVLink One Multicast context spanning the configuration Split Frame Rendering (SFR), NVLink copies

HMD image courtesy of Starbreeze

slide-22
SLIDE 22

22

OPENGL & VULKAN

slide-23
SLIDE 23

23

OPENGL MULTICAST 2

Command & data broadcast BufferSubData to specific GPU CopyImageSubData & CopyBufferSubData GPU-GPU Framebuffer Blit Global barrier & directed sync functions GPU Masks Per-GPU sample locations Per-GPU queries

Feedback on Multicast led to new functionality

Dynamic Multicast toggle (WGL_NV_multigpu_context) GPU_ID built-in in GLSL shader Per-GPU viewports & scissors Texture & Buffer upload mask Asynchronous copies

slide-24
SLIDE 24

24

New extension WGL_NV_multigpu_context: Request SLI mode per context No need to restart application Possible to share resources between contexts

OPENGL MULTICAST 2

Data Context

Dynamic SLI mode

Geometry Materials Left view Right view Multicast Context

slide-25
SLIDE 25

25

New extension WGL_NV_multigpu_context: Request SLI mode per context No need to restart application Possible to share resources between contexts On toggle: Clean up per-GPU resources Keep scene data Alternate Frame Rendering (AFR)

OPENGL MULTICAST 2

Data Context

Dynamic SLI mode

Geometry Materials Left view Right view Multicast Context AFR Context Frame i Frame i+1

slide-26
SLIDE 26

26

MULTICAST 2

Multicast v1 required per-GPU uploads Larger code changes in some renderers Add shader built-in Upload all views to all GPUs Use per-GPU data in shaders Renderer can remain unchanged Just modify shaders instead

GPU ID built-in

Geometry Materials Left view Right view Geometry Materials Views

slide-27
SLIDE 27

27

MULTICAST 2

Add new function to set viewports and scissors per GPU Per-GPU Lens Matched Shading

Per-GPU Viewports & Scissors

slide-28
SLIDE 28

28

MULTICAST 2

Add new function to set viewports and scissors per GPU Per-GPU Lens Matched Shading Per-GPU Multi Resolution Shading

Per-GPU Viewports & Scissors

slide-29
SLIDE 29

29

MULTICAST 2

Add new function to set viewports and scissors per GPU Per-GPU Lens Matched Shading Per-GPU Multi Resolution Shading Easily set up Split Frame Rendering (SFR)

Per-GPU Viewports & Scissors

slide-30
SLIDE 30

30

MULTICAST 2

Multicast provides per-GPU buffer uploads Asymmetrical functionality wrt texture upload functions Add new mask function to modify texture & buffer uploads glUploadGpuMaskNVX( GLbitfield mask ); Useful for simpler per-GPU texture streaming Conserve PCIe bandwidth

Texture & Buffer Upload Mask

L R

slide-31
SLIDE 31

31

MULTICAST 2

Multicast copies stall source GPU while copy takes place Easy to use because of implicit synchronization New copy functions do not stall, but also need more synchronization glAsyncCopyBufferSubDataNV(…) glAsyncCopyImageSubDataNV(…) Copy while both GPUs can continue rendering Allows for more complex rendering algorithms

Asynchronous Copies

Copy GPU L1 GPU L2 GPU R1 GPU R2

slide-32
SLIDE 32

32

MULTICAST 2

Render shadow maps (SM) Start async copies of SMs to other GPU Render z-prepass per GPU & eye Wait for copy to finish Render output images

Asynchronous Copies – Use case

SM_0.. SM_N Z Left Z Right SM_0.. SM_i SM_i+1.. SM_N

slide-33
SLIDE 33

33

VR SLI

Vulkan

Update: VK_KHX_device_group ratified to VK_KHR_device_group with Vulkan 1.1 Make sure to use the right extension/Vulkan version combination! Usage is the same, so migration is painless

Render L R R Geometry Materials Left view Right view

Display

L

slide-34
SLIDE 34

34

VR SLI

VR SLI covers a wide variety of workloads Almost perfect load balancing between left/right eye and two GPUs Copy overhead and view independent workloads limit scaling NVLink can help improve scaling Dual-input HMDs can eliminate copy overhead

Recap

VR SLI Preprocessing Geometric Pipeline Rasterization Fragment Shader Postprocessing Application

slide-35
SLIDE 35

35

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-36
SLIDE 36

36

HMD OPTICS

Countering Lens Distortion User’s View Displayed Image Optics

slide-37
SLIDE 37

37

HMD RENDERING

Oversampling near the borders

Displayed Image Rendered Image

slide-38
SLIDE 38

38

LENS MATCHED SHADING

Four Viewports

Original Image LMS Image

slide-39
SLIDE 39

39

In OpenGL via GL_NV_clip_space_w_scaling extension Set up four viewports, rendering full resolution Set scissors to each quadrant glScissorArray(0, 4, scissors); W scaling parameters glViewportPositionWScaleNV(i, Wx, Wy);

Viewport 0 Scissor 0

LENS MATCHED SHADING

OpenGL

slide-40
SLIDE 40

40

In Vulkan via VK_NV_clip_space_w_scaling extension Set up four viewports, rendering full resolution Set scissors to each quadrant VkPipelineViewportWScalingStateCreateInfoNV W scaling parameters: Use the viewport struct / set on creation Dynamic state & vkCmdSetViewportWScalingNV

Viewport 0 Scissor 0

LENS MATCHED SHADING

Vulkan

slide-41
SLIDE 41

41

LENS MATCHED SHADING

Extreme example, Wx = 2.0 Wy = 2.0

slide-42
SLIDE 42

42

GRAPHICS PIPELINE

LMS can improve performance of Raster / Fragment stage Trade-off between quality and performance

Lens Matched Shading Results

LMS Preprocessing Geometric Pipeline Rasterization Fragment Shader Postprocessing Application

slide-43
SLIDE 43

43

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS HEADSET AUDIO TOUCH & PHYSICS PROFESSIONAL VIDEO

slide-44
SLIDE 44

44

KHRONOS

Cross-Platform VR Standard

slide-45
SLIDE 45

45

OPEN XR

Version 1.0 vision Application VR run-time A HMD A VR run-time B VR run-time C HMD B Mobile HMD

slide-46
SLIDE 46

46

OPEN XR

Longterm vision Application VR run-time A AR goggles VR run-time B VR run-time C HMD CAVE

slide-47
SLIDE 47

47

MORE INFO @ GTC

More talks @ GTC 2018 https://developer.nvidia.com/vrworks-gtc https://developer.nvidia.com/designworks-gtc Connect with the Experts CE8141 - VR: GL, DX & VK - Mon, March 26 2-3PM CE8116 - VRWorks - Tue, March 27 4-5PM

slide-48
SLIDE 48

48

TRY IT OUT!

NVIDIA VRWorks SDK provides OpenGL, Direct3D & Vulkan samples developer.nvidia.com/vrworks More detail in our previous GTC talks: 2017 - S7191 – Vulkan Technology Update 2016 - S6338 - VR Multi GPU Acceleration Featuring Autodesk VRED 2015 - S5668 - VR Direct: How NVIDIA Technology Is Improving The VR Experience

..and more information

slide-49
SLIDE 49

THANK YOU!