UPDATES ON PROFESSIONAL VR & TURING VRWORKS Ingo Esser, Robert - - PowerPoint PPT Presentation

β–Ά
updates on professional vr
SMART_READER_LITE
LIVE PREVIEW

UPDATES ON PROFESSIONAL VR & TURING VRWORKS Ingo Esser, Robert - - PowerPoint PPT Presentation

UPDATES ON PROFESSIONAL VR & TURING VRWORKS Ingo Esser, Robert Menzel, 3/20/2019 Motivation VR SLI Multi-GPU Rendering AGENDA Multi-View Rendering (new in Turing) Variable Rate Shading (new in Turing) 2 MOTIVATION 3 GRAPHICS PIPELINE


slide-1
SLIDE 1

Ingo Esser, Robert Menzel, 3/20/2019

UPDATES ON PROFESSIONAL VR & TURING VRWORKS

slide-2
SLIDE 2 2

AGENDA

Motivation VR SLI – Multi-GPU Rendering Multi-View Rendering (new in Turing) Variable Rate Shading (new in Turing)

slide-3
SLIDE 3 3

MOTIVATION

slide-4
SLIDE 4 4

GRAPHICS PIPELINE

VR Workloads

2000 2200 2000 2160 3840

slide-5
SLIDE 5 5

GRAPHICS PIPELINE

VR Workloads

2000 2200 2000

249M Pix/s N vertices 30 Hz

(4K display)

792M Pix/s 2N vertices 90 Hz

(Vive Pro /w

  • versampling)

2160 3840

slide-6
SLIDE 6 6

GRAPHICS PIPELINE

VR Workloads

2000 2200 2000

249M Pix/s N vertices 30 Hz

(4K display)

792M Pix/s 2N vertices 90 Hz

(Vive Pro /w

  • versampling)

2160 3840

3x 6x Geometric Pipeline Rasterization Fragment Shader Application 6x Driver 6x

slide-7
SLIDE 7 7

GRAPHICS PIPELINE

VR Workloads

2000 2200 2000

249M Pix/s N vertices 30 Hz

(4K display)

792M Pix/s 2N vertices 90 Hz

(Vive Pro /w

  • versampling)

2160 3840

3x 6x Geometric Pipeline Rasterization Fragment Shader Application 6x Driver 6x VR SLI VRS b MVR

slide-8
SLIDE 8 9

HMD RESOLUTIONS

2013 to 2018

Pimax 8K

7680x2160

Pimax 5K

5120x1440

Vive Pro

2880x1600

Rift / Vive

2160x1200

DK1

1280x800 640x800 per eye

slide-9
SLIDE 9 10

NVIDIA VRWORKS

Comprehensive SDK for VR Developers

GRAPHICS SIMULATION VIDEO PROFESSIONAL HEADSET

slide-10
SLIDE 10 11

VR SLI SCALING & NVLINK

slide-11
SLIDE 11 12

VR SLI

Crash course Geometry Materials Left view data Right view data L R R

slide-12
SLIDE 12 13

VR SLI

Scaling 1 vs 2 GPUs

App Left App Right GPU L GPU R GPU L GPU R Copy

10ms π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• 𝑔𝑏𝑑𝑒𝑝𝑠 𝑔 = 2 βˆ— (𝑒 βˆ’ 𝑑) 𝑒 𝑔𝑠𝑏𝑛𝑓 𝑒𝑗𝑛𝑓 𝑒 = 10𝑛𝑑 π‘‘π‘π‘žπ‘§ 𝑒𝑗𝑛𝑓 𝑑 = 𝑔𝑠𝑏𝑛𝑓 𝑑𝑗𝑨𝑓 π‘’π‘ π‘π‘œπ‘‘π‘”π‘“π‘  π‘‘π‘žπ‘“π‘“π‘’

App Both

10ms

slide-13
SLIDE 13 14

GPU L GPU R

VR SLI

π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• 𝑔 = 2 βˆ— (𝑒 βˆ’ 𝑑) 𝑒 Typical render* resolution for Vive 1512 x 1680 (per eye) Copy time over PCIe3 (@10GB/s) ~1 ms Max scaling with 11ms frame time

2 βˆ—(10π‘›π‘‘βˆ’1𝑛𝑑) 10𝑛𝑑

= 1. 8

* Vive HMD runtime requests 1.4Β² larger resolution than display resolution

Max scaling determined by copy time

GPU L GPU R

10ms

C

9ms

slide-14
SLIDE 14 15

π‘‡π‘‘π‘π‘šπ‘—π‘œπ‘• 𝑔 = 2 βˆ— (𝑒 βˆ’ 𝑑) 𝑒 Typical render* resolution for Vive Pro 2016 x 2240 (per eye) Copy time over PCIe3 (@10GB/s) ~1.6 ms Max scaling with 11ms frame time

2 βˆ—(10π‘›π‘‘βˆ’1.6𝑛𝑑) 10𝑛𝑑

= 1. 68

* Vive HMD runtime requests larger resolution than display resolution

GPU L GPU R

VR SLI

Max scaling determined by copy time

GPU L GPU R

10ms

C

8.4ms

slide-15
SLIDE 15 16

VR SLI

Low-res HMDs show screen door effect HMDs increase resolutions to improve experience Vive Pro [Eye]: 1.6ms 1.68x Pimax 5K Plus: 2.5ms 1.5x

Higher resolutions limit scalability

1.8x Vive Vive Pro 1.68x Pimax 5K 1.5x

1 1.2 1.4 1.6 1.8 2 1 2 3 4 5 6 7 8 9

Max Scaling [ ] Display Resolution per Eye [M Pixels]

PCIe3x16

slide-16
SLIDE 16 17

Copy times can hurt scaling with higer resolutions Quadro RTX 6000 NVLINK: 50GB/s (100GB/s full duplex) Quadro RTX 5000 NVLINK: 25GB/s (50GB/s full duplex) NVLink is used automatically if present No bandwidth sharing with other traffic Independent of underlying hardware

VR SLI

Improve scaling using NVLink

slide-17
SLIDE 17 18

1 1.2 1.4 1.6 1.8 2 1 2 3 4 5 6 7 8 9

Max Scaling [ ] Display Resolution per Eye [M Pixels]

PCIe3x16 NVLINK 25 NVLINK 50

VR SLI

NVLINK outperforms PCIe easily Pimax 5K Plus: 2560 x 1440 PCIe3x16: 2.5ms 1.5x NVLINK 50: 0.7ms 1.86x Pimax 8K: 3840 x 2160 PCIe3x16: 6.1ms 0.79x NVLINK 50: 1.7ms 1.66x

NVLINK allows scaling with Hi-Res HMDs

Pimax 8K 1.66x Vive Pro 1.86x Pimax 5K 1.5x

slide-18
SLIDE 18 19

NVLINK

NVLINK is transparent – VR SLI automagically uses NVLINK if present nvidia-smi allows to print link information nvidia-smi nvlink

  • s

: Status

  • sc 0bz

: Set counter 0

  • r 0

: Reset counter 0

  • g 0

: Get value Location: $(ProgramFiles)\NVIDIA Corporation\NVSMI DCH system: $(windir)\system32

Side note: How to NVLINK

slide-19
SLIDE 19 20

NVLINK

NVML API (installed with CUDA SDK) allows to query NVLINK state & topology Enumerate devices, get PCI info, get number of links nvmlDeviceGetCount (&device_count); nvmlDeviceGetHandleByIndex (i, &device); nvmlDeviceGetPciInfo (device, &pci); getUInt (device, NVML_FI_DEV_NVLINK_LINK_COUNT, &numLinks); Get link state, speed, remote device PCI info (topology information) nvmlDeviceGetNvLinkState (device, j, &isActive); getUInt (device, NVML_FI_DEV_NVLINK_SPEED_MBPS_L0 + j, &speed); nvmlDeviceGetNvLinkRemotePciInfo (device, j, &pci); Additional API to query link capabilities, error/data counters, etc.

NVML API support

slide-20
SLIDE 20 21

NVLINK

NVLINK API is getting comparable functionality Enumerate devices NvAPI_EnumPhysicalGPUs ( NvPhysicalGpuHandle nvGPUHandle[NVAPI_MAX_PHYSICAL_GPUS], NvU32 *pGpuCount ); Get link number, speed, topology NvAPI_GPU_NVLINK_GetStatus ( NvPhysicalGpuHandle hPhysicalGpu, NVLINK_GET_STATUS* statusParams ); NVAPI also allows to query capabilities, error / data counters, etc.

NVAPI access – under development

slide-21
SLIDE 21 22

NVLINK

NVIDIA Quadro Control Panel Workstation

  • > View System Topology

NVLINK information

NVIDIA Quadro Control Panel – under development

slide-22
SLIDE 22 23

VR SLI OPENGL MULTICAST 2

slide-23
SLIDE 23 24

OPENGL VR SLI: MULTICAST 2

Command & data broadcast BufferSubData to specific GPU CopyImageSubData & CopyBufferSubData GPU-GPU Framebuffer Blit Global barrier & directed sync functions GPU Masks Per-GPU sample locations Per-GPU queries

Feedback on Multicast led to new functionality

Dynamic Multicast toggle (WGL_NV_multigpu_context) GPU_ID built-in in GLSL shader Per-GPU viewports & scissors Texture & Buffer upload mask Asynchronous copies

slide-24
SLIDE 24 25

New extension WGL_NV_multigpu_context: Request SLI mode per context No need to restart application Possible to share resources between contexts

MULTICAST 2

Data Context

Dynamic SLI mode

Geometry Materials Left view Right view Multicast Context

slide-25
SLIDE 25 26

New extension WGL_NV_multigpu_context: Request SLI mode per context No need to restart application Possible to share resources between contexts On toggle: Clean up per-GPU resources Keep scene data Alternate Frame Rendering (AFR)

MULTICAST 2

Data Context

Dynamic SLI mode

Geometry Materials Left view Right view Multicast Context AFR Context Frame i Frame i+1

slide-26
SLIDE 26 27

MULTICAST 2

Multicast v1 required per-GPU uploads Larger code changes in some renderers Add shader built-in: gl_DeviceIndex Upload all views to all GPUs Use per-GPU data in shaders Renderer can remain unchanged Just modify shaders instead

GPU ID built-in: gl_DeviceIndex

Geometry Materials Left view Right view Geometry Materials Views

slide-27
SLIDE 27 28

MULTICAST 2

Add new function to set viewports and scissors per GPU

glMulticastViewportArrayvNVX( ... ); glMulticastScissorArrayvNVX( ... );

Per-GPU Lens Matched Shading

Per-GPU Viewports & Scissors

slide-28
SLIDE 28 29

MULTICAST 2

Add new function to set viewports and scissors per GPU

glMulticastViewportArrayvNVX( ... ); glMulticastScissorArrayvNVX( ... );

Per-GPU Lens Matched Shading Per-GPU Multi Resolution Shading

Per-GPU Viewports & Scissors

slide-29
SLIDE 29 30

MULTICAST 2

Add new function to set viewports and scissors per GPU

glMulticastViewportArrayvNVX( ... ); glMulticastScissorArrayvNVX( ... );

Per-GPU Lens Matched Shading Per-GPU Multi Resolution Shading Easily set up Split Frame Rendering (SFR)

Per-GPU Viewports & Scissors

slide-30
SLIDE 30 31

MULTICAST 2

Multicast provides per-GPU buffer uploads Asymmetrical functionality wrt texture upload functions Add new mask function to modify texture & buffer uploads glUploadGpuMaskNVX( GLbitfield mask ); Useful for simpler per-GPU texture streaming Conserve PCIe bandwidth

Texture & Buffer Upload Mask

L R

slide-31
SLIDE 31 32

Multicast copies stall source GPU while copy takes place Easy to use because of implicit synchronization New copy functions do not stall, but also need more synchronization glAsyncCopyBufferSubDataNVX( ... ); glAsyncCopyImageSubDataNVX( ... ); Copy while both GPUs can continue rendering Allows for more complex rendering algorithms

MULTICAST 2

Asynchronous Copies

Copy GPU L1 GPU L2 GPU R1 GPU R2

slide-32
SLIDE 32 33

MULTICAST 2

Render shadow maps (SM) Start async copies of SMs to other GPU Render z-prepass per GPU & eye Wait for copy to finish Render output images

Asynchronous Copies – Use case

SM_0.. SM_N Z Left Z Right SM_0.. SM_i SM_i+1.. SM_N

slide-33
SLIDE 33 34

VR SLI + QUADRO SYNC

slide-34
SLIDE 34 35

QUADRO SYNC + VR SLI

Support for new hardware configurations

Use case CAVE systems Each node generates L / R image Scan out through Quad Buffered Stereo Perfect for VR SLI VR SLI + Quadro Sync + Quad Buffered Stereo supported with 418.81 and newer

QSync QSync QSync Stereo Out Stereo Out

slide-35
SLIDE 35 36

QUADRO SYNC + VR SLI + QBS

Synthetic Speed-Of-Light Benchmark

Frame time for 0..800 M triangles Render stereo, compare VR SLI on/off System performance nearly doubles: 16ms: 240M vs 125M triangles 32ms: 495M vs 250M triangles Stereo: Rendering scene twice per frame 2x Quadro RTX6000 + NVLINK: 480M triangles in 16ms

16.66 33.32 49.98 66.64 83.3 99.96 116.62 133.28 100 200 300 400 500 600 700 800

Frame Time [ms] Scene Size [M triangles]

SLI OFF SLI ON

slide-36
SLIDE 36 37

VR SLI VULKAN DEVICE GROUPS

slide-37
SLIDE 37 38

VR SLI

Vulkan provides VR SLI through the VK_KHR_device_group extension Similar per-GPU functionality Uploads Render commands GPU-GPU transfers

Vulkan - subsetAllocation

Geometry Materials Left view data Right view data

slide-38
SLIDE 38 39

VR SLI

Vulkan provides VR SLI through the VK_KHR_device_group extension Similar per-GPU functionality Uploads Render commands GPU-GPU transfers Upcoming support: Per-GPU memory allocations

Vulkan - subsetAllocation

Geometries 0 Materials 0 Geometries 1 Materials 1

slide-39
SLIDE 39 40

VR SLI

VR SLI covers a wide variety of workloads Almost perfect load balancing between left/right eye and two GPUs Copy overhead and view independent workloads limit scaling NVLink can help improve scaling OpenGL: GL_NV_gpu_multicast / GL_NVX_gpu_multicast2 Vulkan: VK_KHR_device_group (core in VK 1.1) DX11: NVAPI

Recap

VR SLI Geometric Pipeline Rasterization Fragment Shader Application Driver

slide-40
SLIDE 40 41

MULTI-VIEW RENDERING

slide-41
SLIDE 41 42

TWO PASS STEREO RENDERING

2 Full Geometry Passes

Left Eye (Pass 1) Right Eye (Pass 2)

slide-42
SLIDE 42 43

TWO PASS RENDERING

Workload in all steps of the pipeline double. Getting CPU bound fast, especially in CAD!

Mono to Stereo

2x 2x Geometric Pipeline Rasterization Fragment Shader Application 2x Driver 2x

slide-43
SLIDE 43 44

SINGLE-PASS-STEREO

1 Pass on Pascal

Left Eye Right Eye

slide-44
SLIDE 44 45

SINGLE-PASS-STEREO

Cut CPU time in half Cut VTG processing (nearly) in half No change in raster & shading DX: NVAPI Vulkan: VK_KHR_Multiview (core in VK 1.1) & VK_NVX_multiview_per_view_attributes OpenGL: GL_NV_stereo_view_rendering

Mono to Stereo

2x ~1x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver 1x

slide-45
SLIDE 45 46

SINGLE-PASS-STEREO

Two views only

Limitations

2x ~1x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver 1x

Display

βœ“ 

2 Displays per eye

slide-46
SLIDE 46 47



Canted displays (wide FoV)

SINGLE-PASS-STEREO

Two views only Only change X-coordinate

Limitations

2x ~1x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver 1x

Display Display

βœ“

slide-47
SLIDE 47 48

MULTI-VIEW RENDERING

Next Generation Single-Pass-Stereo

Left Eye Right Eye

slide-48
SLIDE 48 49

MULTI-VIEW RENDERING

Up to 4 arbitrary views in hardware. Up to 32 arbitrary views in software.

Turing

2x 1x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver 1x

slide-49
SLIDE 49 50

MULTI-VIEW RENDERING

Up to 32 arbitrary views in software. Still significant reduction in CPU overhead. Reduces number of code paths.

Pre-Turing

2x 2x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver >1x

slide-50
SLIDE 50 51

MULTI-VIEW RENDERING

DX11: NVAPI DX12: via View Instancing Vulkan: VK_KHR_Multiview (core in VK 1.1) OpenGL: GL_OVR_multiview & GL_OVR_multiview2

APIs

2x 2x Geometric Pipeline Rasterization Fragment Shader Application 1x Driver >1x

slide-51
SLIDE 51 52

MULTI-VIEW RENDERING

Non-VR Use-cases

Multiple Shadow Maps in one pass (multiple light sources, cascaded shadow maps etc.)

slide-52
SLIDE 52 53

MULTI-VIEW RENDERING

Render to multiple layers (just like Single-Pass-Stereo) Provide data for all views to Vertex Shader Handle view dependent operations via new built-in gl_ViewID_OVR Minimize number of varyings dependent on gl_ViewID_OVR!

Example: OpenGL

slide-53
SLIDE 53 54

MULTI-VIEW RENDERING

mat4 modelViewProjection = viewProjMatrix[gl_ViewID_OVR] * model; gl_Position = modelViewProjection * vertexPos;

Example: OpenGL

slide-54
SLIDE 54 55

MULTI-VIEW RENDERING

mat4 modelViewProjection = viewProjMatrix[0] * model; gl_Position = modelViewProjection * vertexPos; if (gl_ViewID_OVR == 1) { mat4 modelViewProjection2 = viewProjMatrix[1] * model; vec4 pos = modelViewProjection2 * vertexPos; gl_Position.x = pos.x; // hint that only X depends on the viewID to mimic SPS }

Example: OpenGL

slide-55
SLIDE 55 56

MULTI-VIEW RENDERING

Mesh Shaders can be used with Multi-View Rendering! But: not implicitly like Vertex/Tessellation/Geometry Shaders but explicitly in the Mesh Shader max 4 views

Turing Mesh Shaders

slide-56
SLIDE 56 57

MULTI-VIEW RENDERING

Turing Mesh Shaders

Mesh Shader:

  • ut gl_MeshPerVertexNV {

vec4 gl_Position; } gl_MeshVerticesNV[]; … gl_MeshVerticesNV[i].gl_Position = MVP * vertex;

Mesh Shader with explicit Multi-View Rendering

  • ut gl_MeshPerVertexNV {

perviewNV vec4 gl_PositionPerViewNV[]; } gl_MeshVerticesNV[]; … gl_MeshVerticesNV[i].gl_PositionPerViewNV[ v ] = MVP[ v ] * vertex;

slide-57
SLIDE 57 58

MULTI-VIEW RENDERING

Only apply to OpenGL! (Limitations come from GL_OVR_multiview/2) No multisampling No Geometry Shader No Tessellation Shader Weβ€˜re working on it!

Limitations

MVR Geometric Pipeline Rasterization Fragment Shader Application Driver

slide-58
SLIDE 58 59

MULTI-VIEW RENDERING

Reduces geometric load and CPU overhead More flexible than SPS Software fallback for pre-Turing GPUs Performance boost depends on number of view dependent attributes DX11: NVAPI | DX12: via View Instancing Vulkan: VK_KHR_Multiview (core in VK 1.1) OpenGL: GL_OVR_multiview & GL_OVR_multiview2

Recap

MVR Geometric Pipeline Rasterization Fragment Shader Application Driver

slide-59
SLIDE 59 60

VARIABLE RATE SHADING

slide-60
SLIDE 60 61

VARIABLE RATE SHADING

Motivation

High Resolution Medium Resolution Low Resolution Due to the lens distortion the image is warped before sending it to the HMD. Good opportunity to save unnecessary rendering work.

slide-61
SLIDE 61 62

RECAP: MAXWELL

Multi-Resolution Shading

High Resolution Low Resolution 9 Viewports 9 areas in which the resolution is constant

slide-62
SLIDE 62 63

RECAP: PASCAL

Lens Matched Shading

High Resolution Low Resolution 4 Viewports 4 areas in which the resolution gets reduced towards the corners

slide-63
SLIDE 63 64

NEW: TURING

Variable Rate Shading

High Resolution 1 Viewport Many small areas in which the shading rate is constant Medium Resolution Low Resolution

slide-64
SLIDE 64 65

COMPARING MRS, LMS, VRS

From our DX11 VRWorks Samples

slide-65
SLIDE 65 66

COMPARING MRS, LMS, VRS

From our DX11 VRWorks Samples

MRS LMS VRS

MRS LMS VRS

Density: 0.25 Coefficient: 2.0 Shading Rate: 4x4

slide-66
SLIDE 66 67

VARIABLE RATE SHADING

Pixel

Rasterization

slide-67
SLIDE 67 68

VARIABLE RATE SHADING

Pixel Sampling position

Rasterization

slide-68
SLIDE 68 69

VARIABLE RATE SHADING

Pixel Sampling position Pixels: Samples covered: F .Shader invocations*: 40 40 40

* (not counting helper threads)

Rasterization

slide-69
SLIDE 69 70

VARIABLE RATE SHADING

Pixel Sampling position Pixels: Samples covered: F .Shader invocations: 44 69 44

Multi Sampling Rasterization

slide-70
SLIDE 70 71

VARIABLE RATE SHADING

Pixels: Samples covered: F .Shader invocations: 44 69 44 Shading result stored for

  • ne sampling position

Shading result stored for two sampling position

Multi Sampling Rasterization

slide-71
SLIDE 71 72

VARIABLE RATE SHADING

Pixel Sampling position Fragment Shader Invocation

slide-72
SLIDE 72 73

VARIABLE RATE SHADING

slide-73
SLIDE 73 74

VARIABLE RATE SHADING

Shading result stored for

  • ne pixel

Shading result stored for two pixels Shading result stored for four pixels

slide-74
SLIDE 74 75

VARIABLE RATE SHADING

Pixels: Samples covered: F .Shader invocations: 40 40 14

slide-75
SLIDE 75 76

VARIABLE RATE SHADING

1x1 Shading Rate Pixels: Samples covered: F .Shader invocations: 477 477 477

slide-76
SLIDE 76 77

VARIABLE RATE SHADING

2x2 Shading Rate Pixels: Samples covered: F .Shader invocations: 477 477 128

slide-77
SLIDE 77 78

VARIABLE RATE SHADING

4x4 Shading Rate Pixels: Samples covered: F .Shader invocations: 477 477 42

slide-78
SLIDE 78 79

VARIABLE RATE SHADING

slide-79
SLIDE 79 80

VARIABLE RATE SHADING

1x1 Shading Rate 1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x2 Shading Rate 2x2 Shading Rate

slide-80
SLIDE 80 81

VARIABLE RATE SHADING

1x1 Shading Rate 1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x2 Shading Rate 2x2 Shading Rate

Shading Rate Lookup

slide-81
SLIDE 81 82

VARIABLE RATE SHADING 1 2 2 2

Framebuffer Shading Rate Image

1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate

1 2

Palette

2x4 Shading Rate

3 Shading Rate Lookup

slide-82
SLIDE 82 83

VARIABLE RATE SHADING 1 2 2 2

Framebuffer Shading Rate Image (8 bit integer)

1 2

Palette (16 entries)

3

1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x4 Shading Rate

Shading Rate Lookup

slide-83
SLIDE 83 84

VARIABLE RATE SHADING

1 2

Layered Framebuffer Shading Rate Image Array (8 bit integer) Per Viewport Palette (16 entries)

3

1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x4 Shading Rate 1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x4 Shading Rate 1x1 Shading Rate 4x4 Shading Rate 2x2 Shading Rate 2x4 Shading Rate

Shading Rate Lookup

slide-84
SLIDE 84 85

VARIABLE RATE SHADING

Shading Modes: GL_SHADING_RATE_

NO_INVOCATIONS_NV 1_INVOCATION_PER_2X2_PIXELS_NV 1_INVOCATION_PER_PIXEL_NV 1_INVOCATION_PER_2X4_PIXELS_NV 1_INVOCATION_PER_1X2_PIXELS_NV 1_INVOCATION_PER_4X2_PIXELS_NV 1_INVOCATION_PER_2X1_PIXELS_NV 1_INVOCATION_PER_4X4_PIXELS_NV

slide-85
SLIDE 85 86

VARIABLE RATE SHADING

Foveated Rendering

slide-86
SLIDE 86 87

VARIABLE RATE SHADING

Foveated Rendering

Foveation pattern in Shading Rate Image For layered rendering (e.g. Multi-View Rendering): Use texture array for SRI

slide-87
SLIDE 87 88

VARIABLE RATE SHADING

Foveated Rendering

Lens Matched With Eye Tracking

slide-88
SLIDE 88 89

VARIABLE RATE SHADING

Content Adaptive Shading Rate

slide-89
SLIDE 89 90

VARIABLE RATE SHADING

Content Adaptive Shading Rate

Two Viewports: Both span full framebuffer Each has own Shading Rate Palette Select matching viewport in VTG Shader

slide-90
SLIDE 90 91

VARIABLE RATE SHADING

Content Adaptive Shading Rate

Legend: Cold β†’ Finer Shading Hot β†’ Coarse Shading

Content-adaptive Super Sampling for Text

slide-91
SLIDE 91 92

VARIABLE RATE SHADING

So far: reduced shading rate Also possible: increase shading rate (where needed)

Increased Shading Rate

slide-92
SLIDE 92 93

VARIABLE RATE SHADING

Shading Modes: Multi-Sample Framebuffers

GL_SHADING_RATE_

2_INVOCATIONS_PER_PIXEL_NV 4_INVOCATIONS_PER_PIXEL_NV 8_INVOCATIONS_PER_PIXEL_NV

slide-93
SLIDE 93 94

VARIABLE RATE SHADING

Idea: Render to a MSAA buffer 1x shading for most of the scene (regular MSAA) GL_SHADING_RATE_X_INVOCATIONS_PER_PIXEL_NV for important objects or materials ( X: 2,4,8 )

Increased Shading Rate

(OpenGL) Sample from VRWorks

slide-94
SLIDE 94 95

VARIABLE RATE SHADING

Increased Shading Rate: Animated Material

VRS

MSAA

slide-95
SLIDE 95 96

VARIABLE RATE SHADING

Increased Shading Rate: Procedural Material

(OpenGL) Sample from VRWorks

slide-96
SLIDE 96 97

VARIABLE RATE SHADING

Increased Shading Rate: Procedural Material

VRS

MSAA

slide-97
SLIDE 97 98

VARIABLE RATE SHADING

Edge quality: MSAA Shading quality: MSAA OR like Super-Sampling (depending on requirement) Performance: Adjustable between MSAA and Super-Sampling

Increased Shading Rate

slide-98
SLIDE 98 99

VARIABLE RATE SHADING

Varying Extrapolation

slide-99
SLIDE 99 100

VARIABLE RATE SHADING

Varying Extrapolation

slide-100
SLIDE 100 101

VARIABLE RATE SHADING

Varying Extrapolation

slide-101
SLIDE 101 102

VARIABLE RATE SHADING

Varying Extrapolation

Varyings are interpolated in the Pixel center

slide-102
SLIDE 102 103

VARIABLE RATE SHADING

Varying Extrapolation

which means extrapolation for some (but just a small amount)

slide-103
SLIDE 103 104

VARIABLE RATE SHADING

Varying Extrapolation

unless they are defined as centroid

slide-104
SLIDE 104 105

VARIABLE RATE SHADING

Varying Extrapolation

slide-105
SLIDE 105 106

VARIABLE RATE SHADING

Varying Extrapolation

Varyings are interpolated in the coarse pixel center Significantly more extrapolation compared to MSAA: Use centroid to avoid artifacts!

slide-106
SLIDE 106 107

VARIABLE RATE SHADING

Varying Extrapolation

Varyings are interpolated in the coarse pixel center Significantly more extrapolation compared to MSAA: Use centroid to avoid artifacts!

slide-107
SLIDE 107 108

VARIABLE RATE SHADING

Reduces Fragment load Allows to tailor workload to needs Fine-grained control over shading rate Performance boost depends on shading complexity and triangle size DX11: NVAPI Vulkan: VK_NV_shading_rate_image OpenGL: GL_NV_shading_rate_image

Recap

VRS Geometric Pipeline Rasterization Fragment Shader Application Driver

slide-108
SLIDE 108 109

VARIABLE RATE SHADING

Recap

Foveated Rendering Content Adaptive Shading Lens Optimized Shading

slide-109
SLIDE 109 110

VR VILLAGE

Explore the VR Village to get hands-on with the latest advances in virtual reality

VR THEATER

Go to the VR Theater to see and experience narrated VR demos built by our partners

VR PARTNERS

Explore a great lineup of VR partners around the VR Village showcasing their groundbreaking technology

COME EXPLORE ALL THINGS VR AT GTC 2019

VR VILLAGE HOURS

Wednesday: 12:00pm - 7:00pm Thursday: 11:00am - 2:00pm

See More VR on the Exhibition Floor Expo Hall 3, Concourse Level

slide-110
SLIDE 110 111

TRY IT OUT!

NVIDIA VRWorks SDK provides OpenGL, Direct3D & Vulkan samples developer.nvidia.com/vrworks Upcoming Zerolight VR talk discussing MVR, VRS and VR SLI S9209 - Advances in Real-Time Automotive Visualisation – Thu, 11:00 – 11:50, Room 230A More detail in our previous GTC talks:

2018 – S8695 – NVIDIA VR Update 2017 – S7191 – Vulkan Technology Update 2016 – S6338 – VR Multi GPU Acceleration Featuring Autodesk VRED 2015 – S5668 – VR Direct: How NVIDIA Technology Is Improving The VR Experience

..and more information

slide-111
SLIDE 111