Who We Are Nathan Reed NVIDIA DevTech 2 yrs Previously: game - - PowerPoint PPT Presentation

who we are
SMART_READER_LITE
LIVE PREVIEW

Who We Are Nathan Reed NVIDIA DevTech 2 yrs Previously: game - - PowerPoint PPT Presentation

VR Direct: How NVIDIA Technology Is Improving the VR Experience Nathan Reed Developer Technology Engineer, NVIDIA Dario Sancho Lead Programmer, Crytek gameworks.nvidia.com | GDC 2015 Who We Are Nathan Reed NVIDIA DevTech 2 yrs


slide-1
SLIDE 1

gameworks.nvidia.com | GDC 2015

Nathan Reed — Developer Technology Engineer, NVIDIA Dario Sancho — Lead Programmer, Crytek

VR Direct: How NVIDIA Technology

Is Improving the VR Experience

slide-2
SLIDE 2

gameworks.nvidia.com | GDC 2015

Nathan Reed

NVIDIA DevTech — 2 yrs Previously: game graphics programmer at Sucker Punch

Dario Sancho

Crytek — 2 ½ yrs Previously: academy, system and platform programming

Who We Are

slide-3
SLIDE 3

gameworks.nvidia.com | GDC 2015

Headset design Input Rendering performance Experience design

Hard Problems of VR

slide-4
SLIDE 4

gameworks.nvidia.com | GDC 2015

Latency

Scott W. Vincent

Motion to photons in ≤ 20 ms

Franklin Heijnen

slide-5
SLIDE 5

gameworks.nvidia.com | GDC 2015

Stereo Rendering

Two eyes, same scene

slide-6
SLIDE 6

gameworks.nvidia.com | GDC 2015

Various NV hardware & software technologies Targeted at VR rendering performance

Reduce latency Accelerate stereo rendering

What Is VR Direct?

slide-7
SLIDE 7

gameworks.nvidia.com | GDC 2015

VR Direct Components

In This Talk

Asynchronous Timewarp VR SLI

slide-8
SLIDE 8

gameworks.nvidia.com | GDC 2015

Latency

Frame Queuing Timewarp Late-Latching Constants Asynchronous Timewarp

slide-9
SLIDE 9

gameworks.nvidia.com | GDC 2015

Frame Queuing

CPU queue GPU Scanout Time

Frame N Frame N+1 … Frame N Frame N+1 Frame N−1 … Frame N … Frame N−1 Frame N+1 Frame N … Frame N−1 Frame N … Frame N−1 Frame N+1 …

slide-10
SLIDE 10

gameworks.nvidia.com | GDC 2015

Frame Queuing

CPU GPU Scanout Time

Frame N Frame N+1 … Frame N Frame N−1 Frame N … Frame N−1 Frame N+1

slide-11
SLIDE 11

gameworks.nvidia.com | GDC 2015

Timewarp

slide-12
SLIDE 12

gameworks.nvidia.com | GDC 2015

Very effective at reducing latency...of rotation!

Fortunately, that’s the most important

Doesn’t help translation! Doesn’t help other input latency Doesn’t help if vsync is missed

Timewarp Pros & Cons

slide-13
SLIDE 13

gameworks.nvidia.com | GDC 2015

Asynchronous Timewarp

CPU GPU Scanout Timewarp Time

Frame N Frame N … Frame N Frame N−1 Frame N+1 Frame N+1 …

Vsync

Vsync

slide-14
SLIDE 14

gameworks.nvidia.com | GDC 2015

GPU

Space Vs Time

Time GPU Resources (Space)

slide-15
SLIDE 15

gameworks.nvidia.com | GDC 2015

Main Rendering

Space-Multiplexing

Timewarp

Time GPU Resources (Space) Vsync Vsync

slide-16
SLIDE 16

gameworks.nvidia.com | GDC 2015

Time-Multiplexing

Main Rendering Time GPU Resources (Space) Vsync

Timewarp Timewarp

Vsync

slide-17
SLIDE 17

gameworks.nvidia.com | GDC 2015

NV driver supports high-priority graphics context

Time-multiplexed — takes over entire GPU

Main rendering → normal context Timewarp rendering → high-pri context

High-Priority Context

slide-18
SLIDE 18

gameworks.nvidia.com | GDC 2015

Async Timewarp With High-Pri Context

Render thread GPU Warp thread Time

Frame N Frame N … Frame N+1 Frame N+1 …

Vsync Preempt Vsync Preempt

slide-19
SLIDE 19

gameworks.nvidia.com | GDC 2015

Fermi, Kepler, Maxwell: draw-level preemption Can only switch at draw call boundaries!

Long draw will delay context switch

Future GPU: finer-grained preemption

Preemption

slide-20
SLIDE 20

gameworks.nvidia.com | GDC 2015

NvAPI_D3D1x_HintCreateLowLatencyDevice() Applies to next D3D device created Fermi, Kepler, Maxwell / Windows Vista+ NDA developer driver available now

Direct3D High-Priority Context

slide-21
SLIDE 21

gameworks.nvidia.com | GDC 2015

EGL_IMG_context_priority Adds priority attribute to eglCreateContext Available on Tegra K1, X1

Including SHIELD console

Only for EGL (Android) at present

WGL (Windows), GLX (Linux) to come

OpenGL High-Priority Context

slide-22
SLIDE 22

gameworks.nvidia.com | GDC 2015

Still try to render at headset native framerate! Async timewarp is a safety net

Hide occasional hitches / perf drops Not for upsampling framerate

Developer Guidance

slide-23
SLIDE 23

gameworks.nvidia.com | GDC 2015

Avoid long draw calls

Current GPUs only preempt at draw call boundaries Async timewarp can get stuck behind long draws

Split up draws that take >1 ms or so

E.g. heavy postprocessing Split into screen-space tiles

Developer Guidance

slide-24
SLIDE 24

gameworks.nvidia.com | GDC 2015

Reduce queued frames to 1 Timewarp: adjusts rendered image for late head rotation Async timewarp: safety net for missed vsync NVIDIA enables async timewarp via high-pri context

Latency TL;DR

slide-25
SLIDE 25

gameworks.nvidia.com | GDC 2015

Stereo Rendering

Multiview Rendering VR SLI

slide-26
SLIDE 26

gameworks.nvidia.com | GDC 2015

Frame Pipeline

Which stages must be done twice for stereo? Find visible objects Submit render commands Driver internal work

CPU

Transform geometry Rasterization Shading

GPU

slide-27
SLIDE 27

gameworks.nvidia.com | GDC 2015

Flexibility vs Optimizability

More flexible — all stages separate Left Right

slide-28
SLIDE 28

gameworks.nvidia.com | GDC 2015

Flexibility vs Optimizability

More optimizable — some stages shared Left Right Shared

slide-29
SLIDE 29

gameworks.nvidia.com | GDC 2015

Almost the same visible objects Almost the same render commands Almost the same driver internal work Almost the same geometry rendered

Stereo Views

slide-30
SLIDE 30

gameworks.nvidia.com | GDC 2015

Cubemaps: 6 faces Shadow maps

Several lights in one scene Slices of a cascaded shadow map

Light probes for GI

Many probe positions in one scene

Other Multi-View Scenarios

slide-31
SLIDE 31

gameworks.nvidia.com | GDC 2015

Submit scene render commands once All draws, states, etc. broadcast to all views API support for limited per-view state Saves CPU rendering cost Maybe GPU too — depending on impl!

Multiview Rendering

slide-32
SLIDE 32

gameworks.nvidia.com | GDC 2015

Shader Multiview

API VS Tess & GS VS Tess & GS Rast PS Rast PS ViewID = 0 ViewID = 1

slide-33
SLIDE 33

gameworks.nvidia.com | GDC 2015

Hardware Multiview

API VS Tess & GS Rast PS Rast PS ViewMatrix[0] ViewMatrix[1]

slide-34
SLIDE 34

gameworks.nvidia.com | GDC 2015

VR SLI

API Left Right Shared command stream

slide-35
SLIDE 35

gameworks.nvidia.com | GDC 2015

Interlude: AFR SLI

CPU GPU0 GPU1 Scanout Time

… N N−2 N+1 N−1 N+2 N+3 … N N+1 N+2 … … N N+1 N+2 N−1 …

slide-36
SLIDE 36

gameworks.nvidia.com | GDC 2015

VR SLI

CPU GPU0 GPU1 Scanout

… N left N−2 L N N+1 N+2 … N N+1 N+2 N−1 … N+1 L … N right N−2 R N+1 R …

Time

slide-37
SLIDE 37

gameworks.nvidia.com | GDC 2015

VR SLI

API Engine

Per-GPU state:

Constant buffers Viewports

slide-38
SLIDE 38

gameworks.nvidia.com | GDC 2015

VR SLI

Blit GPU1→GPU0 over PCIe bus

slide-39
SLIDE 39

gameworks.nvidia.com | GDC 2015

View-independent work (e.g. shadow maps) is duplicated Scaling depends on proportion of view-dependent work

VR SLI Scaling

slide-40
SLIDE 40

gameworks.nvidia.com | GDC 2015

Currently D3D11 only Fermi, Kepler, Maxwell / Windows 7+ Developer driver available now OpenGL and other APIs: to come

API Availability

slide-41
SLIDE 41

gameworks.nvidia.com | GDC 2015

Teach your engine the concept of a “multiview set”

Related views that will be rendered together

Currently: for (each view) find_objects(); for (each object) update_constants(); render();

Developer Guidance

slide-42
SLIDE 42

gameworks.nvidia.com | GDC 2015

Multiview: find_objects(); for (each object) for (each view) update_constants(); render();

Developer Guidance

slide-43
SLIDE 43

gameworks.nvidia.com | GDC 2015

Keep track of which render targets store stereo data

May need to be marked or set up specially Or allocated as a texture array, etc.

Keep track of sync points

Where you need all views finished before continuing May need to blit between GPUs

Developer Guidance

slide-44
SLIDE 44

gameworks.nvidia.com | GDC 2015

Multiview: submit scene once, save CPU overhead

Requires some engine integration

Range of possible implementations

Trade off flexibility vs optimizability

VR SLI: a GPU per eye

Stereo Rendering TL;DR

slide-45
SLIDE 45

gameworks.nvidia.com | GDC 2015

Variety of VR-related APIs coming in near future Reduce latency

Reduced frame queuing Enable async timewarp & other improvements

Accelerate stereo rendering

Multiview APIs VR SLI

VR Direct Recap

slide-46
SLIDE 46

gameworks.nvidia.com | GDC 2015

Fermi, Kepler, Maxwell D3D11: context priorities and VR SLI

NDA developer driver available now

Android: EGL_IMG_context_priority Other APIs/platforms: to come

VR Direct API Availability

slide-47
SLIDE 47

gameworks.nvidia.com | GDC 2015

All this stuff is hot out of the oven! Will need more iterations before it settles

See what works, revise APIs as needed Consolidate & standardize across industry

What Next?

slide-48
SLIDE 48

gameworks.nvidia.com | GDC 2015

How VR Is Shaping CryEngine

slide-49
SLIDE 49

gameworks.nvidia.com | GDC 2015

As Developer

Focus on results: Best possible VR demo for GDC 2015 (presence, interaction, performance…) Focus on the platform to be shown Short development time

As Technology Provider

Solid implementation Multiplatform and support for multiple head set vendors Focus on performance and seamless integration for users

Our VR Challenges

slide-50
SLIDE 50

gameworks.nvidia.com | GDC 2015

Presence

Convincing rich environments, 3D audio, etc.

Interactivity

Allow the player to manipulate the world instead of just watching

Input devices

Experimenting withtraditional and next-gen input devices

Movement

Believable, stable, non-sickening

Exploring the key aspects of VR

slide-51
SLIDE 51

gameworks.nvidia.com | GDC 2015

High & stable frame rate

Oculus requires 90+ FPS (drops  physical discomfort)

Resolution: Full HD and beyond Quality: Bringing our signature visuals to VR Dual rendering vs Reprojection Minimum latency

Our Rendering Challenges

slide-52
SLIDE 52

gameworks.nvidia.com | GDC 2015

VR Demo Rendering Tale

REPROJECTION

3D TV

X VR

(R&D)

Excellent performance, just a post-process Worked well on 3D TVs Introduces artefacts Reduced depth perception (to minimize artefacts)  “Presence” is not fully achieved

GPU Single GPU solution – 90+ FPS at full HD

slide-53
SLIDE 53

gameworks.nvidia.com | GDC 2015

VR Demo Rendering Tale

REPROJECTION DUAL RENDERING MGPU Single GPU (brute force) CPU, GPU: 2x Rendering Effort!!

GPU GPU GPU 3D TV

X VR

(R&D)

AFR VR SLI (engine pipeline) GPU Masking (good proof of concept)

L R

slide-54
SLIDE 54

gameworks.nvidia.com | GDC 2015

VR Demo Rendering Tale

REPROJECTION DUAL RENDERING Single GPU (brute force)

No driver support

CPU, GPU: 2x Rendering Effort!!

GPU GPU GPU Solution: Use Next-gen GPU Titan X (GM200 GPU) Engine Optimization Art optimization 90+ FPS (around 100 FPS avg.) 3D TV

X VR

(R&D)

MGPU

slide-55
SLIDE 55

gameworks.nvidia.com | GDC 2015

CryEngine VR Integration

REPROJECTION DUAL RENDERING MGPU Single GPU (+optimizations)

GPU GPU GPU R&D

AFR VR SLI Engine rework in progress GPU Masking

Great potential for some platforms / specs

  • Likely to be integrated first
  • Suboptimal (still DC x 2, etc)
  • Big improvement with small engine changes

Other

slide-56
SLIDE 56

gameworks.nvidia.com | GDC 2015

CryEngine VR Integration

REPROJECTION DUAL RENDERING MGPU Single GPU

GPU GPU R&D Great potential for some platforms / specs GPU GPU Not just a tech demo Community wants to work with these technologies ASAP Licensees and potential partners’ requests are the proof

slide-57
SLIDE 57

gameworks.nvidia.com | GDC 2015

Email us: nreed@nvidia.com dariop@crytek.com Slides will be posted: http://gputechconf.com

Questions & Comments?