THE MAGIC BEHIND GAMEWORKS HYBRID FRUSTUM TRACED SHADOWS Chris - - PowerPoint PPT Presentation

the magic behind gameworks hybrid frustum traced shadows
SMART_READER_LITE
LIVE PREVIEW

THE MAGIC BEHIND GAMEWORKS HYBRID FRUSTUM TRACED SHADOWS Chris - - PowerPoint PPT Presentation

NVIDIA RESEARCH TALK: THE MAGIC BEHIND GAMEWORKS HYBRID FRUSTUM TRACED SHADOWS Chris Wyman July 28, 2016 MARCH 2016: 1 ST RAY-TRACED SHADOWS IN GAMES Now available as GameWorks module; shipped in Tom Clancys The Division Right: Percentage


slide-1
SLIDE 1

Chris Wyman July 28, 2016

NVIDIA RESEARCH TALK:

THE MAGIC BEHIND GAMEWORKS’ HYBRID FRUSTUM TRACED SHADOWS

slide-2
SLIDE 2

2

Now available as GameWorks module; shipped in Tom Clancy’s The Division

MARCH 2016: 1ST RAY-TRACED SHADOWS IN GAMES

Left: Hybrid Frustum Traced Shadows (HFTS) Right: Percentage Closer Soft Shadows (PCSS)

slide-3
SLIDE 3

3

WHO?

Joint work:

  • Chris Wyman, NVIDIA Research
  • Jon Story, NVIDIA DevTech
  • UbiSoft’s Massive, developers of The Division

From Tom Clancy’s The Division

slide-4
SLIDE 4

4

WHO?

Joint work:

  • Chris Wyman, NVIDIA Research
  • Jon Story, NVIDIA DevTech
  • UbiSoft’s Massive, developers of The Division

An NVIDIA success story of transitioning research to product

From Tom Clancy’s The Division

slide-5
SLIDE 5

5

WHO?

Joint work:

  • Chris Wyman, NVIDIA Research
  • Jon Story, NVIDIA DevTech
  • UbiSoft’s Massive, developers of The Division

An NVIDIA success story of transitioning research to product May not know:

  • NVIDIA has research division of 100+ researchers
  • Covering graphics, VR, machine learning, AI, compilers, vision, circuits, etc.

From Tom Clancy’s The Division

slide-6
SLIDE 6

6

WHO?

Joint work:

  • Chris Wyman, NVIDIA Research
  • Jon Story, NVIDIA DevTech
  • UbiSoft’s Massive, developers of The Division

An NVIDIA success story of transitioning research to product May not know:

  • NVIDIA has research division of 100+ researchers
  • Covering graphics, VR, machine learning, AI, compilers, vision, circuits, etc.

(2+ years effort) (6+ months effort)

NVIDIA enables researchers and engineers to spend time addressing important graphics problems

From Tom Clancy’s The Division

slide-7
SLIDE 7

7

BUT THERE’S MORE!

Today, GameWorks supports 1 ray per pixel The research extends to 32+ rays per pixel (For a 2x increase in cost)

slide-8
SLIDE 8

8

STORY

Today: talk about the road to productization and research tech transfer

slide-9
SLIDE 9

9

STORY

Today: talk about the road to productization and research tech transfer Up to 5 billion shadow rays/sec in fully dynamic scenes, incl. data structure build

  • On GeForce GTX Titan X (2015)
  • Specialized algorithm for ray traced hard shadows
  • Fits in raster pipeline; no extra ray tracing library
slide-10
SLIDE 10

10

STORY

Today: talk about the road to productization and research tech transfer Up to 5 billion shadow rays/sec in fully dynamic scenes, incl. data structure build

  • On GeForce GTX Titan X (2015)
  • Specialized algorithm for ray traced hard shadows
  • Fits in raster pipeline; no extra ray tracing library

Builds on a “irregular z-buffer” for ray acceleration

  • Not a traditional BVH or kd-tree
  • Irregular z-buffers regarded as a dead end 3 years ago
slide-11
SLIDE 11

11

WHY IS THIS WORTH INVESTIGATING?

slide-12
SLIDE 12

12

WHY IS THIS WORTH INVESTIGATING?

Cause:

  • Precompute shadow map
  • Has fixed resolution
  • Multiple adjacent pixels query

same texel, get same answer

slide-13
SLIDE 13

13

ALIASES EVEN WITH HIGH RESOLUTION

slide-14
SLIDE 14

14

ALIASES EVEN WITH HIGH RESOLUTION

slide-15
SLIDE 15

15

ALIASES EVEN WITH HIGH RESOLUTION

slide-16
SLIDE 16

16

ALIASES EVEN WITH HIGH RESOLUTION

slide-17
SLIDE 17

17

ALIASES EVEN WITH HIGH RESOLUTION

slide-18
SLIDE 18

18

FILTERING SHADOW MAPS HELP

slide-19
SLIDE 19

19

FILTERING SHADOW MAPS HELP

slide-20
SLIDE 20

20

FILTERING SHADOW MAPS HELP

slide-21
SLIDE 21

21

AND BLOCKS STILL VISIBLE AFTER FILTERING!

Less contact shadows Lose fine geometric details And they move and flicker during animation…

slide-22
SLIDE 22

22

HIGH QUALITY RAY TRACING

slide-23
SLIDE 23

23

HIGH QUALITY SHADOW MAP

1 or 32 samples per pixel

slide-24
SLIDE 24

24

HIGH QUALITY RAY TRACING

slide-25
SLIDE 25

25

USING RAY TRACING TODAY

Requires separate ray tracing libraries, APIs, and acceleration structures:

  • May need separate geometric representation
  • Data structure rebuild traditionally costly (for dynamic scenes)
slide-26
SLIDE 26

26

USING RAY TRACING TODAY

Requires separate ray tracing libraries, APIs, and acceleration structures:

  • May need separate geometric representation
  • Data structure rebuild traditionally costly (for dynamic scenes)

Our goals:

  • Specialize ray tracing for hard shadows
  • Build on existing APIs (DirectX, OpenGL, Vulkan) and geometric representations
  • Quickly build a new data structure each frame
slide-27
SLIDE 27

27

WHAT IS RAY TRACING?

Query visibility along arbitrary rays

slide-28
SLIDE 28

28

WHAT IS RAY TRACING?

Query visibility along arbitrary rays To shadow each pixel, test ray to light

  • If occluded, pixel shadowed
  • If unoccluded, pixel lit
slide-29
SLIDE 29

29

WHAT IS RAY TRACING?

Query visibility along arbitrary rays To shadow each pixel, test ray to light

  • If occluded, pixel shadowed
  • If unoccluded, pixel lit

Avoids problems with shadow maps

  • Light visibility not precomputed
  • Computations exactly match pixel locations
slide-30
SLIDE 30

30

MAKING SHADOW RAY TRACING FAST

Typical ray tracer is extremely general

  • 10s, 100s, or 1000s of rays per pixel
  • Incoherent memory access
  • Unknown reflectance of surfaces in scene

From WikiPedia

slide-31
SLIDE 31

31

MAKING SHADOW RAY TRACING FAST

Typical ray tracer is extremely general

  • 10s, 100s, or 1000s of rays per pixel
  • Incoherent memory access
  • Unknown reflectance of surfaces in scene

Specializing for shadows helps

  • Only care about binary visibility per ray

From WikiPedia

slide-32
SLIDE 32

32

MAKING SHADOW RAY TRACING FAST

Typical ray tracer is extremely general

  • 10s, 100s, or 1000s of rays per pixel
  • Incoherent memory access
  • Unknown reflectance of surfaces in scene

Specializing for shadows helps

  • Only care about binary visibility per ray

Specializing for hard shadows helps even more

  • Know all rays go to same location (i.e., the point light)
  • Starts to look like raster, with irregular samples

From WikiPedia

slide-33
SLIDE 33

33

DATA STRUCTURE: IRREGULAR Z-BUFFER

Accelerates queries emanating from a point Can efficiently build and traverse in parallel

  • Fully rebuilds in < 1 ms per frame

Shadow map Irregular Z-buffer

slide-34
SLIDE 34

34

DATA STRUCTURE: IRREGULAR Z-BUFFER

Accelerates queries emanating from a point Can efficiently build and traverse in parallel

  • Fully rebuilds in < 1 ms per frame

A type of ray caching

  • Stores ray endpoints rather than triangles
  • Reorders rays; allows ray tracing via raster hardware
  • Leverage shadow map techniques for more perf wins

Shadow map Irregular Z-buffer

slide-35
SLIDE 35

35

WHY HAS NOBODY ELSE DONE THIS?

Irregular z-buffering is hard

  • 3 years ago, was a “dead end” in academic research
  • Our 1st prototype cost >2 sec for this frame (now <5 ms; a 400x speedup)
slide-36
SLIDE 36

36

WHY HAS NOBODY ELSE DONE THIS?

Irregular z-buffering is hard

  • 3 years ago, was a “dead end” in academic research
  • Our 1st prototype cost >2 sec for this frame (now <5 ms; a 400x speedup)

Bad: Costs increased linearly with # pixels & polygons

slide-37
SLIDE 37

37

WHY HAS NOBODY ELSE DONE THIS?

Irregular z-buffering is hard

  • 3 years ago, was a “dead end” in academic research
  • Our 1st prototype cost >2 sec for this frame (now <5 ms; a 400x speedup)

Bad: Costs increased linearly with # pixels & polygons Worse: Performance could vary 100:1 between frames

slide-38
SLIDE 38

38

MAKING IRREGULAR Z-BUFFERS USABLE

IZBs eliminate aliasing, converting it to performance variability

slide-39
SLIDE 39

39

MAKING IRREGULAR Z-BUFFERS USABLE

IZBs eliminate aliasing, converting it to performance variability

  • If shadow maps alias, many pixels correspond to one texel
  • IZBs have to enumerate, cache, and reorder these pixels
  • Coverts aliasing into a parallel load balancing problem
  • Poor load balancing = poor GPU performance

Well balanced Poorly balanced

slide-40
SLIDE 40

40

MAKING IRREGULAR Z-BUFFERS USABLE

IZBs eliminate aliasing, converting it to performance variability

  • If shadow maps alias, many pixels correspond to one texel
  • IZBs have to enumerate, cache, and reorder these pixels
  • Coverts aliasing into a parallel load balancing problem
  • Poor load balancing = poor GPU performance

Our research:

  • First, identified this problem
  • Second, proposed a simple solution implementable today

Well balanced Poorly balanced

slide-41
SLIDE 41

41

HOW TO LOAD BALANCE

Even well designed shadow map implementations alias badly from some views

  • Nearby texels here 100:1 larger than distant ones
slide-42
SLIDE 42

42

HOW TO LOAD BALANCE

Even well designed shadow map implementations alias badly from some views

  • Nearby texels here 100:1 larger than distant ones
  • Hence the use of cascaded shadow maps
  • Cascades reduce variability in aliasing
slide-43
SLIDE 43

43

HOW TO LOAD BALANCE

Even well designed shadow map implementations alias badly from some views

  • Nearby texels here 100:1 larger than distant ones
  • Hence the use of cascaded shadow maps
  • Cascades reduce variability in aliasing

IZBs convert aliasing to poor load balancing

  • Some texels cost 100x more than others
slide-44
SLIDE 44

44

HOW TO LOAD BALANCE

Even well designed shadow map implementations alias badly from some views

  • Nearby texels here 100:1 larger than distant ones
  • Hence the use of cascaded shadow maps
  • Cascades reduce variability in aliasing

IZBs convert aliasing to poor load balancing

  • Some texels cost 100x more than others
  • Cascaded IZBs reduce this variability (to <<2x)
  • Other shadow map techniques apply too

(E.g., adaptive, perspective, logarithm, etc.)

slide-45
SLIDE 45

45

HOW TO GET SOFT SHADOWS

Ray Traced PCSS HFTS

Unlike shadow maps, maintains high quality contact shadows when filtering

slide-46
SLIDE 46

46

USE AS INPUT TO SHADOW FILTER

Irregular Z-buffer PCSS Hybrid Frustum Traced Shadows

Unlike shadow maps, maintains high quality contact shadows when filtering

Irregular Z-buffer PCSS HFTS

slide-47
SLIDE 47

47

HOW TO COMBINE?

Multiple ways, but straightforward seems to work pretty well

See “Hybrid Ray Traced Shadows” from Jon Story at GDC 2015 for details

slide-48
SLIDE 48

48

HFTS PERFORMANCE

1 2 3 4 5 PCSS IZB HFTS

ms

Images from Tom Clancy’s The Division

GeForce GTX Titan X (2015) at Resolution: 1920x1080

slide-49
SLIDE 49

49

HFTS PERFORMANCE

1 2 3 4 5 6 7 PCSS IZB HFTS

ms

Images from Tom Clancy’s The Division

GeForce GTX Titan X (2015) at Resolution: 1920x1080

slide-50
SLIDE 50

50

HFTS PERFORMANCE

GeForce GTX Titan X (2015) at Resolution: 1920x1080

Images from Tom Clancy’s The Division

1 2 3 4 5 PCSS IZB HFTS

ms

slide-51
SLIDE 51

51

FURTHER TO GO

GameWorks version limited by in-game feasibility

  • More advanced features available in budgets ~10-30 ms

Research prototype shows

  • 32 samples per pixel ~2x cost of 1 sample
  • Seemless shadows from transparent and alpha tested geometry
  • Possibility of higher quality soft shadows
slide-52
SLIDE 52

52

FURTHER TO GO

16 ms per frame

slide-53
SLIDE 53

53

TAKEAWAYS

Can do fast ray traced shadows in games today

slide-54
SLIDE 54

54

TAKEAWAYS

Can do fast ray traced shadows in games today Eliminates shadow map aliasing larger than a pixel

slide-55
SLIDE 55

55

TAKEAWAYS

Can do fast ray traced shadows in games today Eliminates shadow map aliasing larger than a pixel Introduces some variability to perf; cascades & other techniques reduce dramatically

slide-56
SLIDE 56

56

TAKEAWAYS

Can do fast ray traced shadows in games today Eliminates shadow map aliasing larger than a pixel Introduces some variability to perf; cascades & other techniques reduce dramatically Provides high quality input to shadow filters for approximate soft shadows

slide-57
SLIDE 57

57

TAKEAWAYS

Can do fast ray traced shadows in games today Eliminates shadow map aliasing larger than a pixel Introduces some variability to perf; cascades & other techniques reduce dramatically Provides high quality input to shadow filters for approximate soft shadows HFTS available in NVIDIA GameWorks; shipped in Tom Clancy’s The Division

slide-58
SLIDE 58

QUESTIONS?

Contact: Chris Wyman Jon Story cwyman@nvidia.com jons@nvidia.com @_cwyman_ More on: Irregular Z-Buffers More on: Hybrid Frustum Traced Shadows

slide-59
SLIDE 59

COME DO YOUR LIFE’S WORK

JOIN NVIDIA

We are looking for great people at all levels to help us accelerate the next wave of AI-driven computing in Research, Engineering, and Sales and Marketing. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions like artificial intelligence and autonomous cars. Check out our career opportunities:

  • www.nvidia.com/careers
  • Reach out to your NVIDIA social network or NVIDIA recruiter at

DeepLearningRecruiting@nvidia.com