GPU Ray-tracing using Irregular Grids Arsne Prard-Gayot, Javor - - PowerPoint PPT Presentation

gpu ray tracing using irregular grids
SMART_READER_LITE
LIVE PREVIEW

GPU Ray-tracing using Irregular Grids Arsne Prard-Gayot, Javor - - PowerPoint PPT Presentation

GPU Ray-tracing using Irregular Grids Arsne Prard-Gayot, Javor Kalojanov, Philipp Slusallek April 28, 2017 1 Introduction Ray Tracing with Grids Challenges Irregular Grids Construction (Part I) Traversal Construction (Part II) Results


slide-1
SLIDE 1

GPU Ray-tracing using Irregular Grids

Arsène Pérard-Gayot, Javor Kalojanov, Philipp Slusallek April 28, 2017

1

slide-2
SLIDE 2

Introduction Ray Tracing with Grids Challenges Irregular Grids Construction (Part I) Traversal Construction (Part II) Results

2

slide-3
SLIDE 3

Introduction

slide-4
SLIDE 4

Introduction: Ray Tracing with Grids

Pros

  • Very fast parallel construction
  • Stackless & ordered traversal, early exit

Cons

  • Empty space skipping: Teapot in the Stadium
  • Cannot minimize both intersections and traversal steps

3

slide-5
SLIDE 5

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-6
SLIDE 6

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-7
SLIDE 7

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-8
SLIDE 8

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-9
SLIDE 9

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-10
SLIDE 10

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-11
SLIDE 11

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-12
SLIDE 12

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-13
SLIDE 13

Introduction: Uniform Grid

Redundant intersections Empty space

4

slide-14
SLIDE 14

Introduction: Uniform Grid

Redundant intersections Empty space

  • 4
slide-15
SLIDE 15

Introduction: Uniform Grid

Increasing resolution

  • Fewer intersections
  • More traversal steps

4

slide-16
SLIDE 16

Introduction: Our solution

Idea: Remove regularity

  • Start with a dense subdivision
  • Optimize cell shape to minimize traversal cost

5

slide-17
SLIDE 17

Introduction: Our solution

Uniform Grid: Low Resolution

200

Traversal steps + Intersections

5

slide-18
SLIDE 18

Introduction: Our solution

Uniform Grid: Medium Resolution

200

Traversal steps + Intersections

5

slide-19
SLIDE 19

Introduction: Our solution

Irregular Grid: Low Resolution

200

Traversal steps + Intersections

5

slide-20
SLIDE 20

Introduction: Our solution

Irregular Grid: Medium Resolution

200

Traversal steps + Intersections

5

slide-21
SLIDE 21

Introduction: Our solution

Irregular Grid: High Resolution

200

Traversal steps + Intersections

5

slide-22
SLIDE 22

Irregular Grids

slide-23
SLIDE 23

Data Structure

Irregular Grid

=

3 3 3 3 1 1 1 1 2 2 2 2 4 4 7 7 5 5 9 8

Voxel Map

+

1 2 3 4 5 7 8 9

Cells Primitive References

6

slide-24
SLIDE 24

Construction (Part I)

Initialization

  • Initial grid
  • Two-level construction:
  • 1. A coarse uniform grid
  • 2. An octree in each of the grid cells
  • Adaptive: More effort where the geometry is complex
  • Dense: Up to 215 resolution in each second-level cell

7

slide-25
SLIDE 25

Construction (Part I)

Initialization

7

slide-26
SLIDE 26

Construction (Part I)

Initialization

  • User-defined λ1 controls top-level resolution
  • With scene volume V and number of objects N [Cle+83]:

R{x,y,z} = d{x,y,z}

3

√ λ1N V

  • Tries to make cells cubic

7

slide-27
SLIDE 27

Construction (Part I)

Initialization

7

slide-28
SLIDE 28

Construction (Part I)

Initialization

  • Octree depth computed independently in each cell
  • Same formula, but: λ2, local number of objects & volume
  • Clamp resolution to a power of two:

D = ⌈log2(max(Rx, Ry, Rz))⌉

  • Compact: only log2(log2(Rmax)) bits needed
  • 4 bits = max. resolution of 215 × 215 × 215

7

slide-29
SLIDE 29

Construction (Part I)

Initialization

3 2 1 2 1

7

slide-30
SLIDE 30

Construction (Part I)

Initialization

7

slide-31
SLIDE 31

Construction (Part I): Virtual Grid

Property

Cells are aligned on a virtual grid of resolution Rx,y,z 2D

8

slide-32
SLIDE 32

Construction (Part I): Voxel Map

Voxel map as a two level grid Memory efficient/Fast lookup

9

slide-33
SLIDE 33

Interlude: Traversal

Traversal

  • The data structure is not optimal
  • But it can already be used for traversal

Ideas

  • Maintain position on the virtual grid
  • Recompute increment along the ray at each step

10

slide-34
SLIDE 34

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-35
SLIDE 35

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-36
SLIDE 36

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-37
SLIDE 37

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-38
SLIDE 38

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-39
SLIDE 39

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-40
SLIDE 40

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-41
SLIDE 41

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-42
SLIDE 42

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-43
SLIDE 43

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-44
SLIDE 44

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-45
SLIDE 45

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-46
SLIDE 46

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-47
SLIDE 47

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-48
SLIDE 48

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-49
SLIDE 49

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-50
SLIDE 50

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-51
SLIDE 51

Interlude: Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

11

slide-52
SLIDE 52

Contruction (Part II)

Traversal Performance

  • Poor empty space skipping =

⇒ memory latency

  • Redundant intersections =

⇒ instr./memory latency Cell Merging and Expansion

  • Local (greedy) optimizations
  • Examine cells and their neighborhoods
  • Keep optimizations simple and parallelizable

12

slide-53
SLIDE 53

Contruction (Part II): Optimization Passes

Cell Merging Initial Grid x y z ... Repeat Cell Expansion x y z ... Repeat ...

13

slide-54
SLIDE 54

Contruction (Part II): Optimization Passes

Cell Merging Initial Grid x y z ... Repeat Cell Expansion x y z ... Repeat ...

13

slide-55
SLIDE 55

Contruction (Part II): Optimization Passes

Cell Merging Initial Grid x y z ... Repeat Cell Expansion x y z ... Repeat ...

13

slide-56
SLIDE 56

Contruction (Part II): Cell Merging

Cell Merging

  • Merge each cell with its neighbor if the SAH decreases:

|R(A)| SA(A) + |R(B)| SA(B) ≥ |R(A ∪ B)| SA(A ∪ B) − Ct

  • For empty and non-empty cells

14

slide-57
SLIDE 57

Contruction (Part II): Cell Merging

Limitations

  • Only consider the union of 2 aligned cells
  • Union must be a box

14

slide-58
SLIDE 58

Contruction (Part II): Cell Merging

Stopping criterion

  • Keep merging until:

Nafter ≥ αNbefore

  • Nafter/Nbefore: number of cells after/before merging
  • α = 0.995

14

slide-59
SLIDE 59

Contruction (Part II): Optimization Passes

Cell Merging Initial Grid x y z ... Repeat Cell Expansion x y z ... Repeat ...

15

slide-60
SLIDE 60

Contruction (Part II): Optimization Passes

Cell Merging Initial Grid x y z ... Repeat Cell Expansion x y z ... Repeat ...

15

slide-61
SLIDE 61

Contruction (Part II): Cell Expansion

Cell Expansion

  • Expand the exit boundaries of the cells
  • Must maintain correctness of traversal:

R(B) ⊂ R(A)

16

slide-62
SLIDE 62

Contruction (Part II): Cell Expansion

Cell Expansion

  • Expand the exit boundaries of the cells
  • Must maintain correctness of traversal:

R(A) ̸⊂ R(B)

16

slide-63
SLIDE 63

Contruction (Part II): Cell Expansion

Limitations

  • Must examine every neighbor on the box face
  • Binary decision, no partial expansion

16

slide-64
SLIDE 64

Contruction (Part II): Cell Expansion

Stopping criterion

  • Fixed number of expansion passes:
  • 3 for static scenes,
  • 1 for dynamic scenes.

16

slide-65
SLIDE 65

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-66
SLIDE 66

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-67
SLIDE 67

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-68
SLIDE 68

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-69
SLIDE 69

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-70
SLIDE 70

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-71
SLIDE 71

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-72
SLIDE 72

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-73
SLIDE 73

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-74
SLIDE 74

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-75
SLIDE 75

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-76
SLIDE 76

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-77
SLIDE 77

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-78
SLIDE 78

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-79
SLIDE 79

Contruction (Part II): Impact on Traversal

  • 1. Locate ray origin
  • 2. Loop

2.1 Intersect primitives 2.2 Exit if hit is within cell 2.3 Locate exit point 2.4 Move to next cell

17

slide-80
SLIDE 80

Results

slide-81
SLIDE 81

Results: Source Code

GPU implementation

  • https://github.com/madmann91/hagrid
  • Parallel construction & traversal
  • CUDA implementation
  • MIT license

18

slide-82
SLIDE 82

Results: Static Scenes

Parameters

  • (λ1, λ2) = (0.12, 2.4) for every scene
  • Memory footprint ≈ SBVH [SFD09]
  • Different viewpoints

19

slide-83
SLIDE 83

Results: Static Scenes

Scene #Tris Sponza 262K Conference 283K Hairball 2.9M Crown 3.5M San Miguel 7.9M Build times (ms) 26 22 893 203 492 Primary (MRays/s) SBVH Ours 409 653 +60% 265 473 +78% 583 597 +2% 523 526 +1% 100 148 +48% 79 93 +18% 232 296 +28% 181 191 +6% 227 291 +28% 157 180 +15% AO (MRays/s) SBVH Ours 270 386 +43% 187 234 +25% 303 332 +10% 326 338 +4% 53 69 +30% 63 61 -3% 108 120 +11% 112 125 +12% 119 119 +0% 125 115 -8% Random (MRays/s) SBVH Ours 166 274 +65% 295 312 +6% 19 26 +37% 221 238 +8% 119 160 +34%

19

slide-84
SLIDE 84

Results: Build Times vs. Traversal Performance

0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 λ1 λ2 50 100 150 200 250 300 0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5 λ1 λ2 2 4 6 8 10

Build Times (ms) Traversal Times (ms) Lower = Better

Varying parameters for Crown

  • No local optimum ̸= two-level grid
  • Increasing density =

⇒ increasing performance

20

slide-85
SLIDE 85

Results: Construction Steps Performance

Initialization Cell Merging Cell Expansion

38.44% 48% 13.46%

Time spent during construction

  • Average over all static scenes
  • Dominated by initialization & merging

21

slide-86
SLIDE 86

Results: Dynamic Scenes

Methodology

  • Comparison with two-level grids [KBS11]
  • Fixed time budget
  • Two-level grids: choose optimal resolution
  • Irregular grid:
  • Fixed ratio: λ1 : λ2 = 1 : 8
  • Range: λ1 ∈ [0.01, 0.3], λ2 ∈ [0.08, 2.4]
  • Start at minimum, increase until Tbuild = 0.5 Tbudget

22

slide-87
SLIDE 87

Results: Dynamic Scenes

1spp 8spp 1spp 8spp

λ1, λ2 AO spp 10FPS (100ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 2 20 20FPS (50ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 1 8 30FPS (33ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 3

22

slide-88
SLIDE 88

Results: Dynamic Scenes 3spp 13spp 3spp 13spp

λ1, λ2 AO spp 10FPS (100ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 21 57 20FPS (50ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 8 24 30FPS (33ms) 2L Grid Ours 0.2, 2.0 0.3, 2.4 3 13

22

slide-89
SLIDE 89

Results: Dynamic Scenes

8spp 1spp 8spp 1spp

λ1, λ2 AO spp 10FPS (100ms) 2L Grid Ours 0.03, 0.6 0.3, 2.4 1 8 20FPS (50ms) 2L Grid Ours 0.03, 0.6 0.02, 0.16 1 30FPS (33ms) 2L Grid Ours 0.03, 0.6 0.01, 0.08 22

slide-90
SLIDE 90

Results: Conclusion

Irregular grid properties

  • Ordered, stackless traversal
  • Same construction/traversal algorithm for:
  • Static scenes
  • Dynamic scenes
  • Performance similar/superior to state-of-the-art

Future directions

  • Exploring initial subdivision schemes
  • Different voxel map structure
  • More aggressive optimizations

23

slide-91
SLIDE 91

Thank you!

24

slide-92
SLIDE 92

Backup: Related Work

Macro regions Irregular grid (uniform initialization)

Macro Regions [Dev89]

  • Limited to empty space
  • Based on uniform grids

25

slide-93
SLIDE 93

Backup: Aggressive Optimizations

Partial expansion

  • Expand cells partially over their neighbors
  • Test primitives inside neighbor for intersection
  • Implemented in GitHub version
  • Additional +10-20% over merge + basic expansion

26

slide-94
SLIDE 94

References

  • J. G. Cleary et al. “Design and analysis of a parallel ray tracing computer”. In: Graphics Interface ’83.

1983, pp. 33–38. Olivier Devillers. “The Macro-Regions: An Efficient Space Subdivision Structure for Ray Tracing”. In: EG 1989-Technical Papers. Eurographics Association, 1989. Javor Kalojanov, Markus Billeter, and Philipp Slusallek. “Two-Level Grids for Ray Tracing on GPUs”. In: EG 2011 - Full Papers. Ed. by Oliver Deussen Min Chen. Llandudno, UK: Eurographics Association, 2011,

  • pp. 307–314.

Martin Stich, Heiko Friedrich, and Andreas Dietrich. “Spatial splits in bounding volume hierarchies”. In: In Proc. of High-Performance Graphics. 2009, pp. 7–13.

27