π± π, πβ² = π(π, πβ²) π π, πβ² + ΰΆ±
π»
π π, πβ², πβ²β² π± πβ², πβ²β² ππβ²β²
INFOMAGR β Advanced Graphics
Jacco Bikker - November 2017 - February 2018
Lecture 4 - Real - time Ray Tracing Welcome! , = (, ) - - PowerPoint PPT Presentation
INFOMAGR Advanced Graphics Jacco Bikker - November 2017 - February 2018 Lecture 4 - Real - time Ray Tracing Welcome! , = (, ) , + , ,
π± π, πβ² = π(π, πβ²) π π, πβ² + ΰΆ±
π»
π π, πβ², πβ²β² π± πβ², πβ²β² ππβ²β²
Jacco Bikker - November 2017 - February 2018
Advanced Graphics β Real-time Ray Tracing 3
Cost Breakdown for Ray Tracing:
Mind scalability as well as constant cost. Example: scene consisting of 1k spheres and 4 light sources, diffuse materials, rendered to 1M pixels: 1π Γ 5 Γ 1π = 5 β 109 ray/prim intersections. (multiply by desired framerate for realtime)
Using the BVH:
Example: scene consisting of 1k spheres and 4 light sources, diffuse materials, rendered to 1M pixels: 1π Γ 5 Γ 10 = 5 β 107 ray/(prim or node) intersections. (multiply by desired framerate for realtime)
Advanced Graphics β Real-time Ray Tracing 4
Advanced Graphics β Real-time Ray Tracing 5
Reality Check
Performance is now OK, but weβre not quite ready to render a game world.
Advanced Graphics β Real-time Ray Tracing 6
Advanced Graphics β Real-time Ray Tracing 8 Cost of ray tracing: Dominated by memory access cost.
Advanced Graphics β Real-time Ray Tracing 9 Primary rays: For a tile of pixels, these are organized in a narrow frustum. All rays share a common
Advanced Graphics β Real-time Ray Tracing 10 Shadow rays: For point lights, shadow rays also tend to travel close together. When traced from the light source, they too have a common
Advanced Graphics β Real-time Ray Tracing 11 Secondary rays: Reflected and refracted rays tend to diverge significantly.
Advanced Graphics β Real-time Ray Tracing 12
Coherence
Primary rays and shadow rays for point lights are coherent :
Our problem: Ray tracing cost is dominated by memory latency. Solution: Amortize cost of fetching data over multiple rays.
Advanced Graphics β Real-time Ray Tracing 13
Coherent Ray Tracing*
SIMD: four rays for the price of one.
BVHNode::Traverse( Ray r ) { if (!r.Intersects( bounds )) return; if (isleaf()) { IntersectPrimitives(); } else { pool[left].Traverse( r ); pool[left + 1].Traverse( r ); } }
*: Interactive Rendering with Coherent Ray Tracing, Wald et al., 2001
Advanced Graphics β Real-time Ray Tracing 14
Coherent Ray Tracing*
SIMD: four rays for the price of one.
BVHNode::Traverse( Ray4 r4 ) { if (!r4.Intersects( bounds )) return; if (isleaf()) { IntersectPrimitives(); } else { pool[left].Traverse( r4 ); pool[left + 1].Traverse( r4 ); } }
*: Interactive Rendering with Coherent Ray Tracing, Wald et al., 2001
Advanced Graphics β Real-time Ray Tracing 15
Coherent Ray Tracing
Ray packet traversal:
primitive in the leaf. Masking:
intersect a node.
Advanced Graphics β Real-time Ray Tracing 16
Coherent Ray Tracing*
SIMD: four rays for the price of one.
BVHNode::Traverse( Ray4 r4, bool4 mask4 ) { bool4 hit4 = r4.Intersects( bounds ) & mask4; if (none( hit4 )) return; if (isleaf()) { IntersectPrimitives(); } else { pool[left].Traverse( r4, hit4 ); pool[left + 1].Traverse( r4, hit4 ); } }
Advanced Graphics β Real-time Ray Tracing 17
Coherent Ray Tracing
Results:
Overhead:
all four rays perform this operation.
Advanced Graphics β Real-time Ray Tracing 18
Large Packets*
Cost of memory access can be amortized over more rays by using larger packets. Note that a naΓ―ve approach will lead to significant overhead. We therefore add a frustum test to rapidly reject BVH nodes: If the packet frustum does not intersect the node AABB, we discard the node. The cost of this operation is independent of the number of rays in the packet. Likewise, a node is traversed as soon as we find that a ray intersects it. This is also independent of packet size.
*: Large Ray Packets for Real-time Whitted Ray Tracing, Overbeck et al., 2008
Advanced Graphics β Real-time Ray Tracing 19
Large Packets
Algorithm:
found. This step yields a new first active ray index.
BVHNode::Traverse( RayPacket rp, int first ) { if (!Intersects( rp[first )) // 1 { if (!Intersects( rp.frustum )) return; // 2 FindFirstActive( rp, ++first ); // 3 } if (first < rp.rayCount) { if (isleaf()) { IntersectPrimitives( rp ); } else { left.Traverse( rp, first ); right.Traverse( rp, first ); } } }
Advanced Graphics β Real-time Ray Tracing 20
Large Packets
Details:
BVHNode::Traverse( RayPacket rp, int first ) { if (!Intersects( rp[first )) // 1 { if (!Intersects( rp.frustum )) return; // 2 FindFirstActive( rp, ++first ); // 3 } if (first < rp.rayCount) { if (isleaf()) { IntersectPrimitives( rp ); } else { left.Traverse( rp, first ); right.Traverse( rp, first ); } } }
Advanced Graphics β Real-time Ray Tracing 21
Frustum Construction
Method 1, for primary rays: Planes are easily defined using the corner rays: π1 = (π0βπΉ) Γ (π1 β π0) , π1 = π1 β πΉ π2 = (π1βπΉ) Γ (π2 β π1) , π2 = π2 β πΉ π3 = (π2βπΉ) Γ (π3 β π2) , π3 = π3 β πΉ π4 = (π3βπΉ) Γ (π0 β π3) , π4 = π4 β πΉ Note: for secondary rays, we will not have a common origin, nor corner rays. π0 π1 π2 πΉ π3
Advanced Graphics β Real-time Ray Tracing 22
Frustum Construction
Method 2, for shadow rays:
πΈπ‘ = Οπ=0
π
πΈπ , chose axis ΰ· π as the largest component of πΈπ‘ , ΰ· π£ and ΰ· π€ are the other axes.
π
π£πππ, π€πππ π£πππ¦, π€πππ π£πππ¦, π€πππ¦ (π£πππ, π€πππ¦) Note: this still requires a common origin. ΰ· π 1 ΰ· π£
Advanced Graphics β Real-time Ray Tracing 23
Frustum Construction
Method 3, for generic rays:
π
πππ , orthogonal to ΰ·
π, at location ππππ which is
π
ππππ , orthogonal to ΰ·
π, at location πππππ , which is obtained from the AABB over the ray origins.
and π£ππππ , π€ππππ of the rays with π
πππ and π ππππ .
ππππ , π£πππ¦ ππππ , π€πππ ππππ , π£πππ¦ ππππ and π£πππ ππππ , π£πππ¦ ππππ , π€πππ ππππ , π£πππ¦ ππππ .
π£πππ
πππ , π€πππ πππ β π£πππ ππππ , π€πππ ππππ ,
π£πππ¦
πππ , π€πππ πππ β π£πππ¦ ππππ , π€πππ ππππ ,
π£πππ¦
πππ , π€πππ¦ πππ
β π£πππ¦
ππππ , π€πππ¦ ππππ ,
π£πππ
πππ , π€πππ¦ πππ
β (π£πππ
ππππ , π€πππ¦ ππππ ).
ΰ· π ππππ πππππ
Advanced Graphics β Real-time Ray Tracing 24
Ray Order
The order of the rays in a packet is important. We keep track of the first active ray: in this case the green dot. We thus enter the node with 61 rays, while only 12 rays actually intersect the node. Keeping track of the last ray helps somewhat. 7 8 63
Advanced Graphics β Real-time Ray Tracing 25
Ray Order
Overhead can be reduced by numbering rays in each quadrant sequentially.
15 16 31 32 63
Advanced Graphics β Real-time Ray Tracing 26
Ray Order
Overhead can be reduced by numbering rays in each quadrant sequentially. For the general case, Morton order is optimal.
Advanced Graphics β Real-time Ray Tracing 27
Divergent Rays: Partition Traversal
int PartitionRays for ( int i = 0; i < ππ; i++ ) if (ray[idx[i]].IntersectsAABB()) swap( idx[ππ++], idx[i] ); return ππ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rays ray indices ππ ππ
2 1 3 4 5 6 7 8 9 10 11 12 13 14 15
ππ
Advanced Graphics β Real-time Ray Tracing 28
Divergent Rays: Partition Traversal
int PartitionRays for ( int i = 0; i < ππ; i++ ) if (ray[idx[i]].IntersectsAABB()) swap( idx[ππ++], idx[i] ); return ππ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rays ray indices ππ ππ
2 5 3 4 1 6 7 8 9 10 11 12 13 14 15
ππ
Advanced Graphics β Real-time Ray Tracing 29
Divergent Rays: Partition Traversal
int PartitionRays for ( int i = 0; i < ππ; i++ ) if (ray[idx[i]].IntersectsAABB()) swap( idx[ππ++], idx[i] ); return ππ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rays ray indices ππ ππ
2 5 6 3 4 1 7 8 9 10 11 12 13 14 15
ππ
Advanced Graphics β Real-time Ray Tracing 30
Divergent Rays: Partition Traversal
int PartitionRays for ( int i = 0; i < ππ; i++ ) if (ray[idx[i]].IntersectsAABB()) swap( idx[ππ++], idx[i] ); return ππ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rays ray indices ππ ππ
2 5 6 9 4 1 7 8 3 10 11 12 13 14 15
ππ
Advanced Graphics β Real-time Ray Tracing 31
Divergent Rays: Partition Traversal
int PartitionRays for ( int i = 0; i < ππ; i++ ) if (ray[idx[i]].IntersectsAABB()) swap( idx[ππ++], idx[i] ); return ππ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
rays ray indices ππ ππ
2 5 6 9 11 1 7 8 3 10 4 12 13 14 15
ππ
Advanced Graphics β Real-time Ray Tracing 32
Divergent Rays: Partition Traversal Partition traversal gathers active rays in a continuous list. This comes at the price of some overhead:
In practice, this method is suitable for ray distributions where large gaps in the ray set are to be expected.
Advanced Graphics β Real-time Ray Tracing 33
Optimization: Recursion
The recursion can be replaced by a local stack:
struct Stack { BVHNode* node; int first; }; Stack stack[STACKSIZE]; stack[0].node = GetBVHRoot(); stack[0].first = 0; int stackPtr = 1; while( stackPtr > 0) { BVHNode* node = stack[--stackPtr].node; first = stack[stackPtr].first; ... }
BVHNode::Traverse( RayPacket rp, int first ) { if (!Intersects( rp[first )) // 1 { if (!Intersects( rp.frustum )) return; // 2 FindFirstActive( rp, ++first ); // 3 } if (first < rp.rayCount) { if (isleaf()) { IntersectPrimitives( rp ); } else { left.Traverse( rp, first ); right.Traverse( rp, first ); } } }
Advanced Graphics β Real-time Ray Tracing 34
Optimization: SIMD
We can still use SIMD to test four rays at once.
Note: for AVX, replace βfourβ by βeightβ.
Advanced Graphics β Real-time Ray Tracing 35
Results
Compared to 2x2 SIMD packet traversal, ranged traversal improves primary and shadow rays by ~3.5x. Note that ray divergence has a large impact on performance. 1 16.85 (100%) 25.11 (100%) 18.44 (100%) 2 11.61 (69%) 18.83 (75%) 12.93 (70%) 3 6.98 (41%) 12.56 (50%) 7.48 (41%) 4 3.85 (23%) 7.71 (31%) 3.87 (21%)
Advanced Graphics β Real-time Ray Tracing 37
Ray Tracing Animated Scenes
Covered so far:
Not covered:
Advanced Graphics β Real-time Ray Tracing 38
Advanced Graphics β Real-time Ray Tracing 39
Advanced Graphics β Real-time Ray Tracing 40
Advanced Graphics β Real-time Ray Tracing 41
Combining BVHs
Advanced Graphics β Real-time Ray Tracing 42
Combining BVHs
Two BVHs can be combined into a single BVH, by simply adding a new root node pointing to the two BVHs.
Advanced Graphics β Real-time Ray Tracing 43
Scene Graph
Advanced Graphics β Real-time Ray Tracing 44
Scene Graph
world car wheel wheel wheel wheel turret plane plane car wheel wheel wheel wheel turret buggy wheel wheel wheel wheel dude dude dude
Advanced Graphics β Real-time Ray Tracing 45
Scene Graph
If our application uses a scene graph, we can construct a BVH for each scene graph node. The BVH for each node is built using an appropriate construction algorithm:
The extra nodes used to combine these BVHs into a single BVH are known as the Top-level BVH .
Advanced Graphics β Real-time Ray Tracing 46
Rigid Motion
Applying rigid motion to a BVH:
Advanced Graphics β Real-time Ray Tracing 47
Rigid Motion
Applying rigid motion to a BVH:
Rigid motion is achieved by transforming the rays by the inverse transform upon entering the sub-BVH.
(this obviously does not only apply to translation)
Advanced Graphics β Real-time Ray Tracing 48
The Top-level BVH - Construction
Input: list of axis aligned bounding boxes for transformed scene graph nodes Algorithm:
surface area
Note: algorithmic complexity is π(π3).
Advanced Graphics β Real-time Ray Tracing 49
The Top-level BVH β Faster Construction*
Algorithm:
Node A = list.GetFirst(); Node B = list.FindBestMatch( A ); while (list.size() > 1) { Node C = list.FindBestMatch( B ); if (A == C) { list.Remove( A ); list.Remove( B ); A = new Node( A, B ); list.Add( A ); B = list.FindBestMatch( A ); } else A = B, B = C; }
*: Fast Agglomerative Clustering for Rendering, Walter et al., 2008
A B C A B A B
Advanced Graphics β Real-time Ray Tracing 50
The Top-level BVH β Traversal
The leafs of the top-level BVH contain the sub-BVHs. When a ray intersects such a leaf, it is transformed by the inverted transform matrix of the sub-BVH. After this, it traverses the sub-BVH. Once the sub-BVH has been traversed, we transform the ray again, this time by the transform matrix of the sub-BVH. For efficiency, we store the inverted matrix with the sub-BVH root.
Advanced Graphics β Real-time Ray Tracing 51
Advanced Graphics β Real-time Ray Tracing 52
The Top-level BVH β Summary
The top-level BVH enables complex animated scenes:
sub-BVHs, with a transform matrix and its inverse;
Combined, this allows for efficient maintenance of a global BVH.
Advanced Graphics β Real-time Ray Tracing 54
The Quest for Real-time Ray Tracing
Tracing primary, shadow and reflection rays using packets: cores mrays/s mrays/s mrays/s 1 35 37 38 2 70 74 75 6 211 221 225
(Intel Xeon, 3.4Ghz)
1024x768 @ 30Hz: 9.3 rays per pixel.
Advanced Graphics β Real-time Ray Tracing 56
BVH Construction
Assignment 2: add an acceleration structure to your framework.
Good BVHs:
Traversal:
Advanced Graphics β Real-time Ray Tracing 57
BVH Construction
Programming Language
Deadline: Thursday December 28, 23:59 Use the Submit system. Deliverables:
Advanced Graphics β Real-time Ray Tracing 58
BVH Construction
How to get a passing grade:
How to score an 8:
How to score a 10: Pick more than one advanced topic.
Jacco Bikker - November 2017 - February 2018