April 4-7, 2016 | Silicon Valley www.esi-group.com
Andreas Mank (Andreas.Mank@esi-group.com) Team Leader Visualization, ESI Group Markus Tavenrath (mtavenrath@nvidia.com) Senior Developer Technology Engineer, NVIDIA 04/04/2016
ESI RENDERING INNOVATIONS WITH NVIDIA DESIGNWORKS Andreas Mank - - PowerPoint PPT Presentation
April 4-7, 2016 | Silicon Valley ESI RENDERING INNOVATIONS WITH NVIDIA DESIGNWORKS Andreas Mank (Andreas.Mank@esi-group.com) Team Leader Visualization, ESI Group Markus Tavenrath (mtavenrath@nvidia.com) Senior Developer Technology Engineer,
April 4-7, 2016 | Silicon Valley www.esi-group.com
Andreas Mank (Andreas.Mank@esi-group.com) Team Leader Visualization, ESI Group Markus Tavenrath (mtavenrath@nvidia.com) Senior Developer Technology Engineer, NVIDIA 04/04/2016
2
— www.esi-group.com/
3
Dynamic scenes Intuitive interaction Immersion Reliable behavior High quality Everytime and everywhere Distributed hardware On demand
4
High Performance Rendering Remote Rendering Interactive Ray Tracing Physically-Based Rendering Abstract Material Definition Hybrid Rendering
5
https://developer.nvidia.com/designworks
6
7
RAY TRACER RASTERIZER
8
9
GLOBAL ILLUMINATION WITH OPTIX
10
11
12
PLATFORM AS A SERVICE WITH GRID
13
14
Modern OpenGL features Modern shader features with GLSL Not CPU-bound with shaders Not CPU-bound with complex scene graphs Efficient updates for dynamic geometries
15
2 FPS 20 FPS
16
20 360 60 120
50 100 150 200 250 300 350 Dynamic Nodes Materials Fixed
Chart Title
SceniX RiX
Frames per second
17
4/4/2016
Low- end Mid- range High- end Quadro Framerate Low- end Mid- range High- end Quadro Framerate CPU load GPU load
18
SceniX 7 used a dirty bit/renderlist cache scheme for rendering
4/4/2016
G0 T0 T1 T2 S1 S2 G1 T3 S0 M0 M1
material layer
T0 T2 T3
transform layer
S0 S1 S2
geometry layer
We had a few cases where rebuilt could be avoided Incremental updates were still not fast enough
19
4/4/2016
Profiling revealed multiple bottlenecks in renderer
for (material : materials) // HashMap -> pointer chasing if (cam->isVisible(material)) // virtual function call -> pointer chasing process material(); // virtual function call for (transform : material.transforms) // HashMap if (cam->isVisible(transform)) // virtual function call process Transform(); // virtual function call for (shape : transform.shapes) // HashMap if (cam->isVisible(shape) // virtual function fall (20% time) process(shape) // switch(OC) -> branch misprediction
20
SceniX 6 -> SceniX 7 got up to 6x faster each interation if drawcall limited Still so many bottlenecks in our SceneGraph rendering Our partners like ESI needed just a fast renderer, not a SceneGraph SceneGraph->SceneGraph->Rendering worked mostly out Took a lot of resources and wasted CPU time due to the additional layer Research platform was required how to resolve all those bottlenecks NV PRO Pipeline was born Focus on CPU efficient rendering without any compatibility restriction
4/4/2016
21
Developers who want to write an OpenGL renderer face one problem: OpenGL has a million ways to do the same thing, what‘s the best way? Parameters Uniforms, UBOs, SSBOs Geometry immediate mode, display list, vbo/ibo, vao, vab
4/4/2016
Bindless Bindless Combinatorial explosion
22
4/4/2016
How to abstract all the differences in an efficient way?
S0 T0 M0 S1 T2 M1 S2 T3 M1 S2 T3 M1 S1 T2‘ M1
Monitor
S2 S1 S1 S2 S0
render(group of objects) render(group of objects, order) RiX API to abstract rendering of groups of objects
23
4/4/2016
SceneGraph [dp::sg] RiX [dp::rix] How to get from SceneGraph to group with incremental updates?
G0 T0 T1 T2 S1 S2 G1 T3 S0
Referenced twice
S0 T0 M0 S1 T2 M1 S2 T3 M1 S2 T3 M1 S1 T2‘ M1
24
4/4/2016
SceneGraph [dp::sg] RiX::GL [dp::rix::gl]
G0 T0 T1 T2 S1 S2 G1 T3 S0
SceneTree [dp::sg::xbar] Renderer [dp::sg::rdr::rix::gl]
G0 T0 T1 T2 S1 S2 G1 T3 S0 T2‘ S1‘ S2‘ G1‘ T3‘
Events Translate
S1 T2 S2 T3 S0 T0 S1‘ T1‘ S2‘ T2‘
Needs to be done by your application if not using reference SceneGraph
S1 T2 S2 T3 S0 T0 S1‘ T1‘ S2‘ T2‘
Events Events are fully incremental
25
Basic pipeline ready SceneGraph -> group of objects -> RiX Next step: Support for GLSL Problem: uniforms, ubos, ssbos, different GLSL versions all required a different shader header We needed a material system, independent from SceneGraph
4/4/2016
26
Material system [dp::fx] was born Interface allows enumeration of Materials (shader pipelines) and corresponding set of parameter groups Allows multiple backends in parallel XML (public), MDL (on request) Material system can generate shaders for all parameter techniques Uniforms, UBOs, SSBOs, -> write shader only once
4/4/2016
27
Efficient pipeline with another ~6x speedup over SceniX 7 for draw-call limited scenes Achieves 6-7mio drawcalls/s on 2.4Ghz system when using bindless Started with new features Frustum culling TransformTree extraction from SceneTree
4/4/2016
SceneGraph [dp::sg] RiX::GL [dp::rix::gl] SceneTree [dp::sg::xbar] Renderer [dp::sg::rdr::rix::gl]
28
Frustum culling is important to reduce #draw calls per frame Don‘t render hidden objects NV PRO Pipeline has efficient frustum culling system (10k objects get culled in ~100us) works on groups and returns delta since last call
[dp::culling] is the module
4/4/2016
29
4/4/2016
TransformTree is responsible to compute work transform for each object Currently tighly bound to xbar which translates from SceneGraph to Renderer Working on TransformTree as indepdenent module Currently ~15M transforms/s on CPU and up to 300M transforms/s on GPU For more information visit my Talk: S6131 - Nvpro-Pipeline: Handling Massive Transform Updates in a SceneGraph Tuesday, 14:30 – 14:55
30
NV PRO PIPELINE is our open source research rendering pipeline, it‘s not a product Demonstrates techniques to reduce CPU cost of rendering Shows that big speedups are possible when leaving traditional SceneGraph traversal ESI proof that the concepts do work in real world applications Working on modularization so that even more modules can be used in other projects Interested? Grab your copy here: https://developer.nvidia.com/nvidia-pro-pipeline
4/4/2016
April 4-7, 2016 | Silicon Valley www.esi-group.com
https://developer.nvidia.com/nvidia-pro-pipeline
32
4/4/2016 04.04.2016
ICIDO
RiX::GL Transform
Viewer
RASTERIZER RAY TRACER
VCA OptiX Culling Multi-Cast
33
4/4/2016 04.04.2016
ICIDO VRify
MDL SDK COMPOSITER GRID SDK dp::fx
OPENGL OPTIX VULKAN