R Real-Time Rendering l Ti R d i (Echtzeitgraphik) - - PowerPoint PPT Presentation
R Real-Time Rendering l Ti R d i (Echtzeitgraphik) - - PowerPoint PPT Presentation
R Real-Time Rendering l Ti R d i (Echtzeitgraphik) (Echtzeitgraphik) Michael Wimmer wimmer@cg tuwien ac at wimmer@cg.tuwien.ac.at Walking down the graphics pipeline i li Application Application Geometry Geometry Rasterizer
Walking down the graphics i li pipeline
Application Application Geometry Geometry Rasterizer Rasterizer
What for? Understanding the rendering pipeline is the key to real-time rendering! key to real-time rendering! Insights into how things work
Understanding algorithms Understanding algorithms
Insights into how fast things work
Performance
Vienna University of Technology 3
Simple Graphics Pipeline Often found in text books Will t k d t il d l k i t O GL Will take a more detailed look into OpenGL Application Application Geometry Geometry Rasterizer Rasterizer Display Display Display Display
Vienna University of Technology 4
Graphics Pipeline (pre DX10, OpenGL 2 )
Nowadays, everything part of the pipeline is
Application
CPU CPU
part of the pipeline is hardware accelerated
try try
Driver Command
C Geomet Geomet
Geometry Command
Fragment: “pixel”, but with additional info ( )
G er er
Rasterization
(alpha, depth, stencil, …)
asterize asterize
Texture
Ra Ra
Fragment Di l
Vienna University of Technology 5
Display
Fixed Function Pipeline – Dataflow View
- n-chip cache memory
video memory
vertex shading (T&L) geometry
pre-TnL cache
system memory
CPU ( ) commands
post-TnL cache
rasterization CPU triangle setup rasterization fragment shading textures
texture cache
shading and raster ti frame buffer
Vienna University of Technology 6
- perations
DirectX10 /OpenGL 3.2 Evolution
Input Assembler Vertex Buffer Index Buffer
Application
CPU CPU
Vertex Shader Texture Buffer
try try
Driver Command
C
Geometry Shader St Texture
Geomet Geomet
Geometry Command
Setup/ Stream Out Memory Buffer
G er er
Rasterization
Pixel Shader
Rasterization Texture
asterize asterize
Texture
Shader
Output Merger Depth C l
Ra Ra
Fragment Di l
Vienna University of Technology 7
Merger Color
Display
OpenGL 3.0 OpenGL 2.x is not as capable as DirectX 10
But: New features are vendor specific But: New features are vendor specific extensions (geometry shaders, streams…) GLSL a little more restrictive than HLSL (SM 3.0)
OpenGL 3.0 did not clean up this mess! p p
- OpenGL 2.1 + extensions
- Geometry shaders are only an extension
- Geometry shaders are only an extension
- New: depreciation mechanism
OpenGL 4 x OpenGL 4.x
- New extensions
- OpenGL ES compatibility!
Vienna University of Technology 8
- OpenGL ES compatibility!
DirectX 11/OpenGL 4.0 Evolution
fixed fixed programmable programmable f memory memory
Constant Constant Constant Constant Constant Constant Constant Constant Vertex Vertex Shader Shader Setup Setup Rasterizer Rasterizer Output Output Merger Merger Pixel Pixel Shader Shader Geometry Geometry Shader Shader Input Input Assembler Assembler Tessellator Tessellator Control Control Point Point Shader Shader Stream Stream
- ut
- ut
Sampler Sampler Sampler Sampler Sampler Sampler Sampler Sampler
Texture Texture Texture Texture
R ender R ender Target Target Depth Depth S tencil S tencil
Texture Texture
S tream S tream Buffer Buffer
Memory Memory
Vertex Vertex Buffer Buffer Index Index Buffer Buffer
Texture Texture
Vienna University of Technology 9
Memory Memory
DirectX 11 Tesselation
At unexpected position! At unexpected position!
Compute Shaders Multithreading
To reduce state change overhead To reduce state change overhead
Dynamic shader linking y g HDR texture compression M th f t Many other features...
Vienna University of Technology 10
DirectX 11 Pipeline
Vienna University of Technology 11
Application Generate database (Scene description)
Usually only once Usually only once Load from disk B ild l ti t t (hi h ) Build acceleration structures (hierarchy, …)
Simulation (Animation, AI, Physics) ( y ) Input event handlers Modify data structures Modify data structures Database traversal Primitive generation Shaders (vertex,geometry,fragment)
Vienna University of Technology 12
Shaders (vertex,geometry,fragment)
Driver Maintain graphics API state C d i t t ti /t l ti Command interpretation/translation
Host commands GPU commands
Handle data transfer M t Memory management Emulation of missing hardware features g U ll h h d! Usually huge overhead!
Significantly reduced in DX10
Vienna University of Technology 13
g y
Geometry Stage Command Command Vertex Processing Vertex Processing Vertex Processing Vertex Processing Tesselation Tesselation Primitive Assembly Primitive Assembly Tesselation Tesselation Geometry Shading Geometry Shading Clipping Clipping P ti Di i i P ti Di i i Geometry Shading Geometry Shading Perspective Division Perspective Division Culling Culling
Vienna University of Technology 14
Culling Culling
Command
Command buffering (!) Command Unpack and perform format conversion Command interpretation format conversion (“Input Assembler”)
glLoadIdentity( ); glMultMatrix( T ); glBegin( GL TRIANGLE STRIP ); glLoadIdentity( ); glMultMatrix( T ); glBegin( GL TRIANGLE STRIP );
Color
T
g g ( _ _ ) glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); g g ( _ _ ) glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 );
Transformation matrix
T
glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0 0 1 0 0 0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0 0 1 0 0 0 ); glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); lE d( ) glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); lE d( )
Vienna University of Technology 15
glEnd( ); glEnd( );
Vertex Processing Transformation Vertex Processing Vertex Processing
- bject
eye clip normalized d i window v e r
Modelview Projection Perspective Viewport
device r t e x
Modelview Matrix Projection Matrix Perspective Division Viewport Transform Modelview Modelview Projection Modelview
Vienna University of Technology 16
Vertex Processing
Fixed function pipeline: User has to provide matrices the rest happens User has to provide matrices, the rest happens automatically Programmable pipeline: Programmable pipeline: User has to provide matrices/other data to h d shader Shader Code transforms vertex explicitly
We can do whatever we want with the vertex! Usually a gl_ModelViewProjectionMatrix is provided In GLSL-Shader : gl_Position = ftransform();
Vienna University of Technology 17
Vertex Processing Lighting T t di t ti d/ Texture coordinate generation and/or transformation Vertex shading for special effects
T
Object-space triangles Screen-space lit triangles
Vienna University of Technology 18
Tesselation If just triangles, nothing needs to be done,
- therwise:
- therwise:
Evaluation of polynomials for curved surfaces
- Create vertices (tesselation)
DirectX11 specifies this in hardware! DirectX11 specifies this in hardware!
3 new shader stages!!! Still not trivial (special algorithms required)
Vienna University of Technology 19
DirectX11 Tesselation
Vienna University of Technology 20
Tesselation Example
Optimally tesslated!
Vienna University of Technology 21
Geometry Shader Calculations on a primitive (triangle) A t i hb t i l Access to neighbor triangles Limited output (1024 32-bit values) p ( )
No general tesselation!
A li i Applications:
Render to cubemap p Shadow volume generation T i l t i f t i Triangle extension for ray tracing Extrusion operations (fur rendering)
Vienna University of Technology 22
p ( g)
Rest of Geometry Stage Primitive assembly G t h d Geometry shader Clipping (in homogeneous coordinates) pp g ( g ) Perspective division, viewport transform C lli Culling
Vienna University of Technology 23
Rasterization Stage T i l S t T i l S t R i i R i i Triangle Setup Triangle Setup Rasterization Rasterization Fragment Fragment Processing Processing Texture Texture Processing Processing Processing Processing Raster Operations Raster Operations Processing Processing Raster Operations Raster Operations
Vienna University of Technology 24
Rasterization Setup (per-triangle) S li (t i l {f t }) Sampling (triangle = {fragments}) Interpolation (interpolate colors and p ( p coordinates)
Vienna University of Technology 25
Screen-space triangles Fragments
Rasterization Sampling inclusion determination I til d i h h In tile order improves cache coherency Tile sizes vendor/generation g specific
Old graphics cards: 16x64 Old graphics cards: 16x64 New: 4x4
- Smaller tile size favors
conditionals in shaders
- All tile fragments calculated in parallel
- n modern hardware
Vienna University of Technology 26
- n modern hardware
Rasterization – Coordinates Fragments represent “future” pixels
y window coordinate
Pixel center at Pixel center at (2 5 1 5)! (2 5 1 5)!
3.0
(2.5, 1.5)! (2.5, 1.5)!
1 0 2.0
Pixel (2,1)
0.0 1.0
( , )
0.0 1.0 2.0 3.0 x window coordinate Lower left corner
- f the window
Vienna University of Technology 27
- f the window
Rasterization – Rules Separate rule for each primitive each primitive Non-ambiguous! Polygons:
Pixel center Pixel center contained in polygon polygon On-edge pixels:
- nly one is
rasterized
Vienna University of Technology 28
Texture Texture “transformation” and projection
E g projective textures E.g., projective textures
Texture address calculation (programmable in shader) Texture filtering Texture filtering
Vienna University of Technology 29
Fragments Texture Fragments
Fragment Texture operations (combinations, modulations animations etc ) modulations, animations etc.)
Texture Fragments Fragments Textured Fragments
Vienna University of Technology 30
Raster Tests
Ownership Is pixel obscured by other window? Is pixel obscured by other window? Scissor test Only render to scissor rectangle Depth test Test according to z-buffer Alpha test Alpha test Test according to alpha-value S il Stencil test Test according to stencil
Textured Fragments F b ff Pi l
Vienna University of Technology 31
buffer
Textured Fragments Framebuffer Pixels
Raster Operations Blending or compositing Dith i Dithering Logical operations g p
Vienna University of Technology 32
Textured Fragments Framebuffer Pixels
Raster Operations After fragment color calculation (“Output Merger”) Merger )
Scissor Alpha Fragment and Pixel O hi Scissor Test Alpha Test and associated data Ownership Test Stencil Test Depth Test
Stencil Buffer Depth Buffer
Blending (RGBA only) Dithering Logicop Frame Buffer
Vienna University of Technology 33
Display Gamma correction Di it l t l i if Digital to analog conversion if necessary
Framebuffer Pixels Li ht
Vienna University of Technology 34
Framebuffer Pixels Light
Display Frame buffer pixel format: RGBA vs index (obsolete) RGBA vs. index (obsolete) Bits: 16, 32, 128 bit floating point, … Double buffered vs. single buffered Quad buffered for stereo Quad-buffered for stereo Overlays (extra bit planes) for GUI Auxiliary buffers: alpha, stencil
Vienna University of Technology 35
Functionality vs. Frequency
Geometry processing = per-vertex Transformation and Lighting (T&L) Transformation and Lighting (T&L) Historically floating point, complex operations Today: fully programmable flow control texture Today: fully programmable flow control, texture lookup 20-1500 million vertices per second 20-1500 million vertices per second Fragment processing = per-fragment Blending and texture combination Blending and texture combination Historically fixed point and limited operations U 0 billi f (“Gi l”/ ) Up to 50 billion fragments (“Gigatexel”/sec) Floating point, programmable complex i
Vienna University of Technology 36
- perations
Computational Requirements
Application
Assume typical non-trivial fixed- function rendering task
Command
function rendering task
1 light, texture coordinates, projective texture mapping
Geometry
projective texture mapping 7 interpolants (z,r,g,b,s,t,q) Trilinear filtering texture
Rasterization Texture
Trilinear filtering, texture-, color blending, depth buffering
Rough estimate:
Texture Fragment
Rough estimate: ADD CMP MUL DIV
Display
Vertex 102 30 108 5 Fragment 66 9 70 1
Vienna University of Technology 37
Fragment 66 9 70 1
Communication Requirements
Vertex size:
Position x y z
Display:
Color r g b 3 bytes Position x,y,z Normal x,y,z Texture coordinate s t Color r,g,b, 3 bytes
Fragment size (in frame buffer):
Texture coordinate s,t 8 4 = 32 bytes
buffer):
Color r,g,b,a
Texture:
Color r,g,b,a, 4 bytes Depth z (assume 32 bit) 8 bytes, but goes both ways (because f bl di !)
- f blending!)
Vienna University of Technology 38
Communication Requirements
Vertex 5 Gops Fragment 150 Gops
0.640 GB/s
Application
Fragment 150 Gops
20 Mvert/s
Command
4 GB/s
Geometry
4 GB/s Texture Memory
Rasterization T t
Framebuffer 1000 Mpix/s Texture Memory
Texture Fragment
Framebuffer 120 Mpix/s 16 GB/s
Fragment Display
Vienna University of Technology 39