(Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking - - PowerPoint PPT Presentation

echtzeitgraphik
SMART_READER_LITE
LIVE PREVIEW

(Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking - - PowerPoint PPT Presentation

Real-Time Rendering (Echtzeitgraphik) Michael Wimmer wimmer@cg.tuwien.ac.at Walking down the graphics pipeline Application Geometry Rasterizer What for? Understanding the rendering pipeline is the key to real-time rendering! Insights into


slide-1
SLIDE 1

Real-Time Rendering (Echtzeitgraphik)

Michael Wimmer wimmer@cg.tuwien.ac.at

slide-2
SLIDE 2

Walking down the graphics pipeline

Application Geometry Rasterizer

slide-3
SLIDE 3

What for? Understanding the rendering pipeline is the key to real-time rendering! Insights into how things work

Understanding algorithms

Insights into how fast things work

Performance

Vienna University of Technology 3

slide-4
SLIDE 4

Simple Graphics Pipeline Often found in text books Will take a more detailed look into OpenGL

Vienna University of Technology 4

Application Geometry Rasterizer Display

slide-5
SLIDE 5

Nowadays, everything part

  • f the pipeline is hardware

accelerated Fragment: “pixel”, but with additional info (alpha, depth, stencil, …)

Graphics Pipeline (pre DX10, OpenGL 2 )

Vienna University of Technology 5

Geometry Rasterizer

Driver Geometry Rasterization Texture Fragment Display Command Application

CPU

slide-6
SLIDE 6

Fixed Function Pipeline – Dataflow View

Vienna University of Technology 6

  • n-chip cache memory

video memory system memory

rasterization CPU vertex shading (T&L) triangle setup fragment shading and raster

  • perations

textures frame buffer geometry commands

pre-TnL cache post-TnL cache texture cache

slide-7
SLIDE 7

DirectX10 /OpenGL 3.2 Evolution

Vienna University of Technology 7

Vertex Shader Geometry Shader

Pixel Shader

Input Assembler Setup/ Rasterization Output Merger Stream Out Memory Vertex Buffer Texture Depth Texture Texture Color Index Buffer Buffer

Geometry Rasterizer

Driver Geometry Rasterization Texture Fragment Display Command Application

CPU

slide-8
SLIDE 8

OpenGL 3.0 OpenGL 2.x is not as capable as DirectX 10

But: New features are vendor specific extensions (geometry shaders, streams…) GLSL a little more restrictive than HLSL (SM 3.0)

OpenGL 3.0 did not clean up this mess!

  • OpenGL 2.1 + extensions
  • Geometry shaders are only an extension
  • New: depreciation mechanism

OpenGL 4.x

  • New extensions
  • OpenGL ES compatibility!

Vienna University of Technology 8

slide-9
SLIDE 9

DirectX 11/OpenGL 4.0 Evolution

Vienna University of Technology 9

Vertex Shader Setup Rasterizer Output Merger Pixel Shader Geometry Shader

Texture Texture

Render Target Depth Stencil

Texture

Stream Buffer

Stream

  • ut

Memory memory programmable fixed Sampler Sampler Sampler Constant Constant Constant

Vertex Buffer

Input Assembler

Index Buffer

Tessellator Control Point Shader

Texture

Sampler Constant

Not the final place in the pipeline!!!

slide-10
SLIDE 10

DirectX 11 Tesselation

At unexpected position!

Compute Shaders Multithreading

To reduce state change overhead

Dynamic shader linking HDR texture compression Many other features...

Vienna University of Technology 10

slide-11
SLIDE 11

DirectX 11 Pipeline

Vienna University of Technology 11

slide-12
SLIDE 12

DirectX 12/Vulkan/AMD Mantle/Apple Metal Reduce driver overhad

Indirect drawing Pipeline state objects Command lists/bundles Partly possible already in OpenGL 4.3+

Other features

Conservative rasterization (for culling) New blend modes Order-independent transparency

Vienna University of Technology 12

slide-13
SLIDE 13

Application Generate database (Scene description)

Usually only once Load from disk Build acceleration structures (hierarchy, …)

Simulation (Animation, AI, Physics) Input event handlers Modify data structures Database traversal Shaders (vertex,geometry,fragment)

Vienna University of Technology 13

slide-14
SLIDE 14

Driver Maintain graphics API state Command interpretation/translation

Host commands  GPU commands

Handle data transfer Memory management Emulation of missing hardware features Usually huge overhead!

Significantly reduced in DX10

Vienna University of Technology 14

slide-15
SLIDE 15

Geometry Stage

Vienna University of Technology 15

Command Vertex Processing Clipping Perspective Division Primitive Assembly Culling Tesselation Geometry Shading

slide-16
SLIDE 16

Command

Command buffering (!) Command interpretation Unpack and perform format conversion (“Input Assembler”)

Vienna University of Technology 16

glLoadIdentity( ); glMultMatrix( T ); glBegin( GL_TRIANGLE_STRIP ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); glEnd( );

Color Transformation matrix

T

slide-17
SLIDE 17

Vertex Processing Transformation

Vienna University of Technology 17

Vertex Processing

v e r t e x

Modelview Matrix Projection Matrix Perspective Division Viewport Transform Modelview Modelview Projection

l l l

  • bject

eye clip normalized device window

slide-18
SLIDE 18

Vertex Processing

Fixed function pipeline: User has to provide matrices, the rest happens automatically Programmable pipeline: User has to provide matrices/other data to shader Shader Code transforms vertex explicitly

We can do whatever we want with the vertex! Usually a gl_ModelViewProjectionMatrix is provided In GLSL-Shader : gl_Position = ftransform();

Vienna University of Technology 18

slide-19
SLIDE 19

Vertex Processing Lighting Texture coordinate generation and/or transformation Vertex shading for special effects

Vienna University of Technology 19

T

Object-space triangles Screen-space lit triangles

slide-20
SLIDE 20

Tesselation If just triangles, nothing needs to be done,

  • therwise:

Evaluation of polynomials for curved surfaces

  • Create vertices (tesselation)

DirectX11 specifies this in hardware!

3 new shader stages!!! Still not trivial (special algorithms required)

Vienna University of Technology 20

slide-21
SLIDE 21

DirectX11 Tesselation

Vienna University of Technology 21

control shader evaluation shader

slide-22
SLIDE 22

Tesselation Example

Vienna University of Technology 22

Optimally tesslated!

slide-23
SLIDE 23

Geometry Shader Calculations on a primitive (triangle) Access to neighbor triangles Limited output (1024 32-bit values)

 No general tesselation!

Applications:

Render to cubemap Shadow volume generation Triangle extension for ray tracing Extrusion operations (fur rendering)

Vienna University of Technology 23

slide-24
SLIDE 24

Rest of Geometry Stage Primitive assembly Geometry shader Clipping (in homogeneous coordinates) Perspective division, viewport transform Culling

Vienna University of Technology 24

slide-25
SLIDE 25

Rasterization Stage

Vienna University of Technology 25

Rasterization Fragment Processing Raster Operations Texture Processing Triangle Setup

slide-26
SLIDE 26

Rasterization Setup (per-triangle) Sampling (triangle = {fragments}) Interpolation (interpolate colors and coordinates)

Vienna University of Technology 26

Screen-space triangles Fragments

slide-27
SLIDE 27

Rasterization Sampling inclusion determination In tile order improves cache coherency Tile sizes vendor/generation specific

Old graphics cards: 16x64 New: 4x4

  • Smaller tile size favors

conditionals in shaders

  • All tile fragments calculated in parallel
  • n modern hardware

Vienna University of Technology 27

slide-28
SLIDE 28

Rasterization – Coordinates Fragments represent “future” pixels

Vienna University of Technology 28

0.0 1.0 2.0 3.0 0.0 1.0 2.0 3.0 x window coordinate y window coordinate

Pixel (2,1)

Lower left corner

  • f the window

Pixel center at (2.5, 1.5)!

slide-29
SLIDE 29

Rasterization – Rules Separate rule for each primitive Non-ambiguous! Polygons:

Pixel center contained in polygon On-edge pixels:

  • nly one is

rasterized

Vienna University of Technology 29

slide-30
SLIDE 30

Texture Texture “transformation” and projection

E.g., projective textures

Texture address calculation (programmable in shader) Texture filtering

Vienna University of Technology 30

Fragments Texture Fragments

slide-31
SLIDE 31

Fragment Texture operations (combinations, modulations, animations etc.)

Vienna University of Technology 31

Fragments Textured Fragments Texture Fragments

slide-32
SLIDE 32

Raster Tests

Ownership Is pixel obscured by other window? Scissor test Only render to scissor rectangle Depth test Test according to z-buffer Alpha test Test according to alpha-value Stencil test Test according to stencil buffer

Vienna University of Technology 32

Textured Fragments Framebuffer Pixels

slide-33
SLIDE 33

Raster Operations Blending or compositing Dithering Logical operations

Vienna University of Technology 33

Textured Fragments Framebuffer Pixels

slide-34
SLIDE 34

Raster Operations

Scissor Test Alpha Test Stencil Test Depth Test Blending (RGBA only) Dithering Logicop Frame Buffer

Stencil Buffer Depth Buffer

Fragment and associated data Pixel Ownership Test

Vienna University of Technology 34

After fragment color calculation (“Output Merger”)

slide-35
SLIDE 35

Display Gamma correction Digital to analog conversion if necessary

Vienna University of Technology 35

Framebuffer Pixels Light

slide-36
SLIDE 36

Display Frame buffer pixel format: RGBA vs. index (obsolete) Bits: 16, 32, 128 bit floating point, … Double buffered vs. single buffered Quad-buffered for stereo Overlays (extra bit planes) for GUI Auxiliary buffers: alpha, stencil

Vienna University of Technology 36

slide-37
SLIDE 37

Functionality vs. Frequency

Geometry processing = per-vertex Transformation and Lighting (T&L) Historically floating point, complex operations Today: fully programmable flow control, texture lookup 20-1500 million vertices per second Fragment processing = per-fragment Blending and texture combination Historically fixed point and limited operations Up to 50 billion fragments (“Gigatexel”/sec) Floating point, programmable complex operations

Vienna University of Technology 37

slide-38
SLIDE 38

Application Geometry Rasterization Texture Fragment Display Command

Assume typical non-trivial fixed- function rendering task

1 light, texture coordinates, projective texture mapping 7 interpolants (z,r,g,b,s,t,q) Trilinear filtering, texture-, color blending, depth buffering

Rough estimate:

Computational Requirements

Vienna University of Technology 38

ADD CMP MUL DIV Vertex 102 30 108 5 Fragment 66 9 70 1

slide-39
SLIDE 39

Communication Requirements

Vertex size:

Position x,y,z Normal x,y,z Texture coordinate s,t  8  4 = 32 bytes

Texture:

Color r,g,b,a, 4 bytes

Display:

Color r,g,b, 3 bytes

Fragment size (in frame buffer):

Color r,g,b,a Depth z (assume 32 bit)  8 bytes, but goes both ways (because of blending!)

Vienna University of Technology 39

slide-40
SLIDE 40

Communication Requirements

Vienna University of Technology 40

Vertex 5 Gops Fragment 150 Gops

Framebuffer 0.36 GB/s 1000 Mpix/s 20 Mvert/s 120 Mpix/s 16 GB/s 4 GB/s 0.640 GB/s Texture Memory

Application Geometry Rasterization Texture Fragment Display Command