R Real-Time Rendering l Ti R d i (Echtzeitgraphik) - - PowerPoint PPT Presentation

r real time rendering l ti r d i echtzeitgraphik
SMART_READER_LITE
LIVE PREVIEW

R Real-Time Rendering l Ti R d i (Echtzeitgraphik) - - PowerPoint PPT Presentation

R Real-Time Rendering l Ti R d i (Echtzeitgraphik) (Echtzeitgraphik) Michael Wimmer wimmer@cg tuwien ac at wimmer@cg.tuwien.ac.at Walking down the graphics pipeline i li Application Application Geometry Geometry Rasterizer


slide-1
SLIDE 1

R l Ti R d i Real-Time Rendering (Echtzeitgraphik) (Echtzeitgraphik)

Michael Wimmer wimmer@cg tuwien ac at wimmer@cg.tuwien.ac.at

slide-2
SLIDE 2

Walking down the graphics i li pipeline

Application Application Geometry Geometry Rasterizer Rasterizer

slide-3
SLIDE 3

What for? Understanding the rendering pipeline is the key to real-time rendering! key to real-time rendering! Insights into how things work

Understanding algorithms Understanding algorithms

Insights into how fast things work

Performance

Vienna University of Technology 3

slide-4
SLIDE 4

Simple Graphics Pipeline Often found in text books Will t k d t il d l k i t O GL Will take a more detailed look into OpenGL Application Application Geometry Geometry Rasterizer Rasterizer Display Display Display Display

Vienna University of Technology 4

slide-5
SLIDE 5

Graphics Pipeline (pre DX10, OpenGL 2 )

Nowadays, everything part of the pipeline is

Application

CPU CPU

part of the pipeline is hardware accelerated

try try

Driver Command

C Geomet Geomet

Geometry Command

Fragment: “pixel”, but with additional info ( )

G er er

Rasterization

(alpha, depth, stencil, …)

asterize asterize

Texture

Ra Ra

Fragment Di l

Vienna University of Technology 5

Display

slide-6
SLIDE 6

Fixed Function Pipeline – Dataflow View

  • n-chip cache memory

video memory

vertex shading (T&L) geometry

pre-TnL cache

system memory

CPU ( ) commands

post-TnL cache

rasterization CPU triangle setup rasterization fragment shading textures

texture cache

shading and raster ti frame buffer

Vienna University of Technology 6

  • perations
slide-7
SLIDE 7

DirectX10 /OpenGL 3.2 Evolution

Input Assembler Vertex Buffer Index Buffer

Application

CPU CPU

Vertex Shader Texture Buffer

try try

Driver Command

C

Geometry Shader St Texture

Geomet Geomet

Geometry Command

Setup/ Stream Out Memory Buffer

G er er

Rasterization

Pixel Shader

Rasterization Texture

asterize asterize

Texture

Shader

Output Merger Depth C l

Ra Ra

Fragment Di l

Vienna University of Technology 7

Merger Color

Display

slide-8
SLIDE 8

OpenGL 3.0 OpenGL 2.x is not as capable as DirectX 10

But: New features are vendor specific But: New features are vendor specific extensions (geometry shaders, streams…) GLSL a little more restrictive than HLSL (SM 3.0)

OpenGL 3.0 did not clean up this mess! p p

  • OpenGL 2.1 + extensions
  • Geometry shaders are only an extension
  • Geometry shaders are only an extension
  • New: depreciation mechanism

OpenGL 4 x OpenGL 4.x

  • New extensions
  • OpenGL ES compatibility!

Vienna University of Technology 8

  • OpenGL ES compatibility!
slide-9
SLIDE 9

DirectX 11/OpenGL 4.0 Evolution

fixed fixed programmable programmable f memory memory

Constant Constant Constant Constant Constant Constant Constant Constant Vertex Vertex Shader Shader Setup Setup Rasterizer Rasterizer Output Output Merger Merger Pixel Pixel Shader Shader Geometry Geometry Shader Shader Input Input Assembler Assembler Tessellator Tessellator Control Control Point Point Shader Shader Stream Stream

  • ut
  • ut

Sampler Sampler Sampler Sampler Sampler Sampler Sampler Sampler

Texture Texture Texture Texture

R ender R ender Target Target Depth Depth S tencil S tencil

Texture Texture

S tream S tream Buffer Buffer

Memory Memory

Vertex Vertex Buffer Buffer Index Index Buffer Buffer

Texture Texture

Vienna University of Technology 9

Memory Memory

slide-10
SLIDE 10

DirectX 11 Tesselation

At unexpected position! At unexpected position!

Compute Shaders Multithreading

To reduce state change overhead To reduce state change overhead

Dynamic shader linking y g HDR texture compression M th f t Many other features...

Vienna University of Technology 10

slide-11
SLIDE 11

DirectX 11 Pipeline

Vienna University of Technology 11

slide-12
SLIDE 12

Application Generate database (Scene description)

Usually only once Usually only once Load from disk B ild l ti t t (hi h ) Build acceleration structures (hierarchy, …)

Simulation (Animation, AI, Physics) ( y ) Input event handlers Modify data structures Modify data structures Database traversal Primitive generation Shaders (vertex,geometry,fragment)

Vienna University of Technology 12

Shaders (vertex,geometry,fragment)

slide-13
SLIDE 13

Driver Maintain graphics API state C d i t t ti /t l ti Command interpretation/translation

Host commands  GPU commands

Handle data transfer M t Memory management Emulation of missing hardware features g U ll h h d! Usually huge overhead!

Significantly reduced in DX10

Vienna University of Technology 13

g y

slide-14
SLIDE 14

Geometry Stage Command Command Vertex Processing Vertex Processing Vertex Processing Vertex Processing Tesselation Tesselation Primitive Assembly Primitive Assembly Tesselation Tesselation Geometry Shading Geometry Shading Clipping Clipping P ti Di i i P ti Di i i Geometry Shading Geometry Shading Perspective Division Perspective Division Culling Culling

Vienna University of Technology 14

Culling Culling

slide-15
SLIDE 15

Command

Command buffering (!) Command Unpack and perform format conversion Command interpretation format conversion (“Input Assembler”)

glLoadIdentity( ); glMultMatrix( T ); glBegin( GL TRIANGLE STRIP ); glLoadIdentity( ); glMultMatrix( T ); glBegin( GL TRIANGLE STRIP );

Color

T

g g ( _ _ ) glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); g g ( _ _ ) glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0.0, 0.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 );

Transformation matrix

T

glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0 0 1 0 0 0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 0.0, 0.0 ); glColor3f ( 0.0, 0.5, 0.0 ); glVertex3f( 0 0 1 0 0 0 ); glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); lE d( ) glVertex3f( 0.0, 1.0, 0.0 ); glColor3f ( 0.5, 0.0, 0.0 ); glVertex3f( 1.0, 1.0, 0.0 ); lE d( )

Vienna University of Technology 15

glEnd( ); glEnd( );

slide-16
SLIDE 16

Vertex Processing Transformation Vertex Processing Vertex Processing

  • bject

eye clip normalized d i window v e r

Modelview Projection Perspective Viewport

device r t e x

Modelview Matrix Projection Matrix Perspective Division Viewport Transform Modelview Modelview Projection Modelview

 

Vienna University of Technology 16

slide-17
SLIDE 17

Vertex Processing

Fixed function pipeline: User has to provide matrices the rest happens User has to provide matrices, the rest happens automatically Programmable pipeline: Programmable pipeline: User has to provide matrices/other data to h d shader Shader Code transforms vertex explicitly

We can do whatever we want with the vertex! Usually a gl_ModelViewProjectionMatrix is provided In GLSL-Shader : gl_Position = ftransform();

Vienna University of Technology 17

slide-18
SLIDE 18

Vertex Processing Lighting T t di t ti d/ Texture coordinate generation and/or transformation Vertex shading for special effects

T

Object-space triangles Screen-space lit triangles

Vienna University of Technology 18

slide-19
SLIDE 19

Tesselation If just triangles, nothing needs to be done,

  • therwise:
  • therwise:

Evaluation of polynomials for curved surfaces

  • Create vertices (tesselation)

DirectX11 specifies this in hardware! DirectX11 specifies this in hardware!

3 new shader stages!!! Still not trivial (special algorithms required)

Vienna University of Technology 19

slide-20
SLIDE 20

DirectX11 Tesselation

Vienna University of Technology 20

slide-21
SLIDE 21

Tesselation Example

Optimally tesslated!

Vienna University of Technology 21

slide-22
SLIDE 22

Geometry Shader Calculations on a primitive (triangle) A t i hb t i l Access to neighbor triangles Limited output (1024 32-bit values) p ( )

 No general tesselation!

A li i Applications:

Render to cubemap p Shadow volume generation T i l t i f t i Triangle extension for ray tracing Extrusion operations (fur rendering)

Vienna University of Technology 22

p ( g)

slide-23
SLIDE 23

Rest of Geometry Stage Primitive assembly G t h d Geometry shader Clipping (in homogeneous coordinates) pp g ( g ) Perspective division, viewport transform C lli Culling

Vienna University of Technology 23

slide-24
SLIDE 24

Rasterization Stage T i l S t T i l S t R i i R i i Triangle Setup Triangle Setup Rasterization Rasterization Fragment Fragment Processing Processing Texture Texture Processing Processing Processing Processing Raster Operations Raster Operations Processing Processing Raster Operations Raster Operations

Vienna University of Technology 24

slide-25
SLIDE 25

Rasterization Setup (per-triangle) S li (t i l {f t }) Sampling (triangle = {fragments}) Interpolation (interpolate colors and p ( p coordinates)

Vienna University of Technology 25

Screen-space triangles Fragments

slide-26
SLIDE 26

Rasterization Sampling inclusion determination I til d i h h In tile order improves cache coherency Tile sizes vendor/generation g specific

Old graphics cards: 16x64 Old graphics cards: 16x64 New: 4x4

  • Smaller tile size favors

conditionals in shaders

  • All tile fragments calculated in parallel
  • n modern hardware

Vienna University of Technology 26

  • n modern hardware
slide-27
SLIDE 27

Rasterization – Coordinates Fragments represent “future” pixels

y window coordinate

Pixel center at Pixel center at (2 5 1 5)! (2 5 1 5)!

3.0

(2.5, 1.5)! (2.5, 1.5)!

1 0 2.0

Pixel (2,1)

0.0 1.0

( , )

0.0 1.0 2.0 3.0 x window coordinate Lower left corner

  • f the window

Vienna University of Technology 27

  • f the window
slide-28
SLIDE 28

Rasterization – Rules Separate rule for each primitive each primitive Non-ambiguous! Polygons:

Pixel center Pixel center contained in polygon polygon On-edge pixels:

  • nly one is

rasterized

Vienna University of Technology 28

slide-29
SLIDE 29

Texture Texture “transformation” and projection

E g projective textures E.g., projective textures

Texture address calculation (programmable in shader) Texture filtering Texture filtering

Vienna University of Technology 29

Fragments Texture Fragments

slide-30
SLIDE 30

Fragment Texture operations (combinations, modulations animations etc ) modulations, animations etc.)

Texture Fragments Fragments Textured Fragments

Vienna University of Technology 30

slide-31
SLIDE 31

Raster Tests

Ownership Is pixel obscured by other window? Is pixel obscured by other window? Scissor test Only render to scissor rectangle Depth test Test according to z-buffer Alpha test Alpha test Test according to alpha-value S il Stencil test Test according to stencil

Textured Fragments F b ff Pi l

Vienna University of Technology 31

buffer

Textured Fragments Framebuffer Pixels

slide-32
SLIDE 32

Raster Operations Blending or compositing Dith i Dithering Logical operations g p

Vienna University of Technology 32

Textured Fragments Framebuffer Pixels

slide-33
SLIDE 33

Raster Operations After fragment color calculation (“Output Merger”) Merger )

Scissor Alpha Fragment and Pixel O hi Scissor Test Alpha Test and associated data Ownership Test Stencil Test Depth Test

Stencil Buffer Depth Buffer

Blending (RGBA only) Dithering Logicop Frame Buffer

Vienna University of Technology 33

slide-34
SLIDE 34

Display Gamma correction Di it l t l i if Digital to analog conversion if necessary

Framebuffer Pixels Li ht

Vienna University of Technology 34

Framebuffer Pixels Light

slide-35
SLIDE 35

Display Frame buffer pixel format: RGBA vs index (obsolete) RGBA vs. index (obsolete) Bits: 16, 32, 128 bit floating point, … Double buffered vs. single buffered Quad buffered for stereo Quad-buffered for stereo Overlays (extra bit planes) for GUI Auxiliary buffers: alpha, stencil

Vienna University of Technology 35

slide-36
SLIDE 36

Functionality vs. Frequency

Geometry processing = per-vertex Transformation and Lighting (T&L) Transformation and Lighting (T&L) Historically floating point, complex operations Today: fully programmable flow control texture Today: fully programmable flow control, texture lookup 20-1500 million vertices per second 20-1500 million vertices per second Fragment processing = per-fragment Blending and texture combination Blending and texture combination Historically fixed point and limited operations U 0 billi f (“Gi l”/ ) Up to 50 billion fragments (“Gigatexel”/sec) Floating point, programmable complex i

Vienna University of Technology 36

  • perations
slide-37
SLIDE 37

Computational Requirements

Application

Assume typical non-trivial fixed- function rendering task

Command

function rendering task

1 light, texture coordinates, projective texture mapping

Geometry

projective texture mapping 7 interpolants (z,r,g,b,s,t,q) Trilinear filtering texture

Rasterization Texture

Trilinear filtering, texture-, color blending, depth buffering

Rough estimate:

Texture Fragment

Rough estimate: ADD CMP MUL DIV

Display

Vertex 102 30 108 5 Fragment 66 9 70 1

Vienna University of Technology 37

Fragment 66 9 70 1

slide-38
SLIDE 38

Communication Requirements

Vertex size:

Position x y z

Display:

Color r g b 3 bytes Position x,y,z Normal x,y,z Texture coordinate s t Color r,g,b, 3 bytes

Fragment size (in frame buffer):

Texture coordinate s,t  8  4 = 32 bytes

buffer):

Color r,g,b,a

Texture:

Color r,g,b,a, 4 bytes Depth z (assume 32 bit)  8 bytes, but goes both ways (because f bl di !)

  • f blending!)

Vienna University of Technology 38

slide-39
SLIDE 39

Communication Requirements

Vertex 5 Gops Fragment 150 Gops

0.640 GB/s

Application

Fragment 150 Gops

20 Mvert/s

Command

4 GB/s

Geometry

4 GB/s Texture Memory

Rasterization T t

Framebuffer 1000 Mpix/s Texture Memory

Texture Fragment

Framebuffer 120 Mpix/s 16 GB/s

Fragment Display

Vienna University of Technology 39

0.36 GB/s p 16 GB/s