Cg: A system for programming Cg: A system for programming graphics - - PowerPoint PPT Presentation

cg a system for programming cg a system for programming
SMART_READER_LITE
LIVE PREVIEW

Cg: A system for programming Cg: A system for programming graphics - - PowerPoint PPT Presentation

Cg: A system for programming Cg: A system for programming graphics hardware in a graphics hardware in a C-like language C-like language William R. Mark University of Texas at Austin R. Steven Glanville NVIDIA Kurt Akeley NVIDIA and


slide-1
SLIDE 1
slide-2
SLIDE 2

Cg: A system for programming graphics hardware in a C-like language Cg: A system for programming graphics hardware in a C-like language

William R. Mark – University of Texas at Austin

  • R. Steven Glanville – NVIDIA

Kurt Akeley – NVIDIA and Stanford University Mark Kilgard – NVIDIA

slide-3
SLIDE 3

GPUs are now programmable GPUs are now programmable

Vertex Processor Fragment Processor

Triangle Assembly & Rasterization Framebuffer Operations

Framebuffer Textures

Texture Decompression & Filtering

MOV R4.y, R2.y; ADD R4.x, -R4.y, C[3].w; MAD R3.xy, R2, R3.xyww, C[2].z; … MOV R4.y, R2.y; ADD R4.x, -R4.y, C[3].w; MAD R3.xy, R2, R3.xyww, C[2].z; … ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; TEX H6, R3, TEX2, 2D; … ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; TEX H6, R3, TEX2, 2D; …

slide-4
SLIDE 4

Programmable units in GPUs are stream processors Programmable units in GPUs are stream processors

kernel state

Input stream Output stream Stream Processor I1 I2 … In O1 O2 … On

  • The programmable unit executes a

computational kernel for each input element

  • Streams consist of ordered elements
slide-5
SLIDE 5

Example: Vertex processor Example: Vertex processor

uniform values

Transformed Vertexes Vertex Array Vertex Processor V1 V2 … Vn V1 V2 … Vn

Fragment processor can do more: It can read from texture memory

slide-6
SLIDE 6

Previous work Previous work

  • Earlier programmable graphics HW

– Ikonas [England 86] – Pixar FLAP [Levinthal 87] – UNC PixelFlow [Olano 98] – Multipass SIMD: SGI ISL [Peercy00]

  • RenderMan [Hanrahan90]
  • Stanford RTSL [Proudfoot01]
slide-7
SLIDE 7

Design goals Design goals

  • Easier programming of GPUs
  • Ease of adoption
  • Portability – HW, API, OS
  • Complete support for HW capabilities
  • Performance - similar to assembly
  • Minimal interference with application data
  • Longevity -- useful for future hardware
  • Support for non-shading computations

Non-goal: Backward compatibility

slide-8
SLIDE 8

The design process The design process

Concentrate on:

– Overall system architecture – System interfaces: Cg language and APIs

What principles guided the design? Why did we make the choices we did? What alternatives did we consider?

slide-9
SLIDE 9

Major Design Decisions Major Design Decisions

slide-10
SLIDE 10

Modular or monolithic architecture? Modular or monolithic architecture?

  • Cg system is

modular

  • Provides access

to assembly-level interfaces

  • GLSL made

different choices

  • Compile off-line, or

at run time

Application OpenGL API GPU Cg common runtime

(compiler)

Cg OpenGL runtime Cg Direct3D runtime Direct3D API

The Cg system manages Cg programs AND the flow of data on which they operate

slide-11
SLIDE 11

Specialize for shading? Specialize for shading?

  • RenderMan language is domain-specific

– Domain-specific types (e.g. “color”) – Domain-specific constructs (e.g. “illuminance”) – Imposes a particular model on the user

  • C is general purpose

– A HW oriented language – Avoids assumptions about problem domain

  • Cg follows C’s philosophy

– language follows syntax and philosophy of C – Targets GPU HW – performance transparency – Some exceptions – we were not dogmatic

slide-12
SLIDE 12

Cg language example Cg language example

void simpleTransform(float4 objectPosition : POSITION, float4 color : COLOR, float4 decalCoord : TEXCOORD0,

  • ut float4 clipPosition

: POSITION,

  • ut float4 Color : COLOR,
  • ut float4 oDecalCoord

: TEXCOORD0, uniform float brightness, uniform float4x4 modelViewProjection) { clipPosition = mul(modelViewProjection, objectPosition);

  • Color = brightness * color;
  • DecalCoord = decalCoord;

} void simpleTransform(float4 objectPosition : POSITION, float4 color : COLOR, float4 decalCoord : TEXCOORD0,

  • ut float4 clipPosition

: POSITION,

  • ut float4 Color : COLOR,
  • ut float4 oDecalCoord

: TEXCOORD0, uniform float brightness, uniform float4x4 modelViewProjection) { clipPosition = mul(modelViewProjection, objectPosition);

  • Color = brightness * color;
  • DecalCoord = decalCoord;

}

slide-13
SLIDE 13

One program or two? One program or two?

Vertex Processor Fragment Processor

Triangle Assembly & Rasterization Framebuffer Operations

Framebuffer Textures

Texture Decompression & Filtering

Vertex program

void vprog(float4 objP : POSITION, float4 color : COLOR, …

Fragment program

void fprog(float4 t0 : TEXCOORD0, float4 t1 : TEXCOORD1, …

slide-14
SLIDE 14

How should stream HW be exposed in the language? How should stream HW be exposed in the language?

  • How should stream inputs be accessed?

– Global variables? – Parameters to “main()”? – our choice

  • How should stream outputs be written?

– Global variables? – Output parameters from “main()”? – our choice

  • Should we distinguish between stream inputs

and kernel state?

– e.g. vertex position vs. modelview matrix – HW makes a distinction; so we expose it – State parameters marked with keyword: uniform

slide-15
SLIDE 15

How should system support different levels of HW? How should system support different levels of HW?

NV20 R200 NV30 R300 R400 NV40

… ?

NV50

  • HW capabilities change each generation

– Data types – Support for branch instructions, …

  • We expect this problem to persist

– Future GPUs will have new features

  • Mandate exactly one feature set?

– Must strand older HW or limit newer HW

slide-16
SLIDE 16

Two options for handling HW differences Two options for handling HW differences

  • Emulate missing features?

– Too slow on GPU – Too slow on CPU, especially for fragment HW

  • Expose differences to programmer?

– We chose this option – Differences exposed via subsetting – A profile is a named subset – Cg supports function overloading by profile

slide-17
SLIDE 17

Two detailed design decisions Two detailed design decisions

slide-18
SLIDE 18

How to specify a returnable function parameter? How to specify a returnable function parameter?

  • C uses pointers

– But no pointers on current GPUs

  • C++ uses pass-by-reference: float &y

– Preferred implementation still uses pointers

  • Cg uses pass-by-value-result

– Implementation doesn’t need pointers

  • By using new syntax, we preserve C

and C++ syntax for the future

void foobar(float *y) { *y = 2*x; … void foobar(float *y) { *y = 2*x; …

void foobar(out float y) { y = 2*x; …. void foobar(out float y) { y = 2*x; ….

slide-19
SLIDE 19

General mechanism for surface/light separability General mechanism for surface/light separability

LIGHT 1

  • Convenient to put surface and light code in

different modules…

– Decouples surface selection from light selection – Proven usefulness: Classic OpenGL; RenderMan; etc. – We lost it for a while!

  • RenderMan uses a specialized mechanism
  • Cg uses a general-purpose mechanism

– More flexible – Follows C-like philosophy

SURFACE 1 LIGHT 2 LIGHT 3 SURFACE 2

slide-20
SLIDE 20

Light/surface example… First: Declare an interface Light/surface example… First: Declare an interface

// Declare interface to lights interface Light { float3 direction(float3 from); float4 illuminate(float3 p, out float3 lv); }; // Declare interface to lights interface Light { float3 direction(float3 from); float4 illuminate(float3 p, out float3 lv); };

Mechanism adapted from Java, C#

slide-21
SLIDE 21

Second: Declare a light that implements interface Second: Declare a light that implements interface

// Declare object type for point lights struct PointLight : Light { float3 pos, color; float3 direction(float3 p) { return pos - p; } float3 illuminate(float3 p, out float3 lv) { lv = normalize(direction(p)); return color; } }; // Declare object type for point lights struct PointLight : Light { float3 pos, color; float3 direction(float3 p) { return pos - p; } float3 illuminate(float3 p, out float3 lv) { lv = normalize(direction(p)); return color; } };

Declare an object that implements the interface

slide-22
SLIDE 22

Third: Define surface that calls interface Third: Define surface that calls interface

// Main program (surface shader) float4 main(appin IN, out float4 COUT, uniform Light lights[ ]) { ... for (int i=0; i < lights.Length; i++) { // for each light Cl = lights[i].illuminate(IN.pos, L); // get dir/color color += Cl * Plastic(texcolor, L, Nn, In, 30); // apply } COUT = color; } // Main program (surface shader) float4 main(appin IN, out float4 COUT, uniform Light lights[ ]) { ... for (int i=0; i < lights.Length; i++) { // for each light Cl = lights[i].illuminate(IN.pos, L); // get dir/color color += Cl * Plastic(texcolor, L, Nn, In, 30); // apply } COUT = color; }

Call object(s) via the interface type

slide-23
SLIDE 23

Cg is closely related to

  • ther recent languages

Cg is closely related to

  • ther recent languages
  • Microsoft HLSL

– Largely compatible with Cg – NVIDIA and Microsoft collaborated

  • OpenGL ARB shading language
  • All three languages are similar

– Overlapping development – Extensive cross-pollination of ideas – Designers mostly agreed on right approach

  • Systems are different
slide-24
SLIDE 24

Summary Summary

  • Cg system:

– A system for programming GPUs

  • Cg language:

– Extends and restricts C as needed for GPU’s – Expresses stream kernels – HW oriented language

  • Designed to age well

– By reintroducing missing C features

slide-25
SLIDE 25

Acknowledgements Acknowledgements

  • Language co-designers at Microsoft:

Craig Peeper and Loren McQuade

  • Interface functionality: Craig Kolb, Matt Pharr
  • Initial language direction: Cass Everitt
  • Standard library: Chris Wynn
  • Design and implementation team:

Geoff Berry, Mike Bunnell, Chris Dodd, Wes Hunt, Jayant Kolhe, Rev Lebaredian, Nathan Paymer, Matt Pharr, Doug Rogers

  • Director: Nick Triantos
  • Many others at NVIDIA and Microsoft
  • Code and experience from Stanford RTSL: Kekoa

Proudfoot, Pat Hanrahan, and others

slide-26
SLIDE 26

Demo and Questions Demo and Questions