Realistic and Realtime Rendering Steve Pettifer September 2013 - - PDF document
Realistic and Realtime Rendering Steve Pettifer September 2013 - - PDF document
COMP37111: Advanced Computer Graphics Realistic and Realtime Rendering Steve Pettifer September 2013 School of Computer Science The University of Manchester COMP37111 Realistic and Realtime Rendering Contents 1 How to use these notes 4 2
COMP37111 Realistic and Realtime Rendering
Contents
1 How to use these notes 4 2 What you should know already 5 3 Improving these notes 5 4 Solving The Rendering Equation 6 5 The Bidirectional Reflectance Distribution Function 7 6 Ray Tracing 11 7 The mathematics of ray / object intersections 14 7.1 Intersection of a ray and a sphere . . . . . . . . . . . . . . . . 15 7.2 Intersection of a ray and a polygon . . . . . . . . . . . . . . . 16 7.3 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 7.4 Ray Tracing Ponderings . . . . . . . . . . . . . . . . . . . . . 21 8 Radiosity 22 8.1 Calculating the Form Factor . . . . . . . . . . . . . . . . . . . 28 8.2 Issues with the basic Radiosity technique . . . . . . . . . . . 30 8.3 Pros and cons of Radiosity . . . . . . . . . . . . . . . . . . . . 32 8.4 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 8.5 Radiosity Ponderings . . . . . . . . . . . . . . . . . . . . . . . 33 9 Volume Rendering 34 10 Direct Volume Rendering 35 10.1 Trilinear Interpolation . . . . . . . . . . . . . . . . . . . . . . 37 10.2 Computing the colour . . . . . . . . . . . . . . . . . . . . . . 39 10.3 Indirect Volume Rendering . . . . . . . . . . . . . . . . . . . 40 10.4 Proxy Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . 48 10.4.1 Splatting . . . . . . . . . . . . . . . . . . . . . . . . . . 49 10.4.2 Shear warp . . . . . . . . . . . . . . . . . . . . . . . . 49 10.4.3 Texture mapping . . . . . . . . . . . . . . . . . . . . . 50 10.4.4 GPU-accelerated Volume Rendering . . . . . . . . . . 50 10.5 A comparison of direct and indirect techniques . . . . . . . . 52 10.6 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 10.7 Volume Rendering Ponderings . . . . . . . . . . . . . . . . . 53 11 Spatial Enumeration 54 11.1 The Gridcell . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 11.2 The Octree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 11.3 Hierarchical Bounding Volumes . . . . . . . . . . . . . . . . . 56 1
COMP37111 Realistic and Realtime Rendering 11.4 Binary Space Partitioning . . . . . . . . . . . . . . . . . . . . 59 11.5 Generating BSP Trees . . . . . . . . . . . . . . . . . . . . . . . 61 11.6 Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 11.7 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 11.8 Spatial Enumeration Ponderings . . . . . . . . . . . . . . . . 65 12 Culling 66 12.1 Detail culling . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 12.2 Backface culling . . . . . . . . . . . . . . . . . . . . . . . . . . 66 12.3 Frustum Culling . . . . . . . . . . . . . . . . . . . . . . . . . . 67 12.4 Occlusion Culling . . . . . . . . . . . . . . . . . . . . . . . . . 68 12.5 Culling Ponderings . . . . . . . . . . . . . . . . . . . . . . . . 69 13 Acknowledgements 71 14 License 71 A Reading material 72 A.1 The Rendering Equation . . . . . . . . . . . . . . . . . . . . . 72 A.2 Ray Tracing Jell-O Brand Gelatin . . . . . . . . . . . . . . . . 73 A.3 An Improved Illumination Model for Shaded Display . . . . 74 A.4 Ray Tracing on Programmable Graphics Hardware . . . . . 74 A.5 Distributed Interactive Ray Tracing of Dynamic Scenes . . . 74 A.6 The Hemi-cube: a Radiosity Solution for Complex Environ- ments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 A.7 Perceptually-Driven Radiosity . . . . . . . . . . . . . . . . . . 74 A.8 Volume Rendering: Display of Surfaces from Volume Data . 75 A.9 Multi-GPU Volume Rendering using MapReduce . . . . . . 76 A.10 Introduction to bounding volume hierarchies . . . . . . . . . 76 A.11 Portals and mirrors: Simple, Fast evaluation of Potentially Visible Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 A.12 Hierarchical Z-Buffer Visibility . . . . . . . . . . . . . . . . . 76 B Frequently Asked Questions 77 B.1 Do I need to read through all of the notes you’ve written? . . 77 B.2 If I read and understand all the notes, will I know everything I need to for the exam? . . . . . . . . . . . . . . . . . . . . . . 77 B.3 Do I need to remember all the names of people who invented techniques? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 B.4 There’s a lot of maths. Do I need to remember it all? . . . . . 77 B.5 Am I expected to follow up all the Wikipedia links? . . . . . 77 B.6 Am I expected to read all the papers in the Appendixes? . . 78 B.7 Should I read all the references too? . . . . . . . . . . . . . . 78 B.8 The paper about the Jello equation. It’s a spoof, surely? . . . 78 2
COMP37111 Realistic and Realtime Rendering B.9 Do I need to know the answers to all the ponderings? . . . . 78 C Example GPU code for Phong Shading 79 D Background maths 80 D.1 Simple vector algebra . . . . . . . . . . . . . . . . . . . . . . . 80 D.1.1 Vector addition and subtraction . . . . . . . . . . . . . 81 D.1.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . 81 D.1.3 Lamberts Cosine Law . . . . . . . . . . . . . . . . . . 82 D.1.4 Vector product . . . . . . . . . . . . . . . . . . . . . . 83 D.2 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3
COMP37111 Realistic and Realtime Rendering
1 How to use these notes
These notes are intended to help reinforce (and definitely not replace!) the material that’s presented in the second half of the COMP37111 course unit’s lectures; they’re also a decent starting point for your revision. After each lecture, you should read through the relevant section of these notes to make sure you’ve understood the concepts. Most of the diagrams that are used in the lectures are reproduced here, so you don’t need to copy them down during the lecture; but many of the images, which only make their point when they are large and in colour aren’t included. At the end of each section, there are a series of questions and ‘ponder- ings’. The questions have relatively straightforward and definite answers, which have been covered in some form in the lectures or notes: you should use these to test your understanding of the material. Ponderings, on the
- ther hand, are more open-ended questions about the topic that’s just been
- covered. In some cases, the answer will already have been hinted at (or
even given, obliquely) in the notes or lectures, but in most cases these are designed to provoke you to think a bit more deeply about the concepts you’ve just encountered. There may not even be an answer to some of the ponderings; but if you can come to that conclusion yourself (correctly!), then it means you’ve understood the material pretty well. At the beginning of every lecture, we’ll revisit the ponderings from the previous section for a few minutes, and there will also be an opportunity to ask questions of your own about the previous material. Each section has some associated research papers in Appendix A. Some
- f these papers are from way back in the mists of computer graphics his-
tory, and are the seminal papers that sparked a particular way of doing
- things. You should be able to understand most of their content fairly easily,
since it will have been explained in these notes. Others are more up-to-date papers about recent developments, and these are designed to give you a flavour of what has happened in the field since; for these, don’t worry too much if you don’t understand the details, they are designed to give you a flavour and a bit of breadth to your knowledge. None of the details from either category of papers are examinable in themselves; but even a super- ficial understanding of their contents will help you write more mature and authoritative answers in the exam. Throughout the notes, there are many references to other scholarly pa- pers about computer graphics; apart from those that are given in the Ap- pendix, you don’t need to read these, and the citations are included in case you want to follow something up in a lot more detail. Finally, there are numerous terms highlighted like this
W, in bold and
with a ‘w’ superscript. If you don’t understand what the term means in the context in which its written (or you do know what the terms means but are curious to broaden your understanding of that topic a bit), then you 4
COMP37111 Realistic and Realtime Rendering should look that term up in Wikipedia (the terms are hyperlinked in the PDF version of these notes; if you’re reading them on paper you’ll have to do the lookup yourself). The Wikipedia page often contains more detail than is necessary for this course unit, so the touchstone is whether you understand enough about the term to know why its included in these notes. You won’t be examined on the whole content of all the linked Wikipedia pages.
2 What you should know already
These notes assume you are already familiar with the following concepts:
- The basic ‘camera model’, 3D co-ordinate systems, some basic OpenGL
(as covered in the 2nd year course), local illumination, and polygons / polygonal meshes, textures.
- Vector and matrix maths, enough to deal with simple 3D transfor-
mations (so, addition, subtraction, normalisation, multiplication of vectors and matrixes, cross and dot product of vectors. . . if you’re not sure about any of these, have a look in Appendix D which con- tains some refresher notes on these concepts; if that isn’t enough you should probably get hold of a good text book on the subject—personally I like ‘Advanced Engineering Mathematics’ by kreyzsig), though of course other maths text books are also available.
- A bit of basic trigonometry.
- Some very simple calculus; you won’t need to know how to integrate
and differentiate equations, but you will need to undertand what it means to do so (again, there’s a bit of a refresher in Appendix D, and the Kreyzsig book is good for this too). If your recollection of any of them is rusty, now is a good time to brush up!
3 Improving these notes
If you spot any errors of any kind, from typos to mangled sentences, or find bits of the notes hard to follow, please email me (steve.pettifer@manchester.ac.uk) so that I can correct them for future years. Any feedback, positive or nega- tive is very welcome. 5
COMP37111 Realistic and Realtime Rendering
4 Solving The Rendering Equation
Creating a realistic computer-generated image involves, in one form or an-
- ther, modelling the physcial properties and interactions between light,
materials and the human visual/perceptual system. Whichever way we think about it, what we see in the real world is a result of combination of these things: light from one or more sources interacts with the objects in the environment in complex ways, and eventually some of the light enters
- ur eyes and generates an image on our retinas.
Although coming at the problem from slightly different angles1, this process was first represented mathematically by Immel:1986:RMN:15886.15901 and Kajiya:1986:RE:15886.15902 more or less at the same time (see Ap- pendix A.1). The so-called rendering equation
W takes the form:
Lo(x, ω, λ, t) = Le(x, ω, λ, t) +
- Ω
fr(x, ω′, ω, λ, t)Li(x, ω′, λ, t)(−ω′ · n)δω′ (1) where
- λ is a specific wavelength of light
- t is time
- Lo(x, ω, λ, t) is the total amount of light of wavelength λ directed out-
ward along direction ω at time t from a particular position x
- Le(x, ω, λ, t) is the emitted light
- Ω fr(x, ω′, ω, λ, t)Li(x, ω′, λ, t)(−ω′·n)δω′ is an integral over a hemi-
sphere of indward direction
- fr(x, ω′, ω, λ, t) is the bidirectional reflectance distribution function
W
(the ‘BRDF’—we’ll come back to this in a bit)
- Li(x, ω′, λ, t) is the light of wavelength λ coming inward toward x
from direction ω′ at time t
- −ω′ · n is the attenuation of inward light due to incident angle
The maths of the rendering equation can seem rather daunting at first, but breaking it down into its component parts, it’s quite easy to see what’s going on. Let’s ignore the integral part for now, and just look at some of the in- dividual components. The equation focuses on a point on a surface in the
1Its very hard to write this stuff without geometric puns; please just take them as read
from now on
6
COMP37111 Realistic and Realtime Rendering scene x, and we’re trying to work out what ‘colour’ this point is so we can project this onto a viewplane and draw a suitably coloured pixel (Figure 2). There are two vectors involved; one of these, ω, represents the direction from point x towards the viewer’s virtual eyepoint. The other, ω′, repre- sents incoming light along a particular direction (eventually we integrate
- ver all ω′ on one side of the surface to give us a hemisphere called Ω). The
- ther two components are easy: λ represents a particular wavelength of
light (we can think of that as representing ‘colour’ for now), and t is time. If for the moment we skip over the BRDF part, the overall purpose of the rendering equation should be fairly clear: it says something like “the light at a particular point x is a combination of any light emitted directly from that point, combined with the effects of all the light arriving at that point from all possible directions”. The BRDF is what takes into account the fact that light interacts differently with surfaces depending on the an- gles involved; so light that ‘grazes’ a surface at a oblique angle contributes differently to the final result to light that hits a surface head-on (Figure 1); but we’ll come back to that later. To generate a realistic looking scene then, we need to solve the ren- dering equation for that scene, taking into account the light sources, the different materials involved, and the position of the viewer. The render- ing equation has many nice mathematical properties in this regard: first, it is mathematically linear, consisting only of additions and multiplications (there are no ‘to the power of’s involved, which computationally is a good thing). Its also spatially homogeneous, in that it can be applied to all points in a scene regardless of their position or orientation. This means that you can refactor and rearrange the equation relatively easily to give computa- tionally tractable implementations. Translating the pure mathematical representation into sensible code, however, isn’t trivial: the equation, after all, includes the integration of an infinite number of incoming rays of light (all the ω′ bits), and describes the effect an infinitely small point on a surface—and of course computers aren’t inherently very good at dealing with an infinite number of infinitely small
- r infinitely large things! So all the different approaches that we’ll look at
in this course are approximations to ‘solving the rendering equation’, all of which make their own assumptions about the nature of the scene and the behaviour of light and its interaction with different materials. We’ll explore the pros and cons of these different compromise approaches, but its worth keeping in mind the ideal of the rendering equation whilst we look at them.
5 The Bidirectional Reflectance Distribution Function
Let’s return to the impressively-named Bidirectional Reflectance Distribu- tion Function, the BRDF. First, we’ll get the maths out of the way. The 7
COMP37111 Realistic and Realtime Rendering
Figure 1: In a simple case of reflection, the angle between incident light and the surface normal is equal to the angle between the reflected ray and the surface normal. Image by Meekohi
W.
BRDF was first defined by Fred Nicodemus (Nicodemus:65). It’s modern definition is usually given as fr(ωi, ωo) = δLr(ωo) δEi(ωi) = δLr(ωo) Ei(ωi) cos θiδωi (2) In this equation, L is the radiance
W, E is the irradiance W, which are both
properties describing how much light energy comes from a surface (this is a huge over-simplification, but it’ll do for now). The angle between the vector representing the incoming light ωi and the surface normal n is given by θi. One thing to notice here, is the cos θi part of the denominator, which tells us is that the BRDF is affected by the angle between the incoming light and the surface normal. When the incoming light and the surface normal are aligned and the angle between them is therefore zero, then this component will be at its maximum value; when they are perpendicular, it’ll be at a minimum. Without going into the maths or the physics in any more depth, it’s not hard to see that the BRDF encapsulates an effect that we see all around us all the time: objects look different depending on the angle at which we look at them, and the direction from which they are lit. We’ll see this concept approximated many times during this course. If you can’t convince yourself that this is true by just looking around you, then try this thought experiment. Look at Figure 3—you’ve seen it before in the second year course unit when we discussed local illumination models—and recall that the light you see reflected from a surface depends
- n how ‘matt’ or ‘shiny’ that surface is. Matt surfaces tend to scatter any
incoming light in random directions more-or-less equally; highly polished 8
COMP37111 Realistic and Realtime Rendering
Figure 2: The rendering equation states that the outgoing illumination along direction ω from a point x is a function of any illumination coming from point x itself, combined with the effects on point x of all the light arriving at that point from all directions forming a hemisphere centred on that point. Image by Timrb
W.
=
Reflected light Diffuse
+
Mirror
+
Glossy Figure 3: Light reflected from a surface is usually a combination of diffuse, perfect reflection and glossy effects. For most surfaces this means that some light is scattered randomly (the diffuse component), but that there can be a ‘hotspot’ of light pointing in a specific direction relative to the angle of the incoming light.
9
COMP37111 Realistic and Realtime Rendering shiny surface tend to ‘bounce’ the light out along a particular direction; and in the real world most surfaces do a bit of both so you get some scattering with a ‘hotspot’ of bright light (‘specular highlights’). It’s fairly obvious that if you move the light point that the hotspot moves accordingly as the angle of ‘bounce’ changes; and it’s not a huge leap of imagination to realise that whether or not you see that hotspot depends on the angle at which you’re looking at the surface. If you’re looking directly into the hotspot, you’ll get a lot of light coming your way, and if you’re looking at it from any other angle, you’ll get less of that light directed towards you. So in summary: what you see is a combination of the position of the light, the nature of the material, and the position of the viewer. 10
COMP37111 Realistic and Realtime Rendering
6 Ray Tracing
The first of the techniques that we’ll look at for creating photo-realistic im- ages is Ray Tracing. Interestingly, the idea of generating scenes by tracing the effect of rays of light predates the rendering equation by some twenty or so years, and was first described by Arthur Appel in 1968 (Appel:1968:TSM:1468075.1468082); but it is still in many ways a computational approximation to the solving the rendering equation for a scene.
Figure 4: Examples of Ray Traced images. Note the prevalence of shiny and hard surfaces, and therefore of reflection reflection and transparency effects. The figure is a composite of images by Tim Babb, Gilles Tran and Purpy Pup- ple.
In nature, a light source emits a light which travels though space and eventually interacts with a surface that interrupts its progress. One can think of light as being a ‘ray’—a stream of photons if you like—traveling along the same path2. In a perfect vacuum this ray will be a straight line (ig- noring relativistic effects). In reality, any combination of four things might happen with this light ray: absorption, reflection, refraction and fluores-
- cence. A surface may absorb part of the light ray, resulting in a loss of
intensity of the reflected and/or refracted light. It might also reflect all or part of the light ray, in one or more directions. If the surface has any trans- parent or translucent properties, it refracts a portion of the light beam into itself in a different direction while absorbing some (or all) of the spectrum (and possibly altering the color). Less commonly, a surface may absorb some portion of the light and fluorescently re-emit the light at a longer wavelength colour in a random direction, though this is rare enough that it can be discounted from most rendering applications. Between absorp- tion, reflection, refraction and fluorescence, all of the incoming light must be accounted for, and no more. A surface cannot, for instance, reflect 66%
- f an incoming light ray, and refract 50%, since the two would add up to
2Without getting too Brian Cox
W about it all, we will play rather fast and loose with the
notion of wave/particle duality
W in these notes, sometimes talking about the wavelength of
light, and at other times treating it more as a stream of particles. Messy, but necessary for now.
11
COMP37111 Realistic and Realtime Rendering be 116%. From here, the reflected and/or refracted rays may strike other surfaces, where their absorptive, refractive, reflective and fluorescent prop- erties again affect the progress of the incoming rays. Some of these rays travel in such a way that they hit our eye, causing us to see the scene and so contribute to the final rendered image. Translating this idea into programmatic implementation would see us casting rays of light from every light source in a scene, and following their effects as they bounce around the environment, being variously redirected, absorbed and so on until some eventually pass through the virtual view- plane and into our virtual eyepoint. The problem with this approach from a computational point of view is that it’s very wasteful—the vast majority
- f the rays of light that originate at a source don’t actually end up going
through the viewplane at all, and are therefore all the maths that went into following their route is wasted. The simple fix to this problem is to reverse the process, and to think of rays as originating at the eye point, and then moving through the environment until they eventually reach a light source
- r leave the scene without ever reaching a light.
R2 S4 S6 T2 R3 S5 R1 S2 S1 S3 E 4 LB LA 6 T1 3 9
Figure 5: Example of rays traced through a scene containing various objects.
A practical recursive algorithm for basic Ray Tracing was described by Turner Whitted in 1979 (Whitted:1979:IIM:965103.807419). The paper is in- cluded in Appendix A.3. With reference to Figure 5, the algorithm goes like this: 12
COMP37111 Realistic and Realtime Rendering
- 1. Fire a ‘primary’ ray from the eye point, through the viewplane (E) and
into the scene. If the ray shoots straight through the scene without hitting any surfaces, then it is lost and does not contribute to the final image.
- 2. Otherwise, at the point where the ray interacts with a surface, check
to see whether any light could have arrived directly at that surface from any of the environment’s light sources. This is achieved by send- ing out ‘shadow feeler’ (Sn) rays in the direction of each of the light sources (LA and LB)—if a shadow feeler ray can reach a light source without being interrupted by some other object, then light can travel from that light source to the point in question to directly illuminate it; otherwise the point is in a shadow caused by the blocking object.
- 3. If the surface is transparent, we need to then create a ‘refraction ray’
(Tn) that passes through the surface, taking into account any change in direction caused by the density of the material, and also any change in colour or intensity of the ray depending on the opacity and colour
- f the material. Recursively follow the path of this ray through the
scene, much as if it were a primary ray, but remembering where it came from so that the effect can be contributed back to its originating ray.
- 4. If the surface is reflective, we need to create a ‘reflection ray’ (Rn)
that ‘bounces’ off the surface in an appropriate direction. As with refraction rays, we then trace this ray recursively through the scene.
- 5. When a ray reaches a light source, runs out of ‘energy’ or exits the
scene, then the process for that ray stops, and the effect it has had on its originating rays is calculated, resulting in a pixel being plotted on the viewplane.
- 6. Repeat for every pixel on the viewplane (or if you want a really good
result, several times for every pixel by creating virtual ‘sub pixels’ and blending the results together). For any single primary ray, a large number of secondary rays (shadow feeler, reflection and refraction) are generated; and the recursive nature of the algorithm means that the number of rays can grow very quickly even for relatively simple scenes. You can see from this algorithm that the Ray Tracing approach inher- ently takes into account (some) surface properties, the position of light sources and the eye point; and it approximates the ‘integration over the hemisphere’ by using lots and lots of rays. One of the main limitations of this approach as an approximation to the ideal solution is also easy to spot 13
COMP37111 Realistic and Realtime Rendering from the algorithm. If you think back to Figure 3 you’ll recall that inci- dent light does one of two basic things when reflected; it either bounces off a surface as a ‘high intensity’ beam in a particular focussed direction for shiny surfaces, or gets scattered in all directions as lots of ‘low intensity’ rays for matt surfaces (or, some combination of these things). If we were to allow for the ‘scattering’ of rays due to matt surfaces, every ray/surface interaction would generate an even greater number of secondary rays, and very quickly the number of rays being traced would expand beyond what is computationally tractable. So Ray Tracing typically assumes that all sur- faces are essentially specular. Since Ray Tracing is based on a simplified model of physics, it naturally reproduces several important visual effects including umbra shadows
W, re-
flections and refraction, and deals easily with textured objects and ‘partic- ipating media’ such as fog or smoke. Relatively minor modifications to the algorithm allow for optical ‘camera effects’ including depth of field
W
and shape of aperture
- W. On the down side, its inability to model diffuse
interactions means that soft lighting and matt surfaces are not rendered convincingly, and shadows tend to appear harsher than they would in re- ality (Ray Tracing doesn’t recreate penumbra shadows
W very well), though
various techniques that try to identify regions of the image that could be improved by selectively firing extra rays into ‘areas of interest’ are com- mon (e.g. finding a ‘shadow boundary’ and arranging for extra rays to be traced to its vicinity results in more realistic, softer shadows). The computational complexity of Ray Tracing is highly sensitive both in ‘image space’ and ‘scene space’. The size of the viewplane and thus the resolution of the resulting image determines how many primary rays are fired, and the number, position, size and material type of the objects in the scene determine how many secondary rays are generated. Additionally, since Ray Tracing deals well with reflections and specular highlights, the resulting images are viewpoint dependent and not ideally suited to inter- active applications such as computer games or virtual environments. On the plus side, the calculations associated with every primary ray are in- dependent of every other ray, so Ray Tracing is embarrassingly parallel
W,
and there are numerous implementations that use multiple CPUs, GPUs or cores to improve performance.
7 The mathematics of ray / object intersections
Underpinning Ray Tracing, and indeed many other techniques in computer graphics, is the notion of calculating whether a ray has intersected with an
- bject; you’ll encounter this issue again in Volume Rendering in Section 9
as well as in the section on real-time rendering techniques. It’s important to understand the basic principles here (though you won’t 14
COMP37111 Realistic and Realtime Rendering be expected to reproduce all the maths in the exam). First let’s look at the representation of a ray. All we need for this is a way of expressing the ray’s origin as a point in space, and its direction. We can do this with two 3D vectors. So we can say Rorigin = Ro = X0 Y0 Z0 (3) and Rdirection = Rd = Xd Yd Zd (4) where X2
d + Y 2 d + Z2 d = 1, i.e. their values are normalised. The explicit
equation of the ray is then given by R(t) = R0 + Rd ∗ t, where t > 0 (5) The parameter t here represents the ‘distance along the ray’; so if you plug in the value t = 0 you get the ray’s origin and if you put in t = 1.5 then you get a value representing the addition of the ray’s origin to 1.5 unit’s worth of its direction vector.
7.1 Intersection of a ray and a sphere
Now let’s look at a simple mathematical 3D object—a Sphere. Conve- niently, a sphere can be represented by its centre point (Sc) and its radius (Sr). The surface of the sphere is then all points in 3D space that satisfy the equation (Xs − Xc)2 + (Ys − Yc)2 + (Zs − Zc)2 = S2
r
(6) This is the implicit equation
W of the sphere, and you should be able to
see that its a simple application of Phythagoras’ Theorem. To calculate the intersection of a ray and a sphere in 3D space, we need to pair up the two equations and solve them simultaneously; the solution to that will give us any points in space where intersection occurs (and if the simultaneous equation has no solution, then we know the ray and the sphere don’t intersect at all, so that’s fine too). Substituting the explicit representation
W of the ray (5) into the implicit
representation of the sphere (6 gives us (X0 + Xd ∗ t − Xc)2 + (Y0 + Yd ∗ t − Yc)2 + (Z0 + Zd ∗ t − Zc)2 = S2
r
(7) 15
COMP37111 Realistic and Realtime Rendering which in terms of t simplifies to: A ∗ t2 + B ∗ t + C = 0 (8) where A = X2
d + Y 2 d + Z2 d = 1
B = 2 ∗ (Xd ∗ (X0 − Xc) + Yd ∗ (Y0 − Yc) + Zd ∗ (Z0 − Zc)) and C = (X0 − Xc)2 + (Y0 − Yc)2 + (Z0 − Zc)2 − S2
r
The value of A will always be 1, since we’ve started with a normalised vector for the direction of the ray. And the solution to the simultaneous equation is a straightforward quadratic in the form of t = −B ± √ B2 − 4C 2 where A has been left out of the familiar formula since it’s always going to be 1. The ± part of the quadratic solution means of course that we can have two separate answers for t, one if we take the plus, and the other if we take the minus. And this is exactly what we would expect to happen if a ray passes through a sphere; one intersection at the point the ray enters the sphere, and another as it leaves giving, two distinct solutions. If the two solutions are the same because B2 − 4C is 0, this means that the ray grazes the edge of the sphere, forming a perfect tangent. And if B2 − 4C is negative, then the equation has no real solution, meaning that the ray and the sphere do not intersect. This last property is especially useful, since B2 − 4C is relatively simple to compute, and means that we can determine whether a ray has hit or missed a sphere ‘on the cheap’ before deciding whether to do the more computationally costly square-root calculation to work out exactly where the intersection occurs.
7.2 Intersection of a ray and a polygon
Ray/Sphere intersections are all very well and good—they are cheap and easy to understand, but they only work for rays and spheres... and not many real scenes consist only of spheres! (Though you’ll notice that many ray traced scenes do indeed primarily consist of spheres, and now you know why!) Other regular shapes can be represented mathematically, but
- nce you move past simple platonic solids such as cones and cylinders, the
16
COMP37111 Realistic and Realtime Rendering maths gets prohibitively messy and by the time you reach a realistic object, it’s far too complex. We’re familiar already with the technique of represent- ing 3D objects as meshes of polygons, so let’s investigate how we’d do an intersection between a ray and a polygon, since this can then be used to build many different 3D shapes. The approach here is to first see if a ray intersects with the plane of the polygon (ignoring for now the boundary of the polygon). The three dimensional plane can be represented as Plane = A ∗ x + B ∗ y + C ∗ z + D = 0 (9) where A2 + B2 + C2 = 1 The normal to the plane, Pn is Pn = Rd = A B C and the distance from the co-ordinate system’s origin is D. Using the same kind of approach as we did for the sphere, we substitute the equation of the ray (5) into that of the plane to give: A ∗ (X0 + Xd ∗ t) + B ∗ (Y0 + Yd ∗ t) + C ∗ (Y0 + Yd ∗ t) + D = 0 (10) If we solve this for t we get t = −(A ∗ X0 + B ∗ Y0 + C ∗ Z0 + D) A ∗ Xd + B ∗ Yd + C ∗ Zd and representing this in vector notation gives the rather cleaner t = −(P · R0 + D) Pn · Rd If we look at this vector equation we can see that if the denominator is non-zero, then we have a solution that gives us a point where the ray in- tersects the plane; if the denominator is exactly zero, then it means that the ray either misses the plane entirely, or hits ‘edge on’ which has no effect). If we end up with a value for t that is negative, then it means that the plane is ‘behind’ the origin of the ray—so we wouldn’t count that as an intersection anyway. If we get a positive value for t, then the position of the intersection is given by 17
COMP37111 Realistic and Realtime Rendering
a b c d e f p1 p2
Figure 6: Using the winding of a polygon to determine whether a point is inside or out. Taking the edges in order from a to f, relative to the direction of each individual edge p1 is ‘to the right’ of them all, so is inside the polygon’s
- boundary. The point p2, however, is to the left of b, so fails the test and must
be outside the polygon’s boundary.
ri = xy yi zy = X0 + Xd ∗ t Y0 + Yd ∗ t Z0 + Zd ∗ t
- r put another way, is the point t units along the original ray.
But finding the intersection between the ray and a 2D plane is only half a story; the polygon isn’t the whole of the plane, its just a smaller enclosed part of it. So next, we need to find out whether this point that we’ve found
- n the plane is inside or outside of our polygon’s edge boundary.
There are two common ways of doing this. The first, which relies on our polygon having a consistent ‘winding’ and which operates in object space is to check whether the potential intersection point is ‘to one particular side’
- f each of the polygon’s edges (whether this is ‘to the left of’ or ‘to the right
- f’ depends on the winding of the polygon: if it’s wound clockwise, we
test ‘to the left’, if it’s anticlockwise, then ‘to the right’. So for a clockwise wound polygon, if the point is ‘to the right’ of all the edges, it is inside the
- polygon. . . otherwise it’s outside. See Figure 6. It may take you a while to
convince yourself this is true—but draw a couple of points and polygons
- n paper and it’ll soon become clear.
The alternative way of achieving the same thing is to use an interesting 18
COMP37111 Realistic and Realtime Rendering
Figure 7: One of the curious consequences that follows from the Jordan Curve Theorem is that for any polygon, starting from any point and by drawing a long enough line in any direction, if you cross the polygon’s boundary an
- dd number of times then the point is inside, and that for an even number of
crossings its outside. Strange but true.
property in image space arising from the Jordan Curve Theorem
- W. This
involves shooting a ray in an arbitrary direction from the intersection point and counting the number of times it intersects with the polygon’s edge. If the number of intersections is odd, then the point is inside the polygon; if it’s even the point is outside. You can probably convince yourself that this is true in the following way: if you start at a point inside the polygon, then the first time you cross an edge takes you outside the polygon; if it’s a convex polygon, then if you keep on going, you may end up back inside again (your second crossing), in which case a third crossing takes you ‘outside’ again; and so on. See Figure 7. Convincing yourself that this is absolutely true for all possible shaped polygons, though, is much harder, and involves maths far beyond the scope of this course (or indeed, the understanding of this lecturer!).
7.3 Questions
- 1. Which of the following effects can be achieved by Ray Tracing?
(a) Specular highlights (b) Colour Bleed 19
COMP37111 Realistic and Realtime Rendering (c) Refraction / transparency (d) Reflection (e) Caustics (f) Depth of Field (g) Motion Blur (h) Umbra Shadows (i) Penumbra Shadows (j) Participating media
- 2. Which of the following influence the final image produced by Ray
Tracing: (a) The position of the viewpoint (b) The position of light sources (c) The position of objects in the scene (d) The surface properties of objects in the scene (e) The colour of the scene’s light sources
- 3. Which of the following influences the performance of a ray tracing
algorithm: (a) Changing the size of the viewplane (b) Changing the number of light sources (c) Moving the viewpoint (d) Changing the colour / intensity a light source (e) Changing the transparency of an already transparent surface (f) Changing the complexity of the polygonal models (g) Changing polygonal models for procedural / mathematical ones
- 4. Most local illumination models include an ‘ambient light’ term. How
are local illumination models used in Ray Tracing, and what real- world phenomenon does ‘ambient light’ approximate?
- 5. How does the Ray Tracing technique determine if a surface is directly
illuminated by a particular light source? What two reasons cause a light source not to directly contribute to the illumination of a point in the scene? 20
COMP37111 Realistic and Realtime Rendering
7.4 Ray Tracing Ponderings
- 1. What would you need to do to extend the Ray Tracing technique to
better represent the effects of diffuse surfaces? What are the practical implications of this for the algorithm’s performance?
- 2. How might you use the simple geometry of a sphere to reduce the
number of ray/polygon intersections?
- 3. Ray Tracing results in pixels being drawn on a viewplane, so it’s
viewpoint dependent. What would be necessary in order to create textures for the objects in the scene that could be then used for inter- active rendering? What would the limitations of this approach be?
- 4. Which parts of the Ray Tracing process are embarassingly parallel?
Which bits aren’t?
- 5. How could human perception, vision and/or features of the technol-
- gy used to display the Ray Traced scenes influence the algorithm?
21
COMP37111 Realistic and Realtime Rendering
8 Radiosity
Whereas Ray Tracing treats light transport as though it consists of ‘infinitely thin rays’; Radiosity goes to the other abstract extreme and treats it as a gen- eral exchange of energy between parts of the scene. And as you might ex- pect, where Ray Tracing excells at dealing with transparency and reflection
- f light, but does rather badly at the ‘softer’ effects (such as matt/diffuse
surfaces or subtle shadowing), Radiosity has the opposite qualities and is based on the assumption that all surfaces are perfect diffusers (opaque, ideal Lambertian surfaces
W, see Figure 9). It generates results like those
in Figure 8.
Figure 8: Three scenes rendered using the Radiosity technique. Left, by John Wallace and John Lin; centre by D. Lischinski, F. Tampieri, D. P. Greenberg; and right by Simon Gibson.
General diffuse refection Ideal (Lambertian) diffuse reflection
Figure 9: In the real world, the diffuse reflection of light from a surface results in a complex scattering pattern. In computer graphics its common to assume that surfaces are ideal Lambertian diffuse reflectors, where light is scattered equally in all directions. Figure by Roger Hubbold.
Radiosity methods were first developed in about 1950 in the engineer- ing field of heat transfer
- W. They are based on the idea of the conservation
22
COMP37111 Realistic and Realtime Rendering
Figure 10: To the left, a scene rendered using the Radiosity approach, and to the right the ‘patches’ used to calculate the solution. Note that regions that in the original model were probably modelled as a single large polygon (such as the whiteboard or the floor) have been broken up into many much smaller
- patches. Images by Simon Gibson.
- f energy (or ‘energy equilibrium’) and were later refined specifically for
application to the problem of rendering computer graphics in 1984 by re- searchers at Cornell University (Goral:1984:MIL:964965.808601). The term Radiosity in Computer Science usually refers to the technique for comput- ing diffusely reflected illumination that we’ll be exploring here; but the term originally comes from the field of thermodynamics where thermo- dynamic radiosity
W is the flux W leaving a surface a point x, which is the
counterpart of irradiance
W which is the flux arriving at that point. Here,
‘flux’ is just a term that means ‘a flow of particles’ (where in this case the particles are photons). The surfaces of the scene to be rendered are each divided up into one
- r more smaller surfaces called patches, since a polygonal mesh that’s ideal
for modelling a scene isn’t necessarily the good for Radiosity (see Figure 10, but we’ll explore this in more detail later). A ‘form factor’ (also known as a ‘view factor’) is computed for each pair of patches; this value is a measure
- f how well the patches are visible to one another. Patches that are far away
from each other, or oriented at oblique angles relative to one another, will have smaller view factors. If other patches are in the way, the view factor will be reduced or zero, depending on whether the occlusion is partial or
- total. A procedural version of the Radiosity process can be thought of as
working as follows (see Figure 11):
- 1. Identify patches that are associated with light sources
- 2. Shoot light energy into the scene from the sources, and consider the
diffuse-diffuse effect on any patches that are visible to the light source. 23
COMP37111 Realistic and Realtime Rendering
Figure 11: The process starts by considering the light energy sources, which transfer energy directly to other patches. Those other patches in turn then have energy to share with the rest of the scenes. The process continues until equilibrium is reached.
These target patches accumulate light energy
- 3. Repeat the process, starting with the patches that have the most un-
shot energy
- 4. Stop when a high percentage of the initial energy is used up
In practice, the energy values and form factors are used as values in a linearized form of the rendering equation, which yields a linear system of equations
- W. Solving this set of equations gives the Radiosity, or brightness,
- f each patch, taking into account diffuse interreflections and soft shadows.
In ‘classic Radiosity’ we notionally solve all the equations simultaneously (which is a lot of computation, which for a complex scene is quite slow even
- n modern hardware—remember we’ve taken the original scene’s mesh
and split it up into even smaller polygons) and end up with a final solution. Progressive Radiosity (e.g. Yu:1996:PPR:902581 and see Figure 12) solves the system iteratively in such a way that after each iteration we end up with intermediate Radiosity values for the patch. These intermediate val- ues correspond to bounce levels. That is, after one iteration, we know how the scene looks after one light bounce, after two passes, two bounces, and so forth. Progressive Radiosity is useful for getting an interactive preview
- f the scene. Also, the user can stop the iterations once the image looks
good enough, rather than wait for the computation to numerically con- verge (we’ll look at some of the things that can go wrong with Radiosity in Section 8.2). 24
COMP37111 Realistic and Realtime Rendering
Figure 12: Progressive Radiosity stages. As the algorithm iterates, light can be seen to flow into the scene, as multiple bounces are computed. Individual patches are visible as squares on the walls and floor. Image modified from an
- riginal by Hugo Elias.
25
COMP37111 Realistic and Realtime Rendering
Figure 13: Two patches, Ax and Ax′ and their respective ‘elemental areas’ δAx and δAx′. Mathematically to calculate the effect of the whole of patch Ax on patch Ax′ we would integrate overall elemental areas. In practice, computa- tionally we do this by treating the elemental areas as just being lots of ‘tiny areas’.
Another common method for solving the Radiosity equation is ‘shoot- ing Radiosity’, which iteratively solves the Radiosity equation by ‘shoot- ing’ light from the patch with the most energy at each step. After the first pass, only those patches which are in direct line of sight of a light-emitting patch will be illuminated. After the second pass, more patches will become illuminated as the light begins to bounce around the scene. The scene con- tinues to grow brighter and eventually reaches a steady state (this is the ‘procedural’ version of Radiosity described earlier). Let’s look at the process in a bit more detail, including some of the un- derlying maths (again don’t worry—you won’t need to reproduce this in the exam, you just need to understand the principles and the role of the various components). Radiosity B is the energy per unit area leaving a patch surface per dis- crete time interval. It is the combination of emitted and reflected energy, and can be described by the following equation: B(x)δA = E(x)δA + ρ(x)δA
- S
B(x′) · F(x, x′)δA′ (11) where 26
COMP37111 Realistic and Realtime Rendering
Figure 14: The
1 πr2 cos θx cos θx′ component of the Radiosity equation arises
from the geometric relationship between the two elemental patches under
- consideration. We need to take into account the falloff in energy transfer to to
the distance between them (the ‘inverse square law’
1 πr2 part) as well as their
- rientation with respect to one another (the cos θx cos θx′ part).
- B(x)iδAi is the total energy leaving a small area δAi around a point x
(see Figure 13).
- E(x)iδAi is the emitted energy.
- ρ(x) is the reflectivity of the point, giving reflected energy per unit
area by multiplying by the incident energy per unit area (the total energy which arrives from other patches).
- S denotes that the integration variable x′ runs over all the surfaces in
the scene
- F(x, x′) is the form factor of x to x′, defined to be 1 if the two points
x and x′ are completely visible to each other, and 0 if they are not. Now of course, as with the rendering equation, we can’t deal sensibly with mathematical integration, so we’ll need to convert the ‘continuous’ integral part of the equation into something more discrete that we can rep- resent in code. If the surfaces are approximated by a finite number of planar patches, each of which is taken to have a constant Radiosity Bi and reflec- 27
COMP37111 Realistic and Realtime Rendering tivity ρi, the above equation gives the discrete Radiosity equation: Bi = Ei + ρi
n
- j=1
BjFij (12) where Fij is the form factor for the radiation leaving j and hitting i. This equation can then be applied to each patch in the scene. This means that for n patches, we end up with n simultaneous equations to solve. This can be represented as a matrix equation, and solved using one of a variety
- f techniques such as Jacobi iteration
W or the Gauss Seidel method W for
dealing with matrix solutions; you don’t need to know the details of these methods for this course unit, beyond that the results are a set of Radiosity values for each patch.
8.1 Calculating the Form Factor
The form factor Fij between patches i and j needs to take into account three main things:
- 1. The distance between the two patches, and their orientation with re-
spect to one another.
- 2. The ‘shape’ of the two patches when projected onto one another.
- 3. The effect of any objects that occlude the transfer of light between the
two patches (i.e. objects that get in the way and may cause shadows
- r a reduction in the energy transfer).
The first and second of these are purely geometric, and rely only on properties of the two patches. For the first, we need to know the relation- ship between the surface normals of the patches and their relative orienta-
- tions. This can be represented by the expression:
1 πr2 cos θx cos θx′ which is illustrated in Figure 14. The second ‘projected shape’ aspect can be calculated in a number of ways, but perhaps the most intuitive is that called the Nusselt analogue
W
(shown in Figure 15). The form factor between a differential element δAi and the element Aj can be obtained projecting the element Aj first onto a the surface of a unit hemisphere, and then projecting that in turn onto a unit circle around the point of interest in the plane of Ai. The value we need is then equal to the differential area δAi times the proportion of the unit circle covered by this projection. Rather conveniently, the projection
- nto a hemisphere takes care of the factors cos θ′
x and 1 r2 we’ve just looked
at for us, and the projection onto the circle and the division by its area then 28
COMP37111 Realistic and Realtime Rendering takes care of the cos θx and the normalisation by π, making the previous step essentially redundant.
Figure 15: The projected solid angle
W between patch i and j can be obtained
by projecting the element Aj onto a the surface of a unit hemisphere, and then projecting that in turn onto a unit circle around the point of interest in the plane of Ai. The form factor (excluding the ‘occlusion’ part) is then equal to the proportion of the unit circle covered by this projection. Form factors
- bey the reciprocity relation AiFij = AjFji. Image by Jheald
W.
The third and final component, the ‘occlusion’ part is more complex and mathematically less elegant since it needs to take into account not just the geometry of the two patches, but that of every other object/patch in the scene (Figure 16). Early methods at calculating this computationally used a hemicube (an imaginary cube centered upon the first surface to which the second surface was projected, devised by Cohen and Greenberg in 1985, shown in Figure 17). The surface of the hemicube was divided into pixel- like squares, for each of which a form factor can be readily calculated an-
- alytically. The full form factor could then be approximated by adding up
the contribution from each of the pixel-like squares. The projection onto the hemicube, which could be adapted from standard methods for determining the visibility of polygons, also solved the problem of intervening patches partially obscuring those behind (akin to the z-buffer technique that you’ll have encountered in the second year course unit; or you could imagine us- ing ‘ray casting’ techniques to work out whether occlusion has occured by intersecting a ray shot from the source patch towards one of the hemicube’s ‘pixels’ and seeing if it intersects with any other polygons in the process). However all this was quite computationally expensive, because ideally form factors must be derived for every possible pair of patches, leading 29
COMP37111 Realistic and Realtime Rendering
Figure 16: The form factor needs to take into account the effect of occluding
- bjects, since in reality some high proportion (probably over 90%) of patches
won’t be visible to one another because of occlusion. We need to calculate what percentage of the energy is absorbed by the occluder and does not reach the receiving patch.
to a quadratic increase in computation as the number of patches increases. This can be reduced somewhat by using a technique called a binary space partitioning tree (we’ll see this in more detail in Section 11.4) to reduce the amount of time spent determining which patches are completely hidden from others in complex scenes; but even so, the time spent to determine the form factor still typically scales as O(n log(n)). Methods of adaptive integration (adaptive) have been used to improve this, but are outside the scope of this course unit.
8.2 Issues with the basic Radiosity technique
The hemicube approach is of course a discrete approximation to what in the ‘real world’ is a continuous phenomenon, and like all approximations it has limitations. One of the biggest is the regular division of the pixels
- n the hemicube’s surface, and the assumption that patches will project
- nto an integer number of pixels. In Figure 18 although regions a and b of
patch i are of equal width, they project onto very different sized areas of the hemicube on patch j; region b is likely to ‘hit’ a greater number of pixels than region a, giving inaccurate results. The problem gets worse the closer the two surfaces are (e.g. a sheet of paper resting on a table). Of course increasing the resolution of the hemicube helps, but at a computational and memory cost, and like most aliasing problems these brute force techniques cannot ever guarantee to give perfect results). Another issue is that of generating the patches in the first place. Be- 30
COMP37111 Realistic and Realtime Rendering
Figure 17: A hemicube placed on a surface; each face of the hemicube is cov- ered with pixels that are treated rather like a z-buffer. Projecting the patch
- nto the hemicube’s pixels gives an indication of the visibility of the patch to
the surface, rather like a crude approximation to the Nusselt hemisphere.
i j
{ {
a b
Figure 18: Issues with hemicube aliasing; polygons projected onto the flat surface of the hemicube interact with differing numbers of pixels, depending
- n the angle of projection.
cause we don’t know in advance where shadows and patches of brightness are going to end up, its impossible using raw Radiosity techniques to make sensible decisions up-front about how to split the scene’s original polygo- nal mesh into smaller patches. If we create too many, we slow down the
- solution. If we create too few, we miss subtle lighting effects. One solution
to this approach is to use a ray casting approach early on to work out where interesting effects are likely to occur; shooting rays from light sources past the edges of potential occluders and onto large surfaces gives us a hit as to where ‘discontinuities’ in the final result are likely to happen, allowing us to create a greater number of patches in those areas, whilst leaving ‘boring’ areas relatively untouched. Finally, since basic Radiosity rendering assumes energy equilibrium in 31
COMP37111 Realistic and Realtime Rendering a closed environment, tiny flaws in the original polygonal mesh can give ‘holes’ in the environment that allow energy to leak out, giving darker re- sults than one would expect.
8.3 Pros and cons of Radiosity
Radiosity solutions produce excellent results for scenes containing primar- ily diffuse surfaces (which is a large proportion of real-world scenes!). Its shadows are generally softer and more subtle than those generated by Ray Tracing, and can often look more realistic than the crisper umbra shadows
- f basic Ray Tracing; and it deals with other visual with effects such as
colour bleed
- W. It cannot cope with transparent or shiny objects.
Unlike Ray Tracing which results in a view-dependent image-space re- sult (i.e. an image projected onto a viewplane as though seen from a partic- ular position in the scene), Radiosity produces view-independent world- space results (i.e. a mesh of polygons each with their own colour, which have that diffuse colour independent of where they are viewed from). This makes it more applicable to interactive applications (games etc.). It’s also straightforward to map the colours from the resulting patch mesh onto tex- tures which can then be mapped back on to the model’s original and usu- ally simpler polygonal mesh to improve realtime rendering performance.
8.4 Questions
- 1. Which of the following effects can be achieved by Radiosity?
(a) Specular highlights (b) Colour Bleed (c) Refraction / transparency (d) Reflection (e) Caustics (f) Depth of Field (g) Motion Blur (h) Umbra Shadows (i) Penumbra Shadows (j) Participating media
- 2. What is the role of a form factor?
- 3. Patches created by Radiosity are often rendered using Gouraud shad-
ing
- W. What visual effect does this have on shadow boundaries?
32
COMP37111 Realistic and Realtime Rendering
8.5 Radiosity Ponderings
- 1. How could you include specular effects in a scene rendered using
Radiosity?
- 2. What factors influence how the original polygonal mesh should be
sub-divided into patches?
- 3. What is it about Radiosity that makes it easier to generate textures for
- bjects than from Ray Tracing?
- 4. In Ray Tracing, we terminate the tracing of a ray when it leaves a
scene without returning to a light source, or when it runs out of ‘en- ergy’. What conditions might be used in a Radiosity solution to help improve the compute time? (Hint: see Appendix A.7, though there are simpler optimisations that aren’t mentioned in these notes as well).
- 5. How would you extend radiosity to deal with ‘participating media’
in the air such as fog / dust?
- 6. What issues might arise when trying to create Radiosity solutions for
- utdoor scenes?
33
COMP37111 Realistic and Realtime Rendering
9 Volume Rendering
Almost every form of rendering we’ve looked at so far in this course unit and its second-year predecessor has revolved around drawing an object’s
- uter surface. This isn’t surprising, since most things that we’d want to
render can be represented nicely this way, whether that surface is gener- ated parametrically or as a mesh of connected polygons. But a special class
- f scenes exist where we are less interested in the outer surfaces of things,
and more concerned about what’s ‘inside’—and this is where Volume Ren- dering (levoy88) (Appendix A.8) is useful.
Figure 19: Three examples of Volume Rendering from different kinds of data: (a) A volume rendered cadaver head using view-aligned texture mapping and diffuse reflection; (b) a mummified crocodile re-created from CT data and ren- dered using volume ray casting; and (c) properties of the worlds’ oceans ren- dered from a simulation. Composite of images created by Sjschen
W, stefan-
banev
W and James Marsh.
Volume Rendering usually starts with a 3D data-set. This could be gen- erated synthetically as the output of a simulation (say of fluid flow, or cli- mate behaviour), but is perhaps more commonly acquired by scanning or measuring some real world phenomenon in a regular grid pattern. In the case of medical applications, the kind of data we’re talking about here is generated from CT
W, MRI W or MicroCT W scanners; in a geographical con-
text, the data could come from the use of sonar
W or ground penetrating
radar
- W. But anything that generates regular 3D grids of data can be used
as a basis for volume rendered images. The important point here is that it doesn’t really matter what properties the data set contains, what they represent in the real world, what units they are in, what the resolution of the samples is: as long they form a regular 3D array of data points, the techniques we’ll explore here will be able to render them. We’re going to look at two different approaches creating images from volume sets, called ‘Direct’ and ‘Indirect’ Volume Rendering (it’ll become
- bvious why they’re called this later)—but both start with these ‘volume
sets’, which are just 3D arrays of data. The 3D cells of such data sets are referred to as ‘voxels’ (the volumetric equivalent of pixels). 34
COMP37111 Realistic and Realtime Rendering
10 Direct Volume Rendering
Volume data set Viewplane Viewpoint
Figure 20: Simple Volume Rendering
Direct Volume Rendering creates images by casting rays into a volume
- set. Rather like Ray Tracing, we shoot these rays out from the eyepoint
through a viewplane and into the scene (Figure 20. Unlike Ray Tracing where we typically spawn new secondary rays as the primary rays intersect with object surfaces, in Volume Rendering we al- low the original primary ray to penetrate through the scene ‘into’ the data, and we accumulate the effects of colour and transparency along the ray to work out what coloured pixel to plot on our viewplane as shown in Figure 21 (we’ll look at an algorithm that combines the effects of all the
- pacity/colours visited in Section 10.2).
But where do we get colour and transparency values from if our vol- ume set is made of arbitrary data? In the case of a Functional Magnetic Resonance
W Scanner for example, our data points are some measure of of
the strength of the magnetic signal from hydrogen nuclei in water at vari-
- us points in the subject being scanned; in the case of sonar the values are
representations of the time taken between sending out a sonar ‘ping’ and it being reflected back to the sensor. Whatever source they came from, they don’t have any kind of colour or opacity associated with them, so we need to make that association ourselves. Generally speaking when using Vol- ume Rendering, it’s not to create photorealistic images anyway, but rather to see the ‘inside’ of something that you can’t see under normal conditions, like the inside of a brain or a mountain, and its usually because you want to identify some special phenomenon that would otherwise be hidden (like a tumour in tissue or a fissure in rock). So making up colour schemes is usually fine. 35
COMP37111 Realistic and Realtime Rendering
Figure 21: A ray cast from the eyepoint through a 2D representation of a voxel set; as the ray passes through the grid it accumulates colours from the different cells, depending on their opacity. If sampling along the ray is done
- nce per visited cell, there is a danger of aliasing occurring, so it is common
to sample along the ray at sub-voxel intervals.
The first step in generating colour and opacity values for our voxels is to classify the different ‘materials’ present in our data. The term ‘materi- als’ is used loosely here—it may of course be that there aren’t any distinct materials involved if your data represents something continuous like the temperature of a liquid or something amorphous like water itself. But we’ll use the example of an MR scan of a human body, so we’d expect to find res-
- nance values that correspond to things such as bone, skin, muscle tissue,
fat and air. The question is how to we determine which is which? For now we’ll assume that each voxel contains only one value. We be- gin by calculating a histogram of voxel frequency, i.e. for each data value, how many voxels have that value? We then produce a probability distri- bution to represent the likelihood that a given voxel value corresponds to a specific material or mixture of materials (imagine for a moment that the thing you’ve scanned consists of, say, just bone and air; we’d expect to see a lot of data points that have ‘boney’ values, and a lot that have ’airy’ values, and some where intermediate values where we happen to have sampled a region of space that has a bit of both). A typical probability distribution function is shown in Figure 22. The peaks in this graph are likely to corre- spond to different types of materials commonly found in our data sets—the troughs between them are going to be where we’ve sampled transitions be- tween one material and another (or could potentially be some other infre- quently occurring material). For now let’s assume that our scene consists
- nly of air, fat, muscle and bone. By looking at the peaks in our graph, we
could decide where to draw the distinction between one material type and 36
COMP37111 Realistic and Realtime Rendering
Voxel count Voxel value
Figure 22: A plot of voxel value against frequency; peaks naturally occur around values that represent different ‘materials’ in the underlying data.
another (and we’d count the tail of the graph as being anything that’s not air, fat or muscle – in this made-up example that would be bone). Figure 23 shows a possible classification of the different voxel values. For each material type, we now want to assign a different colour, as shown in Figure 24. Then we can set opacity values depending on what materials we want to be visible or invisible. For example, suppose we want to view only bone, without any of the clutter of muscle or fat. In this case we’d set the α-value of all voxels that have been classed as being bone to 1, and we’d want to make all other materials totally transparent (i.e. set their α-value to 0). If we want instead to see muscle displayed partially trans- parently and overlaid on top of the bone, we’d just set the corresponding α-values as shown in Figure 25.
10.1 Trilinear Interpolation
When calculating the value at a sample point along the ray, we could just use the value of the voxel we’re passing through. Though this is quick, even with sub-voxel sampling distances along the ray it can still lead to aliasing effects, so it’s usually better to interpolate values from neighbouring voxels in 3D to get a more representative sample value. This process is called Trilinear interpolation
W, and is shown in Figure 26. It sounds a bit grand,
but the idea is quite simple: all you’re doing here is ‘averaging out’ the values around the data point, biased in the three different directions. 37
COMP37111 Realistic and Realtime Rendering
Figure 23: Voxels can be classified as representing different materials based
- n the peaks in the voxel/frequency graph.
Data value at sample point Colour table 255 255
air fat muscle bone Figure 24: Once different voxel ranges have been associated with particular materials, the colour of those voxels can be set to represent the different ma- terials. 38
COMP37111 Realistic and Realtime Rendering
Data value at sample point Opacity 255 1.0
air fat muscle bone Figure 25: Once different voxel ranges have been associated with particular materials, the opacity of voxels in those distinct ranges can be set to give dif- ferent rendering effects. In this example, all voxels in the ranges representing air and fat have been made transparent, muscle is set to partially transparent, and bone to nearly opaque.
10.2 Computing the colour
With the colour and opacity values calculated, we can begin the process
- f casting rays into the voxel set and accumulating the pixel values on the
viewplane by compositing along the ray. To calculate the effect a particular sample point will have on our final pixel’s value, we need to work out the effect that of other values along the ray that leads to it (see Figure 27). The general idea here is that you keep track of an accumulated colour, and an accumulated opacity value. Every time you step along the ray you take into account how opaque a voxel is, and its colour, and add this effect to the accumulated values until you reach a certain ‘terminal opacity’ (i.e. the ray has passed through so much material that it cannot penetrate any further). This can be achieved by simple pseudo code shown in Figure 28. If you render scenes using this approach with suitable colour and opac- ity values, the results appear rather ‘flat’, somewhat resembling X-ray pho- tography (albeit, a strange kind of coloured X-ray, e.g. Figure 29). For some applications this is fine, but the absence of local surface shading loses some
- f the 3D nature of the scene. The problem of course is that we don’t have
any ‘surfaces’ in our data to shade: all we’ve got are the underlying data points, and some colour and opacity values that we’ve added ourselves. In DVR, it’s possible to create ‘fake’ surface normals from the data by interpreting changes in voxel values over the three different dimensions as 39
COMP37111 Realistic and Realtime Rendering
Sample point Figure 26: Trilinear interpolation accumulated values Figure 27
giving the x, y and z components of a value that can be used as a kind
- f surface normal vector. Even though the vector’s direction doesn’t mean
much in any physical or absolute sense, if the underlying data vary smoothly according to ‘surfaces’ that exist in the real world, then this vector will vary smoothly too, in some sense representing that surface. And that’s all we care about in most cases (since inside a body or in the middle of a lump of rock, any notion of ‘realistic lighting’ is moot anyway!). By looking at these value gradients in three dimensions (Figure 30 shows this approach in 2D), we can create a faux surface normal vector that can be plugged into a Local Illumination calculation to give plausible surface shading, without actually having to pin down a definite surface.
10.3 Indirect Volume Rendering
In Direct Volume Rendering, any ‘surfaces’ that appear in the rendered im- age are implicit—they are effectively visual artefacts of the process and exist only in the mind of the viewer, rather than being something that’s been calculated specifically or represented explicitly in the algorithm. As humans, we can see them, but the computer doesn’t really know that its drawn surfaces. An alternative approach to creating images from volume 40
COMP37111 Realistic and Realtime Rendering accum_colour= 0; accum_opacity= 0; /* find first sample point inside the volume */ while ((accum_opacity < terminal_opacity) && (sample_point_inside_volume)) { /* eval colour and opacity at sample point */ accum_colour+= (1-accum_opacity) *
- pacity * colour;
accum_opacity+= opacity; /* step to next sample point */ }
Figure 28: Pseudo code for colour and opacity compositing along a ray. Figure 29: Rendering volume data without any kind of surface reconstruction leads to ‘flat’ images resembling X-ray photographs; without surfaces there are no specular highlights, for example, which would give visual cues as to the curvature of the object in 3d. Image by Sjschen
W.
41
COMP37111 Realistic and Realtime Rendering
(i, j+1) (i, j-1) (i+1, j) (i-1, j)
Figure 30
data—called Indirect Volume Rendering—explicitly identifies surfaces in the volume set, converts these into polygonal meshes, and then uses ‘tra- ditional’ polygonal rendering and local illumination techniques to draw these on screen. You’re no doubt familiar with the use of contour lines
W on maps to iden-
tify changes in height; for example in Figure 31, every time you cross one of the wiggly lines, you will have gone up 20 feet in height. Similar techniques can be used for maps to represent many other values; temperature, strength
- f geomagnetic fields and so forth, and the idea can be extended further to
represent any of the kind of properties we’d be storing in a volume set: the important concept is that at any point along the line, the ‘value’ (whatever it represents is the same). So, these are called ‘isolines’ (from the Greek ‘isos’ meaning ‘equal’). Figure 32 shows an isoline for the value 55, drawn through a 2D grid of some arbitrary data. Notice that in figure 32 the line has been drawn as a series of straight lines joining the specific point where 55 would appear. Remember that the data are usually going to be sampled from a continuous medium in the real world, so its likely that drawing a smooth curve that approximately follows the value through the grid would give better results. The process of identifying an isoline on a 2D grid is fairly easy to under- stand (Figure 32 probably tells you all there is to know about it!). Applying the same idea to 3D data is conceptually straightforward; instead of finding an isoline, we now need to find an isosurface, i.e. the surface that represents a particular value in our 3D data. It turns out this is actually quite hard to achieve in practice. Look again at Figure 32. The ‘algorithm’ that was followed to generate the dotted isoline here is quite trivial: step through the vertical grid lines
- ne by one, from left to right, and for each find the point at which the value
42
COMP37111 Realistic and Realtime Rendering
Figure 31: Topographic map of Stowe, Vermont. The brown contour lines represent the elevation. The contour interval is 20 feet.
‘55’ appears, and join all those points by a line. On each of the vertical gridlines, the isoline represents the point where the values change from being ‘greater than the 55’ to ‘smaller than 55’. Let’s try to apply the same thinking in three dimensions: we’re looking for a surface that lines in 3D space wherever the value 55 appears in our data; ‘outside’ that surface the values will be greater than 55, and ‘inside’ the surface they will be smaller than 553. Figure 33 shows one possible con-
- figuration. The points on the right-most face of the cube we’ll pretend are
greater than our isovalue; so let’s mark them as being ‘outside’ our surface. The points on the left-most face are smaller than our isovalue; so let’s mark tose as being on the ‘inside’ of our surface. That means that for this particu- lar set of 8 voxels, the change from ‘insideness’ to ‘outsideness’ happens on the four horizontal edges of the cube formed by our data. Exactly where on these lines the isovalue lies will vary, depending on the data points; but we know that for each line, the change takes place somewhere. We can there- fore mark each of these points, and try to create a little patch of our surface. In this case the patch we’d need to create is easy; it’s just a 4 sided polygon that joins the four data points.
3the surface doesn’t have to be an enclosed surface of course, so ‘inside’ and ‘outside’
aren’t quite the right words, but they are less clumsy than ‘to one side of’ and to ‘to the
- ther side of’. So for now, if it helps, think of the surface we’re talking about as forming a
closed shape.
43
COMP37111 Realistic and Realtime Rendering
53 56 58 62 54 50 65 50 49 65 62 49
Figure 32: An isoline (red dotted line) drawn from left to right, linearly inter- polating the point where the value 55 would be on each of the vertical lines (e.g. between 53 and 56 on the first vertical line, and between 54 and 62 on the second).
Figure 33 is one of the simplest cases, however, where all the ‘inside’ values are on one side, and all the ‘outside’ values on the other. For this case it’s obvious where the bit of surface should lie. But things needn’t be as clean as this, and each of the 8 data points that form the cube’s vertices can in reality be ‘inside’ or ‘outside’. So.. . 8 points, each of which can be in
- ne of two different states gives us 28, or 256 different combinations of in
and outsideness. In two of these 256 cases—where all the points are either inside or outside—the cube doesn’t play any part in the surface since the isosurface doesn’t pass through it. If we take into account symmetry, of the remaining 254 combinations there are 14 different distinct variants. These are shown in Figure 34. In each case between 1 and 4 triangles gets added to the overall isosurface. So to a first approximation we can imagine going through all the com- binations of 8 data points that form the cubes we’ve been looking at, and accumulating a set of triangles to form the isosurface. We’d need to shade the triangles to give decent results, and this would mean finding surface normals for them. But which way does each triangle face? Although we’ve used the terms ‘inside’ and ‘outside’ for convenience, remember that this really just means ’to one side’ or ‘to the other’ of our surface in some senses, and in any case whether we want the normals to face ‘in’ or ‘out’ would depend on whether we are looking at an artefact from the outside, or from
- within. And with Volume Rendering, both are possible and sensible! In
any case, even though we could potentially calculate normals from the tri- angles we’ve generated, it almost certainly makes sense to calculate create 44
COMP37111 Realistic and Realtime Rendering inside
- utside
- utside
- utside
inside inside inside
- utside
voxel
Figure 33: The values from 8 voxels creating a cube the left-hand side of which is ‘inside’ a surface, and the right-hand side of which is ‘outside’. A single quadrilateral drawn on the horizontal edges at the appropriate iso-value cre- ates part of the surface.
these as ‘faux normals’ as described in Section 10.2 and then to linearly interpolate these to get normals at the triangle vertices. But the problem is worse than this. Look at Figure 35, in which two cube vertices (marked with dots) are inside, and the others are outside our
- isosurface. Which of the two sets of triangles should we choose? Both
are perfectly valid configurations that mark the two points as being on
- ne side, and the remainder as appearing on the other. These ambigu-
- us cases occur when adjacent vertices have different states, and diagonal
vertices have the same state; and there are 5 other cases like this where we don’t know whether to connect adjacent edges, or to connect accross to the other side of the cube. We can only resolve this ambiguity by looking at neighbouring cubes, and deciding which of the two options best forms a plausible isosurface. But what if one of the neighbours is also an am- biguous case? (see Figure 36 for an example of a mis-classification) The
- riginal algorithm for solving this problem was called ‘Marching Cubes’
(Lorensen:1987:MCH:37402.37422) and though the details of the algorithm contained several errors and inconsistencies that have been corrected by later authors, the name stuck—informally at least—for the whole class of algorithms. 45
COMP37111 Realistic and Realtime Rendering
Figure 34: The 14 different variations of ‘in’ vertices (marked with a blue dot) and ‘outside’ (without a dot) that are used in the Marching Cubes algorithm.
46
COMP37111 Realistic and Realtime Rendering
Figure 35: Two valid sets of polygons that would separate out the ‘inside’ vertices (marked with dots) from the ‘outside’. Figure 36: A discontinuous mesh created by the ambiguity shown in Figure 35.
47
COMP37111 Realistic and Realtime Rendering
10.4 Proxy Geometry
Figure 37: An example of axis-aligned proxy geometry (left), and volume data projected onto the proxy planes. Courtesy of Christof Rezk Salama, Klaus Engel and Markus Hadwiger.
An increasingly popular alternative to finding and rendering isosur- faces is to use ‘proxy geometry’ to recreate a volumetric image. This is a kind of half way house between rendering the image onto a viewplane (as with direct Volume Rendering), and creating a fully blown polygonal mesh of a surface. The idea is to put into the scene a series of essentially transparent polygonal planes, and then to project onto these layers of the volume data, so that when you look ‘through’ the planes, the various differ- ent layers line up to give the impression of a soilid object (a bit like looking through several equally spaced panes of glass, with different bits of a pic- ture painted on them to give the impression of depth). The basic idea is shown in Figure 37. The first requirement here is to generate the proxy planes onto which the different parts of the volume will be drawn. This could be done easily by drawing quadrilaterals that are lined up with the world axes (as hap- pens to be the case in Figure 37). This is fine if you know you are always going to be looking into the world perpendicular to the plans (i.e. through the ‘front’ of the world); but it’s dreadful if you happen to position your viewpoint so that it looks ‘side on’ at all the proxy planes. Under these cir- cumstances you’d not really see the composite volume image at all, since you’d be looking along the edges of the planes rather than directly into them. The solution is to recreate the volume planes so that they are perpendic- ular to view vector—or put another way, parallel to the image plane, which means recalculating them every time the eyepoint moves. Figure 38 shows this effect; in each case the triangles forming the proxy geometry layers are 48
COMP37111 Realistic and Realtime Rendering
Figure 38: Three examples of viewplane-aligned proxy geometries. Courtesy
- f Christof Rezk Salama, Klaus Engel and Markus Hadwiger.
arranged so as to be parallel to the image plane, regardless of the orienta- tion of the cube. You can see that in the middle cube this creates nearly equilateral-looking triangles, whereas in the left and right cubes the trian- gles are more distored to keep the plane of the triangles aligned correctly. Having created the proxy geometry (which is a fairly painless calcu- lation really), it then remains to decide how many layers are necessary to give a good effect. This relies rather on the nature of the underlying data; the more rapidly or unpredictably the values change, the more layers will be needed to avoid the chances that important visual artefacts get lost in the gaps between proxy layers. Once suitable layers have been drawn, a variety of approaches can be used to ‘paint’ onto them. 10.4.1 Splatting (This section and the two that following are taken from the Wikipedia entry on Volume Rendering
W, and is modified only slightly.)
One of the early techniques in this area is called splatting
- W. This is a
technique which trades quality for speed. Here, every volume element is splatted, as Lee Westover said, like a snow ball, on to the viewing surface in back to front order (Westover:1992:SPF:150893). These splats are rendered as disks whose properties (color and transparency) vary diametrically in normal (Gaussian) manner. Flat disks and those with other kinds of prop- erty distribution are also used depending on the application. 10.4.2 Shear warp The shear warp approach to Volume Rendering was developed by Cameron and Undrill, popularized by Philippe Lacroute and Marc Levoy (Lacroute:1994:FVR:192161.192283). In this technique, the viewing transformation is transformed such that the nearest face of the volume becomes axis aligned with an off-screen image 49
COMP37111 Realistic and Realtime Rendering buffer with a fixed scale of voxels to pixels. The volume is then rendered into this buffer using the far more favorable memory alignment and fixed scaling and blending factors. Once all slices of the volume have been ren- dered, the buffer is then warped into the desired orientation and scaled in the displayed image. This technique is relatively fast in software at the cost of less accurate sampling and potentially worse image quality compared to ray casting. There is memory overhead for storing multiple copies of the volume, for the ability to have near axis aligned volumes. This overhead can be miti- gated using run length encoding. 10.4.3 Texture mapping Many 3D graphics systems use texture mapping to apply images, or tex- tures, to geometric objects. Commodity PC graphics cards are fast at tex- turing and can efficiently render slices of a 3D volume, with real time in- teraction capabilities (see Figure 39). Workstation GPUs are even faster, and are the basis for much of the production volume visualization used in medical imaging, oil and gas, and other markets. In earlier years, dedi- cated 3D texture mapping systems were used on graphics systems such as Silicon Graphics InfiniteReality, HP Visualize FX graphics accelerator, and
- thers. This technique was first described by Bill Hibbard and Dave Santek
(Hibbard:1989:IK:329129.329356). These slices can either be aligned with the volume and rendered at an angle to the viewer, or aligned with the viewing plane and sampled from unaligned slices through the volume. Graphics hardware support for 3D textures is needed for the second technique. Volume aligned texturing produces images of reasonable quality, though there is often a noticeable transition when the volume is rotated. 10.4.4 GPU-accelerated Volume Rendering A recently exploited technique to accelerate traditional Volume Render- ing algorithms such as ray-casting is the use of modern graphics cards (as an example see (Engel:2001:HPV:383507.383515), though there are many
- thers). Starting with the programmable pixel shaders, people recognized
the power of parallel operations on multiple pixels and began to perform general-purpose computing on (the) graphics processing units (GPGPU). The pixel shaders are able to read and write randomly from video mem-
- ry and perform some basic mathematical and logical calculations. These
SIMD processors were used to perform general calculations such as render- ing polygons and signal processing. In recent GPU generations, the pixel shaders now are able to function as MIMD processors (now able to inde- pendently branch) utilizing up to 1 GB of texture memory with floating 50
COMP37111 Realistic and Realtime Rendering
Texture
R G B A R G B A R G B G B B R Figure 39: 2D and 3D texture mapping. Courtesy of Christof Rezk Salama, Klaus Engel and Markus Hadwiger.
51
COMP37111 Realistic and Realtime Rendering point formats. With such power, virtually any algorithm with steps that can be performed in parallel, such as volume ray casting or tomographic reconstruction, can be performed with tremendous acceleration. The pro- grammable pixel shaders can be used to simulate variations in the charac- teristics of lighting, shadow, reflection, emissive color and so forth. Such simulations can be written using high level shading languages, an example
- f which is shown in Appendix C.
10.5 A comparison of direct and indirect techniques
As should be obvious, the Direct Volume Rendering approach, rather like basic Ray Tracing, creates a rendered image on the viewplane, which must be recalculated if the position of the viewpoint changes, or if you decide to change the opacity/colour mappings; it’s therefore not a hugely attrac- tive approach for interactive applications (though, again like Ray Tracing, the value for each pixel is independent of every other, so it’s once more an embarrassingly parallel problem, amenable to concurrent computation solutions). However, one of the significant advantages of DVR is that it does not make any ‘hard and fast’ decisions about the underlying data— there are no ‘hard’ cutoffs involved, so less chance of visual artefacts being created or missed. Indirect techniques, on the other hand, end up creating polygonal repre- sentations of the data (whether as an iso-surface mesh, or by projection onto proxy geometry). Although these create representations that are viewpoint independent (so the can be rendered in realtime using normal OpenGL- like techniques), there is a much greater chance of false positive or false negative artefacts (e.g. if you choose the wrong value to create an isosur- face from, or have an insufficient number of proxy-planes, then important features of the data can be missed in the rendering).
10.6 Questions
- 1. Which parts of the volume rendering process can be parallelised?
- 2. Why is Direct Volume Rendering not typically appropriate for inter-
active applications?
- 3. Where does the colour of materials come from in Volume Rendering?
- 4. Why might Direct Volume Rendering be preferable for a medical ap-
plication, such as identifying a malignant tumour in otherwise healthy tissue?
- 5. What is the spatial complexity of a basic voxel grid structure?
52
COMP37111 Realistic and Realtime Rendering
- 6. Explain why there are 14 cube configurations in the Marching Cubes
algorithm.
- 7. What is the purpose of sub-voxel sampling?
- 8. Why is it sometimes useful to invent normals in DVR? Why is it not
necessary to create normals in the same way for IVR?
10.7 Volume Rendering Ponderings
- Although the images created from Direct Volume Rendering are view-
point dependent, not all parts of the process take the viewpoint into
- account. If you were moving the viewpoint around, which calcula-
tions would be valid regardless of the viewpoint’s position?
- How does the Nyquist rate
W relate to Volume Rendering?
- What unwanted visual artefacts might be caused by trilinear interpo-
lation? How could this be improved? 53
COMP37111 Realistic and Realtime Rendering
11 Spatial Enumeration
The final sections of this course unit deal with techniques for improving rendering performance so that things can be drawn in real-time. These techniques are a fundamental part of the way in which modern computer games and other interactive ‘virtual environments’ work. We’ll look at a series of techniques that all fall into the category of ‘spa- tial enumeration’ (also called ‘spatial indexing’). They all address the same problem of forming a an efficient link between the data structures used by a particular application (say, a world full of monsters, if its a computer game) and the data structures used to draw the 3D scenes (polygons, meshes, ob- jects, textures and so forth). The basic issue is that the kind of data struc- tures that are sensible for representing the application data are almost cer- tainly not the same ones that make for efficient rendering. You can think of spatial enumeration as playing a similar role to ‘indexing’ a database: in a database you want to represent your data in a way that is semantically cor- rect, to avoid redundancy and inconsistency – but this isn’t necessary the
- ptimal way of storing it for fast querying. So you build indexes that map
between the shape of data expected back from common queries, and the more ‘truth and beauty’ version of the data held in the normalised schema. The same kind of effect happens in computer graphics: there are a set of common ‘queries’, such as ‘which object intersects first with this ray’, or ‘give me all the objects that are visible from this eyepoint, sorted in order
- f distance from the eyepoint’, and we need these to be fast in order to
create interactive 3D scenes. The same spatial enumeration structures that we’ll be looking at for these interactive purposes are also useful in speeding up the performance of ‘offline’ rendering algorithms such as Ray Tracing, Radiosity and Volume Rendering. In fact, the first of these techniques is one that we’ve already explored; we just didn’t give it a name at the time.
11.1 The Gridcell
The idea of a gridcell structure is simple: you subdivide your 3D space into a large number of small cubes. Then you place your 3D objects into this
- space. Every cell that is intersected by a part of the objects gets a pointer
to the underlying data structure that represents the object, so that if you choose a point in 3D space, you can easily determine which cell contains it, and then follow any of the references from that cell to find out which
- bjects you may have intersected with.
A simplified 2D version of this is shown in Figure 40. Creating an empty gridcell structure is trivial; it’s just a 3D array where each cell contains a list of pointers to objects. Populating it is a bit more tricky, since this means that as you position an object in 3D space, you have 54
COMP37111 Realistic and Realtime Rendering
Application Data Structure Gridcell Data Structure
Figure 40: A 2D version of the ‘gridcell’ structure, showing pointers from the
- bjects in ‘world space’ back down into the underlying objects in the applica-
tion itself. Of course, in reality this structure is a regular 3D array rather than a 2D one, so the cells are cubes – but the principle is identical.
to work out which of the gridcell’s cells the object touches, and make sure that each of them gets a pointer back to that object. The decision as to whether you only do this for cells that intersect with the surface of the
- bject, or for all cells including those ‘inside’ the object will depend on
the application. So the cost of populating the gridcell is fairly high, as is the spatial complexity (which is O(n3), where n is the number of cells in each dimension). But what is the cost of ‘querying’ the gridcell? Let’s say we want to work out which objects are intersected if we fire a ray into a particular cell (x, y, z); answering this is extremely cheap! We can index into our 3D array to immediately find the appropriate cell, and then follow any pointers in that cell to tell us which objects we’ve ‘hit’. So ‘querying’ the gridcell is extremely cheap – O(1) in fact. With the gridcell we have excellent query time; but very poor use of memory (if we make an accurate, fine grained gridcell, it’ll cost lots; if we make a course grained one with few cells, then we’ll get a lot of false positives when we query). You’ll probably have spotted by now that the ‘voxel structure’ we talked about in the Volume Rendering section is basically a gridcell, where the underlying
- bjects are values from our original sample data.
55
COMP37111 Realistic and Realtime Rendering
11.2 The Octree
The Octree is a relatively simple evolution of the gridcell idea, which ex- ploits what’s known as ‘spatial coherence’; the property that in most scenes, ‘stuff tend to cluster together’. Look at the world around you; it typically conists of big empty spaces, and then bunches of things near one another. This works at multiple levels: a galaxy is mostly sparse, with things clus- tered into solar systems and lots of space between; a city has lots of empty space above street level, with things clustered into buildings; chairs cluster around tables; things on the desk around you end up in piles. And so on. The gridcell approach doesn’t take this into account, so you typically end up with large proportions of the cells being ‘empty’: this isn’t so much the case with Volume Rendering, because you’re probably focussing your vi- sualisation on a blob of data that already contains ‘interesting stuff’; but for a scene in a computer game (which usually looks a bit like the real world), you can see that there are likely to be fairly big empty spaces. The idea of the Octree is to start off with one large cuboid space that encompasses the whole scene, and then to break that down into 8 smaller spaces, which will contain smaller subsets of the scene. . . and to keep on doing that until each cell (called an ‘octant’) contains fewer than a predetermined number of ob-
- jects. This is an ‘adaptive’ technique, meaning that if one of the top level
cells happens to fall in a fairly empty part of the scene, you don’t subdivide it any further, and instead focus the used of CPU and memory on bits of the world that need it. The 2D analogue of this approach is called a ‘quadtree’, and is shown in Figure 41.
11.3 Hierarchical Bounding Volumes
The problem that we’re faced with over and over in graphics is that of deal- ing with lots of polygons. Although, for example, interesecting a ray with a polygon is relatively painless (recall in Section 7.2 we went through this in a fair amount of detail), most scenes consist of a very large number of poly- gons, so repeating all the calcualtions becomes quite painful. On the other hand, the ray/sphere intersection calculation is very cheap—but not very many things, apart from spheres, are spherical. The idea behind bounding volumes is to use ‘cheap’ shapes (like spheres) to approximate more com- putationally expensive ones (such as polygonal meshes), in an attempt to eliminate some parts of the scene from our enquiries early on. Imagine you have a polygonal object—say, for the sake of argument—a cow, and we put a notional sphere around the cow (we don’t draw the sphere, it’s just there in the algorithm). Now if we try to do a ray/sphere intersection and miss the sphere, we can immediately be sure that we don’t need to bother wast- ing time testing the individual cow polygons to see if any of those have been hit: we know they haven’t because the ray didn’t hit the surrounding 56
COMP37111 Realistic and Realtime Rendering
A B C E F G H A B C D E F G H All space
Figure 41: A ‘quadtree’ is the two dimensional version of a 3D octree. In this figure, the overall space is broken into four square quadrants (‘octants’ in 3D, which are cubes), which are then sub-divided further until only one polygon exists in each.
57
COMP37111 Realistic and Realtime Rendering
- sphere. If we do hit the sphere, though, we then need to test the individ-
ual cow polygons for intersections. If we hit a cow polygon, then we’ve ‘wasted’ one extra calculation (the intersection with the sphere), but that’s no big deal since we’ve got a ‘hit’ result. If don’t hit a cow polygon, then we’ve wasted much more time, which is a bad thing. On average though, if we’re casting loads of rays into the scene, and if we assume that our scene is reasonably sparse so more rays miss the cow-in-a-sphere than hit it, then we gain an enormous win, since most of our rays can tell they haven’t pos- sibly hit the cow by just testing against the very simple sphere bounding volume. If we want to optimise things a bit more though, we should look at the relationship between the shape of our arbitrary object, and a sphere. We can see there’s a lot of ‘wasted space’ between the two shapes since they are not a good fit for one another. The more wasted space, the more ‘false positives’ we’ll get which lead us to test against the cow’s polygons, so ideally we’d like a shape that minimises this. We could try a bounding cube; the cost of intersecting with a cube is the same as that of intersecting with 6 polygons (i.e. the quads forming the faces of the cube), which compared to the cow is going to be cheap, but compared with the sphere a bit more expensive. The problem here is that a cube is generally not a much better fit to most objects than a sphere, so a balance between ‘cost of intersection’ and ‘false positives caused by wasted space’ needs to be drawn again. In the extreme, of course, the best fitting bounding volume for a cow would be, well, a bounding cow. But the cost
- f testing against this fairly specialist bounding volume is the same as the
underlying cow, which would just be silly.
Figure 42: Three different bounding volumes: a cuboid, a cube and a sphere.
It turns out that a reasonable compromise in many cases is to used a bounding cuboid, i.e. rather than a regular cube, we just allow the pairs of
- pposite faces to be different sizes. Here the cost is almost identical to that
- f a regular cube, but because we can vary the size in three dimensions, we
can often get a better fit to the objects we’re enclosing. Figure 42 shows the cow enclosed in the three bounding shapes we’ve discussed. The obvious next step is to allow bounding volumes to contain other bounding volumes; again exploiting the spatial coherene of most scenes. 58
COMP37111 Realistic and Realtime Rendering The process is conceptually simple; enclose complex polygonal objects in bounding shapes (let’s say cuboids), then enclose any bounding shapes that end up being near one another in a bigger bounding shape, and repeat the process until you have a suitable hierarchy of objects. You can think of this as being a bit like the octree process in reverse, but this time you end up with the nodes in the tree being spatially scattered around the scene wherever clusters of objects appear, rather than being arranged regularly. Hierarchical Bounding Volumes (HBV) is an example of an ‘irregular’ spa- tial enumeration technique; by contrast, the gridcell and octree techniques are considered ‘regular’ because they split space up into equally sized re-
- gions. A partitioning of an environment using bounding ‘spheres’ (circles
in this 2D illustration) is shown in Figure 43.
Figure 43: A hierarchy of bounding circles (these would be spheres in 3D) surrounding clusters of objects in a scene. Note that this is an irregular par- titioning of space; although all the bounding objects are circles/spheres, they are positioned based on the location of objects, and not according to a pre- determined regular pattern.
11.4 Binary Space Partitioning
If we think a little more carefully about the queries we are really asking of
- ur spatial enumeration techniques, we realise that they often boil down to
- ne specific kind of question: “what is the closest object that intersects with
this ray/vector/line”. This is true whether you are ray tracing, or firing a laser from your virtual gun at a virtual bad guy; or even if you are just se- lecting an object with a mouse. The regular spatial enumeration techniques (gridcell and octree) can answer this question fairly easily, because they are tightly coupled to the co-ordinates of the 3D environment and so preserve the concept of in-front/behind, above/below, right-of/left-of fairly well; 59
COMP37111 Realistic and Realtime Rendering but the more anarchic hiearachical bounding volume mechanism, while being more efficient in its use of space, loses this relationship; it simply represents ‘collections of things near each other’. Binary Space Partitioning is an approach that attempts to retain the ‘rel- ative positioning’ property, whilst also being efficient in terms of space and compute complexity. There are two main variants, one of which is easy to understand but not hugely efficient, and another which is a bit more tricky to make sense of on first contact, but which has better performance in most
- cases. We’ll look at the former first, as a way of leading up to the latter.
Axis Aligned Binary Space Partitioning attempts to split the world re- cursively into two sections, where the boundary is drawn such that it aligns with one of the world axes. The process is shown in Figure 44. It starts by dividing the space vertically into two sections, shown by line 0. The next step then divides both these spaces horizontally in two (shown by lines 1a and 1b). These in turn are divided vertically, then horizontally and so on, until some termination condition is reached (say, that no more than 2 ob- jects are left in any one region). The advantage of this approach is that it preserves the relative position of the objects in the different regions, i.e. to the left and right of vertical divisions, and in-front/behind the horizontal
- nes. This means that by traversing the tree that’s created, and knowing
what ‘decision’ was made at each sub-division, you can determine the rela- tive position of any object with respect to any viewpoint (which means you can work out an ordering of things, say to select the ‘nearest’ intersection straight off, rather than having to get all possible intersecting objects and then sorted them into order along the ray).
A B C D E A B C D E 1a 1b 1a 1b 2 2
Figure 44: An example of an axis-aligned BSP containing 5 objects
A relatively minor tweak that makes BSP more efficient is to allow 60
COMP37111 Realistic and Realtime Rendering the partitioning to take place along arbitrary axes, rather than being con- strained to world axes, and in Polygon Aligned Binary Space Partition- ing, the planes are frequently (but not always) chosen to coincide with the planes defined by polygons in the scene.
11.5 Generating BSP Trees
(The following text is taken from the Wikipedia entry on Binary Space Parti- tioning
W, and is modified only slightly. The diagrams are by Chrisjohnson W)
The canonical use of a BSP tree is for rendering polygons (that are double- sided, that is, without back-face culling) with the Painter’s algorithm
W.
Such a tree is constructed from an unsorted list of all the polygons in a
- scene. The recursive algorithm for construction of a BSP tree from that list
- f polygons is:
- 1. Choose a polygon P from the list.
- 2. Make a node N in the BSP tree, and add P to the list of polygons at
that node.
- 3. For each other polygon in the list:
(a) If that polygon is wholly in front of the plane containing P, move that polygon to the list of nodes in front of P. (b) If that polygon is wholly behind the plane containing P, move that polygon to the list of nodes behind P. (c) If that polygon is intersected by the plane containing P, split it into two polygons and move them to the respective lists of poly- gons behind and in front of P. (d) If that polygon lies in the plane containing P, add it to the list of polygons at node N.
- 4. Apply this algorithm to the list of polygons in front of P.
- 5. Apply this algorithm to the list of polygons behind P.
The following sequence illustrates the use of this algorithm in convert- ing a list of lines or polygons into a BSP tree. At each of the eight steps, the algorithm above is applied to a list of lines, and one new node is added to the tree.
- 0. Start with a list of lines, (or in 3D, poly-
gons) making up the scene. In the tree di- agrams, lists are denoted by rounded rect- angles and nodes in the BSP tree by circles. In the spatial diagram of the lines, direction chosen to be the ‘front’ of a line is denoted by an arrow.
A B C D A B C D
61
COMP37111 Realistic and Realtime Rendering
1. Following the steps of the algorithm above i) Choose a line, A, from the list. ii) add it to a node. iii) Split the remaining lines in the list into those which lie in front
- f A (i.e. B2, C2, D2), and those which lie
behind (B1, C1, D1). iv) Process first the lines lying in front of A (in steps iiv), v) fol- lowed by those behind it (in steps 47).
A B1 C2 D2 D1 C1 B2 B1 C1 D1 B2 C2 D2
A
- 2. Apply the algorithm to the list of lines
in front of A (containing B2, C2, D2). We choose a line, B2, add it to a node and split the rest of the list into those lines that are in front of B2 (D2), and those that are behind it (C2, D3).
A B1 C2 D2 D1 C1 B2 D3 B1 C1 D1 C2 D3
A B2
D2
- 3. Choose a line, D2, from the list of lines
in front of B2. It is the only line in the list, so after adding it to a node, nothing further needs to be done.
A B1 C2 D2 D1 C1 B2 D3
D2
B1 C1 D1 C2 D3
A B2
- 4. We are done with the lines in front of
B2, so consider the lines behind B2 (C2 and D3). Choose one of these (C2), add it to a node, and put the other line in the list (D3) into the list of lines in front of C2.
A B1 C2 D2 D1 C1 B2 D3
C2
D3
D2
B1 C1 D1
A B2
- 5. Now look at the list of lines in front of
- C2. There is only one line (D3), so add this
to a node and continue.
A B1 C2 D2 D1 C1 B2 D3
D3 C2 D2
B1 C1 D1
A B2
- 6. We have now added all of the lines in
front of A to the BSP tree, so we now start
- n the list of lines behind A. Choosing a
line (B1) from this list, we add B1 to a node and split the remainder of the list into lines in front of B1 (i.e. D1), and lines behind B1 (i.e. C1).
A B1 C2 D2 D1 C1 B2 D3
D3 C2 D2 A B2
D1
B1
C1
62
COMP37111 Realistic and Realtime Rendering
- 7. Processing first the list of lines in front
- f B1, D1 is the only line in this list, so add
this to a node and continue.
A B1 C2 D2 D1 C1 B2 D3
D3 C2 D2 A B2 B1
C1
D1
- 8. Looking next at the list of lines behind
B1, the only line in this list is C1, so add this to a node, and the BSP tree is complete.
A B1 C2 D2 D1 C1 B2 D3
D3 C2 D2 A B2 B1 D1 C1
The final number of polygons or lines in a tree will often be larger (sometimes much larger) than that in the original list, since lines or poly- gons that cross the partitioning plane must be split into two. It is desirable that this increase is minimised, but also that the final tree remains reason- ably balanced. The choice of which polygon or line is used as a partitioning plane (in step 1 of the algorithm) is therefore important in creating an effi- cient BSP tree.
11.6 Traversal
A BSP tree is traversed in a linear time, in an order determined by the par- ticular function of the tree. Again using the example of rendering double- sided polygons using the painter’s algorithm, for a polygon P to be drawn correctly, all the polygons which are behind the plane in which P lies must be drawn first, then polygon P must be drawn, then finally the polygons in front of P must be drawn. If this drawing order is satisfied for all polygons in a scene, then the entire scene is rendered in the correct order. This pro- cedure can be implemented by recursively traversing a BSP tree using the following algorithm. From a given viewing location V, to render a BSP tree:
- 1. If the current node is a leaf node, render the polygons at the current
node.
- 2. Otherwise, if the viewing location V is in front of the current node:
(a) Render the child BSP tree containing polygons behind the cur- rent node (b) Render the polygons at the current node (c) Render the child BSP tree containing polygons in front of the current node
- 3. Otherwise, if the viewing location V is behind the current node:
(a) Render the child BSP tree containing polygons in front of the current node 63
COMP37111 Realistic and Realtime Rendering (b) Render the polygons at the current node (c) Render the child BSP tree containing polygons behind the cur- rent node
- 4. Otherwise, the viewing location V must be exactly on the plane asso-
ciated with the current node. Then: (a) Render the child BSP tree containing polygons in front of the current node (b) Render the child BSP tree containing polygons behind the cur- rent node
A B1 C2 D2 D1 C1 B2 D3
D3 C2 D2 A B2 B1 D1 C1
V
Figure 45: An example BSP tree constructed for a simple scene containing four polygons
Applying this algorithm recursively to the BSP tree generated above results in the following steps:
- The algorithm is first applied to the root node of the tree, node A. V
is in front of node A, so we apply the algorithm first to the child BSP tree containing polygons behind A – This tree has root node B1. V is behind B1 so first we apply the algorithm to the child BSP tree containing polygons in front of B1: * This tree is just the leaf node D1, so the polygon D1 is ren- dered. – We then render the polygon B1. – We then apply the algorithm to the child BSP tree containing polygons behind B1: * This tree is just the leaf node C1, so the polygon C1 is ren- dered. – We then draw the polygons of A – We then apply the algorithm to the child BSP tree containing polygons in front of A 64
COMP37111 Realistic and Realtime Rendering – This tree has root node B2. V is behind B2 so first we apply the algorithm to the child BSP tree containing polygons in front of B2: * This tree is just the leaf node D2, so the polygon D2 is ren- dered. – We then render the polygon B2. – We then apply the algorithm to the child BSP tree containing polygons behind B2: * This tree has root node C2. V is in front of C2 so first we would apply the algorithm to the child BSP tree containing polygons behind C2. There is no such tree, however, so we continue. * We render the polygon C2. * We apply the algorithm to the child BSP tree containing poly- gons in front of C2 · This tree is just the leaf node D3, so the polygon D3 is rendered. The tree is traversed in linear time and renders the polygons in a far- to-near ordering (D1, B1, C1, A, D2, B2, C2, D3) suitable for the painter’s algorithm.
11.7 Questions
- 1. Why is the BSP Tree approach inappropriate for Volume Rendering?
- 2. What are the space and time complexities of the various spatial enu-
meration techniques?
- 3. How do the various techniques perform if the contents of the scene
are moving?
11.8 Spatial Enumeration Ponderings
- 1. How might you optimally combine the techniques for scenes with
static and moving content?
- 2. Would would be the implications of applying the octree approach to
Volume Rendering?
- 3. Which of the techniques might be helpful in improving the perfor-
mance of Ray Tracing? or for Radiosity? 65
COMP37111 Realistic and Realtime Rendering
12 Culling
For most graphical scenes it is likely that there’s much more detail or con- tent in the underlying model than is visible in any one rendered view; this is particularly true for interactive applications with dynamic content where
- bjects and the viewpoint vary. There’s an obvious performance advantage
to be had if you can work out which bits of the scene aren’t going to con- tribute to the rendered view (i.e. if you can’t see them, don’t waste time drawing them!). This might seem like an obvious thing to try to achieve, but it’s not trivial to achieve, since you have to make sure that the cost of excluding something from being rendered doesn’t exceed the cost of just rendering it. We’ll look at a number of ‘culling’ techniques designed to achieve this.
12.1 Detail culling
The first, and most simple of these techniques is called ‘detail culling’. The idea here is to not bother drawing things that are simply too small to have a perceivable effect on the final scene. This could be a small object reasonably close up, or a huge one in the far distance. All we need to determine here is the size of the object when projected onto the view plane – this could just be a simple bit of trigonometry based on the bounding cubious of the object (and culling out large objects in the distance is particularly satisfying, since spending a lot of effort drawing 1000s of textured polygons, only to find
- ut that the result is a single pixel in the final image is obviously wasteful).
This approach is particularly effective for moving scenes, since the viewer is unlikely to notice tiny bits of lost detail.
12.2 Backface culling
It’s fairly obvious that for most objects you can’t see all of them all at the same time; the chances are that some bits of the object are facing you and are therefore visible, and that some parts are on the ‘other side’, hidden from view. This is certainly true for opaque concave objects, where the front facing surface hides the back surface; it’s a bit more complex for ‘open’ or concave objects (for example, you can see the back face of an open box). What this does mean though is that for a lot of things we could save time by simply not rendering the polygons that are facing away from the viewer. Figure 46 shows a simple model, first rendered with filled polygons, then with all its polygons visible, and finally with only those poylgons that have a surface normal facing the viewplane drawn. You can see there is a consid- erable saving in terms of the number of polygons shown; not surprisingly, somewhere in the region of 50% in this case. But how do you determine whether a polygon is facing towards or away from the viewer? There are 66
COMP37111 Realistic and Realtime Rendering two considerations to take into account here; first, it has to be a cheap cal- culation (otherwise we might as well just draw all the polygons and let the z-buffer take care of hiding the ones behind), and second it has to ideally be something we can do early on in the rendering pipeline (otherwise, if we’ve done most of the hard work transforming a polygon, we might as well just draw it anyway and again let the z-buffer do the occlusion work for us).
Figure 46: A model of a sofa rendered (a) with filled polygons, (b) showing the wireframe of its polygonal mesh, and (c) showing only those polygons that face the viewer.
We can do a simple test on polygons in object-space by calculating the dot product of the view vector and the polygon’s surface normal, and look- ing at the sign of the result. If it’s negative, the polygon faces away from the viewer and can be discarded; if it’s positive, it’s facing towards us and should be drawn (we are essentially testing wether the polygon is angled more than 90 degrees away from perpendicular to the view vector). The down side of this approach is that it does require a bit of trigonometry (re- call the dot product of two vectors will require a cosine calculation) – but modern hardware often has this calculation available as a GPU instruction, so this isn’t too painful (Figure 47. There is one alternative to doing the calculation in world space, which is to look at the winding of the polygon as it is rendered onto the view plane; assuming that our model has a consistent polygon winding, then resulting polygons wound one way when projected on to the viewplane will be facing towards the viewer, and those wound the other will be facing away from the viewer.
12.3 Frustum Culling
The viewing frustum is a geometric representation of the volume visible to the virtual camera. Naturally, objects outside this volume will not be visible in the final image, so they are discarded. Often, objects lie on the boundary of the viewing frustum. These objects are cut into pieces along this boundary in a process called clipping, and the pieces that lie outside the frustum are discarded as there is no place to draw them. The various spatial enumeration techniques described in Section 11 make it possible to 67
COMP37111 Realistic and Realtime Rendering
A B C
Figure 47: Backface culling in object-space. Polygons with normals within 90 degrees either side of the surface normal are facing towards the viewer.
quickly determine which objects are candidates for culling.
12.4 Occlusion Culling
The Z-Buffer techniques takes care of the problem of objects near to the viewplane being ‘drawn over’ those further away; but it does rely on all the
- bjects being drawn, and happens very late on in the rendering pipeline, so
doesn’t help performance. Ideally we’ld like to arrange for objects that are entirely behind other opaque objects to be excluded from list of things to be drawn early on — but this isn’t trivial to arrange (imagine a very com- plicated polygonal object hidden by a simple single quadrilateral wall— it would be enormously wasteful to draw all the polygons only to later find that they are overwritten in the final image by pixels contributed by the single polygon wall). Approaches have been proposed that use tech- niques based on the Z-Buffer approach to solve the general problem, e.g. (Greene:1993:HZV:166117.166147) (see Appendinx A.12 if you’re interested, but don’t worry about the detail!). At one (absurd!) extreme, you could imagine pre-computing all possi- ble configurations of a scene, and remembering which objects are occluded from various viewpoints, but this of course isn’t practical, and is especially problematic for scenes with moving content. One compromise is to cre- ate Potentially Visible Sets of objects; essentially dividing the scene in to regions and pre-computing their visibility (again, only really plausible for scenes with static or pre-determined movement). Another approach to solving this problem is to exploit specific features 68
COMP37111 Realistic and Realtime Rendering
- f the scene being rendered. It’s common, for example, that indoor scenes
consist of enclosed spaces of one kind or another, with portals—doors and windows—through which other enclosed spaces are potentially visible (Fig- ure 48. From any given viewpoint, then, its likely that you can potentially see things in the current room, and in any room to which there is a direct line of sight through ‘portals’ (Luebke:1995:PMS:199404.199422). The ‘por- tal culling’ approach works like roughly like this:
- 1. From the current viewpoint, identify any portals that are partly or
wholly visible in the current view frustum.
- 2. For each portal, cast a rays against the perimeter of the portal into
- ther rooms.
- 3. For each room that a ray enters, identify portals that are visible in the
view frustum, and repeat the ray casting process.
- 4. Each room that a ray passes through has contents that are potentially
visible from the original viewpoint; create a new viewing ‘frustum’ (the portal need not be square, so it’s not always technically a frus- tum) from the eyepoint that takes into account the frame of the portal, and repeat the process from Step 1 until all visible portals have been included. (a) (b)
Figure 48: (a) A plan view of a series of rooms and their interconnecting por- tals, and (b) the view from one of the rooms, showing the outlines of the por- tals and the content seen through them.
12.5 Culling Ponderings
- Why is the paper by Luebke:1995:PMS:199404.199422 called ‘Portals
and Mirrors’? 69
COMP37111 Realistic and Realtime Rendering
- Why is backface culling of polygons in screen-space (i.e. based on the
winding of the resultant polygon) of little value in terms of perfor- mance?
- How could the portal-culling technique be modified to speed up the
rendering of a city-like environment, consisting of densly built tall buildings when viewed from street level?
- Why might a model of an industrial plant consisting of lots of ex-
tremely long pipes pose problems for culling algorithms? 70
COMP37111 Realistic and Realtime Rendering
13 Acknowledgements
This document is largely, but not entirely the original work of the author. However in some places, fragments of text from Wikipedia pages associ- ated with the concepts have been taken and re-worked to suit the prose style of these notes; for small pieces of text—a sentence or two—links nearby are given to the relevant pages using the n
Wotation, and I am grateful to all
the contributors to those pages for creating them and allowing their content to be reworked straightforwardly. Some of the images used in figures are created by colleagues; others are taken from Wikimedia pages and used in accordance with their licences – in all cases I have tried to credit the images’ creators appropriately in the figure captions.
14 License
The text of this exercise is licensed under the terms of the Creative Com- mons Attribution 2.0 Generic (CC BY 3.0) License. 71
COMP37111 Realistic and Realtime Rendering
A Reading material
This appendix contains numerous original papers published in the field of computer graphics.
A.1 The Rendering Equation
This paper by James T. Kajiya brings together a variety of older formulae that had been proposed and used in computer graphics to form what has since been known as ‘the rendering equation’. Essentially the same equa- tion was proposed in Immel:1986:RMN:15886.15901 in their paper about Radiosity; but Kajiya had the foresight to give his article a more memo- rable name. This article mentions Monte Carlo Techniques
W which are an important
part of modern computer graphics, and are a category of algorithms that rely on repeated random sampling (often steered by particular aspects of what’s being rendered) to get high quality results without having to use a full-blown deterministic approach. We’ve not got time in this course to explore these techniques in depth, but you may want to think about which parts of the various algorithms we have covered could be improved by sensible random sampling. 72
COMP37111 Realistic and Realtime Rendering
A.2 Ray Tracing Jell-O Brand Gelatin
In this 1988 paper, and inspired by the Kijiya Rendering Equation, Paul
- S. Heckbert introduces the concept the Jell-O Equation, targeted specifically
at producing results for ‘a restricted class of dessert foods’. 73
COMP37111 Realistic and Realtime Rendering
A.3 An Improved Illumination Model for Shaded Display
Although the idea of creating computer generated images by tracing a ray into a scene was described by Arthur Appel in 1968 (Appel:1968:TSM:1468075.1468082), his technique only use what we’d now call a ‘primary’ ray, and produced relatively crude results without the reflections and transparency effects we’re used to seeing in Ray Traced images. This paper by Turner Whitted in- troduced the ideas of secondary rays and ‘recursive’ Ray Tracing, and is usually cited as being the seminal paper on the topic.
A.4 Ray Tracing on Programmable Graphics Hardware
This 2002 paper by Purcell and colleagues requires a fair understanding of the behaviour of GPUs, beyond what is needed for this course, so don’t worry if you get lost in the detail: its interesting to even a rough idea of how Whitted’s original ideas are applied on modern hardware.
A.5 Distributed Interactive Ray Tracing of Dynamic Scenes
Originally, because each image generated by Ray Tracing is dependent on the position of the eye point (and of course the position of every object and light in the scene), it was primarily used to create static images and individual pre-rendered scenes for CGI animations. This paper describes techniques on ‘modern’ hardware (2003, ahem) that extend Ray Tracing to interactive applications with moving objects.
A.6 The Hemi-cube: a Radiosity Solution for Complex Environ- ments
Although the title of this paper implies that it’s about the Hemi-cube tech- nique, it is in fact a fairly detailed description of the entire Radiosity process when applied to computer graphics.
A.7 Perceptually-Driven Radiosity
This work was published by Simon Gibson as a result of his PhD research here at the School of Computer Science. It was done under the supervision
- f Professor Roger Hubbold, who taught on this course unit before his re-
- tirement. The interesting thing about this work is that it uses properties of
the human perceptual system—i.e. how our eyes and brains interpret im- ages seen on screen—to reduce the cost of calculating Radiosity solutions without reducing the perceived quality of the result. 74
COMP37111 Realistic and Realtime Rendering
A.8 Volume Rendering: Display of Surfaces from Volume Data
This paper by (now) Professor Marc Levoy was the first to report on volume rendering techniques, and is, according to Google Scholar one of the most cited papers in computer graphics. It was published when he was a PhD student studying at University of North Carolina, Chapel Hill. Levoy has since posted an erratum to this paper on his Stanford web page, which reads as follows:
There is an error in this paper. Figure 1 suggests that voxel colors and
- pacities should be interpolated separately during ray tracing/resam-
- pling. This only works correctly if the colors have been premultiplied
by the opacities, as suggested by Porter and Duff [1], before interpo-
- lation. If this is not done, then low-opacity colors may mix on equal
terms with high-opacity colors, leading to color bleeding artifacts at the boundaries between differently colored regions of the volume. The necessity to premultiply colors by opacities was not made clear in the paper. This error, and the visual artifacts it may cause, is nicely described in a paper by Wittenbrink, Malzbender, and Goss [2]. However, contrary to the impression one might get from reading their paper, the error in my 1988 paper is only in the exposition, not in the implementation. As I say in a letter to the editor [3] of IEEE Computer Graphics and Applications, the images in my 1988 paper, and in my later papers, are correct. The code used to produce these images, incorporated in 1994 into Lacroute and Levoy’s [4] free VolPack software package, is also correct. Despite this, there is a U.S. patent [5] covering the ”im- proved” volume rendering algorithm described in [2]. I hope no com- pany is paying a licensing fee on this invalid patent. [1] Porter, T., Duff., T., Compositing digital images, Proc. SIGGRAPH ’84, ACM, 1984, pp. 253-259. [2] Wittenbrink, C., Malzbender, T., Goss, M., Opacity-weighted color interpolation for volume sampling, Proc. 1998 Symposium on Volume Visualization, ACM, October, 1998, pp. 135-142. [3] Levoy, M., Error in volume rendering paper was in exposition
- nly, IEEE Computer Graphics and Applications, Vol.
20, No. 4, July/August, 2000, p. 6. [4] Lacroute, P. and Levoy, M., Fast Volume Rendering Using a Shear- Warp Factorization of the Viewing Transformation, Proc. SIGGRAPH ’94, ACM, 1994, pp. 451-458. [5] Malzbender, T., Goss, M.E., Opacity-weighted color interpolation for volume sampling, U.S. patent Number 6,278,459, filed August 20, 1997, issued August 21, 2001,
75
COMP37111 Realistic and Realtime Rendering
A.9 Multi-GPU Volume Rendering using MapReduce
This paper describes the application of the Map/Reduce technique to the parallelisation of volume rendering on GPUs.
A.10 Introduction to bounding volume hierarchies
Not an officially published article as such, but rather a chapter from Her- man J. Haverkort’s PhD thesis that serves as a useful introduction to many
- f the issues in creating hierarchical bounding volumes.
A.11 Portals and mirrors: Simple, Fast evaluation of Potentially Visible Sets
This paper introduces the idea of Portal Culling. The model used here is a radiosity solution of the home of Professor Fred Brookes
W, one of the
pioneers of computer graphics and author of The Mythical Man Month
W.
A.12 Hierarchical Z-Buffer Visibility
An algorithm that uses a Z-Buffer technique to provide generic occlusion
- culling. Quite a complex algorithm—read just to get a sense of the ap-
proach rather than worrying about the detail. 76
COMP37111 Realistic and Realtime Rendering
B Frequently Asked Questions
B.1 Do I need to read through all of the notes you’ve written?
If you want to be sure you’ve covered all the examinable material, then yes you do. These notes cover at a guess about 70% of what’s examinable for my part of the course. If you learn everything that’s in them and under- stand it thoroughly, you should be able to get a good strong 2:1 degree, possibly even a first.
B.2 If I read and understand all the notes, will I know everything I need to for the exam?
- Nope. The lectures expand, illustrate and exemplify what’s written here,
and I allow myself to include material that relevant / up-to-date or just interesting in the lectures in a much more flexible way than I do here – but that material too is examinable. There’s a significant overlap between the slides I’ll be using in lectures, and these notes, but they are not the same.
B.3 Do I need to remember all the names of people who invented techniques?
No, that’s just included in case you’re interested, and out of respect for those of have pioneered techniques in computer graphics. Credit where credit is due.
B.4 There’s a lot of maths. Do I need to remember it all?
- No. You don’t need to remember any of it in fact—but you do need to un-
derstand it. There will never be a question in the exam that says anything like ‘What is the rendering equation; write it down and explain all the com- ponents’. But there may be a question that shows the rendering equation (or any of the other maths) and asks you to explain what role (for exam- ple) the
1 πr2 component plays. Similarly you won’t be asked to derive the
solution for the intersection between a ray and a sphere–but you might be shown the form of the quadratic equation and asked about the performance implications of the square root component.
B.5 Am I expected to follow up all the Wikipedia links?
Definitely not. If you understand the meaning of a term I’ve highlighted as a wikipedia concept in the context in which I’ve written it then you under- stand enough about that term to use these notes. If you don’t understand what the term means, you should probably read the wikipedia page it links 77
COMP37111 Realistic and Realtime Rendering to; but only enough to give you an overview of the term’s meaning. If I re- ally care about a more detailed definition of a term, I will have written about it in these notes, or explained it in more depth in the lectures. Of course, if a term looks interesting, by all means follow it up!
B.6 Am I expected to read all the papers in the Appendixes?
- Yes. But, they are there to deepen your understanding about concepts, not
to explain things in detail. You should read them, and try to understand the general principles being discussed, but you don’t need to study them in depth. In the case of the historical papers, they will give you a bit more detail about what’s explained in these notes, and for the most part should be reasonably straightforward to follow. For the more recent ones, they are there to give you a flavour of a particular application of a technique,
- r just to show you how that technique has evolved over the years — for
these just a fairly superficial reading should give you enough of a taste of progress to help you put things in context (and to write more interesting exam answers!)
B.7 Should I read all the references too?
Absolutely not. The references are there in case you are really really really keen, and apart from those papers that are also in the Appendixes are just included for interest and completeness.
B.8 The paper about the Jello equation. It’s a spoof, surely?
Yes.
B.9 Do I need to know the answers to all the ponderings?
Yes and no. Some of the ponderings have answers in this text or in the lec- tures; some of them I’ll answer in the reflection sections at the start of the lecture following that topic; and others are really very open ended ques- tions to which there aren’t obvious or definite answers. They are there to provoke you to think. If you can answer most of them—or at least under- stand the issues that make them hard or interesting to answer, then you’re doing okay. 78
COMP37111 Realistic and Realtime Rendering
C Example GPU code for Phong Shading
Included just as a simple example of vertex shading on a GPU, suitable for working with ‘faux’ normals. Just included for reference; no need to learn the details. void main(float4 position : TEXCOORD0, float3 normal : TEXCOORD1,
- ut float4 oColor
: COLOR, uniform float3 ambientCol, uniform float3 lightCol, uniform float3 lightPos, uniform float3 eyePos, uniform float3 Ka, uniform float3 Kd, uniform float3 Ks, uniform float shiny) { float3 P = position.xyz; float3 N = normal; float3 V = normalize(eyePosition - P); float3 H = normalize(L + V); float3 ambient = Ka * ambientCol; float3 L = normalize(lightPos - P); float diffLight = max(dot(L, N), 0); float3 diffuse = Kd * lightCol * diffLight; float specLight = pow(max(dot(H, N), 0), shiny); float3 specular = Ks * lightCol * specLight;
- Color.xyz = ambient + diffuse + specular;
- Color.w = 1;
} 79
COMP37111 Realistic and Realtime Rendering
D Background maths
This appendix, modified from notes written by Professor Roger Hubbold who previously taught this course, provide a brief aide-memoir for some
- f the maths used in this material. Why is this mathematics necessary?
The short answer is that almost all work on graphics uses mathematical notation to give a precise description of calculations. These notes, and any papers dealing with this topic require you to be comfortable with the nota- tion in order to understand their content. So, to be ‘qualified’ in this subject area you need to know the maths. Although it may look daunting, it is not complicated—maybe just unfamiliar—and like most notations or jargon, understanding it just needs a bit of practice. Most of it should be familiar to you from earlier maths and graphics courses.
D.1 Simple vector algebra
For 3D graphics vectors are used extensively to represent points and direc- tions in space. A vector has a magnitude (its length) and a direction. In 3D, a vector has x, y, and z components as in Figure 49.
y z x vx vy vz v
Figure 49: A vector V in 3D space.
In graphics, vectors are commonly written in column notation such as v = vx vy vz but they can actually can be be written in a variety of ways, which can be a bit confusing, so you’ll have to work out which notation is being used in each case. Sometimes you’ll see vectors written as a letter with an arrow 80
COMP37111 Realistic and Realtime Rendering
- ver it such as
v; this simply means that the symbol (in this case v) is a
- vector. In other cases (most of the cases in these notes), vectors are written
by a letter in bold, e.g. v. You may also see a column vector (like the one shown above) written as a transposed (‘rotated’) row vector, using the su- perscript T, as in [vx, vy, vz]T (this is usually just to allow the vector to be written ‘in line’ with text). To confuse things further, vectors are also writ- ten using wiggly rather than square brackets, or even regular parentheses, so {vx, vy, vz}T , v, [vx, vy, vz]T , (vx, vy, vz)T and v all mean the same thing! The magnitude of a vector (often written as |v|) is computed from its components using Pythagoras’ theorem: |v| =
- v2
x + v2 y + v2 z
and we can normalise a vector (to form a unit vector) by dividing each component by the length: ˆ v = vx/m vy/m vz/m Sometimes the hat symbol (ˆ) is used to signify a unit vector. However, in general, in this course and in graphics in general, its often assumed that any vector signifying a direction (e.g. the direction of a light ray, or a sur- face normal) is normalised, so its quite common not to bother with the hat. D.1.1 Vector addition and subtraction Vector addition and subtraction are most easily drawn in two dimensions, but the extension to 3D is trivial. These are shown visually in Figure 50.
a c b a c b c = a + b c = a - b
Figure 50: Addition and subtraction of vectors.
D.1.2 Scalar product The scalar (or dot) product of two vectors is the sum of the products of their
- components. It is a scalar value (a number). It can also be used to compute
81
COMP37111 Realistic and Realtime Rendering the cosine of the angle between two unit vectors (Figure 51), which is very widely used in computer graphics: a · b = axbx + ayby + azbz = |a||b| cos(θ)
a b
Figure 51: The dot product of two vectors a and b can be used to compute the cosine of the angle θ between them.
Uses of the dot product include backface culling, and computing the light energy incident on a surface with Lamberts Cosine Law. D.1.3 Lamberts Cosine Law
A
Acos( ) ' n
Figure 52: Lambert’s Cosine Law
This states that the amount of light falling upon a surface (the energy per unit area of the surface) is proportional to the cosine of the angle be- tween the incident light direction and the surface normal vector. We can see from a Figure 52 that this is so, because the area of the surface presented to (facing) the light source depends on this cosine. Li(x, ω) ∝ cos(θ) = n · ω 82
COMP37111 Realistic and Realtime Rendering D.1.4 Vector product The vector (or cross) product of two vectors (a and b) yields another vector (c) normal (at 90◦) to the plane containing the two vectors (see Figure 53 The product of two vectors V1 = (x1, y1, z1)T and V2 = (x2, y2, z2)T is V1 × V2 = ((y1z2 − y2z1), (z1x2 − z2x1), (x1y2 − x2y1))T
a c b
Figure 53: The vector (or cross) product c is a vector orthogonal to the two vectors a and b.
D.2 Integration
An integral evaluates a function over a given interval. In the example be- low, we evaluate the area under a graph (the function f(x)) over the inter- val x1 to x2 as shown in Figure 54. We do this by taking a series of very thin strips, of width δx and multiplying each of them by the height of the strip, which is f(x). As δx is made smaller and smaller (‘tends to zero’) the answer becomes closer and closer to the true value of the integral. So whilst mathematically we might write A = x2
x1
f(x)δx what we almost always end up doing programmatically is A =
x2
- x1
f(x)dx 83
COMP37111 Realistic and Realtime Rendering
dx f(x) x x2 x1 Figure 54: Integration of the ‘area under a graph’ for f(x) between x1 and x2.
Integration is easily extended to cover more dimensions. A double inte- gral evaluates a function of two variables over two ranges. It could be used, for example, to integrate the volume under a surface defined as a function
- f x and y:
Vol =
x2
- x1
y2
- y1
f(x, y)δxδy In this case, instead of a thin strip we have a small area of dimensions δx by δy, centred on the point (x, y), where the value of the function (the height of the surface) is f(x, y). Again, this can be evaluated by summation, using small values for δx and δy to get an accurate result: Vol =
x2
- x1
y2
- y1
f(x, y)dxdy To further simplify this, we often use a single integral over an area (in- stead of the individual dimensions of δx and δy) and write it like this: Vol =
- A
f(x, y)δA As we shall see, the domain (the range of values) over which we inte- grate can be pretty much any space over which we can define our function. When computing light arriving at a point on a surface we generally inte- grate over a hemisphere of visible directions centred at the point we are interested in and occupying the half-space centred on the surface normal vector. 84
COMP37111 Realistic and Realtime Rendering
Figure 55: Sweeping out a hemisphere with polar co-ordinates θ and π.
One way to sweep out a hemisphere (as in Figure 55) is to use spherical (or polar
W) coordinates:
- Ω
f(θ, φ)δA =
2π
- π
2
- f(θ, φ) sin(θ)δθδφ
(13) Here, our function depends on two angles φ and θ which can be thought
- f as the ‘longitude’ and ‘latitude’ of a globe. By varying φ from 0 to 2π we
sweep out a circle right around the globe. As θ varies from 0 to π/2 we trace
- ut the upper half of a sphere (a hemisphere). In this case, our elemental
(small) area (δA in the earlier example) is of dimension δφ by δθ. But, as θ tends towards zero (the ‘pole’), our value of δφ reduces towards zero; in fact the scaling of δφ as θ reduces is sin(θ) , so in the general case our area δA is actually of dimensions δθ by sin(θ)δφ hence the expression in the integral in Equation 13. Instead of explicitly writing the double integral, we can, as previously, use a single integral over the hemisphere
- Ω, where Ω represents the area
- f the hemisphere. This is simply notation—theres nothing mystical about
- it. When you see integrals like this in the course you can think of them as