Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction - - PowerPoint PPT Presentation

metropolis sampling matt pharr cs348b
SMART_READER_LITE
LIVE PREVIEW

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction - - PowerPoint PPT Presentation

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction Unbiased MC method for sampling from functions distributions Robustness in the face of difficult problems Application to a wide variety of problems Flexibility


slide-1
SLIDE 1

Metropolis Sampling Matt Pharr cs348b

May 20, 2003

slide-2
SLIDE 2

Introduction

  • Unbiased MC method for sampling from

functions’ distributions

  • Robustness in the face of difficult problems
  • Application to a wide variety of problems
  • Flexibility in choosing how to sample
  • Introduced to CG by Veach and Guibas
slide-3
SLIDE 3

(a) Bidirectional path tracing with 40 samples per pixel.

Image credit: Eric Veach

slide-4
SLIDE 4

(b) Metropolis light transport with an average of 250 mutations per pixel [the same computation time as (a)].

Image credit: Eric Veach

slide-5
SLIDE 5

(a) Path tracing with 210 samples per pixel.

Image credit: Eric Veach

slide-6
SLIDE 6

(b) Metropolis light transport with an average of 100 mutations per pixel [the same computation time as (a)].

Image credit: Eric Veach

slide-7
SLIDE 7

Overview

  • For arbitrary f(x) → R, x ∈ Ω
slide-8
SLIDE 8

Overview

  • For arbitrary f(x) → R, x ∈ Ω
  • Define I(f) =
  • Ω f(x)dΩ so fpdf = f/I(f)
slide-9
SLIDE 9

Overview

  • For arbitrary f(x) → R, x ∈ Ω
  • Define I(f) =
  • Ω f(x)dΩ so fpdf = f/I(f)
  • Generates samples X = {xi}, xi ∼ fpdf
  • Without needing to compute fpdf or I(f)
slide-10
SLIDE 10

Overview

  • Introduction to Metropolis sampling
  • Examples with 1D problems
  • Extension to 3D, motion blur
  • Overview of Metropolis Light Transport
slide-11
SLIDE 11

Basic Algorithm

  • Function f(x) over state space Ω, f :Ω → R.
  • Markov Chain: new sample xi using xi−1
slide-12
SLIDE 12

Basic Algorithm

  • Function f(x) over state space Ω, f :Ω → R.
  • Markov Chain: new sample xi using xi−1
  • New samples from mutation to xi−1 → x′
  • Mutation accepted or rejected so xi ∼ fpdf
  • If rejected, xi = xi−1
  • Acceptance guarantees distribution of xi is the

stationary distribution

slide-13
SLIDE 13

Pseudo-code

x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) if (random() < a) x = x’ record(x)

slide-14
SLIDE 14

Expected Values

  • Metropolis avoids parts of Ω where f(x) is

small

  • But e.g. dim parts of an image need samples
  • Record samples at both x and x′
  • Samples are weighted based on a(x → x′)
  • Same result in the limit
slide-15
SLIDE 15

Expected Values – Pseudo-code

x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) record(x, (1-a) * weight) record(x’, a * weight) if (random() < a) x = x’

slide-16
SLIDE 16

Mutations, Transitions, Acceptance

  • Mutations propose x′ given xi
  • T(x → x′) is probability density of proposing

x′ from x

  • a(x → x′) probability of accepting the

transition

slide-17
SLIDE 17

Detailed Balance – The Key

  • By defining a(x → x′) carefully, can ensure

xi ∼ f(x) f(x) T(x → x′) a(x → x′) = f(x′) T(x′ → x) a(x′ → x)

  • Since f and T are given, gives conditions on

acceptance probability

  • (Will not show derivation here)
slide-18
SLIDE 18

Acceptance Probability

  • Efficient choice:

a(x → x′) = min

  • 1, f(x′) T(x′ → x)

f(x) T(x → x′)

slide-19
SLIDE 19

Acceptance Probability – Goals

  • Doesn’t affect unbiasedness; just variance
  • Maximize the acceptance probability →

– Explore state space better – Reduce correlation (image artifacts...)

  • Want transitions that are likely to be accepted

– i.e. transitions that head where f(x) is large

slide-20
SLIDE 20

Mutations: Metropolis

  • T(a → b) = T(b → a) for all a, b

a(x → x′) = min

  • 1, f(x′)

f(x)

  • Random walk Metropolis

T(x → x′) = T(|x − x′|)

slide-21
SLIDE 21

Mutations: Independence Sampler

  • If we have some pdf p, can sample x ∼ p,
  • Straightforward transition function:

T(x → x′) = p(x′)

  • If p(x) = fpdf, wouldn’t need Metropolis
  • But can use pdfs to approximate parts of f...
slide-22
SLIDE 22

Mutation Strategies: General

  • Adaptive methods: vary transition based on

experience

  • Flexibility: base on value of f(x), etc. pretty

freely

  • Remember: just need to be able to compute

transition densities for the mutation

  • The more mutations, the merrier
  • Relative frequency of them not so important
slide-23
SLIDE 23

1D Example

  • Consider the function

f 1(x) = (x − .5)2 : 0 ≤ x ≤ 1 :

  • therwise
  • Want to generate samples from f 1(x)
slide-24
SLIDE 24

1D Mutation #1

mutate1(x) → ξ T1(x → x′) = 1

  • Simplest mutation possible
  • Random walk Metropolis
slide-25
SLIDE 25

1D Mutation #2

mutate2(x) → x + .1 ∗ (ξ − .5) T2(x → x′) =

  • 1

0.1

: |x − x′| ≤ .05 :

  • therwise
  • Also random walk Metropolis
slide-26
SLIDE 26

1D Mutation #2

  • mutate2 increases acceptance probability

a(x → x′) = min

  • 1, f(x′) T(x′ → x)

f(x) T(x → x′)

  • When f(x) is large, will avoid x′ when

f(x′) < f(x)

  • Should try to avoid proposing mutations to

such x′

slide-27
SLIDE 27

1D Results - pdf graphs

0.00 0.25 0.50 0.75 1.00

x

0.0 0.2 0.4 0.6 0.8 0.00 0.25 0.50 0.75 1.00

x

0.0 0.2 0.4 0.6 0.8

  • Left: mutate1 only
  • Right: a mix of the two (10%/90%)
  • 10,000 mutations total
slide-28
SLIDE 28

Why bother with mutate1, then?

  • If we just use the second mutation (±.05)...

0.00 0.25 0.50 0.75 1.00

x

0.0 0.2 0.4 0.6 0.8

slide-29
SLIDE 29

Ergodicity

  • Need finite prob. of sampling x, f(x) > 0
  • This is true with mutate2, but is inefficient:

0.00 0.25 0.50 0.75 1.00

x

0.0 0.2 0.4 0.6 0.8

  • Still unbiased in the limit...
slide-30
SLIDE 30

Ergodicity – Easy Solution

  • Periodically pick an entirely new x
  • e.g. sample uniformly over Ω...
slide-31
SLIDE 31

Application to Integration

  • Given integral,
  • f(x)g(x)dΩ
  • Standard Monte Carlo estimator:

f(x)g(x) dΩ ≈ 1 N

N

  • i=1

f(xi)g(xi) p(xi)

  • where xi ∼ p(x), an arbitrary pdf
slide-32
SLIDE 32

Application to Integration

f(x)g(x) dΩ ≈ 1 N

N

  • i=1

f(xi)g(xi) p(xi)

slide-33
SLIDE 33

Application to Integration

f(x)g(x) dΩ ≈ 1 N

N

  • i=1

f(xi)g(xi) p(xi)

  • Metropolis gives x1, . . . , xN, xi ∼ fpdf(x)

f(x)g(x) dΩ ≈

  • 1

N

N

  • i=1

g(xi)

  • · I(f)
  • (Recall I(f) =
  • Ω f(x)dΩ)
slide-34
SLIDE 34

Start-Up Bias

  • xi converges to fpdf, but never actually reaches

it

  • Especially in early iterations, the distribution
  • f xi can be quite different from fpdf
  • This problem is called the Start-Up Bias
slide-35
SLIDE 35

Eliminating Start-Up Bias

  • Start from any x0, make a couple of “dummy”

iterations without recording the results

– How many “dummy” iterations? – xi only proportional to fpdf for i -> infty anyway – Not a good solution

slide-36
SLIDE 36

Eliminating Start-Up Bias

  • Generate x0, proportional to any suitable pdf

p(x)

  • Weight all contributions by w = f(x0) / p(x0)

– Unbiased on average (over many runs) – Each run may be completely “off” (even black image, if f(x0) was zero) – Not a good solution

slide-37
SLIDE 37

Eliminating Start-Up Bias

  • Generate n initial samples

proportional to any suitable pdf p(x)

  • Compute weights wi = f(x0,i) / p(x0,i)
  • Pick initial state proportional to wi
  • Weight all contributions by
  • Note that

n

x x

, 1 ,

, ,

=

=

n i i

w n w

1

1

∫ ∑

Ω =

=         = dx x f x p x f n E w E

n i i i

) ( ) ( ) ( 1 ] [

1 , ,

slide-38
SLIDE 38

Motion Blur

  • Onward to a 3D problem
  • Scene radiance function L(u, v, t) (e.g.

evaluated with ray tracing)

  • L = 0 outside the image boundary
  • Ω is (u, v, t) ∈ [0, umax] × [0, vmax] × [0, 1]
slide-39
SLIDE 39

Image Contribution Function

  • The key to applying Metro to image synthesis

Ij =

hj(u, v) L(u, v, t) du dv dt

  • Ij is value of j’th pixel
  • hj is pixel reconstruction filter
slide-40
SLIDE 40

Image Contribution Function

  • So if we sample xi ∼ Lpdf

Ij ≈ 1 N

N

  • i=1

hj(xi) ·

L(x) dΩ

  • ,
  • The distribution of xi on the image plane

forms the image

  • Estimate
  • Ω L(x) dΩ with standard MC
slide-41
SLIDE 41

Two Basic Mutations

  • Pick completely new (u, v, t) value
  • Perturb u and v ±8 pixels, time ±.01.
  • Both are symmetric, Random-walk Metropolis
slide-42
SLIDE 42

Motion Blur – Result

  • Left: Distribution RT, stratified sampling
  • Right: Metropolis sampling
  • Same total number of samples
slide-43
SLIDE 43

Motion Blur – Parameter Studies

  • Left: ±80 pixels, ±.5 time. Many rejections.
  • Right: ±0.5 pixels, ±.001 time. Didn’t

explore Ω well.

slide-44
SLIDE 44

Exponential Distribution

  • Vary the scale of proposed mutations

r = rmax e− log(rmax/rmin)ξ, θ = 2πξ (du, dv) = (r sin θ, r cos θ) dt = tmax e− log(tmax/tmin)ξ

  • Will reject when too big, still try wide variety
slide-45
SLIDE 45

Exponential distribution results

slide-46
SLIDE 46

Light Transport

  • Image contribution function was key
  • f(x) over infinite space of paths
  • State-space is light-carrying paths through the

scene–from light source to sensor

  • Robustness is particularly nice–solve difficult

transport problems efficiently

  • Few specialized parameters to set
slide-47
SLIDE 47

Light Transport – Setting

  • Samples x from Ω are sequences v0v1 . . . vk,

k ≥ 1, of vertices on scene surfaces

x x

1

x

2

x3

  • f(x) is the product of emitted light, BRDF

values, cosines, etc.

slide-48
SLIDE 48

Light Transport – Strategy

  • Explore the infinite-dimensional path space
  • Metropolis’s natural focus on areas of high

contribution makes it efficient

  • New issues:

– Stratifying over pixels – Perceptual issues – Spectral issues – Direct lighting

slide-49
SLIDE 49

MLT Pseudocode

Legend: z sampling state (light path) I(z) “scalar contribution function” (i.e target function for Metropolis mutations, was f(z) earlier) F(z) integrand (was h(z)f(z) earlier)

slide-50
SLIDE 50

Bidirectional Mutation

  • Delete a subpath from the current path
  • Generate a new one
  • Connect things with shadow rays

v0 v

1

v2 v

3

v4 v5 v

6

v0 v

1

v

5

v

6

v0 v

1

v2' v3' v4' v5'

  • If occluded, then just reject
slide-51
SLIDE 51

Bidirectional Mutation

  • Very flexible path re-use
  • Ensures ergodicity–may discard the entire path
  • Inefficient when a very small part of path

space is important

  • Transition densities are tricky: need to

consider all possible ways of sampling the path

slide-52
SLIDE 52

Caustic Perturbation

  • Caustic path: one more more specular surface

hits before diffuse, eye

specular non-specular

  • Slightly shift outgoing direction from light

source, regenerate path

slide-53
SLIDE 53

Lens Perturbation

  • Similarly perturb outgoing ray from camera
  • Keeps image samples from clumping together
slide-54
SLIDE 54

Why It Works Well

  • Path Reuse

– Efficiency–paths are built from pieces of old

  • nes

– (Could be used in stuff like path tracing...)

  • Local Exploration

– Given important path, incrementally sample close to it in Ω – When f is small over much of Ω, this is extra helpful

slide-55
SLIDE 55

MLT in Primary Sample Space

  • [Kelement et al. 2002]
  • Light path is uniquely determined by the

random numbers used to generate it.

slide-56
SLIDE 56

MLT in Primary Sample Space

  • Formulate MLT as an integration problem in

the space of uniformly distributed random numbers (the “primary space”)

  • New integrand:
  • New scalar contrib. function:
slide-57
SLIDE 57

MLT in Primary Sample Space

  • Mutations in primary space
  • Small steps

– Perturb all the ui’s using exponential distribution

  • Large steps

– Regenerate path from scratch

  • Ergodicity, symmetry (no need to evaluate T )
slide-58
SLIDE 58

(a) Bidirectional path tracing with 40 samples per pixel.

Image credit: Eric Veach

slide-59
SLIDE 59

(b) Metropolis light transport with an average of 250 mutations per pixel [the same computation time as (a)].

Image credit: Eric Veach

slide-60
SLIDE 60

(a) Path tracing with 210 samples per pixel.

Image credit: Eric Veach

slide-61
SLIDE 61

(b) Metropolis light transport with an average of 100 mutations per pixel [the same computation time as (a)].

Image credit: Eric Veach

slide-62
SLIDE 62
slide-63
SLIDE 63
slide-64
SLIDE 64

20

Image credit: Toshiya Hachisuka

slide-65
SLIDE 65

21

Image credit: Toshiya Hachisuka

slide-66
SLIDE 66

Other applications of Metropolis sampling in Rendering

  • Segovia et al. 2007

Metropolis Instant Radiosity

  • Ghosh & Heidrich 2006

Metropolis sampling of environment maps

  • Metropolis photon sampling

– Fan et al. 2005 – Hachisuka and Jensen 2011

slide-67
SLIDE 67

Fan et al. 2005

slide-68
SLIDE 68

Hachisuka and Jensen 2011

  • Simplifying trick:

Target function is the binary photon visibility V(x)

slide-69
SLIDE 69

Hachisuka and Jensen 2011

  • Other important trick: Adaptive mutation size

– Adapt mutation size to the number of accepted mutations

slide-70
SLIDE 70

Hachisuka and Jensen 2011

slide-71
SLIDE 71

Hachisuka and Jensen 2011

slide-72
SLIDE 72

Conclusion

  • A very different way of thinking about

integration

  • Robustness is highly attractive
  • Implementation can be tricky