metropolis sampling matt pharr cs348b
play

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction - PowerPoint PPT Presentation

Metropolis Sampling Matt Pharr cs348b May 20, 2003 Introduction Unbiased MC method for sampling from functions distributions Robustness in the face of difficult problems Application to a wide variety of problems Flexibility


  1. Metropolis Sampling Matt Pharr cs348b May 20, 2003

  2. Introduction • Unbiased MC method for sampling from functions’ distributions • Robustness in the face of difficult problems • Application to a wide variety of problems • Flexibility in choosing how to sample • Introduced to CG by Veach and Guibas

  3. (a) Bidirectional path tracing with 40 samples per pixel. Image credit: Eric Veach

  4. (b) Metropolis light transport with an average of 250 mutations per pixel [the same computation time as (a)]. Image credit: Eric Veach

  5. (a) Path tracing with 210 samples per pixel. Image credit: Eric Veach

  6. (b) Metropolis light transport with an average of 100 mutations per pixel [the same computation time as (a)]. Image credit: Eric Veach

  7. Overview • For arbitrary f ( x ) → R , x ∈ Ω

  8. Overview • For arbitrary f ( x ) → R , x ∈ Ω � • Define I ( f ) = Ω f ( x )dΩ so f pdf = f/ I ( f )

  9. Overview • For arbitrary f ( x ) → R , x ∈ Ω � • Define I ( f ) = Ω f ( x )dΩ so f pdf = f/ I ( f ) • Generates samples X = { x i } , x i ∼ f pdf • Without needing to compute f pdf or I ( f )

  10. Overview • Introduction to Metropolis sampling • Examples with 1D problems • Extension to 3D, motion blur • Overview of Metropolis Light Transport

  11. Basic Algorithm • Function f ( x ) over state space Ω , f :Ω → R . • Markov Chain: new sample x i using x i − 1

  12. Basic Algorithm • Function f ( x ) over state space Ω , f :Ω → R . • Markov Chain: new sample x i using x i − 1 • New samples from mutation to x i − 1 → x ′ • Mutation accepted or rejected so x i ∼ f pdf • If rejected, x i = x i − 1 • Acceptance guarantees distribution of x i is the stationary distribution

  13. Pseudo-code x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) if (random() < a) x = x’ record(x)

  14. Expected Values • Metropolis avoids parts of Ω where f ( x ) is small • But e.g. dim parts of an image need samples • Record samples at both x and x ′ • Samples are weighted based on a ( x → x ′ ) • Same result in the limit

  15. Expected Values – Pseudo-code x = x0 for i = 1 to n x’ = mutate(x) a = accept(x, x’) record(x, (1-a) * weight) record(x’, a * weight) if (random() < a) x = x’

  16. Mutations, Transitions, Acceptance • Mutations propose x ′ given x i • T ( x → x ′ ) is probability density of proposing x ′ from x • a ( x → x ′ ) probability of accepting the transition

  17. Detailed Balance – The Key • By defining a ( x → x ′ ) carefully, can ensure x i ∼ f ( x ) f ( x ) T ( x → x ′ ) a ( x → x ′ ) = f ( x ′ ) T ( x ′ → x ) a ( x ′ → x ) • Since f and T are given, gives conditions on acceptance probability • (Will not show derivation here)

  18. Acceptance Probability • Efficient choice: 1 , f ( x ′ ) T ( x ′ → x ) � � a ( x → x ′ ) = min f ( x ) T ( x → x ′ )

  19. Acceptance Probability – Goals • Doesn’t affect unbiasedness; just variance • Maximize the acceptance probability → – Explore state space better – Reduce correlation (image artifacts...) • Want transitions that are likely to be accepted – i.e. transitions that head where f ( x ) is large

  20. Mutations: Metropolis • T ( a → b ) = T ( b → a ) for all a , b 1 , f ( x ′ ) � � a ( x → x ′ ) = min f ( x ) • Random walk Metropolis T ( x → x ′ ) = T ( | x − x ′ | )

  21. Mutations: Independence Sampler • If we have some pdf p , can sample x ∼ p , • Straightforward transition function: T ( x → x ′ ) = p ( x ′ ) • If p ( x ) = f pdf , wouldn’t need Metropolis • But can use pdfs to approximate parts of f ...

  22. Mutation Strategies: General • Adaptive methods: vary transition based on experience • Flexibility: base on value of f ( x ) , etc. pretty freely • Remember: just need to be able to compute transition densities for the mutation • The more mutations, the merrier • Relative frequency of them not so important

  23. 1D Example • Consider the function � ( x − . 5) 2 : 0 ≤ x ≤ 1 f 1 ( x ) = 0 : otherwise • Want to generate samples from f 1 ( x )

  24. 1D Mutation #1 mutate 1 ( x ) → ξ T 1 ( x → x ′ ) = 1 • Simplest mutation possible • Random walk Metropolis

  25. 1D Mutation #2 mutate 2 ( x ) → x + . 1 ∗ ( ξ − . 5) 1 | x − x ′ | ≤ . 05 � : T 2 ( x → x ′ ) = 0 . 1 0 : otherwise • Also random walk Metropolis

  26. 1D Mutation #2 • mutate 2 increases acceptance probability 1 , f ( x ′ ) T ( x ′ → x ) � � a ( x → x ′ ) = min f ( x ) T ( x → x ′ ) • When f ( x ) is large, will avoid x ′ when f ( x ′ ) < f ( x ) • Should try to avoid proposing mutations to such x ′

  27. 1D Results - pdf graphs 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 x x • Left: mutate 1 only • Right: a mix of the two (10%/90%) • 10,000 mutations total

  28. Why bother with mutate 1 , then? • If we just use the second mutation ( ± . 05 )... 0.8 0.6 0.4 0.2 0.0 0.00 0.25 0.50 0.75 1.00 x

  29. Ergodicity • Need finite prob. of sampling x , f ( x ) > 0 • This is true with mutate 2 , but is inefficient: 0.8 0.6 0.4 0.2 0.0 0.00 0.25 0.50 0.75 1.00 x • Still unbiased in the limit...

  30. Ergodicity – Easy Solution • Periodically pick an entirely new x • e.g. sample uniformly over Ω ...

  31. Application to Integration � • Given integral, f ( x ) g ( x )dΩ • Standard Monte Carlo estimator: N f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � � N p ( x i ) Ω i =1 • where x i ∼ p ( x ) , an arbitrary pdf

  32. Application to Integration N � f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � N p ( x i ) Ω i =1

  33. Application to Integration N � f ( x ) g ( x ) dΩ ≈ 1 f ( x i ) g ( x i ) � N p ( x i ) Ω i =1 • Metropolis gives x 1 , . . . , x N , x i ∼ f pdf ( x ) N � � 1 � � f ( x ) g ( x ) dΩ ≈ g ( x i ) · I ( f ) N Ω i =1 � • (Recall I ( f ) = Ω f ( x )dΩ )

  34. Start-Up Bias • x i converges to f pdf , but never actually reaches it • Especially in early iterations, the distribution of x i can be quite different from f pdf • This problem is called the Start-Up Bias

  35. Eliminating Start-Up Bias Start from any x 0 , make a couple of “dummy” • iterations without recording the results How many “dummy” iterations? – – x i only proportional to f pdf for i -> infty anyway Not a good solution –

  36. Eliminating Start-Up Bias Generate x 0 , proportional to any suitable pdf • p ( x ) • Weight all contributions by w = f ( x 0 ) / p ( x 0 ) Unbiased on average (over many runs) – Each run may be completely “off” – (even black image, if f ( x 0 ) was zero) – Not a good solution

  37. Eliminating Start-Up Bias Generate n initial samples ,  x , x • 0 , 1 0 , n proportional to any suitable pdf p ( x ) Compute weights w i = f ( x 0,i ) / p ( x 0,i ) • Pick initial state proportional to w i • Weight all contributions by n 1 • ∑ = w w i n = i 1 Note that •   n f ( x ) 1 ∑ ∫ = = 0 , i   E [ w ] E f ( x ) dx   n p ( x )   = Ω i 1 0 , i

  38. Motion Blur • Onward to a 3D problem • Scene radiance function L ( u, v, t ) (e.g. evaluated with ray tracing) • L = 0 outside the image boundary • Ω is ( u, v, t ) ∈ [0 , u max ] × [0 , v max ] × [0 , 1]

  39. Image Contribution Function • The key to applying Metro to image synthesis � I j = h j ( u, v ) L ( u, v, t ) d u d v d t Ω • I j is value of j’th pixel • h j is pixel reconstruction filter

  40. Image Contribution Function • So if we sample x i ∼ L pdf N �� � I j ≈ 1 � h j ( x i ) · L ( x ) dΩ , N Ω i =1 • The distribution of x i on the image plane forms the image � • Estimate Ω L ( x ) dΩ with standard MC

  41. Two Basic Mutations • Pick completely new ( u, v, t ) value • Perturb u and v ± 8 pixels, time ± . 01 . • Both are symmetric, Random-walk Metropolis

  42. Motion Blur – Result • Left: Distribution RT, stratified sampling • Right: Metropolis sampling • Same total number of samples

  43. Motion Blur – Parameter Studies • Left: ± 80 pixels, ± . 5 time. Many rejections. • Right: ± 0 . 5 pixels, ± . 001 time. Didn’t explore Ω well.

  44. Exponential Distribution • Vary the scale of proposed mutations r = r max e − log( r max /r min ) ξ , θ = 2 πξ ( du, dv ) = ( r sin θ, r cos θ ) dt = t max e − log( t max /t min ) ξ • Will reject when too big, still try wide variety

  45. Exponential distribution results

  46. Light Transport • Image contribution function was key • f ( x ) over infinite space of paths • State-space is light-carrying paths through the scene–from light source to sensor • Robustness is particularly nice–solve difficult transport problems efficiently • Few specialized parameters to set

  47. Light Transport – Setting • Samples x from Ω are sequences v 0 v 1 . . . v k , k ≥ 1 , of vertices on scene surfaces � 1 x � 0 x x3 � 2 x • f ( x ) is the product of emitted light, BRDF values, cosines, etc.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend