exact bayesian inference by symbolic disintegration
play

Exact Bayesian inference by symbolic disintegration Chung-chieh Shan - PowerPoint PPT Presentation

Exact Bayesian inference by symbolic disintegration Chung-chieh Shan Norman Ramsey Indiana University Tufts University POPL, 18 January 2017 1 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms 2 1.


  1. Observation, inference, and query in core Hakaru 1 1 y = 2 · x observation 0 1 0 1 m 2 = do { x � uniform 0 1 ; m 0 = do { x � uniform 0 1 ; y � uniform 0 1 ; y � uniform 0 1 ; observe y = 2 · x ; return ( x , y ) } return ( x , y ) } � � m 2 x d ( x , y ) y = 2 · x · x d ( x , y ) y = 2 · x · 1 d ( x , y ) = 0 m 0 E m 2 ( λ ( x , y ) . x ) = m 2 1 d ( x , y ) = � � 0 m 0 5

  2. Observation, inference, and query in core Hakaru 1 1 y = 2 · x observation 0 1 0 1 m 2 = do { x � uniform 0 1 ; m 0 = do { x � uniform 0 1 ; y � uniform 0 1 ; y � uniform 0 1 ; observe y = 2 · x ; return ( x , y ) } return ( x , y ) } � � m 2 x d ( x , y ) y = 2 · x · x d ( x , y ) y = 2 · x · 1 d ( x , y ) = 0 m 0 E m 2 ( λ ( x , y ) . x ) = m 2 1 d ( x , y ) = � � 0 m 0 5

  3. Observation, inference, and query in core Hakaru ambiguous 1 1 y = 2 · x observation 0 1 0 1 m 2 = do { x � uniform 0 1 ; m 0 = do { x � uniform 0 1 ; y � uniform 0 1 ; y � uniform 0 1 ; observe y = 2 · x ; return ( x , y ) } return ( x , y ) } � � m 2 x d ( x , y ) y = 2 · x · x d ( x , y ) y = 2 · x · 1 d ( x , y ) = 0 m 0 E m 2 ( λ ( x , y ) . x ) = m 2 1 d ( x , y ) = � � 0 m 0 5

  4. Observation of measure-zero sets is paradoxical 1 y = 2 · x 0 1 6

  5. Observation of measure-zero sets is paradoxical 1 1 y = 2 · x 0 1 0 1 6

  6. Observation of measure-zero sets is paradoxical 1 1 y = 2 · x 0 1 0 1 6

  7. Observation of measure-zero sets is paradoxical 1 1 y = 2 · x 0 1 0 1 1 1 y = 2 · x 0 1 0 1 6

  8. Observation of measure-zero sets is paradoxical 1 1 y = 2 · x 0 1 0 1 E ( x ) = 1 / 4 1 1 y = 2 · x 0 1 0 1 E ( x ) = 1 / 3 6

  9. Observation of measure-zero sets is paradoxical 1 1 y = 2 · x 0 1 0 1 E ( x ) = 1 / 4 1 1 y = 2 · x 0 1 0 1 E ( x ) = 1 / 3 6

  10. Resolving the paradox via disintegration 1 1 y − 2 · x @ 0 0 1 0 1 E ( x ) = 1 / 4 1 1 y / x @ 2 0 1 0 1 E ( x ) = 1 / 3 6

  11. Resolving the paradox via disintegration 1 1 y − 2 · x @ 0 0 1 0 1 E ( x ) = 1 / 4 1 1 y / x @ 2 0 1 0 1 E ( x ) = 1 / 3 6

  12. Resolving the paradox via disintegration prior posterior 6

  13. Resolving the paradox via disintegration disintegrate prior posterior 6

  14. Resolving the paradox via disintegration disintegrate prior posterior 6

  15. Resolving the paradox via disintegration disintegrate prior posterior Soundness: If the disintegrator succeeds then the result is correct. 1. Motivate by puzzle 2. Specify by semantics 3. Implement by derivation 6

  16. Specifying disintegration by semantics disintegrate 7

  17. Specifying disintegration by semantics disintegrate 7

  18. Specifying disintegration by semantics disintegrate ξ : M ( α × β ) 7

  19. Specifying disintegration by semantics µ : M α disintegrate κ : α → M β ξ : M ( α × β ) 7

  20. Specifying disintegration by semantics µ : M α disintegrate ξ = µ ⊗ κ κ : α → M β ξ : M ( α × β ) 7

  21. Specifying disintegration by semantics µ : M α disintegrate ξ = µ ⊗ κ κ : α → M β ξ : M ( α × β ) 7

  22. Specifying disintegration by semantics µ : M α disintegrate ξ = µ ⊗ κ κ a : M β ξ : M ( α × β ) 7

  23. Specifying disintegration by semantics µ : M α disintegrate ξ = µ ⊗ κ κ a : M β ξ : M ( α × β ) 7

  24. Specifying disintegration by semantics do { a � ; µ : M α b � ; ξ = µ ⊗ κ κ a : M β ξ : M ( α × β ) return ( a , b ) } 7

  25. Specifying disintegration by semantics do { a � ; µ : M α b � ; ξ = µ ⊗ κ κ a : M β ξ : M ( α × β ) return ( a , b ) } 7

  26. do { a � ; µ : M α b � ; ξ : M ( α × β ) κ a : M β return ( a , b ) } 8

  27. α = R β = R × R do { a � ; µ : M α b � ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; y � uniform 0 1 ; return ( x , y ) } prior : M β 8

  28. α = R : α β = R × R y − 2 · x do { a � ; observation µ : M α b � ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; y � uniform 0 1 ; return ( x , y ) } prior : M β 8

  29. α = R : α β = R × R y − 2 · x do { a � ; observation µ : M α do { x � uniform 0 1 ; b � y � uniform 0 1 ; let a = y − 2 · x ; return ( a , ( x , y )) } ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; y � uniform 0 1 ; return ( x , y ) } prior : M β 8

  30. α = R : α β = R × R y − 2 · x do { a � ; lebesgue observation µ : M α do { x � uniform 0 1 ; do { x � uniform 0 1 ; b � y � uniform 0 1 ; observe 0 < a + 2 · x < 1 ; let a = y − 2 · x ; return ( x , a + 2 · x ) } return ( a , ( x , y )) } ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; y � uniform 0 1 ; return ( x , y ) } prior : M β 8

  31. α = R : α β = R × R y − 2 · x do { a � ; lebesgue observation µ : M α do { x � uniform 0 1 ; do { x � uniform 0 1 ; b � y � uniform 0 1 ; observe 0 < a + 2 · x < 1 ; let a = y − 2 · x ; return ( x , a + 2 · x ) } return ( a , ( x , y )) } ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; do { x � uniform 0 1 ; observe 0 < 0 + 2 · x < 1 ; y � uniform 0 1 ; return ( x , 0 + 2 · x ) } return ( x , y ) } prior : M β κ 0 : M β 8

  32. α = R : α β = R × R y / x do { a � ; lebesgue observation µ : M α do { x � uniform 0 1 ; do { x � uniform 0 1 ; b � y � uniform 0 1 ; observe 0 < a · x < 1 ; let a = y / x ; factor x ; return ( a , ( x , y )) } return ( x , a · x ) } ; ξ : M ( α × β ) κ a : M β return ( a , b ) } do { x � uniform 0 1 ; do { x � uniform 0 1 ; observe 0 < 2 · x < 1 ; y � uniform 0 1 ; factor x ; return ( x , y ) } return ( x , 2 · x ) } prior : M β κ 2 : M β 8

  33. � � do { a � ; � � � � � � � � � � � � � � � � b � � � � � � � � � � � � � � � � � � � � � � � ; � � � � � � � � return ( a , b ) } Measure semantics ⋆ Compositional denotation! ⋆ ⋆ ⋆ Equational reasoning! ⋆ ⋆ ⋆ ⋆ ⋆ Integrator formulation! ⋆ ⋆ ⋆ 8

  34. Integrator semantics integrand � �� � � M α � = ( � α � → R ) → R � 1 � uniform 0 1 � = λ f . f ( x ) dx 0 � ∞ � lebesgue � = λ f . f ( x ) dx −∞ � return ( x , y ) � = λ f . f ( x , y ) � do { x � m ; M } � = λ f . � m � ( λ x . � M � f ) � � do { x � uniform 0 1 ; � 1 � 1 � � y � uniform 0 1 ; = λ f . f ( x , y ) dy dx return ( x , y ) } 0 0 9

  35. Integrator semantics integrand � �� � � M α � = ( � α � → R ) → R � 1 � uniform 0 1 � = λ f . f ( x ) dx 0 � ∞ � lebesgue � = λ f . f ( x ) dx −∞ � return ( x , y ) � = λ f . f ( x , y ) � do { x � m ; M } � = λ f . � m � ( λ x . � M � f ) � � do { x � uniform 0 1 ; � 1 � 1 � � y � uniform 0 1 ; = λ f . f ( x , y ) dy dx return ( x , y ) } 0 0 9

  36. Integrator semantics integrand � �� � � M α � = ( � α � → R ) → R � 1 � uniform 0 1 � = λ f . f ( x ) dx 0 � ∞ � lebesgue � = λ f . f ( x ) dx −∞ � return ( x , y ) � = λ f . f ( x , y ) � do { x � m ; M } � = λ f . � m � ( λ x . � M � f ) � � do { x � uniform 0 1 ; � 1 � 1 � � y � uniform 0 1 ; = λ f . f ( x , y ) dy dx return ( x , y ) } 0 0 9

  37. “fantastic introduction! ” a d ! ★ r e t o u r e a s p l e “ a ★ “very polished!” ★ “loved reading!” ★ ★ “ best written of the last 30 papers I have read!” ★ ★ “deft!” ★ “self contained!” ★ ★ ” ! e l t n e ★ g “ ★ “easy to follow!” ★ ★ “beautifully explained!” 10

  38. “fantastic introduction! ” a d ! ★ r e t o u r e a s p l e “ a ★ “very polished!” ★ “loved reading!” ★ ★ “ best written of the last 30 papers I have read!” ★ “ ★ “deft!” PLDI readers without lots of ★ “self contained!” ★ background in probability theory should be able to follow; ★ ” ! e l t n e ” ★ g “ this is impressive ★ “easy to follow!” ★ ★ “beautifully explained!” 10

  39. � � � � do { a � ; � � � � � � � � � � � � � � � � � � � � ; b � � � � � return ( a , b ) } 11

  40. do { a � ; ; b � return ( a , b ) } 11

  41. do { a � ; ; b � return ( a , b ) } 11

  42. 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms do { a � ; ; b � return ( a , b ) } 11

  43. 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms do { a � ; ; b � return ( a , b ) } 1. Motivate by puzzle 2. Specify by semantics 3. Implement by derivation 11

  44. When it works ◮ y − 2 · x y / x max ( x , y ) … ◮ multivariate Gaussian distributions (for regression and dynamics) ◮ mixtures of distributions (for classifying points and documents) ◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.) 12

  45. When it works do { x � · · · ; y � · · · ; ◮ y − 2 · x y / x max ( x , y ) z � · · · ; … return ( f ( x , y , z ) , . . . ) } invertible ◮ multivariate Gaussian distributions (for regression and dynamics) ◮ mixtures of distributions (for classifying points and documents) ◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.) 12

  46. Where it helps disintegrate prior posterior 13

  47. Where it helps disintegrate prior posterior . . . inference procedure 13

  48. Where it helps disintegrate prior posterior . . .  maximum likelihood  inference procedure Markov chain Monte Carlo  … 13

  49. Where it helps disintegrate prior posterior . . . disintegrate . . .  maximum likelihood  inference procedure Markov chain Monte Carlo  … 13

  50. Where it helps disintegrate prior posterior . . . disintegrate . . .  maximum likelihood  inference procedure Markov chain Monte Carlo  … 13

  51. Where it helps disintegrate prior posterior . . . disintegrate ( µ � = lebesgue , arrays…) . . .  maximum likelihood  inference procedure Markov chain Monte Carlo  … 13

  52. 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms  dependent variable of regression     noisy measurement of location   : α total momentum of point masses  detected amplitude of seismic event     condition  … … 71.4 disintegrate distribution conditional distribution 1. Motivate by puzzle 2. Specify by semantics 3. Implement by derivation 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend