Exact Bayesian inference by symbolic disintegration Chung-chieh Shan - - PowerPoint PPT Presentation

exact bayesian inference by symbolic disintegration
SMART_READER_LITE
LIVE PREVIEW

Exact Bayesian inference by symbolic disintegration Chung-chieh Shan - - PowerPoint PPT Presentation

Exact Bayesian inference by symbolic disintegration Chung-chieh Shan Norman Ramsey Indiana University Tufts University POPL, 18 January 2017 1 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms 2 1.


slide-1
SLIDE 1 1

Exact Bayesian inference by symbolic disintegration

Chung-chieh Shan Indiana University Norman Ramsey Tufts University POPL, 18 January 2017

slide-2
SLIDE 2 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-3
SLIDE 3 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

{ }

slide-4
SLIDE 4 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-5
SLIDE 5 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
Hillary Clinton 71.4% Donald Trump 28.6%
slide-6
SLIDE 6 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-7
SLIDE 7 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-8
SLIDE 8 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-9
SLIDE 9 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
Hillary Clinton 71.4% Donald Trump 28.6%
slide-10
SLIDE 10 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
slide-11
SLIDE 11 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

disintegrate

slide-12
SLIDE 12 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

slide-13
SLIDE 13 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: Bool

slide-14
SLIDE 14 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: α

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

slide-15
SLIDE 15 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms
  • distribution
  • conditional distribution

condition

: R

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

slide-16
SLIDE 16 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: α

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

71.4
slide-17
SLIDE 17 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: α

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

71.4
  • 1. Motivate by puzzle
  • 2. Specify by semantics
  • 3. Implement by derivation
slide-18
SLIDE 18 2
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: α

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

71.4
  • 1. Motivate by puzzle
  • 2. Specify by semantics
  • 3. Implement by derivation
slide-19
SLIDE 19 3

Bayesian probabilistic inference

prior

slide-20
SLIDE 20 3

Bayesian probabilistic inference

prior

  • bservation
slide-21
SLIDE 21 3

Bayesian probabilistic inference

prior

  • bservation

posterior

slide-22
SLIDE 22 3

Bayesian probabilistic inference

  • bservation

prior

  • bservation
  • bservation

posterior

slide-23
SLIDE 23 4

Bayesian probabilistic inference

slide-24
SLIDE 24 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-25
SLIDE 25 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-26
SLIDE 26 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-27
SLIDE 27 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-28
SLIDE 28 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-29
SLIDE 29 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model

slide-30
SLIDE 30 4

Bayesian probabilistic inference

x := ...; y := ...;

generative model E(x) P(A)

slide-31
SLIDE 31 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1

slide-32
SLIDE 32 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1

slide-33
SLIDE 33 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 E(x)

slide-34
SLIDE 34 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x)

slide-35
SLIDE 35 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =

  • m0 x d(x, y)
  • m0 1 d(x, y) = 1/2

1

= 1/2

slide-36
SLIDE 36 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =

  • m0 x d(x, y)
  • m0 1 d(x, y) = 1/2

1

= 1/2

slide-37
SLIDE 37 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =

  • m0 x d(x, y)
  • m0 1 d(x, y) = 1/2

1

= 1/2

slide-38
SLIDE 38 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =

  • m0 x d(x, y)
  • m0 1 d(x, y) = 1/2

1

= 1/2

slide-39
SLIDE 39 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =

  • m0 x d(x, y)
  • m0 1 d(x, y) = 1/2

1

= 1/2

P(A) = E( A )

slide-40
SLIDE 40 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x

  • bservation
slide-41
SLIDE 41 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} posterior 1 1

slide-42
SLIDE 42 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} posterior 1 1

slide-43
SLIDE 43 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} 1 1 Em1(λ(x,y). x) =

  • m1 x d(x, y)
  • m1 1 d(x, y) =
  • m0

y < 2 · x · x d(x, y)

  • m0

y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18

slide-44
SLIDE 44 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} 1 1 Em1(λ(x,y). x) =

  • m1 x d(x, y)
  • m1 1 d(x, y) =
  • m0

y < 2 · x · x d(x, y)

  • m0

y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18

slide-45
SLIDE 45 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} 1 1 Em1(λ(x,y). x) =

  • m1 x d(x, y)
  • m1 1 d(x, y) =
  • m0

y < 2 · x · x d(x, y)

  • m0

y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18

slide-46
SLIDE 46 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x

  • bservation

m1 = do {x uniform 0 1; y uniform 0 1;

  • bserve y < 2 · x;

return (x, y)} 1 1 Em1(λ(x,y). x) =

  • m1 x d(x, y)
  • m1 1 d(x, y) =
  • m0

y < 2 · x · x d(x, y)

  • m0

y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18

slide-47
SLIDE 47 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x

  • bservation
slide-48
SLIDE 48 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x

  • bservation

m2 = do {x uniform 0 1; y uniform 0 1;

  • bserve y = 2 · x;

return (x, y)} 1 1 Em2(λ(x,y). x) =

  • m2 x d(x, y)
  • m2 1 d(x, y) =
  • m0

y = 2 · x · x d(x, y)

  • m0

y = 2 · x · 1 d(x, y) = 0

slide-49
SLIDE 49 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x

  • bservation

m2 = do {x uniform 0 1; y uniform 0 1;

  • bserve y = 2 · x;

return (x, y)} 1 1 Em2(λ(x,y). x) =

  • m2 x d(x, y)
  • m2 1 d(x, y) =
  • m0

y = 2 · x · x d(x, y)

  • m0

y = 2 · x · 1 d(x, y) = 0

slide-50
SLIDE 50 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x

  • bservation

m2 = do {x uniform 0 1; y uniform 0 1;

  • bserve y = 2 · x;

return (x, y)} 1 1 Em2(λ(x,y). x) =

  • m2 x d(x, y)
  • m2 1 d(x, y) =
  • m0

y = 2 · x · x d(x, y)

  • m0

y = 2 · x · 1 d(x, y) = 0

slide-51
SLIDE 51 5

Observation, inference, and query in core Hakaru

m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x

  • bservation

ambiguous m2 = do {x uniform 0 1; y uniform 0 1;

  • bserve y = 2 · x;

return (x, y)} 1 1 Em2(λ(x,y). x) =

  • m2 x d(x, y)
  • m2 1 d(x, y) =
  • m0

y = 2 · x · x d(x, y)

  • m0

y = 2 · x · 1 d(x, y) = 0

slide-52
SLIDE 52 6

Observation of measure-zero sets is paradoxical

1 1 y = 2 · x

slide-53
SLIDE 53 6

Observation of measure-zero sets is paradoxical

1 1 1 1 y = 2 · x

slide-54
SLIDE 54 6

Observation of measure-zero sets is paradoxical

1 1 1 1 y = 2 · x

slide-55
SLIDE 55 6

Observation of measure-zero sets is paradoxical

1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1

slide-56
SLIDE 56 6

Observation of measure-zero sets is paradoxical

1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1 E(x) = 1/4 E(x) = 1/3

slide-57
SLIDE 57 6

Observation of measure-zero sets is paradoxical

1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1 E(x) = 1/4 E(x) = 1/3

slide-58
SLIDE 58 6

Resolving the paradox via disintegration

1 1 1 1 y − 2 · x @ y/x @ 2 1 1 1 1 E(x) = 1/4 E(x) = 1/3

slide-59
SLIDE 59 6

Resolving the paradox via disintegration

1 1 1 1 y − 2 · x @ y/x @ 2 1 1 1 1 E(x) = 1/4 E(x) = 1/3

slide-60
SLIDE 60 6

Resolving the paradox via disintegration

prior posterior

slide-61
SLIDE 61 6

Resolving the paradox via disintegration

disintegrate prior posterior

slide-62
SLIDE 62 6

Resolving the paradox via disintegration

disintegrate prior posterior

slide-63
SLIDE 63 6

Resolving the paradox via disintegration

disintegrate prior posterior Soundness: If the disintegrator succeeds then the result is correct.

  • 1. Motivate by puzzle
  • 2. Specify by semantics
  • 3. Implement by derivation
slide-64
SLIDE 64 7

Specifying disintegration by semantics

disintegrate

slide-65
SLIDE 65 7

Specifying disintegration by semantics

disintegrate

slide-66
SLIDE 66 7

Specifying disintegration by semantics

ξ : M (α × β)

disintegrate

slide-67
SLIDE 67 7

Specifying disintegration by semantics

µ : M α κ : α → M β ξ : M (α × β)

disintegrate

slide-68
SLIDE 68 7

Specifying disintegration by semantics

µ : M α κ : α → M β ξ : M (α × β)

disintegrate

ξ = µ ⊗ κ

slide-69
SLIDE 69 7

Specifying disintegration by semantics

µ : M α κ : α → M β ξ : M (α × β)

disintegrate

ξ = µ ⊗ κ

slide-70
SLIDE 70 7

Specifying disintegration by semantics

µ : M α κ a : M β ξ : M (α × β)

disintegrate

ξ = µ ⊗ κ

slide-71
SLIDE 71 7

Specifying disintegration by semantics

µ : M α κ a : M β ξ : M (α × β)

disintegrate

ξ = µ ⊗ κ

slide-72
SLIDE 72 7

Specifying disintegration by semantics

do {a

µ : M α

;

b

κ a : M β

;

return (a, b)}

ξ : M (α × β) ξ = µ ⊗ κ

slide-73
SLIDE 73 7

Specifying disintegration by semantics

do {a

µ : M α

;

b

κ a : M β

;

return (a, b)}

ξ : M (α × β) ξ = µ ⊗ κ

slide-74
SLIDE 74 8

do {a

µ : M α ;

b

κ a : M β ;

return (a, b)}

ξ : M (α × β)

slide-75
SLIDE 75 8

α = R β = R × R

do {a

µ : M α ;

b

κ a : M β ;

return (a, b)}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β

slide-76
SLIDE 76 8

α = R β = R × R

do {a

µ : M α ;

b

κ a : M β ;

return (a, b)}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x

: α

  • bservation
slide-77
SLIDE 77 8

α = R β = R × R

do {a

µ : M α ;

b

κ a : M β ;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x

: α

  • bservation
slide-78
SLIDE 78 8

α = R β = R × R

do {a lebesgue

µ : M α ;

b do {x uniform 0 1;

  • bserve 0 < a + 2 · x < 1;

return (x, a + 2 · x)}

κ a : M β ;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x

: α

  • bservation
slide-79
SLIDE 79 8

α = R β = R × R

do {a lebesgue

µ : M α ;

b do {x uniform 0 1;

  • bserve 0 < a + 2 · x < 1;

return (x, a + 2 · x)}

κ a : M β ;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x

: α

  • bservation

do {x uniform 0 1;

  • bserve 0 < 0 + 2 · x < 1;

return (x, 0 + 2 · x)}

κ 0 : M β

slide-80
SLIDE 80 8

α = R β = R × R

do {a lebesgue

µ : M α ;

b do {x uniform 0 1;

  • bserve 0 < a · x < 1;

factor x; return (x, a · x)}

κ a : M β ;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y/x; return (a, (x, y))}

ξ : M (α × β)

do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y/x

: α

  • bservation

do {x uniform 0 1;

  • bserve 0 < 2 · x < 1;

factor x; return (x, 2 · x)}

κ 2 : M β

slide-81
SLIDE 81 8

do {a

;

b

;

return (a, b)}

  • Measure semantics

⋆ Compositional denotation!⋆ ⋆ ⋆ Equational reasoning!⋆ ⋆ ⋆ ⋆ ⋆ Integrator formulation!⋆ ⋆ ⋆

slide-82
SLIDE 82 9

Integrator semantics

M α =

integrand
  • (α → R) → R

uniform 0 1 = λf. 1

f(x) dx

lebesgue = λf. ∞

−∞

f(x) dx

return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)

  • do {x uniform 0 1;

y uniform 0 1; return (x, y)}

  • =

λf. 1 1

f(x, y) dy dx

slide-83
SLIDE 83 9

Integrator semantics

M α =

integrand
  • (α → R) → R

uniform 0 1 = λf. 1

f(x) dx

lebesgue = λf. ∞

−∞

f(x) dx

return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)

  • do {x uniform 0 1;

y uniform 0 1; return (x, y)}

  • =

λf. 1 1

f(x, y) dy dx

slide-84
SLIDE 84 9

Integrator semantics

M α =

integrand
  • (α → R) → R

uniform 0 1 = λf. 1

f(x) dx

lebesgue = λf. ∞

−∞

f(x) dx

return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)

  • do {x uniform 0 1;

y uniform 0 1; return (x, y)}

  • =

λf. 1 1

f(x, y) dy dx

slide-85
SLIDE 85 10

“fantastic introduction!

★“self contained!”★

★ “ a p l e a s u r e t

  • r

e a d ! ” ★

★ “ g e n t l e ! ” ★

★“loved reading!”★

★“beautifully explained!”

“very polished!”

★“easy to follow!”★

★“deft!”

★“best written

  • f the last 30 papers I have read!”★
slide-86
SLIDE 86 10

“fantastic introduction!

★“self contained!”★

★ “ a p l e a s u r e t

  • r

e a d ! ” ★

★ “ g e n t l e ! ” ★

★“loved reading!”★

★“beautifully explained!”

“very polished!”

★“easy to follow!”★

★“deft!”

★“best written

  • f the last 30 papers I have read!”★

PLDI readers without lots of background in probability theory should be able to follow; this is impressive

“ ”

slide-87
SLIDE 87 11

do {a

;

b

;

return (a, b)}

slide-88
SLIDE 88 11

do {a

;

b

;

return (a, b)}

slide-89
SLIDE 89 11

do {a

;

b

;

return (a, b)}

slide-90
SLIDE 90 11
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

do {a

;

b

;

return (a, b)}

slide-91
SLIDE 91 11
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

do {a

;

b

;

return (a, b)}

  • 1. Motivate by puzzle
  • 2. Specify by semantics
  • 3. Implement by derivation
slide-92
SLIDE 92 12

When it works

◮ y − 2 · x

y/x max(x, y) …

◮ multivariate Gaussian distributions

(for regression and dynamics)

◮ mixtures of distributions

(for classifying points and documents)

◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.)
slide-93
SLIDE 93 12

When it works

◮ y − 2 · x

y/x max(x, y) …

◮ multivariate Gaussian distributions

(for regression and dynamics)

◮ mixtures of distributions

(for classifying points and documents)

◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.)

do {x · · · ; y · · · ; z · · · ; return (f(x, y, z), . . . )} invertible

slide-94
SLIDE 94 13

Where it helps

prior disintegrate posterior

slide-95
SLIDE 95 13

Where it helps

prior disintegrate posterior inference procedure

. . .

slide-96
SLIDE 96 13

Where it helps

prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …

   . . .

slide-97
SLIDE 97 13

Where it helps

prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …

  

disintegrate

. . . . . .

slide-98
SLIDE 98 13

Where it helps

prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …

  

disintegrate

. . . . . .

slide-99
SLIDE 99 13

Where it helps

prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …

  

disintegrate (µ = lebesgue, arrays…)

. . . . . .

slide-100
SLIDE 100 14
  • 1. Probabilistic programs denote distributions
  • 2. Exact inference by transforming terms

distribution conditional distribution disintegrate condition

: α

dependent variable of regression noisy measurement of location total momentum

  • f point masses

detected amplitude of seismic event … …

            

71.4
  • 1. Motivate by puzzle
  • 2. Specify by semantics
  • 3. Implement by derivation
slide-101
SLIDE 101 15

Induction hypothesis for automatic disintegrator

do {a lebesgue ; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (α × β)

slide-102
SLIDE 102 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue ; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (α × β)

slide-103
SLIDE 103 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (R × β)

slide-104
SLIDE 104 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}

ξ : M (R × β)

slide-105
SLIDE 105 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; a m; return (a, (x, y))}

ξ : M (R × β)

slide-106
SLIDE 106 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do {x uniform 0 1; y uniform 0 1; a m; return (a, (x, y))}

ξ : M (R × β)

slide-107
SLIDE 107 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do { h; a m; return (a, (x, y))}

ξ : M (R × β)

slide-108
SLIDE 108 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b

;

return (a, b)} do { h; a m; return (a, (x, y))}

ξ : M (R × β)

slide-109
SLIDE 109 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M

} }

slide-110
SLIDE 110 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M

} < ⊳ m a M h }

slide-111
SLIDE 111 15

Induction hypothesis for automatic disintegrator

Specialize:

α = R, µ = lebesgue.

Generalize:

  • bservable action m,

heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M

} < ⊳ m a M h }

Implement <

⊳ by equational reasoning from this specification.

Case analysis on m:

slide-112
SLIDE 112 16

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = uniform 0 1: do {h; a uniform 0 1; M}

=

{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}

=

{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define

< ⊳ (uniform 0 1) a c h =

do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.

slide-113
SLIDE 113 16

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = uniform 0 1: do {h; a uniform 0 1; M}

=

{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}

=

{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define

< ⊳ (uniform 0 1) a c h =

do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.

slide-114
SLIDE 114 16

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = uniform 0 1: do {h; a uniform 0 1; M}

=

{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}

=

{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define

< ⊳ (uniform 0 1) a c h =

do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.

slide-115
SLIDE 115 16

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = uniform 0 1: do {h; a uniform 0 1; M}

=

{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}

=

{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define

< ⊳ (uniform 0 1) a c h =

do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.

slide-116
SLIDE 116 17

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}

=

{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}

=

{ induction hypothesis } do {a lebesgue; <

⊳ m a

  • do {let x = a; h2; M}
  • h1}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <

⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1}

So define

< ⊳ x a c (h1; x m; h2) = < ⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1

The continuation memoizes.

slide-117
SLIDE 117 17

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}

=

{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}

=

{ induction hypothesis } do {a lebesgue; <

⊳ m a

  • do {let x = a; h2; M}
  • h1}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <

⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1}

So define

< ⊳ x a c (h1; x m; h2) = < ⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1

The continuation memoizes.

slide-118
SLIDE 118 17

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}

=

{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}

=

{ induction hypothesis } do {a lebesgue; <

⊳ m a

  • do {let x = a; h2; M}
  • h1}

=

{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <

⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1}

So define

< ⊳ x a c (h1; x m; h2) = < ⊳ m a

  • λh′. M (h′; let x = a; h2)
  • h1

The continuation memoizes.

slide-119
SLIDE 119 18

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return (−e): do {h; a return (−e); M}

=

{ monad laws, beta } do {h; b return e; let a = −b; M}

=

{ induction hypothesis … } do {b lebesgue; let a = −b; <

⊳ (return e) b M h} =

{ change integration variable from b to a } do {a lebesgue; let b = −a; <

⊳ (return e) b M h} =

{ “parametricity” of <

⊳ … }

do {a lebesgue; <

⊳ (return e) (−a) M h}

So define

< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h

Similarly for other invertible functions: log x, y − 2 · x.

slide-120
SLIDE 120 18

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return (−e): do {h; a return (−e); M}

=

{ monad laws, beta } do {h; b return e; let a = −b; M}

=

{ induction hypothesis … } do {b lebesgue; let a = −b; <

⊳ (return e) b M h} =

{ change integration variable from b to a } do {a lebesgue; let b = −a; <

⊳ (return e) b M h} =

{ “parametricity” of <

⊳ … }

do {a lebesgue; <

⊳ (return e) (−a) M h}

So define

< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h

Similarly for other invertible functions: log x, y − 2 · x.

slide-121
SLIDE 121 18

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return (−e): do {h; a return (−e); M}

=

{ monad laws, beta } do {h; b return e; let a = −b; M}

=

{ induction hypothesis … } do {b lebesgue; let a = −b; <

⊳ (return e) b M h} =

{ change integration variable from b to a } do {a lebesgue; let b = −a; <

⊳ (return e) b M h} =

{ “parametricity” of <

⊳ … }

do {a lebesgue; <

⊳ (return e) (−a) M h}

So define

< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h

Similarly for other invertible functions: log x, y − 2 · x.

slide-122
SLIDE 122 18

Goal: do {h; a m; M} = do {a lebesgue; <

⊳ m a M h}

Case m = return (−e): do {h; a return (−e); M}

=

{ monad laws, beta } do {h; b return e; let a = −b; M}

=

{ induction hypothesis … } do {b lebesgue; let a = −b; <

⊳ (return e) b M h} =

{ change integration variable from b to a } do {a lebesgue; let b = −a; <

⊳ (return e) b M h} =

{ “parametricity” of <

⊳ … }

do {a lebesgue; <

⊳ (return e) (−a) M h}

So define

< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h

Similarly for other invertible functions: log x, y − 2 · x.