Exact Bayesian inference by symbolic disintegration
Chung-chieh Shan Indiana University Norman Ramsey Tufts University POPL, 18 January 2017
Exact Bayesian inference by symbolic disintegration Chung-chieh Shan - - PowerPoint PPT Presentation
Exact Bayesian inference by symbolic disintegration Chung-chieh Shan Norman Ramsey Indiana University Tufts University POPL, 18 January 2017 1 1. Probabilistic programs denote distributions 2. Exact inference by transforming terms 2 1.
Exact Bayesian inference by symbolic disintegration
Chung-chieh Shan Indiana University Norman Ramsey Tufts University POPL, 18 January 2017
{ }
disintegrate
distribution conditional distribution disintegrate condition
distribution conditional distribution disintegrate condition
: Bool
distribution conditional distribution disintegrate condition
: α
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
condition
: R
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
distribution conditional distribution disintegrate condition
: α
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
71.4distribution conditional distribution disintegrate condition
: α
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
71.4distribution conditional distribution disintegrate condition
: α
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
71.4Bayesian probabilistic inference
prior
Bayesian probabilistic inference
prior
Bayesian probabilistic inference
prior
posterior
Bayesian probabilistic inference
prior
posterior
Bayesian probabilistic inference
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model
Bayesian probabilistic inference
x := ...; y := ...;
generative model E(x) P(A)
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 E(x)
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x)
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =
1
= 1/2
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =
1
= 1/2
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =
1
= 1/2
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =
1
= 1/2
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 Em0(λ(x,y). x) =
1
= 1/2
P(A) = E( A )
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} posterior 1 1
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} prior 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} posterior 1 1
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em1(λ(x,y). x) =
y < 2 · x · x d(x, y)
y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em1(λ(x,y). x) =
y < 2 · x · x d(x, y)
y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em1(λ(x,y). x) =
y < 2 · x · x d(x, y)
y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y < 2 · x
m1 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em1(λ(x,y). x) =
y < 2 · x · x d(x, y)
y < 2 · x · 1 d(x, y) = 11/24 3/4 = 11/18
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x
m2 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em2(λ(x,y). x) =
y = 2 · x · x d(x, y)
y = 2 · x · 1 d(x, y) = 0
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x
m2 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em2(λ(x,y). x) =
y = 2 · x · x d(x, y)
y = 2 · x · 1 d(x, y) = 0
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x
m2 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em2(λ(x,y). x) =
y = 2 · x · x d(x, y)
y = 2 · x · 1 d(x, y) = 0
Observation, inference, and query in core Hakaru
m0 = do {x uniform 0 1; y uniform 0 1; return (x, y)} 1 1 y = 2 · x
ambiguous m2 = do {x uniform 0 1; y uniform 0 1;
return (x, y)} 1 1 Em2(λ(x,y). x) =
y = 2 · x · x d(x, y)
y = 2 · x · 1 d(x, y) = 0
Observation of measure-zero sets is paradoxical
1 1 y = 2 · x
Observation of measure-zero sets is paradoxical
1 1 1 1 y = 2 · x
Observation of measure-zero sets is paradoxical
1 1 1 1 y = 2 · x
Observation of measure-zero sets is paradoxical
1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1
Observation of measure-zero sets is paradoxical
1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1 E(x) = 1/4 E(x) = 1/3
Observation of measure-zero sets is paradoxical
1 1 1 1 y = 2 · x y = 2 · x 1 1 1 1 E(x) = 1/4 E(x) = 1/3
Resolving the paradox via disintegration
1 1 1 1 y − 2 · x @ y/x @ 2 1 1 1 1 E(x) = 1/4 E(x) = 1/3
Resolving the paradox via disintegration
1 1 1 1 y − 2 · x @ y/x @ 2 1 1 1 1 E(x) = 1/4 E(x) = 1/3
Resolving the paradox via disintegration
prior posterior
Resolving the paradox via disintegration
disintegrate prior posterior
Resolving the paradox via disintegration
disintegrate prior posterior
Resolving the paradox via disintegration
disintegrate prior posterior Soundness: If the disintegrator succeeds then the result is correct.
Specifying disintegration by semantics
disintegrate
Specifying disintegration by semantics
disintegrate
Specifying disintegration by semantics
ξ : M (α × β)
disintegrate
Specifying disintegration by semantics
µ : M α κ : α → M β ξ : M (α × β)
disintegrate
Specifying disintegration by semantics
µ : M α κ : α → M β ξ : M (α × β)
disintegrate
ξ = µ ⊗ κ
Specifying disintegration by semantics
µ : M α κ : α → M β ξ : M (α × β)
disintegrate
ξ = µ ⊗ κ
Specifying disintegration by semantics
µ : M α κ a : M β ξ : M (α × β)
disintegrate
ξ = µ ⊗ κ
Specifying disintegration by semantics
µ : M α κ a : M β ξ : M (α × β)
disintegrate
ξ = µ ⊗ κ
Specifying disintegration by semantics
do {a
µ : M α
;
b
κ a : M β
;
return (a, b)}
ξ : M (α × β) ξ = µ ⊗ κ
Specifying disintegration by semantics
do {a
µ : M α
;
b
κ a : M β
;
return (a, b)}
ξ : M (α × β) ξ = µ ⊗ κ
do {a
µ : M α ;
b
κ a : M β ;
return (a, b)}
ξ : M (α × β)
α = R β = R × R
do {a
µ : M α ;
b
κ a : M β ;
return (a, b)}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β
α = R β = R × R
do {a
µ : M α ;
b
κ a : M β ;
return (a, b)}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x
: α
α = R β = R × R
do {a
µ : M α ;
b
κ a : M β ;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x
: α
α = R β = R × R
do {a lebesgue
µ : M α ;
b do {x uniform 0 1;
return (x, a + 2 · x)}
κ a : M β ;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x
: α
α = R β = R × R
do {a lebesgue
µ : M α ;
b do {x uniform 0 1;
return (x, a + 2 · x)}
κ a : M β ;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y − 2 · x
: α
do {x uniform 0 1;
return (x, 0 + 2 · x)}
κ 0 : M β
α = R β = R × R
do {a lebesgue
µ : M α ;
b do {x uniform 0 1;
factor x; return (x, a · x)}
κ a : M β ;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y/x; return (a, (x, y))}
ξ : M (α × β)
do {x uniform 0 1; y uniform 0 1; return (x, y)} prior : M β y/x
: α
do {x uniform 0 1;
factor x; return (x, 2 · x)}
κ 2 : M β
do {a
;
b
;
return (a, b)}
⋆ Compositional denotation!⋆ ⋆ ⋆ Equational reasoning!⋆ ⋆ ⋆ ⋆ ⋆ Integrator formulation!⋆ ⋆ ⋆
Integrator semantics
M α =
integranduniform 0 1 = λf. 1
f(x) dx
lebesgue = λf. ∞
−∞f(x) dx
return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)
y uniform 0 1; return (x, y)}
λf. 1 1
f(x, y) dy dx
Integrator semantics
M α =
integranduniform 0 1 = λf. 1
f(x) dx
lebesgue = λf. ∞
−∞f(x) dx
return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)
y uniform 0 1; return (x, y)}
λf. 1 1
f(x, y) dy dx
Integrator semantics
M α =
integranduniform 0 1 = λf. 1
f(x) dx
lebesgue = λf. ∞
−∞f(x) dx
return (x, y) = λf. f(x, y) do {x m; M} = λf. m(λx. Mf)
y uniform 0 1; return (x, y)}
λf. 1 1
f(x, y) dy dx
“fantastic introduction!
★ “ a p l e a s u r e t
e a d ! ” ★
★ “ g e n t l e ! ” ★
★“loved reading!”★
★“beautifully explained!”
“very polished!”
★“easy to follow!”★
★“deft!”
★“best written
“fantastic introduction!
★ “ a p l e a s u r e t
e a d ! ” ★
★ “ g e n t l e ! ” ★
★“loved reading!”★
★“beautifully explained!”
“very polished!”
★“easy to follow!”★
★“deft!”
★“best written
PLDI readers without lots of background in probability theory should be able to follow; this is impressive
do {a
;
b
;
return (a, b)}
do {a
;
b
;
return (a, b)}
do {a
;
b
;
return (a, b)}
do {a
;
b
;
return (a, b)}
do {a
;
b
;
return (a, b)}
When it works
◮ y − 2 · xy/x max(x, y) …
◮ multivariate Gaussian distributions(for regression and dynamics)
◮ mixtures of distributions(for classifying points and documents)
◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.)When it works
◮ y − 2 · xy/x max(x, y) …
◮ multivariate Gaussian distributions(for regression and dynamics)
◮ mixtures of distributions(for classifying points and documents)
◮ seismic event detection (Arora et al.) ◮ point masses’ total momentum (Afshar et al.)do {x · · · ; y · · · ; z · · · ; return (f(x, y, z), . . . )} invertible
Where it helps
prior disintegrate posterior
Where it helps
prior disintegrate posterior inference procedure
. . .
Where it helps
prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …
. . .
Where it helps
prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …
disintegrate
. . . . . .
Where it helps
prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …
disintegrate
. . . . . .
Where it helps
prior disintegrate posterior inference procedure maximum likelihood Markov chain Monte Carlo …
disintegrate (µ = lebesgue, arrays…)
. . . . . .
distribution conditional distribution disintegrate condition
: α
dependent variable of regression noisy measurement of location total momentum
detected amplitude of seismic event … …
71.4Induction hypothesis for automatic disintegrator
do {a lebesgue ; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (α × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue ; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (α × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; let a = y − 2 · x; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; a m; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do {x uniform 0 1; y uniform 0 1; a m; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do { h; a m; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; b
;
return (a, b)} do { h; a m; return (a, (x, y))}
ξ : M (R × β)
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M
} }
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M
} < ⊳ m a M h }
Induction hypothesis for automatic disintegrator
Specialize:
α = R, µ = lebesgue.
Generalize:
heap h, final action M. Define continuation M = λh′. do {h′; M}. do {a lebesgue; do { h; a m; M
} < ⊳ m a M h }
Implement <
⊳ by equational reasoning from this specification.
Case analysis on m:
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = uniform 0 1: do {h; a uniform 0 1; M}
=
{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}
=
{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define
< ⊳ (uniform 0 1) a c h =
do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = uniform 0 1: do {h; a uniform 0 1; M}
=
{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}
=
{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define
< ⊳ (uniform 0 1) a c h =
do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = uniform 0 1: do {h; a uniform 0 1; M}
=
{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}
=
{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define
< ⊳ (uniform 0 1) a c h =
do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = uniform 0 1: do {h; a uniform 0 1; M}
=
{ probability density of m (Bhat et al.) } do {h; a lebesgue; factor 0 < a < 1 ; M}
=
{ exchange integrals using Tonelli’s theorem } do {a lebesgue; factor 0 < a < 1 ; h; M}
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; factor 0 < a < 1 ; M h} So define
< ⊳ (uniform 0 1) a c h =
do {factor 0 < a < 1 ; c h} Similarly for other primitive continuous distributions. The disintegration fused into most inference methods ends here.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}
=
{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}
=
{ induction hypothesis } do {a lebesgue; <
⊳ m a
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <
⊳ m a
So define
< ⊳ x a c (h1; x m; h2) = < ⊳ m a
The continuation memoizes.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}
=
{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}
=
{ induction hypothesis } do {a lebesgue; <
⊳ m a
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <
⊳ m a
So define
< ⊳ x a c (h1; x m; h2) = < ⊳ m a
The continuation memoizes.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return x: Look up x in h = (h1; x m; h2) do {h1; x m; h2; a return x; M}
=
{ monad laws, beta, alpha } do {h1; a m; let x = a; h2; M}
=
{ induction hypothesis } do {a lebesgue; <
⊳ m a
=
{ beta; recall M = λh′. do {h′; M} } do {a lebesgue; <
⊳ m a
So define
< ⊳ x a c (h1; x m; h2) = < ⊳ m a
The continuation memoizes.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return (−e): do {h; a return (−e); M}
=
{ monad laws, beta } do {h; b return e; let a = −b; M}
=
{ induction hypothesis … } do {b lebesgue; let a = −b; <
⊳ (return e) b M h} =
{ change integration variable from b to a } do {a lebesgue; let b = −a; <
⊳ (return e) b M h} =
{ “parametricity” of <
⊳ … }
do {a lebesgue; <
⊳ (return e) (−a) M h}
So define
< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h
Similarly for other invertible functions: log x, y − 2 · x.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return (−e): do {h; a return (−e); M}
=
{ monad laws, beta } do {h; b return e; let a = −b; M}
=
{ induction hypothesis … } do {b lebesgue; let a = −b; <
⊳ (return e) b M h} =
{ change integration variable from b to a } do {a lebesgue; let b = −a; <
⊳ (return e) b M h} =
{ “parametricity” of <
⊳ … }
do {a lebesgue; <
⊳ (return e) (−a) M h}
So define
< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h
Similarly for other invertible functions: log x, y − 2 · x.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return (−e): do {h; a return (−e); M}
=
{ monad laws, beta } do {h; b return e; let a = −b; M}
=
{ induction hypothesis … } do {b lebesgue; let a = −b; <
⊳ (return e) b M h} =
{ change integration variable from b to a } do {a lebesgue; let b = −a; <
⊳ (return e) b M h} =
{ “parametricity” of <
⊳ … }
do {a lebesgue; <
⊳ (return e) (−a) M h}
So define
< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h
Similarly for other invertible functions: log x, y − 2 · x.
Goal: do {h; a m; M} = do {a lebesgue; <
⊳ m a M h}
Case m = return (−e): do {h; a return (−e); M}
=
{ monad laws, beta } do {h; b return e; let a = −b; M}
=
{ induction hypothesis … } do {b lebesgue; let a = −b; <
⊳ (return e) b M h} =
{ change integration variable from b to a } do {a lebesgue; let b = −a; <
⊳ (return e) b M h} =
{ “parametricity” of <
⊳ … }
do {a lebesgue; <
⊳ (return e) (−a) M h}
So define
< ⊳ (return (−e)) a c h = < ⊳ (return e) (−a) c h
Similarly for other invertible functions: log x, y − 2 · x.