Probabilistic Programming in Birch www.birch-lang.org Lawrence - - PowerPoint PPT Presentation

probabilistic programming in birch
SMART_READER_LITE
LIVE PREVIEW

Probabilistic Programming in Birch www.birch-lang.org Lawrence - - PowerPoint PPT Presentation

Probabilistic Programming in Birch www.birch-lang.org Lawrence Murray Department of Information Technology, Uppsala University Outline 1. Graphical models probabilistic programs. 2. Birch: motivation and design. 3. Birch: language


slide-1
SLIDE 1

Probabilistic Programming in Birch

www.birch-lang.org Lawrence Murray

Department of Information Technology, Uppsala University Outline

  • 1. Graphical models −

→ probabilistic programs.

  • 2. Birch: motivation and design.
  • 3. Birch: language features.
slide-2
SLIDE 2

1 Graphical models −

→ probabilistic programs

Lawrence Murray 1 / 30

slide-3
SLIDE 3

Graphical models

(a) Directed

A B C D E

(b) Undirected

A B C D E

Lawrence Murray 2 / 30

slide-4
SLIDE 4

Graphical models

(a) Without plate notation

x y1 y2 y3

(b) With plate notation

x yn

n=1,...,3

Lawrence Murray 3 / 30

slide-5
SLIDE 5

Graphical models

Figure: S. Höhna, M. J. Landis, T. A. Heath, B. Boussau, N. Lartillot, B. R. Moore, J. P. Huelsenbeck, and F. Ronquist. Revbayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Systematic, 65(4):726–736, 2016. doi: 10.1093/sysbio/syw021

Lawrence Murray 4 / 30

slide-6
SLIDE 6

Graphical models

Figure: Benwing https://commons.wikimedia.org/wiki/File:Bayesian-gaussian-mixture.svg Lawrence Murray 5 / 30

slide-7
SLIDE 7

Graphical models − → probabilistic programs

Lawrence Murray 6 / 30

slide-8
SLIDE 8

Graphical models − → probabilistic programs

Lawrence Murray 6 / 30

slide-9
SLIDE 9

Graphical models − → probabilistic programs

Lawrence Murray 6 / 30

slide-10
SLIDE 10

Graphical models − → probabilistic programs

Lawrence Murray 6 / 30

slide-11
SLIDE 11

Graphical models − → probabilistic programs

The most expressive languages are known as universal Also known as Turing complete. Models written in such languages are universal probabilistic programs. These are the most expressive lan- guages for model specification, but also the most difficult for which to do inference.

Lawrence Murray 6 / 30

slide-12
SLIDE 12

Graphical models − → probabilistic programs

An alternative perspective on probabilistic programming is that it is a programming paradigm for probabilistic modelling and inference. Other programming paradigms include object-oriented programming, generic programming, procedural programming, functional programming, etc. From this perspective, probabilistic programming languages merely emphasise this particular programming paradigm, providing ergonomic features for writing probabilistic models and probabilistic inference methods.

Lawrence Murray 7 / 30

slide-13
SLIDE 13

Graphical models − → probabilistic programs

An alternative perspective on probabilistic programming is that it is a programming paradigm for probabilistic modelling and inference.

▶ Other programming paradigms include object-oriented

programming, generic programming, procedural programming, functional programming, etc.

▶ From this perspective, probabilistic programming languages

merely emphasise this particular programming paradigm, providing ergonomic features for writing probabilistic models and probabilistic inference methods.

Lawrence Murray 7 / 30

slide-14
SLIDE 14

2 Birch: motivation and design

Lawrence Murray 8 / 30

slide-15
SLIDE 15

Birch

▶ Universal probabilistic programming language (PPL). ▶ Supports procedural, generic, object-oriented, and (of course)

probabilistic programming paradigms.

▶ Both models and methods are written in the Birch language itself. ▶ Draws inspiration from many places, including existing PPLs such

as LibBi (www.libbi.org), and modern object-oriented languages such as Swift.

▶ Free and open source, under the Apache 2.0 license. ▶ See birch-lang.org

Lawrence Murray 9 / 30

slide-16
SLIDE 16

Technical details

▶ Dynamic memory management with reference-counted garbage

collection.

▶ Compiles to C++14 then native binaries. ▶ Uses standard C/C++ libraries for numerical computing, e.g. STL,

Boost, Eigen.

▶ C/C++ code can be nested in Birch code to allow tight integration.

Lawrence Murray 10 / 30

slide-17
SLIDE 17

Birch − → C++14

(a) C++14 provides a lot of things we would like to quarantine. (b) Most Birch code translates directly to C++14 e.g. object model, higher-order functions, user-defined conversions (c) Some Birch code translates to verbose or intrusive C++14 that one would not want to code by hand e.g. probabilistic operators, fibers, copy-on-write Birch C++14 (a) (b) (c)

Lawrence Murray 11 / 30

slide-18
SLIDE 18

Birch − → C++14

(a) C++14 provides a lot of things we would like to quarantine. (b) Most Birch code translates directly to C++14 e.g. object model, higher-order functions, user-defined conversions (c) Some Birch code translates to verbose or intrusive C++14 that one would not want to code by hand e.g. probabilistic operators, fibers, copy-on-write Birch C++14 (a) (b) (c)

Lawrence Murray 11 / 30

slide-19
SLIDE 19

Birch − → C++14

(a) C++14 provides a lot of things we would like to quarantine. (b) Most Birch code translates directly to C++14 e.g. object model, higher-order functions, user-defined conversions (c) Some Birch code translates to verbose or intrusive C++14 that one would not want to code by hand e.g. probabilistic operators, fibers, copy-on-write Birch C++14 (a) (b) (c)

Lawrence Murray 11 / 30

slide-20
SLIDE 20

Birch − → C++14

(a) C++14 provides a lot of things we would like to quarantine. (b) Most Birch code translates directly to C++14 e.g. object model, higher-order functions, user-defined conversions (c) Some Birch code translates to verbose or intrusive C++14 that one would not want to code by hand e.g. probabilistic operators, fibers, copy-on-write Birch C++14 (a) (b) (c)

Lawrence Murray 11 / 30

slide-21
SLIDE 21

Models in Birch

In Birch, a model is specified by writing a program that simulates from the joint distribution.

▶ In many other PPLs, there is a distinction between which

variables are observed and which are latent within the program.

▶ i.e. the program already factors the joint distribution into

likelihood and prior.

▶ In Birch, the preference is to distinguish which variables are

  • bserved and which are latent at runtime.

▶ i.e. at runtime, the user, or the inference method, chooses which

conditionals or marginals of the joint distribution are of interest. (Ideally, at least, as this is not always possible.)

Lawrence Murray 12 / 30

slide-22
SLIDE 22

Models in Birch

In Birch, a model is specified by writing a program that simulates from the joint distribution.

▶ In many other PPLs, there is a distinction between which

variables are observed and which are latent within the program.

▶ i.e. the program already factors the joint distribution into

likelihood and prior.

▶ In Birch, the preference is to distinguish which variables are

  • bserved and which are latent at runtime.

▶ i.e. at runtime, the user, or the inference method, chooses which

conditionals or marginals of the joint distribution are of interest.

▶ (Ideally, at least, as this is not always possible.) Lawrence Murray 12 / 30

slide-23
SLIDE 23

Example: Bayesian linear regression model

class class LinearRegressionModel < Model { X:Real[_,_]; σ2:Random<Real>; β:Random<Real[_]>; y:Random<Real[_]>; fiber fiber simulate() -> Real { N:Integer <- rows(X); P:Integer <- columns(X); if if (N > 0 && P > 0) { σ2 ~ InverseGamma(3.0, 0.4); β ~ Gaussian(vector(0.0, P), identity(P)*σ2); y ~ Gaussian(X*β, σ2); } } }

Lawrence Murray 13 / 30

slide-24
SLIDE 24

Example: linear-Gaussian state-space model

class class LinearGaussianSSM = MarkovModel<LinearGaussianSSMState, LinearGaussianSSMParameter>; class class LinearGaussianSSMParameter < Parameter { a:Real <- 0.8; σ2_x:Real <- 1.0; σ2_y:Real <- 0.1; } class class LinearGaussianSSMState < State { x:Random<Real>; y:Random<Real>; fiber fiber initial(θ:LinearGaussianSSMParameter) -> Real { x ~ Gaussian(0.0, θ.σ2_x); y ~ Gaussian(x, θ.σ2_y); }

Lawrence Murray 14 / 30

slide-25
SLIDE 25

Example: linear-Gaussian state-space model

fiber fiber transition(z:LinearGaussianSSMState, θ:LinearGaussianSSMParameter) -> Real { x ~ Gaussian(θ.a*z.x, θ.σ2_x); y ~ Gaussian(x, θ.σ2_y); } }

Lawrence Murray 15 / 30

slide-26
SLIDE 26

Example: nonlinear state-space model

class class SIRModel = MarkovModel<SIRState,SIRParameter>; class class SIRParameter < Parameter { λ:Random<Real>; δ:Random<Real>; γ:Random<Real>; fiber fiber parameter() -> Real { λ <- 10.0; δ ~ Beta(2.0, 2.0); γ ~ Beta(2.0, 2.0); } } class class SIRState < State { τ:Random<Integer>; Δi:Random<Integer>; Δr:Random<Integer>;

Lawrence Murray 16 / 30

slide-27
SLIDE 27

Example: nonlinear state-space model

s:Random<Integer>; i:Random<Integer>; r:Random<Integer>; fiber fiber transition(x:SIRState, θ:SIRParameter) -> Real { τ ~ Binomial(x.s, 1.0 - exp(-θ.λ*x.i/(x.s + x.i + x.r))); Δi ~ Binomial(τ, θ.δ); Δr ~ Binomial(x.i, θ.γ); s ~ Delta(x.s - Δi); i ~ Delta(x.i + Δi - Δr); r ~ Delta(x.r + Δr); } }

Lawrence Murray 17 / 30

slide-28
SLIDE 28

Models in Birch

▶ Knowing something about the structure of a model may help

tailor the inference algorithm, so it will be useful if programs reveal something of this.

▶ One option is static analysis, but this is hard. ▶ The approach at this stage is for it to be the programmer’s

responsibility to reveal this by construction, e.g. using the MarkovModel class.

▶ Details are still developing.

Lawrence Murray 18 / 30

slide-29
SLIDE 29

Methods in Birch

Inference methods are also written in the Birch language.

▶ Currently available are:

▶ Analytical solutions ▶ Importance sampling ▶ Bootstrap particle filter ▶ Alive particle filter ▶ Auxiliary particle filter (automated) ▶ Rao–Blackwellized particle filter (automated)

▶ Not far off are:

▶ Particle MCMC methods ▶ Other MCMC methods. Lawrence Murray 19 / 30

slide-30
SLIDE 30

3 Birch: language features

Lawrence Murray 20 / 30

slide-31
SLIDE 31

Optionals

Optionals allow variables to have a value of a particular type, or no value at all.

▶ They are used in other programming languages (e.g. Swift) to

eliminate boilerplate that checks for null values, e.g. a function checking its arguments.

▶ In Birch, they are used for the same purpose, but also a second

role: to represent missing values.

Lawrence Murray 21 / 30

slide-32
SLIDE 32

Randoms

Randoms are optionals to which a probability distribution can be attached.

▶ When they don’t have a value, the probability distribution can

be used to automatically simulate a value.

▶ Once a random has a value, that value is final, it cannot be

  • verwritten.

Lawrence Murray 22 / 30

slide-33
SLIDE 33

Delayed sampling

▶ Randoms are essential for the delayed sampling mechanism

within Birch.

▶ This is a heuristic algorithm for performing analytical

  • ptimizations at runtime.

▶ It automatically yields optimizations such as variable

elimination/collapsing, Rao–Blackwellization and locally-optimal proposals. See:

  • L. M. Murray, D. Lundén, J. Kudlicka, D. Broman, and T. B. Schön. Delayed sam-

pling and automatic Rao–Blackwellization of probabilistic programs. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS), 2018. URL https://arxiv.org/abs/1708.07787

Lawrence Murray 23 / 30

slide-34
SLIDE 34

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x);

Lawrence Murray 24 / 30

slide-35
SLIDE 35

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); assume x for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-36
SLIDE 36

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-37
SLIDE 37

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-38
SLIDE 38

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-39
SLIDE 39

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-40
SLIDE 40

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 1 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-41
SLIDE 41

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 1 y[2] y[1] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-42
SLIDE 42

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 1 y[2] 1 y[1] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-43
SLIDE 43

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 2 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-44
SLIDE 44

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 2 y[3] y[1] y[2] y[4] y[5]

Lawrence Murray 24 / 30

slide-45
SLIDE 45

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 2 y[3] 2 y[1] y[2] y[4] y[5]

Lawrence Murray 24 / 30

slide-46
SLIDE 46

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 3 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-47
SLIDE 47

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 3 y[4] y[1] y[2] y[3] y[5]

Lawrence Murray 24 / 30

slide-48
SLIDE 48

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 3 y[4] 3 y[1] y[2] y[3] y[5]

Lawrence Murray 24 / 30

slide-49
SLIDE 49

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 4 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-50
SLIDE 50

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 4 y[5] y[1] y[2] y[3] y[4]

Lawrence Murray 24 / 30

slide-51
SLIDE 51

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 4 y[5] 4 y[1] y[2] y[3] y[4]

Lawrence Murray 24 / 30

slide-52
SLIDE 52

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

} stdout.print(x);

x 5 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-53
SLIDE 53

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x); value x

x 5 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-54
SLIDE 54

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-55
SLIDE 55

Delayed sampling example

Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 24 / 30

slide-56
SLIDE 56

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

Lawrence Murray 25 / 30

slide-57
SLIDE 57

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); assume x[1] y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-58
SLIDE 58

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-59
SLIDE 59

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-60
SLIDE 60

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-61
SLIDE 61

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-62
SLIDE 62

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); assume x[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-63
SLIDE 63

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-64
SLIDE 64

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 1 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-65
SLIDE 65

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 1 y[1] y[2] 1 x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-66
SLIDE 66

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-67
SLIDE 67

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); assume x[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-68
SLIDE 68

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-69
SLIDE 69

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 2 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-70
SLIDE 70

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 2 y[2] y[3] 2 x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-71
SLIDE 71

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-72
SLIDE 72

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); assume x[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] y[3] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-73
SLIDE 73

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] y[3] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-74
SLIDE 74

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 3 y[3] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-75
SLIDE 75

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 3 y[3] y[4] 3 x[5] y[5]

Lawrence Murray 25 / 30

slide-76
SLIDE 76

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-77
SLIDE 77

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); assume x[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] y[4] y[5]

Lawrence Murray 25 / 30

slide-78
SLIDE 78

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] y[4] y[5]

Lawrence Murray 25 / 30

slide-79
SLIDE 79

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 4 y[4] y[5]

Lawrence Murray 25 / 30

slide-80
SLIDE 80

Delayed sampling

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 4 y[4] y[5] 4

Lawrence Murray 25 / 30

slide-81
SLIDE 81

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 5 y[4] y[5]

Lawrence Murray 25 / 30

slide-82
SLIDE 82

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); value x[1]

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 5 y[3] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-83
SLIDE 83

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); value x[1]

x[1] 1 x[2] 2 y[1] x[3] 5 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-84
SLIDE 84

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); value x[1]

x[1] 1 x[2] 5 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-85
SLIDE 85

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); value x[1]

x[1] 5 y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-86
SLIDE 86

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); value x[1]

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-87
SLIDE 87

Delayed sampling: Kalman Filter

Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 25 / 30

slide-88
SLIDE 88

Delayed sampling

Lawrence Murray 26 / 30

slide-89
SLIDE 89

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-90
SLIDE 90

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-91
SLIDE 91

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-92
SLIDE 92

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-93
SLIDE 93

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-94
SLIDE 94

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-95
SLIDE 95

Delayed sampling

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-96
SLIDE 96

Delayed sampling

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-97
SLIDE 97

Delayed sampling

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-98
SLIDE 98

Delayed sampling

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-99
SLIDE 99

Delayed sampling

x_n[1] x_l[1] 1 y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-100
SLIDE 100

Delayed sampling

x_n[1] x_l[1] 1 x_n[2] y_n[1] y_l[1] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-101
SLIDE 101

Delayed sampling

x_n[1] x_l[1] 1 x_n[2] x_l[2] y_n[1] y_l[1] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-102
SLIDE 102

Delayed sampling

x_n[1] x_l[1] 1 x_n[2] 1 x_l[2] y_n[1] y_l[1] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-103
SLIDE 103

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-104
SLIDE 104

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-105
SLIDE 105

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-106
SLIDE 106

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-107
SLIDE 107

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_l[2] y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-108
SLIDE 108

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 1 y_n[1] y_l[1] x_n[2] y_l[2] y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-109
SLIDE 109

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 1 y_n[1] y_l[1] x_n[2] y_l[2] 1 y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-110
SLIDE 110

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-111
SLIDE 111

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] y_n[2] y_l[2] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-112
SLIDE 112

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] x_l[3] y_n[2] y_l[2] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-113
SLIDE 113

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] 2 x_l[3] y_n[2] y_l[2] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-114
SLIDE 114

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-115
SLIDE 115

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-116
SLIDE 116

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-117
SLIDE 117

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-118
SLIDE 118

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_l[3] y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-119
SLIDE 119

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 2 y_n[2] y_l[2] x_n[3] y_l[3] y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-120
SLIDE 120

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 2 y_n[2] y_l[2] x_n[3] y_l[3] 2 y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-121
SLIDE 121

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-122
SLIDE 122

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] y_n[3] y_l[3] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-123
SLIDE 123

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] x_l[4] y_n[3] y_l[3] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-124
SLIDE 124

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] 3 x_l[4] y_n[3] y_l[3] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-125
SLIDE 125

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-126
SLIDE 126

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-127
SLIDE 127

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-128
SLIDE 128

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-129
SLIDE 129

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_l[4] y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-130
SLIDE 130

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 3 y_n[3] y_l[3] x_n[4] y_l[4] y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-131
SLIDE 131

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 3 y_n[3] y_l[3] x_n[4] y_l[4] 3 y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-132
SLIDE 132

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-133
SLIDE 133

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] y_n[4] y_l[4] x_l[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-134
SLIDE 134

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] x_l[5] y_n[4] y_l[4] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-135
SLIDE 135

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] 4 x_l[5] y_n[4] y_l[4] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-136
SLIDE 136

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-137
SLIDE 137

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-138
SLIDE 138

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-139
SLIDE 139

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-140
SLIDE 140

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_l[5] y_n[5]

Lawrence Murray 26 / 30

slide-141
SLIDE 141

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 4 y_n[4] y_l[4] x_n[5] y_l[5] y_n[5]

Lawrence Murray 26 / 30

slide-142
SLIDE 142

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 4 y_n[4] y_l[4] x_n[5] y_l[5] 4 y_n[5]

Lawrence Murray 26 / 30

slide-143
SLIDE 143

Delayed sampling

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 5 y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-144
SLIDE 144

Delayed sampling: Rao–Blackwellized Particle Filter

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 5 y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 26 / 30

slide-145
SLIDE 145

Fibers

Fibers (also known as coroutines elsewhere) are like functions, but their execution can be paused and resumed.

▶ A function, when called, executes to completion and returns a

value to the caller.

▶ A fiber, when called, executes to its first pause point and yields a

value to the caller. The caller can then proceed with some other

  • computation. Later, the caller may resume the fiber; it will

execute to its next pause point and yield another value to the caller, and so on.

Lawrence Murray 27 / 30

slide-146
SLIDE 146

Fibers

▶ In Birch, fibers are used to simulate a probabilistic model. Each

time an observation is encountered, the fiber pauses and yields a weight.

▶ This is a key ingredient for many inference methods (e.g.

Sequential Monte Carlo).

▶ Fibers can be replicated. When resumed, replicated fibers

proceed independently.

▶ A copy-on-write mechanism is used to minimise copying when

replicating fibers.

▶ Can also be useful for prospective computation, e.g. anything

with an accept/reject step.

Lawrence Murray 28 / 30

slide-147
SLIDE 147

Probabilistic operators

Optionals, randoms and fibers come together in the probabilistic

  • perators of Birch. These are:

a <~ b simulate the distribution b and assign the value to a, a ~> b

  • bserve the value a with distribution b and yield its log-

likelihood from the current fiber, a ~ b if a has a value then observe it, otherwise simulate it (per- haps lazily).

Lawrence Murray 29 / 30

slide-148
SLIDE 148

Looking ahead

▶ Current focus is pilot applications. ▶ Near ahead is adding new inference methods. ▶ Further ahead is performance tuning and parallelism.

Getting started guide and tutorial available on the website: birch-lang.org.

Lawrence Murray 30 / 30