Automated learning with a probabilistic programming language: Birch - - PowerPoint PPT Presentation

automated learning with a probabilistic programming
SMART_READER_LITE
LIVE PREVIEW

Automated learning with a probabilistic programming language: Birch - - PowerPoint PPT Presentation

Lawrence Murray Department of Information Technology, Uppsala University Outline Automated learning with a probabilistic programming language: Birch 1. The Birch probabilistic programming language. 2. The delayed sampling heuristic. 3. A


slide-1
SLIDE 1

Automated learning with a probabilistic programming language: Birch

Lawrence Murray

Department of Information Technology, Uppsala University Outline

  • 1. The Birch probabilistic programming language.
  • 2. The delayed sampling heuristic.
  • 3. A probabilistic defjnition of probabilistic programs.
  • 4. Example: multiple object tracking.
slide-2
SLIDE 2

Birch (birch-lang.org)

▶ Object-oriented and probabilistic programming paradigms. ▶ Draws inspiration from many places, including LibBi, for which it is

something of a successor, but also modern object-oriented languages such as Swift.

▶ Maintains the C/C++ basis of LibBi: compiles to C++14, uses

standard C/C++ libraries for numerical computing such as Boost and Eigen.

▶ Multithreaded using OpenMP. ▶ Dynamic memory management, with reference counting. ▶ Free and open source, under the Apache 2.0 license.

Lawrence Murray 1 / 24

slide-3
SLIDE 3

Birch

▶ Preferably, models are implemented by defjning the joint

distribution.

▶ That is, program code does not distinguish between latent and

  • bserved variables, this distinction is made at runtime according to

value assignment.

▶ Inference methods are also implemented in the Birch language. ▶ A particular feature of Birch is delayed sampling, a dynamic

mechanism for full and partial analytical solutions.

Lawrence Murray 2 / 24

slide-4
SLIDE 4

Delayed sampling

▶ Automatically yields optimizations such as variable elimination,

Rao–Blackwellization, and locally-optimal proposals.

▶ Is a heuristic: we have only partial knowledge of the whole model

structure during execution and must make myopic decisions.

▶ Usually, as a probabilistic program executes, we eagerly sample each

latent variable and update a weight for each observed variable.

▶ Instead, we delay the sampling of each latent variable in order to

analytically condition on each observed variable where possible.

▶ In Birch, this is implemented via computational graphs and implicit

type conversion.

  • L. M. Murray, D. Lundén, J. Kudlicka, D. Broman, and T. B. Schön. Delayed sampling and automatic Rao–Blackwellization of

probabilistic programs. In Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS), Lanzarote, Spain, 2018. URL arxiv.org/abs/1708.07787 Lawrence Murray 3 / 24

slide-5
SLIDE 5

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

Lawrence Murray 4 / 24

slide-6
SLIDE 6

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0);

assume x

for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-7
SLIDE 7

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-8
SLIDE 8

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-9
SLIDE 9

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-10
SLIDE 10

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-11
SLIDE 11

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 1 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-12
SLIDE 12

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 1 y[2] y[1] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-13
SLIDE 13

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 1 y[2] 1 y[1] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-14
SLIDE 14

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 2 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-15
SLIDE 15

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 2 y[3] y[1] y[2] y[4] y[5]

Lawrence Murray 4 / 24

slide-16
SLIDE 16

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 2 y[3] 2 y[1] y[2] y[4] y[5]

Lawrence Murray 4 / 24

slide-17
SLIDE 17

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 3 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-18
SLIDE 18

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 3 y[4] y[1] y[2] y[3] y[5]

Lawrence Murray 4 / 24

slide-19
SLIDE 19

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 3 y[4] 3 y[1] y[2] y[3] y[5]

Lawrence Murray 4 / 24

slide-20
SLIDE 20

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 4 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-21
SLIDE 21

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 4 y[5] y[1] y[2] y[3] y[4]

Lawrence Murray 4 / 24

slide-22
SLIDE 22

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 4 y[5] 4 y[1] y[2] y[3] y[4]

Lawrence Murray 4 / 24

slide-23
SLIDE 23

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

  • bserve y[n]

}

stdout.print(x);

x 5 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-24
SLIDE 24

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

value x

x 5 y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-25
SLIDE 25

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-26
SLIDE 26

Example #1

Code Checkpoint

x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0);

}

stdout.print(x);

x y[1] y[2] y[3] y[4] y[5]

Lawrence Murray 4 / 24

slide-27
SLIDE 27

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

Lawrence Murray 5 / 24

slide-28
SLIDE 28

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0);

assume x[1]

y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-29
SLIDE 29

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-30
SLIDE 30

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-31
SLIDE 31

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-32
SLIDE 32

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0);

  • bserve y[1]

for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-33
SLIDE 33

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0);

assume x[t]

y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-34
SLIDE 34

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-35
SLIDE 35

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 1 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-36
SLIDE 36

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 1 y[1] y[2] 1 x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-37
SLIDE 37

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-38
SLIDE 38

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0);

assume x[t]

y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-39
SLIDE 39

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-40
SLIDE 40

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 2 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-41
SLIDE 41

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 2 y[2] y[3] 2 x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-42
SLIDE 42

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-43
SLIDE 43

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0);

assume x[t]

y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] y[3] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-44
SLIDE 44

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] y[3] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-45
SLIDE 45

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 3 y[3] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-46
SLIDE 46

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 3 y[3] y[4] 3 x[5] y[5]

Lawrence Murray 5 / 24

slide-47
SLIDE 47

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-48
SLIDE 48

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0);

assume x[t]

y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] y[4] y[5]

Lawrence Murray 5 / 24

slide-49
SLIDE 49

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] y[4] y[5]

Lawrence Murray 5 / 24

slide-50
SLIDE 50

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 4 y[4] y[5]

Lawrence Murray 5 / 24

slide-51
SLIDE 51

Example #2

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 4 y[4] y[5] 4

Lawrence Murray 5 / 24

slide-52
SLIDE 52

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0);

  • bserve y[t]

} stdout.print(x[1]);

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 4 y[3] x[5] 5 y[4] y[5]

Lawrence Murray 5 / 24

slide-53
SLIDE 53

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

value x[1]

x[1] 1 x[2] 2 y[1] x[3] 3 y[2] x[4] 5 y[3] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-54
SLIDE 54

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

value x[1]

x[1] 1 x[2] 2 y[1] x[3] 5 y[2] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-55
SLIDE 55

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

value x[1]

x[1] 1 x[2] 5 y[1] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-56
SLIDE 56

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

value x[1]

x[1] 5 y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-57
SLIDE 57

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

value x[1]

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-58
SLIDE 58

Example #2: Kalman Filter

Code Checkpoint

x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]);

x[1] y[1] x[2] y[2] x[3] y[3] x[4] y[4] x[5] y[5]

Lawrence Murray 5 / 24

slide-59
SLIDE 59

Example #3

Lawrence Murray 6 / 24

slide-60
SLIDE 60

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-61
SLIDE 61

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-62
SLIDE 62

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-63
SLIDE 63

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-64
SLIDE 64

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-65
SLIDE 65

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-66
SLIDE 66

Example #3

x_n[1] x_l[1] y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-67
SLIDE 67

Example #3

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-68
SLIDE 68

Example #3

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-69
SLIDE 69

Example #3

x_n[1] x_l[1] y_l[1] y_n[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-70
SLIDE 70

Example #3

x_n[1] x_l[1] 1 y_n[1] y_l[1] x_n[2] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-71
SLIDE 71

Example #3

x_n[1] x_l[1] 1 x_n[2] y_n[1] y_l[1] x_l[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-72
SLIDE 72

Example #3

x_n[1] x_l[1] 1 x_n[2] x_l[2] y_n[1] y_l[1] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-73
SLIDE 73

Example #3

x_n[1] x_l[1] 1 x_n[2] 1 x_l[2] y_n[1] y_l[1] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-74
SLIDE 74

Example #3

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-75
SLIDE 75

Example #3

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-76
SLIDE 76

Example #3

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-77
SLIDE 77

Example #3

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-78
SLIDE 78

Example #3

x_n[1] x_l[1] 1 x_l[2] y_n[1] y_l[1] x_n[2] y_l[2] y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-79
SLIDE 79

Example #3

x_n[1] x_l[1] 1 x_l[2] 1 y_n[1] y_l[1] x_n[2] y_l[2] y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-80
SLIDE 80

Example #3

x_n[1] x_l[1] 1 x_l[2] 1 y_n[1] y_l[1] x_n[2] y_l[2] 1 y_n[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-81
SLIDE 81

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] y_n[2] y_l[2] x_n[3] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-82
SLIDE 82

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] y_n[2] y_l[2] x_l[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-83
SLIDE 83

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] x_l[3] y_n[2] y_l[2] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-84
SLIDE 84

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_n[3] 2 x_l[3] y_n[2] y_l[2] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-85
SLIDE 85

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-86
SLIDE 86

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-87
SLIDE 87

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-88
SLIDE 88

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-89
SLIDE 89

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] y_n[2] y_l[2] x_n[3] y_l[3] y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-90
SLIDE 90

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 2 y_n[2] y_l[2] x_n[3] y_l[3] y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-91
SLIDE 91

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 2 y_n[2] y_l[2] x_n[3] y_l[3] 2 y_n[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-92
SLIDE 92

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] y_n[3] y_l[3] x_n[4] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-93
SLIDE 93

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] y_n[3] y_l[3] x_l[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-94
SLIDE 94

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] x_l[4] y_n[3] y_l[3] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-95
SLIDE 95

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_n[4] 3 x_l[4] y_n[3] y_l[3] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-96
SLIDE 96

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-97
SLIDE 97

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-98
SLIDE 98

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-99
SLIDE 99

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-100
SLIDE 100

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] y_n[3] y_l[3] x_n[4] y_l[4] y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-101
SLIDE 101

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 3 y_n[3] y_l[3] x_n[4] y_l[4] y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-102
SLIDE 102

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 3 y_n[3] y_l[3] x_n[4] y_l[4] 3 y_n[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-103
SLIDE 103

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] y_n[4] y_l[4] x_n[5] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-104
SLIDE 104

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] y_n[4] y_l[4] x_l[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-105
SLIDE 105

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] x_l[5] y_n[4] y_l[4] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-106
SLIDE 106

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_n[5] 4 x_l[5] y_n[4] y_l[4] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-107
SLIDE 107

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-108
SLIDE 108

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-109
SLIDE 109

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-110
SLIDE 110

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-111
SLIDE 111

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] y_n[4] y_l[4] x_n[5] y_l[5] y_n[5]

Lawrence Murray 6 / 24

slide-112
SLIDE 112

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 4 y_n[4] y_l[4] x_n[5] y_l[5] y_n[5]

Lawrence Murray 6 / 24

slide-113
SLIDE 113

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 4 y_n[4] y_l[4] x_n[5] y_l[5] 4 y_n[5]

Lawrence Murray 6 / 24

slide-114
SLIDE 114

Example #3

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 5 y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-115
SLIDE 115

Example #3: Rao–Blackwellized Particle Filter

x_n[1] x_l[1] 1 x_l[2] 2 y_n[1] y_l[1] x_n[2] x_l[3] 3 y_n[2] y_l[2] x_n[3] x_l[4] 4 y_n[3] y_l[3] x_n[4] x_l[5] 5 y_n[4] y_l[4] x_n[5] y_n[5] y_l[5]

Lawrence Murray 6 / 24

slide-116
SLIDE 116

Programmatic models

  • 1. Generative model of a joint distribution expressed in a universal

programming language.

  • 2. Countably infjnite set of random variables {Vk}∞

k=1.

  • 3. As the model executes we encounter a fjnite subset of {Vk}∞

k=1 in

some order σ, producing a sequence (Vσ[k])|σ|

k=1, where σ[k] is given by

a deterministic function Ne of the random variates so far: σ[k] = Ne(vσ[1], . . . , vσ[k−1]).

  • 4. When Vσ[k] is encountered, it is associated with a distribution

Vσ[k] ∼ pσ[k] ( dvσ[k] | Pa(vσ[1], . . . , vσ[k−1]) ) , with Pa a deterministic function selecting a subset of its arguments.

Lawrence Murray 7 / 24

slide-117
SLIDE 117

Programmatic models

At some point execution terminates, having simulated pσ(dvσ[1], . . . , dvσ[|σ|]) =

|σ|

k=1

pσ[k] ( dvσ[k] | Pa(vσ[1], . . . , vσ[k−1]) ) . We will be interested in executing the code several times. The nth execution will be associated with the distribution p

n, given by

p

n dv n

dv

n n n

k

p

n k

dv

n k

Pa v

n

v

n k

with

n k

Ne v

n

v

n k

. Subscript n is used to denote execution-dependent variables.

Lawrence Murray 8 / 24

slide-118
SLIDE 118

Programmatic models

At some point execution terminates, having simulated pσ(dvσ[1], . . . , dvσ[|σ|]) =

|σ|

k=1

pσ[k] ( dvσ[k] | Pa(vσ[1], . . . , vσ[k−1]) ) . We will be interested in executing the code several times. The nth execution will be associated with the distribution pσn, given by pσn(dvσn[1], . . . , dvσn[|σn|]) =

|σn|

k=1

pσn[k] ( dvσn[k] | Pa(vσn[1], . . . , vσn[k−1]) ) , with σn[k] = Ne(vσn[1], . . . , vσn[k−1]). Subscript n is used to denote execution-dependent variables.

Lawrence Murray 8 / 24

slide-119
SLIDE 119

Programmatic models

For difgerent executions n and m, it is possible for:

▶ the number of random variables encountered, |σn| and |σm|, to difger, ▶ the sequences (Vσn[k])|σn| k=1 and (Vσm[k])|σm| k=1 to difger, and ▶ the two subsets {Vσn[k]}|σn| k=2 and {Vσm[k]}|σm| k=2 to be difgerent (and even

disjoint). In general, p

n and p m are not the same, but rather components of a

mixture. Models that fall in the programmatic class but not the graphical class occur in e.g. nonparametrics, phylogenetics, computational linguistics, multiple object tracking.

Lawrence Murray 9 / 24

slide-120
SLIDE 120

Programmatic models

For difgerent executions n and m, it is possible for:

▶ the number of random variables encountered, |σn| and |σm|, to difger, ▶ the sequences (Vσn[k])|σn| k=1 and (Vσm[k])|σm| k=1 to difger, and ▶ the two subsets {Vσn[k]}|σn| k=2 and {Vσm[k]}|σm| k=2 to be difgerent (and even

disjoint). In general, pσn and pσm are not the same, but rather components of a mixture. Models that fall in the programmatic class but not the graphical class occur in e.g. nonparametrics, phylogenetics, computational linguistics, multiple object tracking.

Lawrence Murray 9 / 24

slide-121
SLIDE 121

Programmatic models

For difgerent executions n and m, it is possible for:

▶ the number of random variables encountered, |σn| and |σm|, to difger, ▶ the sequences (Vσn[k])|σn| k=1 and (Vσm[k])|σm| k=1 to difger, and ▶ the two subsets {Vσn[k]}|σn| k=2 and {Vσm[k]}|σm| k=2 to be difgerent (and even

disjoint). In general, pσn and pσm are not the same, but rather components of a mixture. Models that fall in the programmatic class but not the graphical class occur in e.g. nonparametrics, phylogenetics, computational linguistics, multiple object tracking.

Lawrence Murray 9 / 24

slide-122
SLIDE 122

Multiple object tracking

13 17 18 28 34 37 41 42 45 50 56 59 76 79 81 91 95 97

Lawrence Murray 10 / 24

slide-123
SLIDE 123

Multiple object tracking

X2 ... ... Θ Y2 Y1 YT X1 XT X0 p(dy1:T, dx0:T, dθ)

  • joint

= p(dθ)

parameter

p(dx0 | θ)

  • initial

T

t=1

p(dxt | xt−1, θ)

  • transition

p(dyt | xt, θ)

  • bservation

But random state size, random observation size, random association between them.

Lawrence Murray 11 / 24

slide-124
SLIDE 124

Multiple object tracking

X2 ... ... Θ Y2 Y1 YT X1 XT X0 p(dy1:T, dx0:T, dθ)

  • joint

= p(dθ)

parameter

p(dx0 | θ)

  • initial

T

t=1

p(dxt | xt−1, θ)

  • transition

p(dyt | xt, θ)

  • bservation

But random state size, random observation size, random association between them.

Lawrence Murray 11 / 24

slide-125
SLIDE 125

Single object model

The single object model is a linear-Gaussian state-space model. For the ith

  • bject:

Xi

0 ∼ N(µi 0, M)

Xi

t ∼ N(Axi t−1, Q)

Di

t ∼ Bernoulli(ρ)

(is the object detected?) Yi

t ∼ N(Bxi t, R)

(observation if detected.)

Lawrence Murray 12 / 24

slide-126
SLIDE 126

Single object model

class Track < StateSpaceModel<Global,Random<Real[_]>,Random<Real[_]>> { t:Integer; // starting time of the track fiber initial(x:Random<Real[_]>, θ:Global) -> Real { auto μ <- vector(0.0, 3*length(θ.l)); μ[1..2] <~ Uniform(θ.l, θ.u); x ~ Gaussian(μ, θ.M); } fiber transition(x':Random<Real[_]>, x:Random<Real[_]>, θ:Global) -> Real { x' ~ Gaussian(θ.A*x, θ.Q); } fiber observation(y:Random<Real[_]>, x:Random<Real[_]>, θ:Global) -> Real { d:Boolean; d <~ Bernoulli(θ.d); // is the track detected? if d { y ~ Gaussian(θ.B*x, θ.R); } } }

Lawrence Murray 13 / 24

slide-127
SLIDE 127

Multiple object model

At time t:

▶ the number of new objects is

Bt ∼ Poisson(λ),

▶ the lifetime of each new object i is

Si ∼ Poisson(τ), so that the probability of an object disappearing if it has been present for si time steps is Pr[Si = si]/ Pr[Si ≥ si],

▶ the number of spurious observations (clutter) Ct is

Ct − 1 ∼ Poisson(µ), with these uniformly distributed on the domain [l, u].

Lawrence Murray 14 / 24

slide-128
SLIDE 128

Multiple object model

class Multi < StateSpaceModel<Global,List<Track>,List<Random<Real[_]>>> { t:Integer <- 0; // current time fiber transition(x':List<Track>, x:List<Track>, θ:Global) -> Real { t <- t + 1; /* move current objects */ auto track <- x.walk(); while track? { ρ:Real <- pmf_poisson(t - track!.t - 1, θ.τ); R:Real <- 1.0 - cdf_poisson(t - track!.t - 1, θ.τ) + ρ; s:Boolean; s <~ Bernoulli(1.0 - ρ/R); // does the object survive? if s { track!.step(); x'.pushBack(track!); } } ...

Lawrence Murray 15 / 24

slide-129
SLIDE 129

Multiple object model

... /* birth new objects */ N:Integer; N <~ Poisson(θ.λ); for n:Integer in 1..N { track:Track; track.t <- t; track.θ <- θ; track.start(); x'.pushBack(track); } } ...

Lawrence Murray 16 / 24

slide-130
SLIDE 130

Multiple object model

... fiber observation(y:List<Random<Real[_]>>, x:List<Track>, θ:Global) -> Real { if !y.empty() { // observations given, use data association association(y, x, θ); } else { N:Integer; N <~ Poisson(θ.μ); for n:Integer in 1..(N + 1) { clutter:Random<Real[_]>; clutter <~ Uniform(θ.l, θ.u); y.pushBack(clutter); } } } }

Lawrence Murray 17 / 24

slide-131
SLIDE 131

Inference

▶ There is structure to leverage: the whole model consists of multiple

state-space models for single objects within a larger state-space model for managing multiple objects.

▶ There is form to leverage: the inner state-space models are

linear-Gaussian. If we run a particle fjlter on the outer state-space model, delayed sampling gives us, within each particle, a separate Kalman fjlter on the inner state-space models.

Lawrence Murray 18 / 24

slide-132
SLIDE 132

Inference

▶ There is structure to leverage: the whole model consists of multiple

state-space models for single objects within a larger state-space model for managing multiple objects.

▶ There is form to leverage: the inner state-space models are

linear-Gaussian. If we run a particle fjlter on the outer state-space model, delayed sampling gives us, within each particle, a separate Kalman fjlter on the inner state-space models.

Lawrence Murray 18 / 24

slide-133
SLIDE 133

Multiple object tracking

13 17 18 28 34 37 41 42 45 50 56 59 76 79 81 91 95 97

Lawrence Murray 19 / 24

slide-134
SLIDE 134

Multiple object tracking

12 16 20 29 32 36 40 45 49 55 59 78 79 81 88 90 94

Lawrence Murray 20 / 24

slide-135
SLIDE 135

Multiple object tracking

12 16 17 27 33 36 39 45 49 49 55 57 75 78 80 80 90

Lawrence Murray 20 / 24

slide-136
SLIDE 136

Multiple object tracking

13 16 27 33 36 41 45 50 56 59 75 78 81 87 90 94

Lawrence Murray 20 / 24

slide-137
SLIDE 137

Multiple object tracking

12 16 18 27 33 36 41 44 49 55 58 77 78 78 80 90 92 95

Lawrence Murray 20 / 24

slide-138
SLIDE 138

Multiple object tracking

12 16 29 32 36 40 44 50 56 59 76 79 80 88 90 94

Lawrence Murray 20 / 24

slide-139
SLIDE 139

Data association

fiber association(y:List<Random<Real[_]>>, x:List<Track>, θ:Global) -> Real { K:Integer <- 0; // number of detections auto track <- x.walk(); while track? { if track!.y.back().hasDistribution() { /* object is detected, compute proposal */ K <- K + 1; q:Real[y.size()]; n:Integer <- 1; auto detection <- y.walk(); while detection? { q[n] <- track!.y.back().pdf(detection!); n <- n + 1; } Q:Real <- sum(q); ...

Lawrence Murray 21 / 24

slide-140
SLIDE 140

Multiple object model

... /* propose an association */ if Q > 0.0 { q <- q/Q; n <~ Categorical(q); // choose an observation yield track!.y.back().realize(y.get(n)); // likelihood yield -log(q[n]); // proposal correction y.erase(n); // remove the observation for future associations } else { yield -inf; // detected, but all likelihoods (numerically) zero } } /* factor in prior probability of hypothesis */ yield -lrising(y.size() + 1, K); // prior correction } ...

Lawrence Murray 22 / 24

slide-141
SLIDE 141

Multiple object model

... /* clutter */ y.size() - 1 ~> Poisson(θ.μ); auto clutter <- y.walk(); while clutter? { clutter! ~> Uniform(θ.l, θ.u); } }

Lawrence Murray 23 / 24

slide-142
SLIDE 142

Summary

Getting started guide and tutorial available on the website:

birch-lang.org.

Papers

▶ L. M. Murray and T. B. Schön. Automated learning with a

probabilistic programming language: Birch. Annual Reviews in Control, to appear, 2018. URL arxiv.org/abs/1810.01539

▶ L. M. Murray, D. Lundén, J. Kudlicka, D. Broman, and T. B. Schön.

Delayed sampling and automatic Rao–Blackwellization of probabilistic

  • programs. In Proceedings of the 21st International Conference on

Artifjcial Intelligence and Statistics (AISTATS), Lanzarote, Spain,

  • 2018. URL arxiv.org/abs/1708.07787

Lawrence Murray 24 / 24