Probabilistic Programming in Birch www.birch-lang.org Lawrence - PowerPoint PPT Presentation

Methods in Birch Inference methods are also written in the Birch language. ▶ Currently available are: ▶ Analytical solutions ▶ Importance sampling ▶ Bootstrap particle filter ▶ Alive particle filter ▶ Auxiliary particle filter (automated) ▶ Rao–Blackwellized particle filter (automated) ▶ Not far off are: ▶ Particle MCMC methods ▶ Other MCMC methods. Lawrence Murray 19 / 30

3 Birch: language features Lawrence Murray 20 / 30

Optionals Optionals allow variables to have a value of a particular type, or no value at all. ▶ They are used in other programming languages (e.g. Swift) to eliminate boilerplate that checks for null values, e.g. a function checking its arguments. ▶ In Birch, they are used for the same purpose, but also a second role: to represent missing values . Lawrence Murray 21 / 30

Randoms Randoms are optionals to which a probability distribution can be attached. ▶ When they don’t have a value , the probability distribution can be used to automatically simulate a value . ▶ Once a random has a value, that value is final, it cannot be overwritten. Lawrence Murray 22 / 30

Delayed sampling ▶ Randoms are essential for the delayed sampling mechanism within Birch. ▶ This is a heuristic algorithm for performing analytical optimizations at runtime. ▶ It automatically yields optimizations such as variable elimination/collapsing, Rao–Blackwellization and locally-optimal proposals. See: L. M. Murray, D. Lundén, J. Kudlicka, D. Broman, and T. B. Schön. Delayed sampling and automatic Rao–Blackwellization of probabilistic programs. Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS), 2018. URL https://arxiv.org/abs/1708.07787 Lawrence Murray 23 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x); Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint assume x x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x); x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x); x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { observe y[n] y[n] ~ Gaussian(x, 1.0); } stdout.print(x); x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { observe y[n] y[n] ~ Gaussian(x, 1.0); } stdout.print(x); 1 x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { observe y[n] y[n] ~ Gaussian(x, 1.0); } stdout.print(x); 1 x 1 y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } value x stdout.print(x); 5 x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling example Code Checkpoint x ~ Gaussian(0.0, 1.0); for (n in 1..N) { y[n] ~ Gaussian(x, 1.0); } stdout.print(x); x y[1] y[2] y[3] y[4] y[5] Lawrence Murray 24 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint assume x[1] x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); observe y[1] y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); observe y[1] y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { assume x[t] x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 1 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 1 x[1] x[2] x[3] x[4] x[5] 1 y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { assume x[t] x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 2 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 2 x[1] x[2] x[3] x[4] x[5] 2 y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { assume x[t] x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 3 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 3 x[1] x[2] x[3] x[4] x[5] 3 y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { assume x[t] x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 4 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 4 4 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 4 4 x[1] x[2] x[3] x[4] x[5] 4 y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); observe y[t] y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); 1 2 3 4 5 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } value x[1] stdout.print(x[1]); 1 2 3 5 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } value x[1] stdout.print(x[1]); 1 2 5 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } value x[1] stdout.print(x[1]); 1 5 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } value x[1] stdout.print(x[1]); 5 x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } value x[1] stdout.print(x[1]); x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling: Kalman Filter Code Checkpoint x[1] ~ Gaussian(0.0, 1.0); y[1] ~ Gaussian(x[1], 1.0); for (t in 2..T) { x[t] ~ Gaussian(a*x[t - 1], 1.0); y[t] ~ Gaussian(x[t], 1.0); } stdout.print(x[1]); x[1] x[2] x[3] x[4] x[5] y[1] y[2] y[3] y[4] y[5] Lawrence Murray 25 / 30

Delayed sampling Lawrence Murray 26 / 30

Delayed sampling x_n[1] x_n[2] x_n[3] x_n[4] x_n[5] x_l[1] x_l[2] x_l[3] x_l[4] x_l[5] y_l[1] y_l[2] y_l[3] y_l[4] y_l[5] y_n[1] y_n[2] y_n[3] y_n[4] y_n[5] Lawrence Murray 26 / 30

Delayed sampling x_n[1] x_n[2] x_n[3] x_n[4] x_n[5] 1 x_l[1] x_l[2] x_l[3] x_l[4] x_l[5] y_l[1] y_l[2] y_l[3] y_l[4] y_l[5] y_n[1] y_n[2] y_n[3] y_n[4] y_n[5] Lawrence Murray 26 / 30

Probabilistic Programming in Birch www.birch-lang.org Lawrence - PowerPoint PPT Presentation

Probabilistic Programming in Birch www.birch-lang.org Lawrence Murray Department of Information Technology, Uppsala University Outline 1. Graphical models probabilistic programs. 2. Birch: motivation and design. 3. Birch: language

Automated learning with a probabilistic programming language: Birch 1. The Birch probabilistic

BIRCH STREET PLAZA: ROSLINDALE VILLAGE Birch St Project Team Birch St Process (18 months)

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Profiles Profiles Dr Diana Birch Dr Diana Birch Youth Support Youth Support Introduction

Getting Crowds to Work Leah Birch Naor Brown October 24, 2012 Leah Birch Naor Brown Getting

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Come for the code Stay for the community Its not their drupal Its ours Drupalcon Sydney 2013

Elliptic Curves and the Birch and Swinnerton-Dyer Conjecture William Stein Harvard University

The Future of Spreadsheets in in the Big ig Data Era David Birch 1* , David Lyford-Smith 2 &

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

A Brief Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago Universidade

Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago BISS 2014, Bertinoro

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

SUMMARY OF RECOMMENDATIONS AND SDG TARGETS FOR FUTURE REPORTING Speaker: Dr. Gillian Sparkes,

Developing Dorset Councils Plan September 2019 Dorset Council was formed from 6 previous

securing the value of nature Overview A healthy, properly functioning natural environment is

From the Birch and Swinnerton Dyer Conjecture to the GL 2 Main Conjecture for elliptic curves by

Support Vector Machines Alex Leblang and Sam Birch ML Framework Data projected into feature

Number Theory and Representation Theory A conference in honor of the 60th birthday of Benedict

Hierarchy An arrangement or classification of things according to inclusiveness A natural

CURE: An Efficient Clustering Algorithm for Large Databases Sudipto Guha Rajeev Rastogi Kyuseok

Probabilistic Programming in Birch www.birch-lang.org Lawrence - PowerPoint PPT Presentation

Probabilistic Programming in Birch www.birch-lang.org Lawrence Murray Department of Information Technology, Uppsala University Outline 1. Graphical models probabilistic programs. 2. Birch: motivation and design. 3. Birch: language

Automated learning with a probabilistic programming language: Birch 1. The Birch probabilistic

BIRCH STREET PLAZA: ROSLINDALE VILLAGE Birch St Project Team Birch St Process (18 months)

CS 4110 Probabilistic Programming Probabilistic Programming It's not about writing software.

Profiles Profiles Dr Diana Birch Dr Diana Birch Youth Support Youth Support Introduction

Getting Crowds to Work Leah Birch Naor Brown October 24, 2012 Leah Birch Naor Brown Getting

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Principles of Probabilistic Programming Lectures at EWSCS 2020 Winter School Joost-Pieter Katoen

Come for the code Stay for the community Its not their drupal Its ours Drupalcon Sydney 2013

Elliptic Curves and the Birch and Swinnerton-Dyer Conjecture William Stein Harvard University

The Future of Spreadsheets in in the Big ig Data Era David Birch 1* , David Lyford-Smith 2 &amp;

Reactive Probabilistic Programming Semantics with Mixed Nondeterministic/Probabilistic Automata

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

A Brief Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago Universidade

Introduction to Probabilistic and Quantum Programming Part II Ugo Dal Lago BISS 2014, Bertinoro

Running Probabilistic Running Probabilistic Running Probabilistic Programs Backwards Programs

Probabilistic Tracking and Probabilistic Tracking and Probabilistic Tracking and Thesis

SUMMARY OF RECOMMENDATIONS AND SDG TARGETS FOR FUTURE REPORTING Speaker: Dr. Gillian Sparkes,

Developing Dorset Councils Plan September 2019 Dorset Council was formed from 6 previous

securing the value of nature Overview A healthy, properly functioning natural environment is

From the Birch and Swinnerton Dyer Conjecture to the GL 2 Main Conjecture for elliptic curves by

Support Vector Machines Alex Leblang and Sam Birch ML Framework Data projected into feature

Number Theory and Representation Theory A conference in honor of the 60th birthday of Benedict

Hierarchy An arrangement or classification of things according to inclusiveness A natural

CURE: An Efficient Clustering Algorithm for Large Databases Sudipto Guha Rajeev Rastogi Kyuseok

The Future of Spreadsheets in in the Big ig Data Era David Birch 1* , David Lyford-Smith 2 &