Monolingual probabilistic programming using generalized coroutines - - PowerPoint PPT Presentation

monolingual probabilistic programming using generalized
SMART_READER_LITE
LIVE PREVIEW

Monolingual probabilistic programming using generalized coroutines - - PowerPoint PPT Presentation

Monolingual probabilistic programming using generalized coroutines Oleg Kiselyov Chung-chieh Shan FNMOC Rutgers University oleg@pobox.com ccshan@cs.rutgers.edu 19 June 2009 This session . . . programming formalism 2/14 This talk . . .


slide-1
SLIDE 1

Monolingual probabilistic programming using generalized coroutines

Oleg Kiselyov

FNMOC

  • leg@pobox.com

Chung-chieh Shan

Rutgers University ccshan@cs.rutgers.edu

19 June 2009

slide-2
SLIDE 2

2/14

This session . . .

programming formalism

slide-3
SLIDE 3

2/14

This talk . . .

Modular programming Expressive formalism Efficient implementation

slide-4
SLIDE 4

2/14

This talk . . . is about knowledge representation

Modular programming – Factored representation Expressive formalism – Informative prior Efficient implementation – Custom inference

slide-5
SLIDE 5

2/14

This talk . . . is about knowledge representation

Modular programming – Factored representation Expressive formalism – Informative prior Efficient implementation – Custom inference

slide-6
SLIDE 6

3/14

Declarative probabilistic inference

Model (what) Inference (how)

slide-7
SLIDE 7

3/14

Declarative probabilistic inference

Model (what) Inference (how) Toolkit

(BNT, PFP)

invoke distributions, conditionalization, . . . Language

(BLOG, IBAL,

Church) random choice,

  • bservation, . . .

interpret

slide-8
SLIDE 8

3/14

Declarative probabilistic inference

Model (what) Inference (how) Toolkit

(BNT, PFP)

+ use existing libraries, types, debugger + easy to add custom inference Language

(BLOG, IBAL,

Church) + random variables are

  • rdinary variables

+ compile models for faster inference

slide-9
SLIDE 9

3/14

Declarative probabilistic inference

Model (what) Inference (how) Toolkit

(BNT, PFP)

+ use existing libraries, types, debugger + easy to add custom inference Language

(BLOG, IBAL,

Church) + random variables are

  • rdinary variables

+ compile models for faster inference Today: Best of both invoke interpret Express models and inference as interacting programs in the same general-purpose language.

slide-10
SLIDE 10

3/14

Declarative probabilistic inference

Model (what) Inference (how) Toolkit

(BNT, PFP)

+ use existing libraries, types, debugger + easy to add custom inference Language

(BLOG, IBAL,

Church) + random variables are

  • rdinary variables

+ compile models for faster inference Today: Best of both Payoff: expressive model + models of inference: bounded-rational theory of mind Payoff: fast inference + deterministic parts of models run at full speed + importance sampling Express models and inference as interacting programs in the same general-purpose language.

slide-11
SLIDE 11

4/14

Outline

◮ Expressivity

Memoization Nested inference Implementation Reifying a model into a search tree Importance sampling with look-ahead Performance

slide-12
SLIDE 12

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)]

Models are ordinary code (in OCaml) using a library function dist.

slide-13
SLIDE 13

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)]

Models are ordinary code (in OCaml) using a library function dist.

slide-14
SLIDE 14

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)] let cloudy = flip 0.5 in let rain = flip (if cloudy then 0.8 else 0.2) in let sprinkler = flip (if cloudy then 0.1 else 0.5) in let wet_roof = flip 0.7 && rain in let wet_grass = flip 0.9 && rain || flip 0.9 && sprinkler in if wet_grass then rain else fail ()

Models are ordinary code (in OCaml) using a library function dist. Random variables are ordinary variables.

slide-15
SLIDE 15

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)] let cloudy = flip 0.5 in let rain = flip (if cloudy then 0.8 else 0.2) in let sprinkler = flip (if cloudy then 0.1 else 0.5) in let wet_roof = flip 0.7 && rain in let wet_grass = flip 0.9 && rain || flip 0.9 && sprinkler in if wet_grass then rain else fail ()

Models are ordinary code (in OCaml) using a library function dist. Random variables are ordinary variables.

slide-16
SLIDE 16

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)] let grass_model = fun () -> let cloudy = flip 0.5 in let rain = flip (if cloudy then 0.8 else 0.2) in let sprinkler = flip (if cloudy then 0.1 else 0.5) in let wet_roof = flip 0.7 && rain in let wet_grass = flip 0.9 && rain || flip 0.9 && sprinkler in if wet_grass then rain else fail () normalize (exact_reify grass_model)

Models are ordinary code (in OCaml) using a library function dist. Random variables are ordinary variables. Inference applies to thunks and returns a distribution.

slide-17
SLIDE 17

5/14

Grass model

cloudy rain sprinkler wet roof wet grass

let flip = fun p -> dist [(p, true); (1.-.p, false)] let grass_model = fun () -> let cloudy = flip 0.5 in let rain = flip (if cloudy then 0.8 else 0.2) in let sprinkler = flip (if cloudy then 0.1 else 0.5) in let wet_roof = flip 0.7 && rain in let wet_grass = flip 0.9 && rain || flip 0.9 && sprinkler in if wet_grass then rain else fail () normalize (exact_reify grass_model)

Models are ordinary code (in OCaml) using a library function dist. Random variables are ordinary variables. Inference applies to thunks and returns a distribution. Deterministic parts of models run at full speed.

slide-18
SLIDE 18

6/14

Models as programs in a general-purpose language

Reuse existing infrastructure!

◮ Rich libraries: lists, arrays, database access, I/O, . . . ◮ Type inference ◮ Functions as first-class values ◮ Compiler ◮ Debugger ◮ Memoization

slide-19
SLIDE 19

6/14

Models as programs in a general-purpose language

Reuse existing infrastructure!

◮ Rich libraries: lists, arrays, database access, I/O, . . . ◮ Type inference ◮ Functions as first-class values ◮ Compiler ◮ Debugger ◮ Memoization

Express Dirichlet processes, etc. (Goodman et al. 2008) Speed up inference using lazy evaluation

slide-20
SLIDE 20

6/14

Models as programs in a general-purpose language

Reuse existing infrastructure!

◮ Rich libraries: lists, arrays, database access, I/O, . . . ◮ Type inference ◮ Functions as first-class values ◮ Compiler ◮ Debugger ◮ Memoization

Express Dirichlet processes, etc. (Goodman et al. 2008) Speed up inference using lazy evaluation bucket elimination sampling w/memoization (Pfeffer 2007)

slide-21
SLIDE 21

7/14

Nested inference

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

♣ ♣ ♣ ✵✿✸

slide-22
SLIDE 22

7/14

Nested inference

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

Let ♣ be the probability that flipping the coin yields true.

What is the probability that ♣ is at least ✵✿✸?

slide-23
SLIDE 23

7/14

Nested inference

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

Let ♣ be the probability that flipping the coin yields true.

What is the probability that ♣ is at least ✵✿✸? Answer: 1.

at_least 0.3 true (exact_reify coin)

slide-24
SLIDE 24

7/14

Nested inference

exact_reify (fun () ->

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

Let ♣ be the probability that flipping the coin yields true.

What is the probability that ♣ is at least ✵✿✸? Answer: 1.

at_least 0.3 true (exact_reify coin) )

slide-25
SLIDE 25

7/14

Nested inference

exact_reify (fun () ->

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

Let ♣ be the probability that flipping the coin yields true. Estimate ♣ by flipping the coin twice. What is the probability that our estimate of ♣ is at least ✵✿✸? Answer: 7/8.

at_least 0.3 true (sample 2 coin) )

slide-26
SLIDE 26

7/14

Nested inference

exact_reify (fun () ->

Choose a coin that is either fair or completely biased for true.

let biased = flip 0.5 in let coin = fun () -> flip 0.5 || biased in

Let ♣ be the probability that flipping the coin yields true. Estimate ♣ by flipping the coin twice. What is the probability that our estimate of ♣ is at least ✵✿✸? Answer: 7/8.

at_least 0.3 true (sample 2 coin) )

Returns a distribution—not just nested query (Goodman et al. 2008). Inference procedures are OCaml code using dist, like models. Works with observation, recursion, memoization. Bounded-rational theory of mind without interpretive overhead.

slide-27
SLIDE 27

8/14

Outline

Expressivity Memoization Nested inference

◮ Implementation

Reifying a model into a search tree Importance sampling with look-ahead Performance

slide-28
SLIDE 28

9/14

Reifying a model into a search tree

true

✳✽ ✳✷ ✳✸

false

✳✷ . . . ✳✻ . . . ✳✸ ✳✺ Exact inference by depth-first brute-force enumeration. Rejection sampling by top-down random traversal.

slide-29
SLIDE 29

9/14

Reifying a model into a search tree

  • pen
  • pen

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

  • pen
  • pen

✳✻

  • pen

✳✸ ✳✺ Exact inference by depth-first brute-force enumeration. Rejection sampling by top-down random traversal.

slide-30
SLIDE 30

9/14

Reifying a model into a search tree

closed

  • pen

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

  • pen
  • pen

✳✻

  • pen

✳✸ ✳✺ Exact inference by depth-first brute-force enumeration. Rejection sampling by top-down random traversal.

slide-31
SLIDE 31

9/14

Reifying a model into a search tree

closed closed

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Exact inference by depth-first brute-force enumeration. Rejection sampling by top-down random traversal.

slide-32
SLIDE 32

9/14

Reifying a model into a search tree

closed closed

true

✳✽ closed ✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Exact inference by depth-first brute-force enumeration. Rejection sampling by top-down random traversal.

slide-33
SLIDE 33

9/14

Reifying a model into a search tree

  • pen

closed

true

✳✽ closed ✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺

unit -> bool

reify reflect Inference procedures cannot access models’ source code. Reify then reflect:

◮ Brute-force enumeration becomes bucket elimination ◮ Sampling becomes particle filtering

slide-34
SLIDE 34

9/14

Reifying a model into a search tree

  • pen

closed

true

✳✽ closed ✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺

unit -> bool

reify reflect Implementation: represent a probability and state monad

(Giry 1982, Moggi 1990, Filinski 1994)

using first-class delimited continuations

(Strachey & Wadsworth 1974, Felleisen et al. 1987, Danvy & Filinski 1989)

Implementation: using clonable user-level threads

◮ Model runs inside a thread. ◮ dist clones the thread. ◮ fail kills the thread. ◮ Memoization mutates thread-local storage.

slide-35
SLIDE 35

10/14

Importance sampling with look-ahead

  • pen
  • pen

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

  • pen
  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✶

✭✿✷❀ ✮ ✭✿✻❀ ✮

slide-36
SLIDE 36

10/14

Importance sampling with look-ahead

closed

  • pen

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

  • pen
  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✶

✭✿✷❀ ✮ ✭✿✻❀ ✮

  • 1. Expand one level.
slide-37
SLIDE 37

10/14

Importance sampling with look-ahead

closed

  • pen

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

  • pen
  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✶

✭✿✷❀ false✮ ✭✿✻❀ ✮

  • 1. Expand one level.
  • 2. Report shallow successes.
slide-38
SLIDE 38

10/14

Importance sampling with look-ahead

closed

✿✸ closed true

✳✽

  • pen

✳✷ ✳✸

false

✳✷

✿✹✺ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✿✼✺

✭✿✷❀ false✮ ✭✿✻❀ ✮

  • 1. Expand one level.
  • 2. Report shallow successes.
  • 3. Expand one more level and tally open probability.
slide-39
SLIDE 39

10/14

Importance sampling with look-ahead

closed closed

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✿✼✺

✭✿✷❀ false✮ ✭✿✻❀ ✮

  • 1. Expand one level.
  • 2. Report shallow successes.
  • 3. Expand one more level and tally open probability.
  • 4. Randomly choose a branch and go back to 2.
slide-40
SLIDE 40

10/14

Importance sampling with look-ahead

closed closed

true

✳✽

  • pen

✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✿✼✺

✭✿✷❀ false✮ ✭✿✻❀ true✮

  • 1. Expand one level.
  • 2. Report shallow successes.
  • 3. Expand one more level and tally open probability.
  • 4. Randomly choose a branch and go back to 2.
slide-41
SLIDE 41

10/14

Importance sampling with look-ahead

closed closed

true

✳✽

✵ closed

✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✵

✭✿✷❀ false✮ ✭✿✻❀ true✮

  • 1. Expand one level.
  • 2. Report shallow successes.
  • 3. Expand one more level and tally open probability.
  • 4. Randomly choose a branch and go back to 2.
slide-42
SLIDE 42

10/14

Importance sampling with look-ahead

closed closed

true

✳✽ closed ✳✷ ✳✸

false

✳✷ closed

  • pen

✳✻

  • pen

✳✸ ✳✺ Probability mass ♣❝ ❂ ✵

✭✿✷❀ false✮ ✭✿✻❀ true✮

  • 1. Expand one level.
  • 2. Report shallow successes.
  • 3. Expand one more level and tally open probability.
  • 4. Randomly choose a branch and go back to 2.
slide-43
SLIDE 43

11/14

Outline

Expressivity Memoization Nested inference Implementation Reifying a model into a search tree Importance sampling with look-ahead

◮ Performance

slide-44
SLIDE 44

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

  • Source motif
slide-45
SLIDE 45

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

  • Source motif
slide-46
SLIDE 46

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

  • Source motif
slide-47
SLIDE 47

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

  • Source motif
slide-48
SLIDE 48

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

infer

  • Destination motif

Source motif

✎ ✎

slide-49
SLIDE 49

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

infer

  • Destination motif

Source motif

  • Implemented using lazy stochastic lists.

Motif pair 1 2 3 4 5 6 7 % correct using importance sampling

✎ Pfeffer 2007 (30 sec)

93 100 28 80 98 100 63

✎ This paper

(90 sec) 98 100 29 87 94 100 77

✎ This paper

(30 sec) 92 99 25 46 72 95 61

slide-50
SLIDE 50

12/14

Motivic development in Beethoven sonatas

(Pfeffer 2007)

5 10 15 20 25 30 35 40

  • 19
  • 18
  • 17
  • 16
  • 15
  • 14
  • 13

Frequency in 100 trials ln Pr(D = 1 | S = 1) IBAL 90 seconds 30 seconds

slide-51
SLIDE 51

13/14

Noisy radar blips for aircraft tracking

(Milch et al. 2007)

Blips present and absent infer 1 2 3 4 5 6 7 Number of planes Probability Particle filter. Implemented using lazy stochastic coordinates.

slide-52
SLIDE 52

13/14

Noisy radar blips for aircraft tracking

(Milch et al. 2007)

Blips present and absent

t ❂ ✶

infer 1 2 3 4 5 6 Number of planes Probability Particle filter. Implemented using lazy stochastic coordinates.

slide-53
SLIDE 53

13/14

Noisy radar blips for aircraft tracking

(Milch et al. 2007)

Blips present and absent

t ❂ ✶, t ❂ ✷

infer 1 2 3 4 Number of planes Probability Particle filter. Implemented using lazy stochastic coordinates.

slide-54
SLIDE 54

13/14

Noisy radar blips for aircraft tracking

(Milch et al. 2007)

Blips present and absent

t ❂ ✶, t ❂ ✷, t ❂ ✸

infer 3 4 Number of planes Probability Particle filter. Implemented using lazy stochastic coordinates.

slide-55
SLIDE 55

14/14

Summary

Model (what) Inference (how) Toolkit + use existing libraries, types, debugger + easy to add custom inference Language + random variables are

  • rdinary variables

+ compile models for faster inference Today: Best of both Payoff: expressive model + models of inference: bounded-rational theory of mind Payoff: fast inference + deterministic parts of models run at full speed + importance sampling Express models and inference as interacting programs in the same general-purpose language.