[PPT] - Truncated Random Measures Jonathan Huggins MIT CSAIL and Dept. of PowerPoint Presentation

SLIDE 1

Truncated Random Measures

Jonathan Huggins

MIT CSAIL and Dept. of EECS with: T. Campbell, J. How, T. Broderick

SLIDE 2

What leads to a statistical method being used for science?

SLIDE 3

What leads to a statistical method being used for science?

1. Conceptually clear

SLIDE 4

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…

SLIDE 5

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use

SLIDE 6

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…

SLIDE 7

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable

SLIDE 8

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable

SLIDE 9

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable
How to fix this? probabilistic programming

SLIDE 10

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable
How to fix this? probabilistic programming
Write down the model, but don’t worry about inference

SLIDE 11

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable
How to fix this? probabilistic programming
Write down the model, but don’t worry about inference
v1.0: BUGS/JAGS (Gibbs sampling)

SLIDE 12

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable
How to fix this? probabilistic programming
Write down the model, but don’t worry about inference
v1.0: BUGS/JAGS (Gibbs sampling)
v2.0: Stan (HMC or variational inference or MAP estimation)

SLIDE 13

What leads to a statistical method being used for science?

1. Conceptually clear
Bayesian methods are conceptually clear…
2. Easy to use
…but often not easy to use…
3. Reliable
…which makes them less reliable
How to fix this? probabilistic programming
Write down the model, but don’t worry about inference
v1.0: BUGS/JAGS (Gibbs sampling)
v2.0: Stan (HMC or variational inference or MAP estimation)
Goal: integrate BNP priors into PPLs like Stan

SLIDE 14

BNP: awesome, but   challenging to use

SLIDE 15

BNP: awesome, but   challenging to use

Need models that can extract new, useful information from infinite streams of data

SLIDE 16

BNP: awesome, but   challenging to use

e.g. keep learning new topics from a stream of documents

Need models that can extract new, useful information from infinite streams of data

SLIDE 17

BNP: awesome, but   challenging to use

e.g. keep learning new topics from a stream of documents

Need models that can extract new, useful information from infinite streams of data Bayesian nonparametrics: achieves growing model size via infinite parameters

SLIDE 18

BNP: awesome, but   challenging to use

e.g. keep learning new topics from a stream of documents

Need models that can extract new, useful information from infinite streams of data Bayesian nonparametrics: achieves growing model size via infinite parameters

[Gopalan 2014]

movie s

[Teh 2006]

text

[Huang 2014]

medicine

[Michini 2015]

robotics

[Lennox 2010]

genetics

[Prunster 2014]

finance

[Yang 2015]

astronomy

[Yu 2012]

traffic

[Ozaki 2008]

agriculture

[Kottas 2008]

pathology

SLIDE 19

BNP: awesome, but   challenging to use

e.g. keep learning new topics from a stream of documents

Need models that can extract new, useful information from infinite streams of data Bayesian nonparametrics: achieves growing model size via infinite parameters

[Gopalan 2014]

movie s

[Teh 2006]

text

[Huang 2014]

medicine

[Michini 2015]

robotics

[Lennox 2010]

genetics

[Prunster 2014]

finance

[Yang 2015]

astronomy

[Yu 2012]

traffic

[Ozaki 2008]

agriculture

[Kottas 2008]

pathology

hard work!

SLIDE 20

BNP: awesome, but   challenging to use

e.g. keep learning new topics from a stream of documents

Need models that can extract new, useful information from infinite streams of data Bayesian nonparametrics: achieves growing model size via infinite parameters

[Gopalan 2014]

movie s

[Teh 2006]

text

[Huang 2014]

medicine

[Michini 2015]

robotics

[Lennox 2010]

genetics

[Prunster 2014]

finance

[Yang 2015]

astronomy

[Yu 2012]

traffic

[Ozaki 2008]

agriculture

[Kottas 2008]

pathology

hard work! automate inference with probabilistic programming

SLIDE 21

Inference in BNP models

SLIDE 22

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Inference in BNP models

SLIDE 23

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Inference in BNP models

SLIDE 24

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Inference in BNP models

SLIDE 25

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

All BNP priors

Inference in BNP models

SLIDE 26

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

All BNP priors Previously studied priors

Inference in BNP models

SLIDE 27

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

All BNP priors Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 28

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Contributions:

All BNP priors Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 29

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Contributions:

2 representation forms (7 reps total) that allow finite approximation
f (normalized) completely random measures ( (N)CRMs )

All BNP priors Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 30

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Contributions:

2 representation forms (7 reps total) that allow finite approximation
f (normalized) completely random measures ( (N)CRMs )

All BNP priors Priors with finite approx (new) Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 31

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Contributions:

2 representation forms (7 reps total) that allow finite approximation
f (normalized) completely random measures ( (N)CRMs )
Approximation error analysis

All BNP priors Priors with finite approx (new) Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 32

Option #1: Integrate out the parameter (CRP, IBP, etc.)

issues: care about the parameters, using approximations (HMC/VB), distributed computation

Option #2: use a finite approximation...

with e.g. variational inference, HMC  

[Blei 06; Neal 10]

Problem: Wide variety of priors in BNP with no finite approximation

Contributions:

2 representation forms (7 reps total) that allow finite approximation
f (normalized) completely random measures ( (N)CRMs )
Approximation error analysis
Computational complexity analysis (not in this talk)

All BNP priors Priors with finite approx (new) Previously studied priors with finite approx (past work)

Inference in BNP models

SLIDE 33

Past work: finite approximations to BNP priors

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓

SLIDE 34

Past work: finite approximations to BNP priors

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓

[Sethuraman 94] [Roychowdhury 15] [Teh 07] [Paisley 12] [Thibaux 07] [Broderick 14] [Bondesson 82] [Roychowdhury 15] [Ishwaran 01] [Doshi-Velez 09] [Paisley 12] [Broderick 14] [Roychowdhury 15]

SLIDE 35

Past work: finite approximations to BNP priors

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓

[Sethuraman 94] [Roychowdhury 15] [Teh 07] [Paisley 12] [Thibaux 07] [Broderick 14] [Bondesson 82] [Roychowdhury 15] [Ishwaran 01] [Doshi-Velez 09] [Paisley 12] [Broderick 14] [Roychowdhury 15]

Sparse results for a few priors in BNP

SLIDE 36

Past work: finite approximations to BNP priors

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓

[Sethuraman 94] [Roychowdhury 15] [Teh 07] [Paisley 12] [Thibaux 07] [Broderick 14] [Bondesson 82] [Roychowdhury 15] [Ishwaran 01] [Doshi-Velez 09] [Paisley 12] [Broderick 14] [Roychowdhury 15]

Sparse results for a few priors in BNP No general theory

SLIDE 37

Truncation Roadmap

SLIDE 38

Tractable models in BNP

Truncation Roadmap

SLIDE 39

Tractable models in BNP two forms for sequential representations

Truncation Roadmap

SLIDE 40

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 41

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 42

Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 …

SLIDE 43

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 …

SLIDE 44

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 …

SLIDE 45

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 …

SLIDE 46

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

SLIDE 47

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

SLIDE 48

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

SLIDE 49

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

SLIDE 50

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

SLIDE 51

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

ϴ is a random discrete measure

n the topics

Θ

SLIDE 52

topic   space frequency space Doc 1 (532 words) Doc 2 (210 words) Doc 3 (854 words) Doc 4 (926 words)

The Standard Model in BNP (By Example )

343 189 210 854 342 584

s p

r

t s politics f

d

… 0.7 0.5 0.2 … 0.7 sports

ϴ is a random discrete measure

n the topics

Θ

SLIDE 53

trait  space rate   space Obs 1 Obs 2 Obs 3 Obs 4

The Standard Model in BNP (By Example )

343 189 210 854 342 584 … … “traits” “rates”

ϴ is a random discrete measure

n the topics

topics traits θ1 θ2 θ3 ψ1 ψ2 ψ3

Θ

SLIDE 54

trait  space rate   space Obs 1 Obs 2 Obs 3 Obs 4

The Standard Model in BNP (By Example )

343 189 210 854 342 584 … … “traits” “rates”

ϴ is a random discrete measure

n the topics

topics traits θ1 θ2 θ3 ψ1 ψ2 ψ3

Θ

SLIDE 55

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

trait space rate   space

SLIDE 56

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

Poisson point process with measure 𝜉(d𝜄 x d𝜔):

[Kingman 93]

trait space rate   space

SLIDE 57

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

completely random measure (CRM) (e.g. BP, 𝚫P) Poisson point process with measure 𝜉(d𝜄 x d𝜔):

[Kingman 93]

trait space rate   space

Θ

SLIDE 58

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

completely random measure (CRM) (e.g. BP, 𝚫P) Normalize rates: normalized CRM (NCRM) (e.g. DP) Poisson point process with measure 𝜉(d𝜄 x d𝜔):

[Kingman 93]

trait space rate   space

Θ

SLIDE 59

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

completely random measure (CRM) (e.g. BP, 𝚫P) Normalize rates: normalized CRM (NCRM) (e.g. DP) Captures a large class of useful priors in BNP Poisson point process with measure 𝜉(d𝜄 x d𝜔):

[Kingman 93]

trait space rate   space

Θ

SLIDE 60

Poisson processes and (N)CRMs

How do we generate infinitely many trait/rate points (𝜔, 𝜄)?

completely random measure (CRM) (e.g. BP, 𝚫P) Normalize rates: normalized CRM (NCRM) (e.g. DP) Captures a large class of useful priors in BNP How do we pick a finite subset of the points? Poisson point process with measure 𝜉(d𝜄 x d𝜔):

[Kingman 93]

trait space rate   space

Θ

SLIDE 61

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 62

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 63

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by:

Θ

trait space rate   space

SLIDE 64

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

SLIDE 65

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1

SLIDE 66

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1 2

SLIDE 67

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1 2 3

SLIDE 68

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1 2 3 4

SLIDE 69

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 70

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 71

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation) 2) removing any atoms beyond the K-th (truncation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 72

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation) 2) removing any atoms beyond the K-th (truncation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 73

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation) 2) removing any atoms beyond the K-th (truncation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 74

Sequential representation & truncation

We pick a finite subset of atoms (𝜔,𝜄) by: 1) ordering the atoms (sequential representation) 2) removing any atoms beyond the K-th (truncation)

Θ

trait space rate   space

1 2 3 4 K

SLIDE 75

We describe 2 forms for sequential representations

Ordering of (N)CRM atoms

SLIDE 76

Series representation  function of a homogenous   Poisson point process  (4 versions) We describe 2 forms for sequential representations

Ordering of (N)CRM atoms

SLIDE 77

Superposition representation  infinite sum of homogenous CRMs, each with finite # of atoms  (3 versions)

Series representation  function of a homogenous   Poisson point process  (4 versions) We describe 2 forms for sequential representations

Ordering of (N)CRM atoms

SLIDE 78

Superposition representation  infinite sum of homogenous CRMs, each with finite # of atoms  (3 versions)

Series representation  function of a homogenous   Poisson point process  (4 versions) We describe 2 forms for sequential representations

Ordering of (N)CRM atoms

Theorem (H., Campbell, How, Broderick).  Can generate (N)CRMs using all 7 sequential representations

SLIDE 79

Sequential representation comparison

Why so many representations?

SLIDE 80

They’re all useful in different circumstances

Sequential representation comparison

Why so many representations?

SLIDE 81

They’re all useful in different circumstances

Sequential representation comparison

Why so many representations?

Series Reps Superposition Reps B-Rep IL-Rep R-Rep T-Rep DB-Rep PL-Rep SB-Rep Error Bound Decay

✓

(exp)

✓

(exp)

✓/✗ ✗ ✓

(exp)

✓

(exp)

✗

Ease of Analysis

✗ ✗✗ ✗ ✗ ✓ ✓ ✓

Generality

✓ ✓ ✓ ✓ ✓ ✓ ✓

Known # Atoms

✓ ✓ ✗ ✗ ✗ ✗ ✗

SLIDE 82

Given Gamma process:

Sequential representation example

SLIDE 83

Given Gamma process: Step 1: compute

Sequential representation example

SLIDE 84

Given Gamma process: Step 1: compute

Sequential representation example

SLIDE 85

Given Gamma process: Step 1: compute Step 2: compute

Sequential representation example

SLIDE 86

Given Gamma process: Step 1: compute Step 2: compute

Sequential representation example

SLIDE 87

Given Gamma process: Exponential(𝜇) density! Step 1: compute Step 2: compute

Sequential representation example

SLIDE 88

Given Gamma process: Exponential(𝜇) density! Step 3: plug in! Step 1: compute Step 2: compute

Sequential representation example

SLIDE 89

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 90

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 91

How close is our finite approximation?

Choosing between the seven representations

SLIDE 92

How close is our finite approximation? Truncation error:

Choosing between the seven representations

SLIDE 93

How close is our finite approximation?

truncated ϴK full infinite ϴ

Truncation error:

Choosing between the seven representations

SLIDE 94

generated data generated data

How close is our finite approximation?

truncated ϴK full infinite ϴ

Truncation error:

Choosing between the seven representations

SLIDE 95

Compare the distribution of the data under full vs. truncated generated data generated data

How close is our finite approximation?

truncated ϴK full infinite ϴ

Truncation error:

Choosing between the seven representations

SLIDE 96

Depends on number of observations N and truncation level K

How close is our finite approximation? Truncation error:

Choosing between the seven representations

SLIDE 97

Depends on number of observations N and truncation level K As N gets larger, error increases

How close is our finite approximation? Truncation error:

Choosing between the seven representations

SLIDE 98

Depends on number of observations N and truncation level K As N gets larger, error increases As K gets larger, error decreases

How close is our finite approximation? Truncation error:

Choosing between the seven representations

ε

SLIDE 99

Depends on number of observations N and truncation level K As N gets larger, error increases As K gets larger, error decreases Cannot evaluate exactly, so we develop new upper bounds

How close is our finite approximation? Truncation error:

Choosing between the seven representations

ε

SLIDE 100

Lemma (H., Campbell, How, Broderick).

Protobound

i.e. P( whoops! )

Leads to all the other truncation error bounds in this work

The truncation error

SLIDE 101

Lemma (H., Campbell, How, Broderick).

Protobound

i.e. P( whoops! )

Leads to all the other truncation error bounds in this work

The truncation error

Theorem (HCHB). The series

rep error is bounded by

kpN,∞ pN,Kk1  1 e−

R ∞ E[¯ π(τ(V,u+GK))N ]du

SLIDE 102

Lemma (H., Campbell, How, Broderick).

Protobound

i.e. P( whoops! )

Leads to all the other truncation error bounds in this work

The truncation error

Theorem (HCHB). The series

rep error is bounded by

kpN,∞ pN,Kk1  1 e−

R ∞ E[¯ π(τ(V,u+GK))N ]du

kpN,∞ pN,Kk1  1 e−

R ∞ ¯ π(θ)Nν+

K(dθ)

Theorem (HCHB). The superposition rep error is bounded by

SLIDE 103

Given Gamma-Poisson process:

Error bound example

SLIDE 104

Step 1: bound the integral, where : Given Gamma-Poisson process:

Error bound example

GK ∼ Gamma(K, c)

SLIDE 105

Step 1: bound the integral, where : Given Gamma-Poisson process:

Integration by parts

Error bound example

GK ∼ Gamma(K, c)

SLIDE 106

Step 1: bound the integral, where : Given Gamma-Poisson process:

Integration by parts

Error bound example

GK ∼ Gamma(K, c)

SLIDE 107

Step 1: bound the integral, where : Given Gamma-Poisson process:

Integration by parts Gamma expectation

Error bound example

GK ∼ Gamma(K, c)

SLIDE 108

Step 1: bound the integral, where : Given Gamma-Poisson process:

Integration by parts Gamma expectation

Step 2: plug in!

Error bound example

1 2kpN,∞ pN,Kk1  1 exp ( Nγ ✓ γλ 1 + γλ ◆K) ⇠ Nγ ✓ γλ 1 + γλ ◆K , K ! 1 GK ∼ Gamma(K, c)

SLIDE 109

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap

SLIDE 110

Tractable models in BNP two forms for sequential representations Truncation and error analysis

Truncation Roadmap ✓ ✓ ✓

SLIDE 111

Previous Work

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓

SLIDE 112

Our Work

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓

𝚫P

✓ ✓ ✓

(N)CRM

✓ ✓ ✓

SLIDE 113

Our Work

Finite Approximation Approximation Error Bounds Computational Complexity DP

✓ ✓ ✓

BP

✓ ✓ ✓

BPP

✓ ✓ ✓

𝚫P

✓ ✓ ✓

(N)CRM

✓ ✓ ✓

SLIDE 114

Conclusions

SLIDE 115

The sequential representations and   truncation error bounds we develop…

Conclusions

SLIDE 116

The sequential representations and   truncation error bounds we develop…

Expand the class of BNP priors that admit efficient inference

Conclusions

SLIDE 117

The sequential representations and   truncation error bounds we develop…

Expand the class of BNP priors that admit efficient inference
Help automate the use of BNP models (e.g. in PPLs)

Conclusions

SLIDE 118

The sequential representations and   truncation error bounds we develop…

Expand the class of BNP priors that admit efficient inference
Help automate the use of BNP models (e.g. in PPLs)
Facilitates the use of “modern” inference methods (e.g. HMC

and VB) with BNP models

Conclusions

SLIDE 119

The sequential representations and   truncation error bounds we develop…

Expand the class of BNP priors that admit efficient inference
Help automate the use of BNP models (e.g. in PPLs)
Facilitates the use of “modern” inference methods (e.g. HMC

and VB) with BNP models

Trade off computational efficiency and statistical accuracy of

truncated model

Conclusions

SLIDE 120

The sequential representations and   truncation error bounds we develop…

Expand the class of BNP priors that admit efficient inference
Help automate the use of BNP models (e.g. in PPLs)
Facilitates the use of “modern” inference methods (e.g. HMC

and VB) with BNP models

Trade off computational efficiency and statistical accuracy of

truncated model

J. Huggins*, T. Campbell*, J. How, T. Broderick.

Truncated Random Measures. Submitted, 2016.  Available online: https://arxiv.org/abs/1603.00861

Truncated Random Measures

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

What leads to a statistical method being used for science?

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

BNP: awesome, but challenging to use

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Inference in BNP models

Past work: finite approximations to BNP priors

✓ ✓ ✓

✓ ✓ ✓

✓

✓ ✓ ✓

✓

Past work: finite approximations to BNP priors

✓ ✓ ✓

✓ ✓ ✓

✓

✓ ✓ ✓

✓

Past work: finite approximations to BNP priors

✓ ✓ ✓

✓ ✓ ✓

✓

✓ ✓ ✓

✓

Sparse results for a few priors in BNP

Past work: finite approximations to BNP priors

✓ ✓ ✓

✓ ✓ ✓

✓

✓ ✓ ✓

✓

Sparse results for a few priors in BNP No general theory

Truncation Roadmap

Truncation Roadmap

Truncation Roadmap

Truncation Roadmap

Truncation Roadmap

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

The Standard Model in BNP (By Example )

Θ

The Standard Model in BNP (By Example )

Θ

The Standard Model in BNP (By Example )

Θ

The Standard Model in BNP (By Example )

Θ

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use

BNP: awesome, but   challenging to use