Automatic code rewriting in probabilistic programming Internship - - PowerPoint PPT Presentation

automatic code rewriting in probabilistic programming
SMART_READER_LITE
LIVE PREVIEW

Automatic code rewriting in probabilistic programming Internship - - PowerPoint PPT Presentation

Background The transformation in theory Technical aspects Experimental performance Automatic code rewriting in probabilistic programming Internship supervised by Hongseok Yang at the University of Oxford Diane Gallois-Wong September 11,


slide-1
SLIDE 1

Background The transformation in theory Technical aspects Experimental performance

Automatic code rewriting in probabilistic programming

Internship supervised by Hongseok Yang at the University of Oxford Diane Gallois-Wong September 11, 2015

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 1 / 13

slide-2
SLIDE 2

Background The transformation in theory Technical aspects Experimental performance

Introduction

Probabilistic programming languages: short, intuitive code to describe probabilistic models, built-in inference algorithms LDA model: straightforward, naive implementation: easy to write but bad performance collapsed implementation: complex, better performance Automatic transformation of naive code into collapsed version

Formalism with a lambda calculus, program analysis techniques

Anglican: a probabilistic programming language (Oxford)

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 2 / 13

slide-3
SLIDE 3

Background The transformation in theory Technical aspects Experimental performance Bayesian statistics The probabilistic pogramming language Anglican Latent Dirichlet Allocation

Bayesian statistics

Bayes’ Theorem p(H|X) = p(X|H) p(H) p(X) H: hypothesis, X: observation (events) prior probability, posterior probability, likelihood function

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 3 / 13

slide-4
SLIDE 4

Background The transformation in theory Technical aspects Experimental performance Bayesian statistics The probabilistic pogramming language Anglican Latent Dirichlet Allocation

Anglican

Probabilistic programming language Integrated in the functional language Clojure (dialect of Lisp)

(let [more-heads (sample (flip (/ 1 2))) coin (flip (if more-heads (/ 2 3) (/ 1 3)))] (observe coin true) (observe coin true) (observe coin true) (predict more-heads))

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 4 / 13

slide-5
SLIDE 5

Background The transformation in theory Technical aspects Experimental performance Bayesian statistics The probabilistic pogramming language Anglican Latent Dirichlet Allocation

Latent Dirichlet Allocation (LDA)

Topic Model Input: collection of documents which are collections of words Aim: classify the documents using topics LDA: generative topic model θd ∼ Dirichlet(α)

  • prob. vector over topics for each document d

ϕk ∼ Dirichlet(β)

  • prob. vector over words for each topic k

zd,n ∼ Discrete (θd)

for each position n in document d, choose a topic zd,n according to θd then a word wd,n according to ϕzd,n

wd,n ∼ Discrete (ϕzd,n)

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 5 / 13

slide-6
SLIDE 6

Background The transformation in theory Technical aspects Experimental performance Highly expressive latent variables Conjugate Prior and Dirichlet Process

Highly expressive latent variables

Naive implementation of LDA: Inputs: a corpus of documents w where wd,n is the word at position n in document d and hyperparameters α and β for each d: θd = sample Dirichlet(α) for each k: ϕk = sample Dirichlet(β) for each d and each n: zd,n = sample Discrete(θd)

  • bserve Discrete(ϕzd,n) wd,n

predict z

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 6 / 13

slide-7
SLIDE 7

Background The transformation in theory Technical aspects Experimental performance Highly expressive latent variables Conjugate Prior and Dirichlet Process

Conjugate Prior

θ ∼ Dirichlet(α) x | θ ∼ Discrete(θ) x ∼ Discrete(α) θ | x ∼ Dirichlet(f (α, x)) Dirichlet is a conjugate prior with a Discrete likelihood. Dirichlet(α) interpretation: class i observed αi − 1 times. f (α, x) = (α where component with index x incremented)

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 7 / 13

slide-8
SLIDE 8

Background The transformation in theory Technical aspects Experimental performance Highly expressive latent variables Conjugate Prior and Dirichlet Process

Conjugate Prior and Dirichlet Process

θ ∼ Dirichlet(α) x | θ ∼ Discrete(θ) x ∼ Discrete(α) θ | x ∼ Dirichlet(f (α, x)) θ ∼ Dirichlet(α) x0 ∼ Discrete(θ) x1 ∼ Discrete(θ) x2 ∼ Discrete(θ) x0 ∼ Discrete(α) ; α0 = f (α, x0) x1 ∼ Discrete(α0) ; α1 = f (α0, x1) x2 ∼ Discrete(α1) ; α2 = f (α1, x2) (θ ∼ Dirichlet(α2)) θ is marginalised.

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 8 / 13

slide-9
SLIDE 9

Background The transformation in theory Technical aspects Experimental performance Necessary conditions on variables to marginalise State and modular functions

Necessary conditions on variables to marginalise

Necessary conditions imposed to make things easier. Main condition to marginalise θ: θ should only appear as argument to a Discrete distribution. Automatically checked through multiple steps of program analysis, including a type and effect system.

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 9 / 13

slide-10
SLIDE 10

Background The transformation in theory Technical aspects Experimental performance Necessary conditions on variables to marginalise State and modular functions

State and modular functions

x1 ∼ Discrete(θ) − → x1 ∼ Discrete(α0) ; α1 = f (α0, x1) x1 = sample Discrete(θ) − → [x1; S] = S-sample S-Discrete(θ) S

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 10 / 13

slide-11
SLIDE 11

Background The transformation in theory Technical aspects Experimental performance Necessary conditions on variables to marginalise State and modular functions

An example of code

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 11 / 13

slide-12
SLIDE 12

Background The transformation in theory Technical aspects Experimental performance

Vocabulary words: 0, 1, 2. Input is a corpus generated ac- cording to two topics: [0.5 ; 0 ; 0.5] , [0 ; 0.5 ; 0.5] Plot estimations of a topic.

Naive implementation Transformed code Naive implementation

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 12 / 13

slide-13
SLIDE 13

Background The transformation in theory Technical aspects Experimental performance

Conclusion

Formal definition of the automatic transformation and working implementation (small subset of Anglican) Potential improvements: extend the subset, weaken necessary conditions, improve memory management of the state, extend to other conjugate priors Will be used as part of a much more general automatic code transformation developed by my supervisor to be eventually added to Anglican High level transformation: affects the model itself

Diane Gallois-Wong Automatic code rewriting in probabilistic programming 13 / 13