Efficient MCMC for Continuous Time Discrete State Systems Vinayak - PowerPoint PPT Presentation

Efficient MCMC for Continuous Time Discrete State Systems Vinayak Rao and Yee Whye Teh Gatsby Computational Neuroscience Unit, University College London

Overview Continuous time discrete space systems: applications in physics, chemistry, genetics, ecology, neuroscience etc. The simplest example: the Poisson process on the real line. Generalizations: renewal processes, Markov jump processes, continuous time Bayesian networks etc. These relate back to the basic Poisson process via the idea of uniformization . We use this connection to develop tractable models and efficient MCMC sampling algorithms. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 2 / 40

Thinning Uniformization generalizes the idea of ‘thinning’. Thinning: to sample from a Poisson process with rate λ ( t ). Sample events from a Poisson process with rate Ω > λ ( t ) ∀ t . Thin or reject each event with probability 1 − λ ( t ) Ω . x x o o o o Follows from the complete randomness of the Poisson process. Markov jump processes or renewal processes are not completely random: Uniformization —thin points by running a Markov chain . Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 3 / 40

Uniformization (at a high level) Draw from a Poisson process with rate Ω. Ω is larger than the fastest rate at which ‘events occur’. Construct a Markov chain with transition times given by the drawn event times. The Markov chain is subordinated to the Poisson process. Keep a point t with probability λ ( t | state ) / Ω. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 4 / 40

Markov jump processes (MJPs) An MJP S ( t ) , t ∈ R + is a right-continuous piecewise-constant stochastic process taking values in some finite space. S = { 1 , 2 , ... n } . It is parametrized by an initial distribution π and a rate matrix A . A ij : rate of leaving state i for j   A 11 A 12 . . . A 1 n n A 21 A 22 . . . A 2 n �   A ii = − A ij  . . .  ... . . .   . . . j =1 , j � = i   | A ii | : rate of leaving state i A n 1 A n 2 . . . A nn Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 5 / 40

Uniformization for MJPs Alternative to Gillespie’s algorithm. Sample a set of event times from a Poisson process with rate Ω ≥ max i | A ii | on the interval [ t start , t end ]. Run a discrete time Markov chain with initial distribution π and transition probability matrix B = I + 1 Ω A on these event times. The matrix B allows self-transitions. [Jensen, 1953] Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 6 / 40

Uniformization for MJPs Lemma For any Ω ≥ max i | A ii | , the (continuous time) sequence of states obtained by the uniformized process is a sample from a MJP with initial distribution π and rate matrix A. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 7 / 40

Auxiliary variable Gibbs sampler Given noisy observations of an MJP, obtain samples from posterior. Observations can include: State values at the end points of an interval. Observations x ( t ) ∼ F ( S ( t )) at a finite set of times t . More complicated likelihood functions that depend on the entire trajectory, e.g. Markov modulated Poisson processes and continuous time Bayesian networks (see later). State space of Gibbs sampler consist of: Trajectory of MJP S ( t ). Auxiliary set of event times rejected via self-transitions. [Rao and Teh, 2011a] Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 8 / 40

Auxiliary variable Gibbs sampler Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 9 / 40

Auxiliary variable Gibbs sampler Given current MJP path, we need to resample the set of rejected events. Conditioned on the path, these are: ◮ independent of the observations , ◮ produced by ‘thinning’ a rate Ω Poisson process with probability A S ( t ) S ( t ) 1 + , Ω ◮ thus, distributed according to a inhomogeneous Poisson process with piecewise constant rate (Ω + A S ( t ) S ( t ) ). Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 9 / 40

Auxiliary variable Gibbs sampler Given all potential transition event times, the MJP trajectory is resampled using the forward-filtering backward-sampling algorithm. The likelihood of the state between 2 successive points must include all observations in that interval. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 9 / 40

Comments Complexity: O ( n 2 P ), where P is the (random) number of points. Can take advantage of sparsity in transition rate matrix A . Only dependence between successive samples is via the transition times of the trajectory. Increasing Ω reduces this dependence, but increases computational cost. Sampler is ergodic for any Ω > max i | A ii | . Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 10 / 40

Existing approaches to sampling [Fearnhead and Sherlock, 2006, Hobolth and Stone, 2009] produce independent posterior samples, marginalizing over the infinitely many MJP paths using matrix exponentiation. scale as O ( n 3 + n 2 P ). any structure, e.g. sparsity, in the rate matrix A cannot be exploited in matrix exponentiation. cannot be easily extended to complicated likelihood functions (e.g. Markov modulated Poisson processes, continuous time Bayesian networks). Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 11 / 40

Continuous-time Bayesian networks (CTBNs) Compact representations of large state space MJPs with structured rate matrices. Applications include ecology, chemistry , network intrusion detection, human computer interaction etc. The rate matrix of a node at time is determined by the configuration of its parents at that time. [Nodelman et al., 2002] Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 12 / 40

Gibbs sampling CTBNs via uniformization P N C ? The trajectories of all nodes are piecewise constant. In a segment of constant parent (P) values, the dynamics of N are controlled by a fixed rate matrix A P . Each child (C) trajectory is effectively a continuous-time observation. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 13 / 40

Gibbs sampling CTBNs via uniformization P N C ? Sample candidate transition times from a Poisson process with rate Ω > A P ii . Between two successive proposed transition events, N remains in a constant state. ◮ This state must account for the likelihood of children nodes’ states. ◮ The state must also explain relevant observations. With the resulting ‘likelihood’ function and transition matrix B = ( I + 1 Ω A P ), sample new trajectory using forward-filtering backward-sampling. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 13 / 40

Existing approaches to inference [El-Hay et al., 2008] describe a Gibbs sampler involving time discretization, which is expensive and approximate. [Fan and Shelton, 2008] uses particle filtering which can be inaccurate if posterior is far from prior. [Nodelman et al., 2002, Nodelman et al., 2005, Opper and Sanguinetti, 2007, Cohn et al., 2010] use deterministic approximations (mean-field and expectation propagation) which are biased and can be inaccurate. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 14 / 40

Experiments We compare our uniformization-based sampler with a state-of-the-art CTBN Gibbs sampler of [El-Hay et al., 2008]. search on the time interval. When comparing running times, we measured times required to produce same effective sample sizes. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 15 / 40

Experiments 3 3 10 10 Uniformization Uniformization El Hay et al. El Hay et al. 2 El Hay et al. (Matrix exp.) El Hay et al. (Matrix exp.) 10 2 10 CPU time in seconds CPU time in seconds 1 10 1 10 0 10 0 10 −1 10 −2 10 −1 10 0 1 2 10 10 10 0 10 20 30 40 50 Dimensionality of nodes in CTBN chain Number of nodes in CTBN chain Figure: CPU time vs number Figure: CPU time vs length of of states of CTBN nodes. CTBN chain. The plots above were produced for a CTBN with a chain topology, increasing the number of nodes in the chain (left) and the number of states of each node (right). Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 16 / 40

Experiments 4 10 Uniformization 10 1 Uniformization El Hay et al. 3 10 El Hay et al. El Hay et al. (Matrix exp.) CPU time in seconds 2 10 Average Relative Error 1 10 0 10 0 10 10 −1 10 100 1000 10000 1 2 3 Number of samples 10 10 10 Length of CTBN time−interval Figure: Average relative error Figure: CPU time vs time vs number of samples interval of CTBN paths. Produced for the standard ‘drug network’. Left: required CPU time as length of the time interval increases. Right: (normalized) absolute error in estimated parameters of the network as the (absolute) number of samples increases. Vinayak Rao, Yee Whye Teh (Gatsby Unit) MCMC for Continuous Time Systems 17 / 40

Efficient MCMC for Continuous Time Discrete State Systems Vinayak - PowerPoint PPT Presentation

Efficient MCMC for Continuous Time Discrete State Systems Vinayak Rao and Yee Whye Teh Gatsby Computational Neuroscience Unit, University College London Overview Continuous time discrete space systems: applications in physics, chemistry,

MCMC for Continuous-Time Discrete-State Systems Vinayak Rao and Yee Whye Teh Gatsby

Discrete-time Systems in the Time Domain Chaiwoot Boonyasiriwat August 21, 2020 Discrete-time

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

Testing MCMC Samplers Jason M.T. Roos First European Bayesian Summit in Marketing Testing MCMC

Additional notes on MCMC sampling Shravan Vasishth March 18, 2020 For more details on MCMC, some

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Continuous-time and discrete-time systems Comparison similarities and differences

CMSC 222: Discrete Mathematics Prof S Fall 2018 What is Discrete Mathematics? Discrete

MCMC for Cut Models or Chasing a Moving Target with MCMC Martyn Plummer International Agency

Modern Computational Statistics Lecture 8: Advanced MCMC Cheng Zhang School of Mathematical

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Introduction to MCMC and BUGS Basic recipes, and a sample of some techniques for getting

FOR MCMC OLD HEADQUARTER CONFIDENTIAL BACKGROUND Existing MCMC Old HQ building is occupying

MCMC and Variational Inference for AutoEncoders Achille Thin 1 , Alain Durmus 2 , Eric Moulines 1 1

Network determination based on birth-death MCMC inference A. Mohammadi and E. Wit February 4,

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

@HOKnetwork ARCHITECT + TECHNOLOGIST CHIEF TECHNOLOGY OFFICER 2011-2012 @jvandezande

Promoting Ranking Diversity for Biomedical Information Retrieval based on LDA Yan Chen, Xiaoshi

Faith vs Unbelief Caleb & Joshua Mob Fact #1 Num 13:27 land = as God promised Fact #2 13:28-29

M ACHINE T RANSLATION Marie-Rene Arend 29 May 2013 Machine translation: Background info +

Concepts as Hubs in the Web of Data Thomas Baker Dublin Core Metadata Initiative International

Rhino 2019/4/16 Rhino Rhino Rhino 1.1

Art and Technology- A Timeline Dr. Gabriela Avram Introduction to Digital Media 17 This

Inheritance Chapter 15 & additional topics Overview Inheritance Introduction

Efficient MCMC for Continuous Time Discrete State Systems Vinayak - PowerPoint PPT Presentation

Efficient MCMC for Continuous Time Discrete State Systems Vinayak Rao and Yee Whye Teh Gatsby Computational Neuroscience Unit, University College London Overview Continuous time discrete space systems: applications in physics, chemistry,

MCMC for Continuous-Time Discrete-State Systems Vinayak Rao and Yee Whye Teh Gatsby

Discrete-time Systems in the Time Domain Chaiwoot Boonyasiriwat August 21, 2020 Discrete-time

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

Testing MCMC Samplers Jason M.T. Roos First European Bayesian Summit in Marketing Testing MCMC

Additional notes on MCMC sampling Shravan Vasishth March 18, 2020 For more details on MCMC, some

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Continuous-time and discrete-time systems Comparison similarities and differences

CMSC 222: Discrete Mathematics Prof S Fall 2018 What is Discrete Mathematics? Discrete

MCMC for Cut Models or Chasing a Moving Target with MCMC Martyn Plummer International Agency

Modern Computational Statistics Lecture 8: Advanced MCMC Cheng Zhang School of Mathematical

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee

Introduction to MCMC and BUGS Basic recipes, and a sample of some techniques for getting

FOR MCMC OLD HEADQUARTER CONFIDENTIAL BACKGROUND Existing MCMC Old HQ building is occupying

MCMC and Variational Inference for AutoEncoders Achille Thin 1 , Alain Durmus 2 , Eric Moulines 1 1

Network determination based on birth-death MCMC inference A. Mohammadi and E. Wit February 4,

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

@HOKnetwork ARCHITECT + TECHNOLOGIST CHIEF TECHNOLOGY OFFICER 2011-2012 @jvandezande

Promoting Ranking Diversity for Biomedical Information Retrieval based on LDA Yan Chen, Xiaoshi

Faith vs Unbelief Caleb &amp; Joshua Mob Fact #1 Num 13:27 land = as God promised Fact #2 13:28-29

M ACHINE T RANSLATION Marie-Rene Arend 29 May 2013 Machine translation: Background info +

Concepts as Hubs in the Web of Data Thomas Baker Dublin Core Metadata Initiative International

Rhino 2019/4/16 Rhino Rhino Rhino 1.1

Art and Technology- A Timeline Dr. Gabriela Avram Introduction to Digital Media 17 This

Inheritance Chapter 15 &amp; additional topics Overview Inheritance Introduction

Faith vs Unbelief Caleb & Joshua Mob Fact #1 Num 13:27 land = as God promised Fact #2 13:28-29

Inheritance Chapter 15 & additional topics Overview Inheritance Introduction