A reversible infinite HMM using normalised random measures - PowerPoint PPT Presentation

A reversible infinite HMM using normalised random measures Konstantina Palla, David A. Knowles, Zoubin Ghahramani 23rd of June 2014 Konstantina Palla 1 / 24

M OTIVATION Assume a Markov chain X 1 , . . . , X t , . . . , X T , which is reversible : P ( X 1 , . . . , X t , . . . X T ) = P ( X T , . . . , X t , . . . , X 1 ) Applications • Modelling physical systems e.g transitions of a macromolecule conformation at fixed temperature. • Chemical dynamics of protein folding. Tasks • Find the transition operation (transition matrix) of the reversible Markov chain • Put a prior on the reversible Markov chain This work: proposes a Bayesian non-parametric prior for reversible Markov chains. Konstantina Palla 2 / 24

R EVERSIBLE M ARKOV CHAINS Problem : Put prior on reversible Markov chains. What does that mean? Reversible chains and random walk on weighted graph G ( V , E , W ) weighted undirected graph • vertex-set V = { i , r , q , . . . } i • edge-set E = { e ir , e iq , e rq , . . . } • weight-set W = { J ir , J rq , J iq , . . . } Discrete-time random walk on G → J iq J ir Markov chain with X t ∈ V and transition matrix J ij P ( i , j ) := k J ik , J rq � r q Put a prior on the transition matrix P (or on the weights J s). Konstantina Palla 3 / 24

B ASIC T HEORY Seminal work by Diaconis, Freedman and Coppersmith. Markov Exchangeability A process on a countable space S is Markov exchangeable if the probability of observing a path X 1 , . . . , X t , . . . , X T is only a function of X 1 and the transition counts C ( i , j ) := |{ X t = i , X t + 1 = j ; 1 ≤ t < T }| for all i , j ∈ S . Representation Theorem (Diaconis and Freedman, 1980) A process is Markov exchangeable and returns to every state visited infinitely often (recurrent), if and only if it is a mixture of recurrent Markov chains T − 1 � � P ( X 2 , . . . , X t , . . . , X T | X 1 ) = P ( X t , X t + 1 ) µ ( dP | X 1 ) P t = 1 where P is the set of stochastic matrices on S × S and the mixing measure µ ( ·| X 1 ) on P is uniquely determined. Problem: Determine the prior µ . Not always easy. Konstantina Palla 4 / 24

R ELATED WORK Random walk with reinforcement i • Idea: Simulate from the prior µ . • Increase the edge weight by +1 each time an edge is crossed. + 1 1 + 1 1 1 T →∞ T [ J ir , J rq , J iq ] − − − → [ L ir , L rq , L iq ] ∼ µ T - total number of steps, µ - measure over edge + 1 weights, the underlying prior r q 1 • Process Markov exchangeable, recurrent → mixture of recurrent MCs Examples • Edge Reinforcement Random Walk (ERRW) Diaconis and Freedman [1980], Diaconis and Rolles [2006]; conjugate prior for the transition matrix for reversible MCs. • Edge reinforced schema by Bacallado et al. [2013] extends ERRW to countably infinite space, reversible process, prior is difficult to characterise. Konstantina Palla 5 / 24

R ELATED WORK Define a prior over reversible Markov chains: 1. Explicitly characterize the measure µ over transition matrix 2. Define an Edge Reinforcement schema Proposed work : Explicitly construct the prior µ over the weights (or equivalently the transition matrix) Konstantina Palla 6 / 24

A MODEL FOR REVERSIBLE M ARKOV CHAINS General idea: Define the prior over the weights using the Gamma process hierarchically . Gamma process Γ P ( α 0 H ) Completely random measure on X with Lévy measure ν ( dw , dx ) = ρ ( dw ) H ( dx ) = a 0 w − 1 e − a 0 w dw H ( dx ) . on the space X × [ 0 , ∞ ) . H is the base measure and α 0 the concentration parameter. ∞ � G 0 := w i δ X i ∼ Γ P ( α 0 H ) i = 1 Countably infinite collection of pairs { X i , w i } ∞ i = 1 sampled from a Poisson process with intensity ν . Konstantina Palla 7 / 24

A MODEL FOR REVERSIBLE M ARKOV CHAINS Define the prior over the weights using the Gamma process hierarchically . Model α 0 µ 0 1. First level: Γ P over space X G 0 α ∞ � G 0 = w i δ x i ∼ Γ P ( α 0 , µ 0 ) i = 1 G Set of states S := { x i ; x i ∈ X , i ∈ N } , countably infinite . X 2. Second level: Γ P over space S × S . w i i ∞ ∞ � � G = J ij δ X i X j ∼ Γ P ( α, µ ) , J iq J ri i = 1 j = 1 J ir J qi J ij | α, w i , w j ∼ Gamma ( α w i w j , α ) J rq Base measure atomic on S × S : w q r w r q µ ( x i , x j ) = G 0 ( x i ) G 0 ( x j ) J qr Non-reversible : Directed edges, J ij � = J ji Konstantina Palla 8 / 24

A MODEL FOR REVERSIBLE M ARKOV CHAINS Reversibility w i i Impose symmetry J ij = J ji ∼ Gamma ( α w i w j , α ) Proof: Sufficient to prove detailed balance J ir J qi π i P ( i , j ) = π j P ( j , i ) � k J ik J rq where π i = k J jk , 0 < � k J jk < ∞ � � w q r w r j q Corollary: π is the invariant measure of the chain. We call the model the Symmetric Hierarchical Gamma Process (SHGP) Konstantina Palla 9 / 24

A MODEL FOR REVERSIBLE M ARKOV CHAINS Properties • Irreducibility A MC is irreducible if ∃ t ∈ N s.t P t ij > 0 , ∀ i , j ∈ S J ij SHGP is irreducible: , J ij , � k J ik ∈ ( 0 , ∞ ) → P ij = k J ik > 0 a.s ∀ i , j ∈ S � • Recurrence A state i is positive recurrent if E ( τ ii ) < ∞ , τ ij := min { t > 1 : X t = j | X 1 = i } The SHGP is positive recurrent since the following applies: Theorem (Levin et al. [2006]) An irreducible Markov chain is positive recurrent iff there exists a probability distribution π such that π = π P. Konstantina Palla 10 / 24

A MODEL FOR REVERSIBLE M ARKOV CHAINS Representation Theorem A process is Markov exchangeable and returns to every state visited infinitely often (recurrent), if and only if it is a mixture of recurrent Markov chains T − 1 � � P ( X 2 , . . . , X t , . . . , X T | X 1 ) = P ( X t , X t + 1 ) µ ( dP | X 1 ) P t = 1 where P is the set of stochastic matrices on S × S and µ ( ·| X 1 ) on P is the mixing measure. SHGP • Explicitly defined prior µ ; hierarchical construction of weights • SHGP is a mixture of recurrent, reversible Markov chains • SHGP is recurrent, Markov exchangeable and reversible. Konstantina Palla 11 / 24

T HE SHGP H IDDEN M ARKOV M ODEL α 0 µ 0 G 0 α G X 1 X 2 X 3 X T X E Y 1 Y 2 Y 3 Y T Y Finite number of states K . Countably X t ∈ { 1 , . . . , K } - hidden state sequence. infinite model as K → ∞ . E - emission matrix K Y t , t = 1 , . . . , T - observed sequence with � G 0 = w i δ x i observation model F ( ·| E ) i = 1 Y t | X t , E ∼ iid F ( ·| E X t ) w i ∼ Gamma ( α 0 µ 0 ( x i ) , α 0 ) K K � � { E k , k = 1 , · · · , K } state emission G = J ij δ x i , x j parameters. F ; multinomial, Poisson and i = 1 j = 1 Gaussian observation models J ij = J ji ∼ Gamma ( α w i w j , α ) Konstantina Palla 12 / 24

E XPERIMENTS We ran the SHGP Hidden Markov model on 2 real world datasets with reversible underlying systems. Comparison against • SHGP HMM non-reversible • infinite HMM (HDP) Konstantina Palla 13 / 24

C H IP- SEQ DATA FROM NEURAL STEM CELLS • ChIP-seq allows us to measure what proteins, with what chemical modifications, are bound to DNA along the genome. • Y matrix T × L , T = 2 · 10 4 and L = 6: counts, how many reads for the protein of interest l map to bin t. • Poisson (multivariate) likelihood model F . 200 H3K27ac H3K27me3 150 H3K4me1 read counts H3K4me3 100 p300 Pol2 50 0 0 50 100 150 200 250 300 genomic location (100bp) Figure: ChipSeq data for a small section of length 300 of the whole chromosome region, along with the L = 6 identifiers (proteins of interest) Konstantina Palla 14 / 24

C H IP- SEQ DATA FROM NEURAL STEM CELLS Task: Predict held out values in Y . Table: ChipSeq results for 10 runs using different hold out patterns (20%), a truncation level of K = 30, 1000 iterations and a burnin of 700. Model Alogirthm Train error Test error Train log likelihood Test log likelihood Reversible HMC 0 . 9122 ± 0 . 0032 1 . 1158 ± 0 . 0097 − 1 . 0488 ± 0 . 0009 − 3 . 2422 ± 0 . 0023 Non-rev 0 . 9127 ± 0 . 0033 1 . 1167 ± 0 . 0095 − 1 . 0494 ± 0 . 0009 − 3 . 2478 ± 0 . 0022 iHMM Beam Sampler 0 . 9383 ± 0 . 0061 1 . 1365 ± 0 . 0107 − 1 . 0727 ± 0 . 0041 − 3 . 3047 ± 0 . 0027 Konstantina Palla 15 / 24

C H IP- SEQ DATA FROM NEURAL STEM CELLS SHGP recovers known types of regulatory regions • promoters . • enhancers . Figure: Learnt emission matrix L × K for ChIP-seq dataset. Element E lk is the Poisson rate parameter for protein l in state k . Brighter indicates higher values . Konstantina Palla 16 / 24

S INGLE ION CHANNEL RECORDINGS DATASET • Patch clamp recordings is a method for measuring conformational changes in ion channels. These changes are accompanied by changes in electrical potential (measurements). • Y matrix 1 × T , T = 10 4 : 10KHz recording of electrical potential measurements of a single alamethicin channel. • Gaussian likelihood model F . Y t | X t , E ∼ N ( Y t ; µ, σ ) , where µ = E ( X t , 1 ) and σ = E ( X t , 2 ) with K × 2 emission matrix E . Table: Ion channel results across 10 different random hold out patterns, a truncation of K = 15, 1000 iterations and a burnin of 700. Model Alogirthm Train error Test error Train log likelihood Test log likelihood 0 . 023 ± 0 . 001 0 . 030 ± 0 . 002 2 . 204 ± 0 . 055 2 . 034 ± 0 . 058 Reversible HMC 0 . 027 ± 0 . 007 0 . 033 ± 0 . 007 2 . 108 ± 0 . 084 1 . 970 ± 0 . 078 Non-reversible HMC 0 . 038 ± 0 . 005 0 . 045 ± 0 . 004 2 . 134 ± 0 . 070 2 . 008 ± 0 . 058 iHMM Beam sampler Konstantina Palla 17 / 24

A reversible infinite HMM using normalised random measures - PowerPoint PPT Presentation

A reversible infinite HMM using normalised random measures Konstantina Palla, David A. Knowles, Zoubin Ghahramani 23rd of June 2014 Konstantina Palla 1 / 24 M OTIVATION Assume a Markov chain X 1 , . . . , X t , . . . , X T , which is reversible

Introduction to Hmm Introduction to Hmm Joe Wu Nov 4 th 2011 Agenda The applications of HMM.

Cell implementation HMM (HMM hidden Markov model) Authors: Jakub Hork Ji Hona

Using HMM to Blur the Lines between CPU and GPU Programming John Hubbard, May 10, 2017

Universality Issues in Reversible Computing Systems and Cellular Automata Kenichi Morita

1 Core new business Normalised operating profit Normalised headline earnings +16 16% % +19

Infinite graphs P eter Komj ath LC12 P eter Komj ath Infinite graphs Infinite

Electric network for non-reversible Markov chains joint with Aron Folly M arton Bal

Entropy Change in Entropy Reversible Isobaric Process Ideal Gas in a Reversible Process Free

Reversible calculus Ioana Cristescu 22 June 2012 Ioana Cristescu Reversible calculus

Reversible Computation and Reversible Programming Languages Tetsuo Yokoyama Graduate School of

Toward a Curry-Howard Correspondence for Linear, Reversible Computation Reversible Computation

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

A Talk on Protein Homology Detection by HMM-HMM comparisons[1] Sding, J Qing Ye Department of

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline

Fast TwoLevel Fast TwoLevel HMM Decodi HMM Decoding ng Algor gorithm for thm for Large

Global Robot Ego-Localization C Combining Image Retrieval and HMM- bi i I R i l d HMM

17 272 623 12 33,943 3,28,027 Center, 2716 centers, 720 Campus, 185 Upazila: 79 Tutor:

Chemical Energy Converter (CA A2) Coordination: Christian Bach, Felix Bchi 2 nd Annual SCCER

New Russian long-range prediction system Tolstykh .. (1,2,3) Fadeev R.Yu. (1,2,3), Shashkin

C. Giannakopoulos 1 , A. Karali 1 , G. Lemesios 1 , V. T enentes 1 , K.V. Varotsos 1 1 National

PENTA-ID Prof Mike Sharland PENTA-ID Expertise and Interests Chronic Infection Acute Infection

Focus group on Anthelmintic Resistance EMA 13 June 2016 Rens van Dobbenburgh, DVM FVE

Previous Involvement With China 1983-4 Guangzhou: project on leishmaniasis 1987- Xinjiang: set

Q2FY19 05-11-2018 Disclaimer Except for the historical information contained herein, statements