Fast estimation of posterior change-point probabilities for CNV data - PowerPoint PPT Presentation

Fast estimation of posterior change-point probabilities for CNV data The Minh Luong, Yves Rozenholc, Gregory Nuel, MAP5, Universit´ e Paris Descartes July 5, 2012 Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Introduction Change-point methods: applications in econometrics, engineering, network security, signal processing, music classification, bioinformatics e.g. copy number variation (CNV), to identify regions where DNA mutations are related to disease susceptibility High-resolution data, 10’s thousands of clones per chromosome Array comparative genomic hybridization (aCGH) Single nucleotide polymorphism (SNP) array array CGH profile, source: Redon and Carter, Methods Mol Biol. 2009; 529: 37-49. Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Examples of R packages for change-point analysis Unsupervised hidden Markov model (HMM) approaches Willenbrock and Fridyland (2005) - aCGH package Marioni et al (2006) - snapCGH package Non-HMM segmentation approaches Venkatraman and Olshen (2004) - DNAcopy package Hup´ e et al (2004) - GLAD package Likelihood-based approaches - penalization criteria Picard et al (2005) - cghseg package Change-point uncertainty (MCMC) Erdman et al (2008) - bcp package Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Motivation Few exact non-MCMC methods for assessing uncertainty of change-point estimates Methods for finding exact posterior probabilities of change-points: O ( n 2 ) complexity frequentist - Gu´ edon (2007) Bayesian - Rigaill (2011) High-resolution data in genomics technologies ( > 10 , 000 observations per chromosome): Smaller inter-segmental differences: characterize uncertainty More data: need efficient estimates O ( n 2 ) not feasible Next-generation sequencing: need methods adaptable to non-normal data Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Segmentation approach to change-point detection Dataset: X = ( X 1 , X 2 , . . . , X n ): real-valued observations. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Hidden state space: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● S = ( S 1 , S 2 , . . . , S n ): ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● corresponding segment indices. Y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Distribution: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● S: ● 1 2 3 ● ● 4 5 ● ● ● P ( X i | S i = k , θ k ) ∼ g θ k ( · ): X i ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● belongs to segment k . ● Problem of interest: Find P ( S i | X ; θ ) =?, when segments Figure: Segment-based unknown given data change-point detection (K=5) Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Constrained hidden Markov model for segmentation Use of HMM algorithms to estimate posterior probabilities with linear complexity S : Markov chain over { 1 , 2 , . . . , K , K + 1 } , M K : set of possible S { S ∈ M K } : K states in n observations Constraints on HMM correspond exactly to a segmentation change-point model. Find best partitioning S ∈ M K into K non-overlapping intervals, distribution homogeneous within each segment S 1 = 1 , S n = K , junk state: K + 1 Allow for transitions of only 0 or +1, S i − S i − 1 ∈ { 0 , 1 } . P ( S i = k + 1 | S i − 1 = k ) = η k ( i ) P ( S i = k | S i − 1 = k ) = 1 − η k ( i ) Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Adapted forward-backward algorithm Forward and backward quantities, for observation i and state k : F i ( k ) = P ( X 1: i = x 1: i , S i = k ) B i ( k ) = P ( X i +1: n = x i +1: n , S n = K | S i = k ) Initialization: F 1 (1) = g θ 1 ( x 1 ) B 1 ( K − 1) = η K ( x n ) g θ k ( x n ) , B 1 ( K ) = (1 − η K ( x n )) g θ k ( x n ) Recursion: F i ( k ) = [ F i − 1 ( k )(1 − η k ( i )) + 1 k > 1 F i − 1 ( k − 1) η k ( i )] g θ k ( x i ) B i − 1 ( k ) = (1 − η k ( i )) g θ k ( x i ) B i ( k ) + 1 k < K η k +1 ( i ) g θ k +1 ( x i ) B i ( k + 1) Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Posterior probabilities from forward-backward algorithm Posterior probability of state k for observation i P ( S i = k | X 1: n = x 1: n ) = F i ( k ) B i ( k ) F 1 (1) B 1 (1) . Posterior probability of obs i being the k th change-point P ( CP k = i | X 1: n = x 1: n ) = P ( S i = k , S i +1 = k + 1 | X 1: n = x 1: n ) = F i ( k ) η k ( i ) g θ k +1 ( x k +1 ) B i +1 ( k + 1) F 1 (1) B 1 (1) Posterior transition probability from k − 1 th to k th state P ( S i = k | S i − 1 = k − 1 , X 1: n = x 1: n ) = η k − 1 ( i − 1) g θ k ( x i ) B i ( k ) . B i − 1 ( k − 1) Luong et al, MAP5 Fast estimation of posterior change-point probabilities

Fast estimation of posterior change-point probabilities for CNV data - PowerPoint PPT Presentation

Fast estimation of posterior change-point probabilities for CNV data The Minh Luong, Yves Rozenholc, Gregory Nuel, MAP5, Universit e Paris Descartes July 5, 2012 Luong et al, MAP5 Fast estimation of posterior change-point probabilities

A O I Posterior View A O I Posterior View A O I

Section 33: Hip Structural Components 33-1 posterior posterior anterior anterior head of

Review: Probabilities DISCRETE PROBABILITIES Intro We have all been exposed to informal

Point Estimation The goal of Point Estimation is to find the point in -space which gives the

Where do the probabilities come from? Probabilities come from: Experts Data D. Poole

5b Kinesiology: AOIs - Posterior Upper Body 5b Kinesiology: AOIs - Posterior Upper Body

5b Kinesiology AOIs - Posterior Upper Body 5b Kinesiology AOIs - Posterior Upper Body Class

7b Swedish: Technique Demo and Practice - Posterior Lower Body 7b Swedish: Technique Demo and

6b Swedish: Technique Review and Practice - Posterior Upper Body 6b Swedish: Technique Review

Posterior Lower Body 54b Deep Tissue: Technique Demo and Practice - Posterior Lower Body

4b Swedish: Technique Demo and Practice - Posterior Upper Body 4b Swedish: Technique Demo and

5b Kinesiology: AOIs - Posterior Upper Body 5b Kinesiology: AOIs - Posterior Upper Body

8b Kinesiology: AOIs - Posterior Lower Body 8b Kinesiology: AOIs - Posterior Lower Body Class

FEASIBLE JOINT POSTERIOR BELIEFS BAYESIAN COMMUNICATION N Receivers: POSTERIOR s 1 S 1 p

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Computing Posterior Probabilities CSE 4308/5360: Artificial Intelligence I University of Texas

ROPER: A Genetic ROP-Chain Compiler Targetting Embedded Devices Olivia Lucca Fraser NIMS Lab,

CSC 473 Automata, Grammars & Languages 8/15/10 Automata, Grammars and Languages Discourse 01

Topic II.2: Connecting the Dots Discrete Topics in Data Mining Universitt des Saarlandes,

in paediatrics services Follow up learning webinar Wednesday 24 June 2020, 2.00-3.30pm Please

D EFENSE S TRATEGIES : Cover Given two frames x and y make x an element of y : *'+

Information Retrieval Methods for Software Engineering Andrian Marcus with substantial

Free Response Slide 2 / 6 Calculated distances and percent crossovers. Shaw, K. & Miko, I.

Applying the Hybridization Approach to Biological Models Thao Dang VERIMAG, CNRS (France) Joint

Sambuz

Useful Links

Newsletter

Mail Us

Fast estimation of posterior change-point probabilities for CNV data - PowerPoint PPT Presentation

Fast estimation of posterior change-point probabilities for CNV data The Minh Luong, Yves Rozenholc, Gregory Nuel, MAP5, Universit e Paris Descartes July 5, 2012 Luong et al, MAP5 Fast estimation of posterior change-point probabilities

A O I Posterior View A O I Posterior View A O I

Section 33: Hip Structural Components 33-1 posterior posterior anterior anterior head of

Review: Probabilities DISCRETE PROBABILITIES Intro We have all been exposed to informal

Point Estimation The goal of Point Estimation is to find the point in -space which gives the

Where do the probabilities come from? Probabilities come from: Experts Data D. Poole

5b Kinesiology: AOIs - Posterior Upper Body 5b Kinesiology: AOIs - Posterior Upper Body

5b Kinesiology AOIs - Posterior Upper Body 5b Kinesiology AOIs - Posterior Upper Body Class

7b Swedish: Technique Demo and Practice - Posterior Lower Body 7b Swedish: Technique Demo and

6b Swedish: Technique Review and Practice - Posterior Upper Body 6b Swedish: Technique Review

Posterior Lower Body 54b Deep Tissue: Technique Demo and Practice - Posterior Lower Body

4b Swedish: Technique Demo and Practice - Posterior Upper Body 4b Swedish: Technique Demo and

5b Kinesiology: AOIs - Posterior Upper Body 5b Kinesiology: AOIs - Posterior Upper Body

8b Kinesiology: AOIs - Posterior Lower Body 8b Kinesiology: AOIs - Posterior Lower Body Class

FEASIBLE JOINT POSTERIOR BELIEFS BAYESIAN COMMUNICATION N Receivers: POSTERIOR s 1 S 1 p

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Computing Posterior Probabilities CSE 4308/5360: Artificial Intelligence I University of Texas

ROPER: A Genetic ROP-Chain Compiler Targetting Embedded Devices Olivia Lucca Fraser NIMS Lab,

CSC 473 Automata, Grammars &amp; Languages 8/15/10 Automata, Grammars and Languages Discourse 01

Topic II.2: Connecting the Dots Discrete Topics in Data Mining Universitt des Saarlandes,

in paediatrics services Follow up learning webinar Wednesday 24 June 2020, 2.00-3.30pm Please

D EFENSE S TRATEGIES : Cover Given two frames x and y make x an element of y : *'+

Information Retrieval Methods for Software Engineering Andrian Marcus with substantial

Free Response Slide 2 / 6 Calculated distances and percent crossovers. Shaw, K. &amp; Miko, I.

Applying the Hybridization Approach to Biological Models Thao Dang VERIMAG, CNRS (France) Joint

Sambuz

Useful Links

Newsletter

Mail Us

CSC 473 Automata, Grammars & Languages 8/15/10 Automata, Grammars and Languages Discourse 01

Free Response Slide 2 / 6 Calculated distances and percent crossovers. Shaw, K. & Miko, I.