Generalizing Tree Probability Estimation via Bayesian Networks Cheng - PowerPoint PPT Presentation

December 1, 2018 Fred Hutchinson Cancer Research Center, Seattle, WA Generalizing Tree Probability Estimation via Bayesian Networks Cheng Zhang and Frederick A. Matsen IV

Tree of life phylogenetic trees are used to model the evolutionary relationship among various biological species or other entities. from Darwin’s Notebook 1/8 Phylogenetic Trees In Molecular Evolution ,

• Sample relative frequencies ( SRF ). – Do not generalize! • Conditional clade distribution ( CCD ). – Not flexible enough for real data! Our Contribution : Subsplit Bayesian Networks . A general probability estimation • generalizes to unsampled trees. • provides accurate approximation for real data posteriors. Key: harness the similarity of trees properly. Current approaches are unsatisfactory. What is the best way to use MCMC samples? framework for phylogenetic trees based on MCMC samples that 2/8 Probability Estimation of Phylogenetic Trees Markov chain Monte Carlo

Our Contribution : Subsplit Bayesian Networks . A general probability estimation • generalizes to unsampled trees. • provides accurate approximation for real data posteriors. Key: harness the similarity of trees properly. Current approaches are unsatisfactory. What is the best way to use MCMC samples? framework for phylogenetic trees based on MCMC samples that 2/8 Probability Estimation of Phylogenetic Trees • Sample relative frequencies ( SRF ). – Do not generalize! • Conditional clade distribution ( CCD ). – Not flexible enough for real data! Markov chain Monte Carlo

Current approaches are unsatisfactory. What is the best way to use MCMC samples? framework for phylogenetic trees based on MCMC samples that • generalizes to unsampled trees. • provides accurate approximation for real data posteriors. 2/8 Probability Estimation of Phylogenetic Trees • Sample relative frequencies ( SRF ). – Do not generalize! • Conditional clade distribution ( CCD ). – Not flexible enough for real data! Markov chain Monte Carlo Our Contribution : Subsplit Bayesian Networks . A general probability estimation Key: harness the similarity of trees properly.

A subsplit of a clade X is an ordered pair of disjoint subclades Y Z such that Z . Examples: C 1 C 2 C 3 C 4 C 5 . Subsplit Decomposition C 2 C 3 C 4 C 5 p C 2 C 3 p C 4 C 5 C 2 C 3 p C 6 C 4 C 5 p C 7 C 2 C 3 Z X Y C 2 3/8 T O 3 C 6 C 7 O 8 p T Y • Clade Decomposition O 1 C 6 O 2 O 3 O 4 O 5 O 6 O 7 O 8 C 4 C 7 C 2 C 5 C 3 C 1 represents a species. Problem Setup • Leaf label set X = { O 1 , . . . , O N } , each label • A clade X is a nonempty subset of X . C 5 = { O 3 , O 4 , O 5 } , C 7 = { O 6 , O 7 } . T C = { C 2 , C 3 , C 4 , C 5 , C 6 , C 7 }

p C 2 C 3 p C 4 C 5 C 2 C 3 p C 6 C 4 C 5 p C 7 C 2 C 3 3/8 C 7 p T • Clade Decomposition represents a species. C 1 O 1 C 5 C 2 C 3 C 6 O 5 O 2 O 3 C 4 O 4 O 6 O 7 O 8 Problem Setup • Leaf label set X = { O 1 , . . . , O N } , each label • A clade X is a nonempty subset of X . C 5 = { O 3 , O 4 , O 5 } , C 7 = { O 6 , O 7 } . T C = { C 2 , C 3 , C 4 , C 5 , C 6 , C 7 } A subsplit of a clade X is an ordered pair of disjoint subclades ( Y , Z ) such that Y ∪ Z = X , Y ≻ Z . Examples: C 1 → ( C 2 , C 3 ) , C 2 → ( C 4 , C 5 ) . Subsplit Decomposition T S = { ( C 2 , C 3 ) , ( C 4 , C 5 ) , ( { O 3 } , C 6 ) , ( C 7 , { O 8 } ) }

3/8 C 6 • Clade Decomposition represents a species. C 1 O 1 C 5 C 2 C 7 C 3 C 4 O 8 O 7 O 6 O 5 O 4 O 2 O 3 Problem Setup • Leaf label set X = { O 1 , . . . , O N } , each label • A clade X is a nonempty subset of X . C 5 = { O 3 , O 4 , O 5 } , C 7 = { O 6 , O 7 } . T C = { C 2 , C 3 , C 4 , C 5 , C 6 , C 7 } A subsplit of a clade X is an ordered pair of disjoint subclades ( Y , Z ) such that Y ∪ Z = X , Y ≻ Z . Examples: C 1 → ( C 2 , C 3 ) , C 2 → ( C 4 , C 5 ) . Subsplit Decomposition T S = { ( C 2 , C 3 ) , ( C 4 , C 5 ) , ( { O 3 } , C 6 ) , ( C 7 , { O 8 } ) } p ( T ) = p ( C 2 , C 3 ) p ( C 4 , C 5 | C 2 , C 3 ) p ( C 6 | C 4 , C 5 ) p ( C 7 | C 2 , C 3 )

p sbn T p S i S SBNs provide valid probability distributions and are flexible. 4/8 B 1.0 1.0 1.0 1.0 AB CD A B C D A 1.0 C D S 1 1.0 1.0 1.0 • nodes take on subsplit / singleton clade values. • contains a full and complete binary tree. SBN probability for rooted trees p S 1 i 1 i D D C C S 5 S 4 S 3 S 2 B D A B A S 7 B C D ABC D A BC D A S 6 Subsplit Bayesian Networks · · · · · · · · · · · · · · · · · · · · · · · · A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

SBNs provide valid probability distributions and are flexible. 4/8 C S 1 D D 1.0 1.0 1.0 1.0 AB CD A B D A A B C D 1.0 1.0 1.0 1.0 • nodes take on subsplit / singleton clade values. • contains a full and complete binary tree. SBN probability for rooted trees B C D B S 5 BC S 4 S 3 S 2 D A C S 7 A B C D ABC D A S 6 Subsplit Bayesian Networks · · · · · · · · · · · · · · · · · · · · · · · · A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network ∏ p sbn ( T ) = p ( S 1 ) p ( S i | S π i ) i > 1

4/8 C S 1 D D 1.0 1.0 1.0 1.0 AB CD A B D A A B C D 1.0 1.0 1.0 1.0 • nodes take on subsplit / singleton clade values. • contains a full and complete binary tree. SBN probability for rooted trees and are flexible. B C D B BC S 5 S 4 S 3 S 2 D A C S 7 A B C D ABC D A S 6 Subsplit Bayesian Networks · · · · · · · · · · · · · · · · · · · · · · · · A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network ∏ p sbn ( T ) = p ( S 1 ) p ( S i | S π i ) i > 1 SBNs provide valid probability distributions

4/8 C S 1 D D 1.0 1.0 1.0 1.0 AB CD A B D A A B C D 1.0 1.0 1.0 1.0 • nodes take on subsplit / singleton clade values. • contains a full and complete binary tree. SBN probability for rooted trees B C D B BC S 5 S 4 S 3 S 2 D A C S 7 A B C D ABC D A S 6 Subsplit Bayesian Networks · · · · · · · · · · · · · · · · · · · · · · · · A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network ∏ p sbn ( T ) = p ( S 1 ) p ( S i | S π i ) i > 1 SBNs provide valid probability distributions and are flexible.

Generalizing Tree Probability Estimation via Bayesian Networks Cheng - PowerPoint PPT Presentation

December 1, 2018 Fred Hutchinson Cancer Research Center, Seattle, WA Generalizing Tree Probability Estimation via Bayesian Networks Cheng Zhang and Frederick A. Matsen IV Tree of life phylogenetic trees are used to model the evolutionary

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Counting and Probability Whats to come? Counting and Probability Whats to come?

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Probability: Classical and Bayesian 12/14/1998 12/14/98 Page 1 12/14/98 Page 2 P(h|e) P(h|e)

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Evolutionary Developmental Soft Robotics Towards adaptive and intelligent machines following

Models of Language Evolution models thereof its evolution language Models of Language Evolution

Biologically Evolved Forms of Compositionality Short version, for presentation at SYCO 1, 20-21

Ev oluationary Computation 1. Computational pro cedures patterned after biological ev

How ow Mod odern rn Sc Science I Inform rms s Ethics an cs and Pe Peace 18th Annual

An An It Iterat erative ive Par Parameter ameter Est Estim imation ation Metho Me thod

Modelling molecular evolution with process algebras Marek Kwiatkowski ETH Z urich & Eawag

Inheritance is a Surjection: Description and Consequences Supplementary Notes and Basic

Generalizing Tree Probability Estimation via Bayesian Networks Cheng - PowerPoint PPT Presentation

December 1, 2018 Fred Hutchinson Cancer Research Center, Seattle, WA Generalizing Tree Probability Estimation via Bayesian Networks Cheng Zhang and Frederick A. Matsen IV Tree of life phylogenetic trees are used to model the evolutionary

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

I 4 - Bayesian parameter estimation in a normal model STAT 587 (Engineering) Iowa State

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Counting and Probability Whats to come? Counting and Probability Whats to come?

Bayesian Estimation of Low-rank Matrices Pierre Alquier Journes de Statistique du Sud,

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Probability: Classical and Bayesian 12/14/1998 12/14/98 Page 1 12/14/98 Page 2 P(h|e) P(h|e)

Maximum-likelihood and Bayesian parameter estimation Andrea Passerini passerini@disi.unitn.it

Evolutionary Developmental Soft Robotics Towards adaptive and intelligent machines following

Models of Language Evolution models thereof its evolution language Models of Language Evolution

Biologically Evolved Forms of Compositionality Short version, for presentation at SYCO 1, 20-21

Ev oluationary Computation 1. Computational pro cedures patterned after biological ev

How ow Mod odern rn Sc Science I Inform rms s Ethics an cs and Pe Peace 18th Annual

An An It Iterat erative ive Par Parameter ameter Est Estim imation ation Metho Me thod

Modelling molecular evolution with process algebras Marek Kwiatkowski ETH Z urich &amp; Eawag

Inheritance is a Surjection: Description and Consequences Supplementary Notes and Basic

Modelling molecular evolution with process algebras Marek Kwiatkowski ETH Z urich & Eawag