Generalizing Tree Probability Estimation via Bayesian Networks Cheng - - PowerPoint PPT Presentation

generalizing tree probability estimation via bayesian
SMART_READER_LITE
LIVE PREVIEW

Generalizing Tree Probability Estimation via Bayesian Networks Cheng - - PowerPoint PPT Presentation

December 1, 2018 Fred Hutchinson Cancer Research Center, Seattle, WA Generalizing Tree Probability Estimation via Bayesian Networks Cheng Zhang and Frederick A. Matsen IV Tree of life phylogenetic trees are used to model the evolutionary


slide-1
SLIDE 1

Generalizing Tree Probability Estimation via Bayesian Networks

Cheng Zhang and Frederick A. Matsen IV December 1, 2018

Fred Hutchinson Cancer Research Center, Seattle, WA

slide-2
SLIDE 2

Phylogenetic Trees

Tree of life

In Molecular Evolution, phylogenetic trees are used to model the evolutionary relationship among various biological species or other entities.

from Darwin’s Notebook

1/8

slide-3
SLIDE 3

Phylogenetic Trees

Tree of life

In Molecular Evolution, phylogenetic trees are used to model the evolutionary relationship among various biological species or other entities.

from Darwin’s Notebook

1/8

slide-4
SLIDE 4

Probability Estimation of Phylogenetic Trees

Markov chain Monte Carlo

Current approaches are unsatisfactory.

  • Sample relative frequencies (SRF).

– Do not generalize!

  • Conditional clade distribution (CCD).

– Not flexible enough for real data! What is the best way to use MCMC samples? Our Contribution: Subsplit Bayesian Networks. A general probability estimation framework for phylogenetic trees based on MCMC samples that

  • generalizes to unsampled trees.
  • provides accurate approximation for real data posteriors.

Key: harness the similarity of trees properly.

2/8

slide-5
SLIDE 5

Probability Estimation of Phylogenetic Trees

Markov chain Monte Carlo

Current approaches are unsatisfactory.

  • Sample relative frequencies (SRF).

– Do not generalize!

  • Conditional clade distribution (CCD).

– Not flexible enough for real data! What is the best way to use MCMC samples? Our Contribution: Subsplit Bayesian Networks. A general probability estimation framework for phylogenetic trees based on MCMC samples that

  • generalizes to unsampled trees.
  • provides accurate approximation for real data posteriors.

Key: harness the similarity of trees properly.

2/8

slide-6
SLIDE 6

Probability Estimation of Phylogenetic Trees

Markov chain Monte Carlo

Current approaches are unsatisfactory.

  • Sample relative frequencies (SRF).

– Do not generalize!

  • Conditional clade distribution (CCD).

– Not flexible enough for real data! What is the best way to use MCMC samples? Our Contribution: Subsplit Bayesian Networks. A general probability estimation framework for phylogenetic trees based on MCMC samples that

  • generalizes to unsampled trees.
  • provides accurate approximation for real data posteriors.

Key: harness the similarity of trees properly.

2/8

slide-7
SLIDE 7

Probability Estimation of Phylogenetic Trees

Markov chain Monte Carlo

Current approaches are unsatisfactory.

  • Sample relative frequencies (SRF).

– Do not generalize!

  • Conditional clade distribution (CCD).

– Not flexible enough for real data! What is the best way to use MCMC samples? Our Contribution: Subsplit Bayesian Networks. A general probability estimation framework for phylogenetic trees based on MCMC samples that

  • generalizes to unsampled trees.
  • provides accurate approximation for real data posteriors.

Key: harness the similarity of trees properly.

2/8

slide-8
SLIDE 8

Problem Setup

O1 O2 O3 O4 O5 O6 O7 O8

C4 C6 C7 C2 C5 C3 C1

  • Leaf label set X = {O1, . . . , ON}, each label

represents a species.

  • A clade X is a nonempty subset of X.

C5 = {O3, O4, O5}, C7 = {O6, O7}.

  • Clade Decomposition

TC = {C2, C3, C4, C5, C6, C7} A subsplit of a clade X is an ordered pair of disjoint subclades Y Z such that Y Z X Y

  • Z. Examples: C1

C2 C3 C2 C4 C5 . Subsplit Decomposition T C2 C3 C4 C5 O3 C6 C7 O8 p T p C2 C3 p C4 C5 C2 C3 p C6 C4 C5 p C7 C2 C3

3/8

slide-9
SLIDE 9

Problem Setup

O1 O2 O3 O4 O5 O6 O7 O8

C4 C6 C7 C2 C5 C3 C1

  • Leaf label set X = {O1, . . . , ON}, each label

represents a species.

  • A clade X is a nonempty subset of X.

C5 = {O3, O4, O5}, C7 = {O6, O7}.

  • Clade Decomposition

TC = {C2, C3, C4, C5, C6, C7} A subsplit of a clade X is an ordered pair of disjoint subclades (Y, Z) such that Y ∪ Z = X, Y ≻ Z. Examples: C1 → (C2, C3), C2 → (C4, C5). Subsplit Decomposition TS = {(C2, C3), (C4, C5), ({O3}, C6), (C7, {O8})} p T p C2 C3 p C4 C5 C2 C3 p C6 C4 C5 p C7 C2 C3

3/8

slide-10
SLIDE 10

Problem Setup

O1 O2 O3 O4 O5 O6 O7 O8

C4 C6 C7 C2 C5 C3 C1

  • Leaf label set X = {O1, . . . , ON}, each label

represents a species.

  • A clade X is a nonempty subset of X.

C5 = {O3, O4, O5}, C7 = {O6, O7}.

  • Clade Decomposition

TC = {C2, C3, C4, C5, C6, C7} A subsplit of a clade X is an ordered pair of disjoint subclades (Y, Z) such that Y ∪ Z = X, Y ≻ Z. Examples: C1 → (C2, C3), C2 → (C4, C5). Subsplit Decomposition TS = {(C2, C3), (C4, C5), ({O3}, C6), (C7, {O8})} p(T) = p(C2, C3)p(C4, C5|C2, C3)p(C6|C4, C5)p(C7|C2, C3)

3/8

slide-11
SLIDE 11

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn T p S1

i 1

p Si S

i

SBNs provide valid probability distributions and are flexible.

4/8

slide-12
SLIDE 12

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn T p S1

i 1

p Si S

i

SBNs provide valid probability distributions and are flexible.

4/8

slide-13
SLIDE 13

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn T p S1

i 1

p Si S

i

SBNs provide valid probability distributions and are flexible.

4/8

slide-14
SLIDE 14

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn T p S1

i 1

p Si S

i

SBNs provide valid probability distributions and are flexible.

4/8

slide-15
SLIDE 15

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn T p S1

i 1

p Si S

i

SBNs provide valid probability distributions and are flexible.

4/8

slide-16
SLIDE 16

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn(T) = p(S1) ∏

i>1

p(Si|Sπi) SBNs provide valid probability distributions and are flexible.

4/8

slide-17
SLIDE 17

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn(T) = p(S1) ∏

i>1

p(Si|Sπi) SBNs provide valid probability distributions and are flexible.

4/8

slide-18
SLIDE 18

Subsplit Bayesian Networks

S1 S2 S3 S4 S5 S6 S7 · · · · · · · · · · · · · · · · · · · · · · · ·

D A B C A B C D

ABC D A BC

D A

B C

D D

1.0 1.0 1.0 1.0

AB CD A B C D

A B C D

1.0 1.0 1.0 1.0

A Subsplit Bayesian Network on a leaf set X of size N is a Bayesian network

  • nodes take on subsplit / singleton clade values.
  • contains a full and complete binary tree.

SBN probability for rooted trees psbn(T) = p(S1) ∏

i>1

p(Si|Sπi) SBNs provide valid probability distributions and are flexible.

4/8

slide-19
SLIDE 19

Learning SBNs

Rooted Trees

  • maximum likelihood

Unrooted Trees

  • Expectation Maximization
  • simple averaging lower bound maximization
  • incorporate regularization when necessary

A B C D

1 2 4 5 3

A B C D

1 root/unroot

A B C D

3 root/unroot

A A B

C D

A

B CD A BCD

A B C D

A B C D AB CD

S4 S5 S6 S7 S2 S3 S1

5/8

slide-20
SLIDE 20

Learning SBNs

Rooted Trees

  • maximum likelihood

Unrooted Trees

  • Expectation Maximization
  • simple averaging lower bound maximization
  • incorporate regularization when necessary

A B C D

1 2 4 5 3

A B C D

1 root/unroot

A B C D

3 root/unroot

A A B

C D

A

B CD A BCD

A B C D

A B C D AB CD

S4 S5 S6 S7 S2 S3 S1

5/8

slide-21
SLIDE 21

Learning SBNs

Rooted Trees

  • maximum likelihood

Unrooted Trees

  • Expectation Maximization
  • simple averaging lower bound maximization
  • incorporate regularization when necessary

A B C D

1 2 4 5 3

A B C D

1 root/unroot

A B C D

3 root/unroot

A A B

C D

A

B CD A BCD

A B C D

A B C D AB CD

S4 S5 S6 S7 S2 S3 S1

5/8

slide-22
SLIDE 22

Experiments

10

8

10

6

10

4

10

2

100

log(ground truth)

10

8

10

7

10

6

10

5

10

4

10

3

10

2

10

1

100

log(estimated probability)

CCD

peak 1 peak 2

10

8

10

6

10

4

10

2

100

log(ground truth)

10

8

10

7

10

6

10

5

10

4

10

3

10

2

10

1

100

log(estimated probability)

SBN-EM

peak 1 peak 2

104 105

K

10

2

10

1

100

KL divergence

DS1

ccd sbn-sa sbn-em sbn-em- srf

A real data set with multimodal posterior

6/8

slide-23
SLIDE 23

Experiments

Data set (#Taxa, #Sites) Tree space size Sampled trees KL divergence to ground truth SRF CCD SBN-SA SBN-EM SBN-EM-α DS1 (27, 1949) 5.84×1032 1228 0.0155 0.6027 0.0687 0.0136 0.0130 DS2 (29, 2520) 1.58×1035 7 0.0122 0.0218 0.0218 0.0199 0.0128 DS3 (36, 1812) 4.89×1047 43 0.3539 0.2074 0.1152 0.1243 0.0882 DS4 (41, 1137) 1.01×1057 828 0.5322 0.1952 0.1021 0.0763 0.0637 DS5 (50, 378) 2.84×1074 33752 11.5746 1.3272 0.8952 0.8599 0.8218 DS6 (50, 1133) 2.84×1074 35407 10.0159 0.4526 0.2613 0.3016 0.2786 DS7 (59, 1824) 4.36×1092 1125 1.2765 0.3292 0.2341 0.0483 0.0399 DS8 (64, 1008) 1.04×10103 3067 2.1653 0.4149 0.2212 0.1415 0.1236

7/8

slide-24
SLIDE 24

Summary

Poster # 123

  • We proposed a general framework for tree probability estimation based on subsplit

Bayesian networks.

  • SBNs exploit the similarity among trees to provide flexible probability estimators

that generalize to unsampled trees.

  • Future work
  • extends to general trees
  • structure learning of SBNs
  • deeper investigation on the effect of parameter sharing
  • applications in other probabilistic learning problems in tree spaces (e.g., MCMC transition

kernel design and variational inference)

8/8