Concatenation A complicated story Concatenated ML Assumes all - - PowerPoint PPT Presentation

concatenation
SMART_READER_LITE
LIVE PREVIEW

Concatenation A complicated story Concatenated ML Assumes all - - PowerPoint PPT Presentation

Concatenation A complicated story Concatenated ML Assumes all sequences evolve down 1 tree As mutation rate -> 0 Likelihood dominated by single mutations When a character changes, it changes once Maximum Likelihood ->


slide-1
SLIDE 1

Concatenation

A complicated story

slide-2
SLIDE 2

Concatenated ML

  • Assumes all sequences evolve down 1 tree
  • As mutation rate -> 0
  • Likelihood dominated by single mutations
  • When a character changes, it changes once
  • Maximum Likelihood -> Maximum Parsimony
  • This is Claim 1
slide-3
SLIDE 3

Proposition 1

The six-leaf balanced tree has a lower parsimony score than the unbalanced tree

slide-4
SLIDE 4
  • As branches shorten, deep coalescence ‘almost

always’

slide-5
SLIDE 5

Conditions

  • Concatenated ML is inconsistent for certain

caterpillar trees

  • Our intuition is that this is all species trees n>6
  • This proof covers caterpillar species trees n=6
  • However, the mutation rate must be very low
  • r-state symmetric and infinite alleles equivalent
  • Gets worse as you add genes/sites
slide-6
SLIDE 6

Partitioned ML

  • What makes it partitioned?
  • Only assume constant topology
  • Branch lengths + substitution matrices can

vary

  • Statistically consistent under ILS?
  • Not affected by this proof, so maybe?
slide-7
SLIDE 7

PANIC

  • Unpartitioned ML positively misleading
  • All results for statistical consistency of summary

methods assumes infinite length genes

  • No gene tree error
  • In the presence of gene tree error: ?!?!
  • In the presence of recombination: ?!?!
  • No method proven consistent on fixed-length genes
slide-8
SLIDE 8

Keep Calm, Concatenate

  • short edges -> concatenated alignments fail
  • Extinction -> few short edges deep in tree
  • Failures probably restricted to twigs
  • Real data->few deep short edges
  • But how would you know?
slide-9
SLIDE 9

Now what?

  • Good news:
  • The species tree is identifiable
  • A statistically consistent method is possible
  • Bad news:
  • Either don’t have it or can’t prove we have it
  • Good news again:
  • Concatenation works ok, I guess
slide-10
SLIDE 10

Questions?