Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 - PowerPoint PPT Presentation

Phylogenetic trees IV Maximum Likelihood Gerhard Jäger ESSLLI 2016 Gerhard Jäger Maximum Likelihood ESSLLI 2016 1 / 50

Theory Theory Gerhard Jäger Maximum Likelihood ESSLLI 2016 2 / 50

Theory Recap: Continuous time Markov model ESSLLI 2016 Maximum Likelihood Gerhard Jäger 3 / 50 l 4 l 3 l 2 l 5 � s + re − t r − re − t � l 8 P ( t ) = s − se − t r + se − t l 1 = ( s, r ) π l 6 l 7

Theory Likelihood of a tree ESSLLI 2016 Maximum Likelihood Gerhard Jäger 4 / 50 different branches is independent suppose we know probability simplifying assumption: evolution at background reading: Ewens and Grant (2005), 15.7 l 4 l 3 l 2 l 5 l 8 l 1 distributions v t and v b over states at top and bottom of branch l k l 6 L ( l k ) = v T t P ( l k ) v b l 7

Theory Likelihood of a tree ESSLLI 2016 Maximum Likelihood Gerhard Jäger method from tips to root log-likelihood of larger tree: recursively apply this 5 / 50 log-likelihoods likelihoods of states (0 , 1) at root are v T 1 P ( l 1 ) v T 2 P ( l 2 ) l 2 log ( v T 1 P ( l 1 )) + log ( v T 2 P ( l 2 )) v 2 l 1 v 1

Theory (Log-)Likelihood of a tree Gerhard Jäger Maximum Likelihood ESSLLI 2016 6 / 50 log L ( tips below | mother = s ) = s ′ ∈ states log P ( s → s ′ | branchlength )+ � � d ∈ daughters log ( L ( tips below d | d = s ′ ))

Theory (Log-)Likelihood of a tree ESSLLI 2016 Maximum Likelihood Gerhard Jäger likelihoods for each character this is for one character — likelhood for all data is product of if we assume that root node is in equilibrium: root overall likelihood for entire tree depends on probability distribution on each branch this is essentially identical to Sankoff algorithm for parsimony: 7 / 50 weight ( i, j ) = log P ( l k ) ij weight matrix depends on branch length → needs to be recomputed for L ( tree ) = ( s, r ) T L ( root ) does not depend on location of the root ( → time reversibility)

Theory (Log-)Likelihood of a tree likelihood of tree depends on branch lengths rates for each character likelihood for tree topology: Gerhard Jäger Maximum Likelihood ESSLLI 2016 8 / 50 L ( tree | � L ( topology ) = max l k ) l k : k is a branch

Theory (Log-)Likelihood of a tree ESSLLI 2016 Maximum Likelihood Gerhard Jäger rates are gamma distributed 4 invariant 3 characters) 2 1 different options, increasing order of complexity Where do we get the rates from? 9 / 50 s = r = 0 . 5 for all characters r = empirical relative frequency of state 1 in the data (identical for all a certain proportion p inv (value to be estimated) of characters are

Theory rate matrix is multiplied with ESSLLI 2016 Maximum Likelihood Gerhard Jäger Gamma distribution Gamma-distributed rates 10 / 50 all characters equilibrium distribution is identical for except for mathematical convenience) common method (no real justification much we want allow rates to vary, but not too coefficient λ i for character i λ i is random variable drawn from a L ( r i = x ) = β β x ( β − 1) e − βx Γ( β )

Theory Gamma-distributed rates overall likelihood of tree topology: integrate computationally impractical approximate integration via Hidden Markov Model Gerhard Jäger Maximum Likelihood ESSLLI 2016 11 / 50 over all λ i , weighted by Gamma likelihood in practice: split Gamma distribution into n discrete bins (usually n = 4 ) and

Theory Modeling decisions to make ESSLLI 2016 Maximum Likelihood Gerhard Jäger This could be continued — you can build in rate variation across branches, you can fit the 1 0 none invariant characters 1 Gamma distributed 0 none rate variation 1 ML estimate 1 aspect of model possible choices number of parameters to estimate branch lengths unconstrained 12 / 50 ultrametric equilibrium probabilities uniform 0 empirical 2 n − 3 ( n is number of taxa) n − 1 p inv number of Gamma categories . . .

Theory Model selection tradeoff rich models are better at detecting patterns in the data, but are prone to over-fitting parsimoneous models less vulnerable to overfitting but may miss important information standard issue in statistical inference one possible heuristics: Akaike Information Criterion (AIC) the model minimizing AIC is to be preferred Gerhard Jäger Maximum Likelihood ESSLLI 2016 13 / 50 AIC = − 2 × log likelihood + 2 × number of free parameters

Theory unconstrained 16 unconstrained uniform Gamma 17496.73 17 empirical none none none 16106.52 18 unconstrained empirical 17494.73 Gamma 16049.28 none Example: Model selection for cognacy data/ 16009.90 13 unconstrained uniform none 17492.73 uniform 14 unconstrained uniform none 17494.73 15 unconstrained none 19 ultrametric 16025.99 16051.27 23 unconstrained ML Gamma none 24 ML unconstrained ML Gamma 16001.00 Gerhard Jäger Maximum Likelihood ESSLLI 2016 none unconstrained unconstrained empirical empirical Gamma none 16033.21 20 unconstrained Gamma 22 16011.38 21 unconstrained ML none none 16102.04 ML Gamma 12 ultrametric ultrametric uniform Gamma none 17517.89 4 uniform 17518.39 Gamma 17519.75 5 ultrametric empirical none none 3 none 6 AIC UPGMA tree model no. branch lengths eq. probs. rate variation inv. char. 1 uniform ultrametric uniform none none 17515.95 2 ultrametric 15981.94 16114.66 ultrametric ultrametric empirical ML none none 16034.96 10 ML 16022.21 none 16058.83 11 ultrametric ML Gamma none 9 ultrametric 14 / 50 empirical empirical ultrametric 8 none 16056.85 7 15997.16 ultrametric none Gamma Gamma p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv p inv

Theory model requires several hours on a single processor; parallelization helps ESSLLI 2016 Maximum Likelihood Gerhard Jäger in practice one has to make compromises model specification, and pick the tree+model with lowest AIC ideally, one would want to do 24 heuristic tree searches, one for each for the 25 taxa in our running example, ML tree search for the full Tree search computationally very demanding! optimize branch lengths to maximize likelihood for that topology heuristic search to find the topology maximizing likelihood ML tree: a model ML computation gives us likelihood of a tree topology, given data and 15 / 50

Running example Running example Gerhard Jäger Maximum Likelihood ESSLLI 2016 16 / 50

Running example ultrametric: ESSLLI 2016 Maximum Likelihood Gerhard Jäger Running example: cognacy data AIC = 7972 17 / 50 unconstrained branch lengths: AIC = 7929 Greek Irish Breton Welsh Bengali Hindi Nepali Lithuanian Bulgarian Czech Polish Russian Ukrainian Icelandic Swedish Danish English Dutch German Romanian French Italian Catalan Portuguese Spanish Greek Hindi Bengali Nepali Italian French Catalan Romanian Portuguese Spanish Irish Breton Welsh Lithuanian Russian Ukrainian Polish Bulgarian Czech English Dutch German Danish Icelandic Swedish

Running example ultrametric: ESSLLI 2016 Maximum Likelihood Gerhard Jäger Running example: WALS data AIC = 2828 18 / 50 unconstrained branch lengths: AIC = 2752 Bengali Nepali Hindi Breton Irish Welsh Bulgarian Greek Czech Lithuanian Polish Russian Ukrainian Catalan Italian Portuguese Romanian Spanish French Danish Swedish Icelandic Dutch German English Hindi Bengali Nepali Polish Czech Lithuanian Russian Ukrainian English Dutch German Swedish Danish Icelandic Bulgarian Greek Romanian Portuguese Spanish Catalan Italian French Breton Irish Welsh

Running example ultrametric: ESSLLI 2016 Maximum Likelihood Gerhard Jäger Running example: phonetic data AIC = 90575 19 / 50 unconstrained branch lengths: AIC = 89871 Bengali Hindi Nepali Lithuanian Bulgarian Polish Czech Russian Ukrainian English Dutch German Danish Icelandic Swedish Greek Irish Breton Welsh French Catalan Portuguese Romanian Spanish Italian Lithuanian Ukrainian Bulgarian Russian Polish Czech Icelandic Swedish Danish English Dutch German Bengali Hindi Nepali Greek Irish Breton Welsh Romanian French Italian Spanish Catalan Portuguese

Running example many parameter settings makes model selection difficult ESSLLI 2016 Maximum Likelihood Gerhard Jäger (more than 100–200 taxa) ultrametric constraint makes branch lengths optimization even though they have higher AIC) (note that the ultrametric trees in our example are sometimes better computationally demanding Wrapping up disadvantages: character states at each internal node can be read off side effect of likelihood computation: probability distribution over on branch lengths possibility of multiple mutations are taken into account — depending data different mutation rates for different characters are inferred from the ML is conceptually superior to MP (let alone distance methods) 20 / 50 computationally more expensive ⇒ not feasible for larger data sets

Cleaning up from yesterday Cleaning up from yesterday Gerhard Jäger Maximum Likelihood ESSLLI 2016 21 / 50

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 - PowerPoint PPT Presentation

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum Likelihood ESSLLI 2016 1 / 50 Theory Theory Gerhard Jger Maximum Likelihood ESSLLI 2016 2 / 50 Theory Recap: Continuous time Markov model

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic

Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, Bones, Genes, Tools February 28,

CSCE 471/871 Lecture 5: Phylogenetic Trees Building Phylogenetic Trees Stephen Scott

Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Outline CSCE CSCE 471/871 471/871 Lecture 5: Lecture 5: Building Building CSCE 471/871

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

CSCE 471/871 Lecture 5: Building Phylogenetic Trees Building trees from pairwise distances

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

Spaces of phylogenetic networks Jonathan Klawitter PhD Exam 5th March, 2020 2 - 1

Phylogenetic Networks Networks Phylogenetic Daniel H. Huson Daniel H. Huson www-

Phylogenetic Trees in ACL2 Warren A. Hunt Jr. and Serita M. Nelesen The University of Texas at

Phylogenetic trees III Maximum Parsimony . Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Balance indices for phylogenetic trees under well-known probability models Universitat de les

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Lattice calculations & DiRAC facility Matthew Wingate DAMTP , University of Cambridge

Object detection Wed Feb 24 Kristen Grauman UT Austin Announcements Reminder: Assignment 2

CMPT 825 Natural Language Processing Angel Xuan Chang angelxuanchang.github.io/nlp-class

Hybrid Parallelization of Particle-in-Cell (PIC) Algorithm For Simulation Of Low Temperature

Investigating the potential of ancestral state reconstruction algorithms in historical linguistics

LingSync & the Online Linguistic Database New models for the collection and management of data

HOW RWE CAN INF ORM RE IMBURSE ME NT QUE ST IONS? PE RSPE CT IVE OF AN HT A BODY

Approach to Pediatric Abdominal X-Rays Ben Pi, Dr. J. Jaremko PedsCases University of Alberta

Sambuz

Useful Links

Newsletter

Mail Us

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 - PowerPoint PPT Presentation

Phylogenetic trees IV Maximum Likelihood Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum Likelihood ESSLLI 2016 1 / 50 Theory Theory Gerhard Jger Maximum Likelihood ESSLLI 2016 2 / 50 Theory Recap: Continuous time Markov model

FOUND IN TRANSLATION: Reconstructing Phylogenetic Language Trees Reconstructing Phylogenetic

Phylogenetic trees IV Maximum Likelihood Gerhard Jger Words, Bones, Genes, Tools February 28,

CSCE 471/871 Lecture 5: Phylogenetic Trees Building Phylogenetic Trees Stephen Scott

Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods

Maximum Likelihood properties Maximum parsimony Maximum likelihood Experimental design

Outline CSCE CSCE 471/871 471/871 Lecture 5: Lecture 5: Building Building CSCE 471/871

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

CSCE 471/871 Lecture 5: Building Phylogenetic Trees Building trees from pairwise distances

Phylogenetic tree Michael Schroeder Biotechnology Center TU Dresden Phylogenetic trees

Spaces of phylogenetic networks Jonathan Klawitter PhD Exam 5th March, 2020 2 - 1

Phylogenetic Networks Networks Phylogenetic Daniel H. Huson Daniel H. Huson www-

Phylogenetic Trees in ACL2 Warren A. Hunt Jr. and Serita M. Nelesen The University of Texas at

Phylogenetic trees III Maximum Parsimony . Gerhard Jger ESSLLI 2016 Gerhard Jger Maximum

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Balance indices for phylogenetic trees under well-known probability models Universitat de les

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

Lattice calculations &amp; DiRAC facility Matthew Wingate DAMTP , University of Cambridge

Object detection Wed Feb 24 Kristen Grauman UT Austin Announcements Reminder: Assignment 2

CMPT 825 Natural Language Processing Angel Xuan Chang angelxuanchang.github.io/nlp-class

Hybrid Parallelization of Particle-in-Cell (PIC) Algorithm For Simulation Of Low Temperature

Investigating the potential of ancestral state reconstruction algorithms in historical linguistics

LingSync &amp; the Online Linguistic Database New models for the collection and management of data

HOW RWE CAN INF ORM RE IMBURSE ME NT QUE ST IONS? PE RSPE CT IVE OF AN HT A BODY

Approach to Pediatric Abdominal X-Rays Ben Pi, Dr. J. Jaremko PedsCases University of Alberta

Sambuz

Useful Links

Newsletter

Mail Us

Lattice calculations & DiRAC facility Matthew Wingate DAMTP , University of Cambridge

LingSync & the Online Linguistic Database New models for the collection and management of data