monophyletic concordance between species trees and gene
play

Monophyletic concordance between species trees and gene genealogies - PowerPoint PPT Presentation

Monophyletic concordance between species trees and gene genealogies with multiple mergers Bjarki Eldon and James Degnan Phylomania 2010 University of Tasmania November 4-5, 2010 Low offspring number models Kingman (1982) introduced the n


  1. Monophyletic concordance between species trees and gene genealogies with multiple mergers Bjarki Eldon and James Degnan Phylomania 2010 University of Tasmania November 4-5, 2010

  2. Low offspring number models Kingman (1982) introduced the n -coalescent from an exchangeable Cannings offspring model; let ν i denote the number of offspring of individual i E [ ν k 1 ] < ∞ N → ∞ ; k ≥ 1 as M¨ ohle and Sagitov (2001) characterised coalescent processes based on the timescale c N c N = E [ ν 1 ( ν 1 − 1)] N − 1

  3. Conditions for convergence to Kingman’s coalescent Wright-Fisher and Moran models are exchangeable Cannings models with E [ ν 1 ( ν 1 − 1)( ν 1 − 2)] lim = 0 N 2 c N N →∞ implying c N → 0 and convergence to Kingman’s coalescent.

  4. High variance in offspring distribution Ecology, reproductive biology, and genetics of a diverse group of marine organisms suggest many offspring contributed by few individuals (Beckenbach 94; Hedgecock 94) Direct genotyping of parents and offspring provides evidence of large families in Pacific oyster (Boudry etal 2002) and Lion-Paw Scallop (Petersen etal 2008) Cod, oysters, mussels, barnacles, sea stars, plants ?

  5. Evidence for large offspring distribution ◮ broadcast spawning and external fertilization ◮ high initial mortality ◮ very large population sizes ◮ low genetic diversity ◮ large number of singleton genetic variants

  6. Λ coalescent allows multiple mergers Donnelly and Kurtz (1999), Pitman (1999), and Sagitov (1999) independently introduce a multiple merger coalescent; Λ-coalescent with coalescence rate � � 1 � b x k (1 − x ) b − k x − 2 Λ( dx ) λ b , k = k 0 Kingman’s coalescent is obtained if Λ = δ 0 For simultaneous multiple merger coalescent processes, see Schweinsberg (2000) and M¨ ohle and Sagitov (2001).

  7. Schweinsberg’s heavy-tail model Schweinsberg (2003) Each individual produces a random number X i of potential offspring; C > 0 and a > 0 and constant population size N P [ X i ≥ k ] ∼ C / k a and E [ X i ] > 1 From the pool of potential offspring, sample without replacement to form the new generation

  8. Coalescent process depends on a Coalescent timescale in units of c N ∼ N a − 1 if 1 < a < 2 case coalescent coalescence rate � b � a ≥ 2 Kingman coalescent 2 � b � B ( k − a , b − k + a ) 1 ≤ a < 2 Λ ∼ Beta (2 − a , a ) B (2 − a , a ) k 0 < a < 1 Ξ-coalescent

  9. A modified Moran model Eldon and Wakeley (2006) A modified Moran model, in which the offspring number U is random rather than fixed at one as in the usual Moran model P [ U = u ] = (1 − ε N ) δ 2 + ε N δ [ ψ N ] and ε N ∼ 1 / N γ , γ > 0

  10. Coalescent process depends on γ N γ , N 2 � � Coalescent timescale is N γ = min , γ > 0 case coalescence rate timescale � n � N 2 γ > 2 2 � b � � N 2 δ 2 + ψ k (1 − ψ ) b − k � γ = 2 k � b � ψ k (1 − ψ ) b − k N γ , 1 < γ < 2 γ < 2 k

  11. Ratios of coalescence times for Λ = K + Λ ψ ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 100 200 300 400 500 sample size n

  12. Ratios of coalescence times for Λ = Beta (0 . 9 , 1 . 1) ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500 sample size n

  13. Ratios of coalescence times for Λ = Beta (0 . 1 , 1 . 9) ◦ : R 1 ; △ : R 2 ; ▽ : R 3 ; ⋄ : R 4 ; + : R n − 1 0.8 0.6 0.4 0.2 0.0 0 100 200 300 400 500 sample size n

  14. Monophyletic concordance for Λ coalescents t A B

  15. Not monophyletic concordance t A B

  16. General form for P [ MC ] for two species � P [ MC ] = P [ MC ; m A , m B ] P [ m A , m B ] m A , m B with P [ n A , n B ] = G n A , m A ( t ) G n B , m B ( t ) and m A + m B � � m A � � P [ MC ; m A , m B ] = β m A + m B , k P [ MC ; m A − k + 1 , m B ] k k =2 � m B �� � m A + m B � + P [ MC ; m A , m B − k + 1] / k k

  17. Computing G i , j ( t ) G i , j ( t ) is the probability of j lines at time t when starting from i lines at time zero within one population A vector c of ordered mergers associated with Kingman’s coalescent is simply { 2 , 2 , . . . , 2 } By way of example, starting from 10 lines, say, a coalescence sequence could be { 3 , 2 , 5 , 3 } in a Λ coalescent. Conditioning on the embedded chain , or the order of mergers Transition probabilities q i , j  if i � = j �  k � = i q i , k   β i , j =    0 otherwise

  18. The rate matrix Q A of ( A t ; t ≥ 0) is � � 1 � j x j − i − 1 (1 − x ) i − 1 Λ( dx ) q j , i = j − i + 1 0 j − 1 � q j , j = − q j , i , 2 ≤ j ≤ n i =1 q j , i = 0 , otherwise

  19. Using eigenvectors and eigenvalues of Q A Eigenvalues of Q A are α ( k ) = q k , k Left eigenvector l ( k ) = � � l ( k ) 1 , . . . , l ( k ) n � � Right eigenvector r ( k ) = r ( k ) , . . . , r ( k ) n 1 Obtained by recursions q j +1 , j l ( k ) j +1 + · · · + q k , j l ( k ) l ( k ) k = , 1 ≤ j < k j q k , k − q j , j q j , k r ( k ) + · · · + q j , j − 1 r ( k ) r ( k ) j − 1 k = , 1 < k < j ≤ n j q k , k − q j , j

  20. The spectral decomposition of Q A yields the transition probabilites G i , j ( t ) ≡ P [ A t = j | A 0 = i ] as i e − α ( k ) t r ( k ) l ( k ) � G i , j ( t ) = i j k = j

  21. Transition probabilities G i , j for i = 3 q 3 , 2 G 3 , 2 ( t ) = P [ T 3 ≤ t , T 3 + T 2 > t ] q 3 , 2 + q 3 , 3 q 3 , 2 q 3 , 3 G 3 , 1 ( t ) = P [ T 3 + T 2 ≤ t ] + P [ T 3 ≤ t ] q 3 , 2 + q 3 , 3 q 3 , 2 + q 3 , 3 G 3 , 3 ( t ) = P [ T 3 > t ] and G 3 , 1 ( t ) + G 3 , 2 ( t ) + G 3 , 3 ( t ) = 1

  22. An example with Λ ψ Process with infinitesimal parameters � i � ψ i − j +1 (1 − ψ ) j − 1 q ij = j For i = 3 we obtain, with α ( k ) ≡ � 1 k = i − 1 q ik 3 e − α (2) t − e − α (3) t � � G 3 , 2 ( t ) = 2 1 − 3 2 e − α (2) t + 1 2 e − α (3) t G 3 , 1 ( t ) = e − α (3) t G 3 , 3 ( t ) =

  23. In general, � G i , j ( t ) = g c ( t ) , 1 ≤ j < i c ∈ C i , j in which c is a coalescence sequence ; or a particular order of mergers in going from i to j sequences. Number of possible sequences is | C i , j | = 2 i − j − 1

  24.  p ( c ) P [ T ( c ) ≤ t , T ( c ) + T j > t ] if j > 1        g c ( t ) = p ( c ) P [ T ( c ) ≤ t ] if j = 1       P [ T i > t ] if j = i  in which l γ k � 1 − e − β ( i k , j ) t � � P [ T ( c ) ≤ t , T ( c ) + T j > t ] = e − α ( j ) t β ( i k , j ) k =1 with β ( i k , j ) ≡ α ( i k ) − α ( j ); and l � 1 − e − α ( i k ) t � � γ ′ P [ T ( c ) ≤ t ] = k k =1

  25. Example: two species The probability P [ MC ] of monophyletic concordance for two lines from each of two species, with α X ( k ) = � 1 ≤ k ≤ i − 1 q ik (for species X ) (1 − e − α A (2) t )(1 − e − α B (2) t ) P [ MC ] = e − α A (2) t (1 − e − α B (2) t ) β 3 , 2 / 3 + (1 − e − α A (2) t ) e − α B (2) t β 3 , 2 / 3 + e − α A (2) t e − α B (2) t β 4 , 2 β 3 , 2 / 9 +

  26. Two species and two lines each ◦ : Λ ψ ; △ : K + Λ ψ 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 psi ◦ : Beta (2 − a , a ) 1.0 0.8 0.6 0.4 0.2 0.0 1.2 1.4 1.6 1.8 a

  27. Two species and two lines each ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05); + : K 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  28. Two species and two lines each ◦ : Λ 0 . 99 ; △ : K + Λ 0 . 99 ; ⋄ : Beta (0 . 05 , 1 . 95); + : K 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  29. Two species and three lines each ◦ : Λ ψ ; △ : K + Λ ψ 1.0 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 ◦ : Beta (2 − a , a ) 1.0 0.8 0.6 0.4 0.2 0.0 1.2 1.4 1.6 1.8

  30. Two species and three lines each ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05) 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  31. Two species and three lines each ◦ : Λ 0 . 95 ; △ : K + Λ 0 . 95 ; ⋄ : Beta (0 . 05 , 1 . 95) 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 time t

  32. Recursive approach for s species Let ˜ n = n 1 + · · · + n s in which n i denotes the number of ancestral lines for species i in a population; and let n = ( n 1 , . . . , n s ) ˜ n s � n r � � ˜ � n � � P [ MC ; n ] = β ˜ P [ MC ; m ] / n , k k k r =1 k =2 in which m = ( n 1 , n 2 , . . . , n r − 1 , n r − k + 1 , n r +1 , . . . , n s ) and P [ MC ; (0 , 0 , . . . , 0 , 1)] = P [ MC ; (0 , 0 , . . . , 0 , 1 , 1)] = 1

  33. Three species and two lines each ( t 1 = 1 , t 2 = 2) ◦ : Λ 0 . 05 ; △ : K + Λ 0 . 05 ; ⋄ : Beta (0 . 95 , 1 . 05) 0.8 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 ψ = a − 1

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend