The mathematics of Darwins theory of evolution 1859 and 150 years - - PowerPoint PPT Presentation
The mathematics of Darwins theory of evolution 1859 and 150 years - - PowerPoint PPT Presentation
The mathematics of Darwins theory of evolution 1859 and 150 years later Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, sterreich und The Santa Fe Institute, Santa Fe, New Mexico, USA The Mathematics of Darwin
The mathematics of Darwin‘s theory of evolution 1859 and 150 years later Peter Schuster
Institut für Theoretische Chemie, Universität Wien, Österreich und The Santa Fe Institute, Santa Fe, New Mexico, USA
„The Mathematics of Darwin Legacy“ Centro Internacional de Mathemática, Lisbon, 23.-24.11.2009
Web-Page for further information: http://www.tbi.univie.ac.at/~pks
"La Filosophia è scritta in questo grandissimo libro, que continuamente ci stà aperto innanzi à gli occhi (io dico l’universo) ma non si può intendere se prima non s’impara à intender la lingua, e conoscer i caratteri, nei quali è scritto. Egli è scritto in lingua matematica, e i caratteri son triangoli, cerchi. & altre figure Geometriche ...", „Philosophy [science] is written in this grand book, the universe ... . It is written in the language of mathematics, and ist characters are triangles, circles and other geometric figures; …. „ Galileo Galilei. 1632. Il Saggiatore. Edition Nationale, Bd.6, Florenz 1896, p.232. Galileo Galilei, 1564 - 1642
"La Filosophia è scritta in questo grandissimo libro, que continuamente ci stà aperto innanzi à gli occhi (io dico l’universo) ma non si può intendere se prima non s’impara à intender la lingua, e conoscer i caratteri, nei quali è scritto. Egli è scritto in lingua matematica, e i caratteri son triangoli, cerchi. & altre figure Geometriche ...", „Philosophy [science] is written in this grand book, the universe ... . It is written in the language of mathematics, and ist characters are triangles, circles and other geometric figures; …. „ Galileo Galilei. 1632. Il Saggiatore. Edition Nationale, Vol.6, Florenz 1896, p.232. Galileo Galilei, 1564 - 1642
If Charles Darwin would have written the „Origin“ in mathematical language, how would he have done it? What did we learn about evolution from in vitro experiments? Quantitative systems biology – A challenge for biologists, chemists, physicists, and mathematicians !
If Charles Darwin would have written the „Origin“ in mathematical language, how would he have done it? What did we learn about evolution from in vitro experiments? Quantitative systems biology – A challenge for biologists, chemists, physicists, and mathematicians !
1 , ;
1 1 1
= = + =
− +
F F F F F
n n n
Leonardo da Pisa „Fibonacci“ ~1180 – ~1240 Thomas Robert Malthus 1766 – 1834
1, 2 , 4 , 8 ,16 , 32 , 64, 128 , ... geometric progression exponential growth
The history of exponential growth
Leonhard Euler, 1717 - 1783
n n
n x x ) 1 ( lim ) ( exp + ≡
∞ →
Exponential function and exponential growth
Pierre-François Verhulst, 1804-1849
( )
t r
e x C x C x t x C x x r dt dx
−
− + = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − = ) ( ) ( ) ( ) ( , 1
The logistic equation, 1828
( )
Φ r x x C Φ x r x r C x x r x C x x r x − = = ≡ − = ⇒ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − = dt d : 1 , ) t ( dt d 1 dt d
Darwin
[ ]
( ) ( )
∑ ∑ ∑
= = =
= − = − = = = =
n i i i j j n i i i j j j n i i i i n
x f Φ Φ f x x f f x x C x x
1 1 1 2 1
; dt d 1 ; X : X , , X , X K
( )
{ }
var 2 2 dt d
2 2
≥ = > < − > < = f f f Φ
Generalization of the logistic equation to n variables yields selection
Three necessary conditions for Darwinian evolution are: 1. Multiplication, 2. Variation, and 3. Selection. Darwin discovered the principle of natural selection from empirical observations in nature.
Gregor Mendel (1822-1884)
Gregor Mendel‘s experiments on plant genetics
Versuche über Pflanzen-Hybriden. Verhandlungen des naturforschenden Vereines in Brünn 4: 3–47, 1866. Über einige aus künstlicher Befruchtung gewonnenen Hieracium-Bastarde. Verhandlungen des naturforschenden Vereines in Brünn 8: 26–31, 1870.
Gregor Mendel‘s experiments on plant genetics
Ronald Fisher (1890-1962)
Darwin Mendel alleles: A1, A2, ..... , An frequencies: xi = [Ai] ; genotypes: Ai·Aj fitness values: aij = f (Ai·Aj), aij = aji
( )
∑ ∑ ∑ ∑ ∑
= = = = =
= = = − = − =
n j j j i n j n i ji i n i ji j j j i n i ji j
x x x a Φ n j Φ x a x x Φ x x a x
1 1 1 1 1
1 und (t) mit , , 2 , 1 , t d d K
( )
{ }
var 2 2 dt d
2 2
≥ = > < − > < = a a a Φ Ronald Fisher‘s selection equation: The genetical theory of natural selection. Oxford, UK, Clarendon Press, 1930.
If Charles Darwin would have written the „Origin“ in mathematical language, how would he have done it? What did we learn about evolution from in vitro experiments? Quantitative systems biology – A challenge for biologists, chemists, physicists, and mathematicians !
Generation time (optimal) Population size (maximal) Mutation per replication event Bacteria 20 min 1010 1/400 – 1/300 Viruses variable 1012 1 RNA molecules 1 –10 sec 1015 tunable
The world of in vitro evolution experiments
Richard Lenski, 1956 -
Bacterial evolution under controlled conditions: A twenty years experiment. Richard Lenski, University of Michigan, East Lansing
Bacterial evolution under controlled conditions: A twenty years experiment. Richard Lenski, University of Michigan, East Lansing
1 year
Epochal evolution of bacteria in serial transfer experiments under constant conditions
- S. F. Elena, V. S. Cooper, R. E. Lenski. Punctuated evolution caused by selection of rare beneficial mutants.
Science 272 (1996), 1802-1804
1 year
Epochal evolution of bacteria in serial transfer experiments under constant conditions
- S. F. Elena, V. S. Cooper, R. E. Lenski. Punctuated evolution caused by selection of rare beneficial mutants.
Science 272 (1996), 1802-1804
Variation of genotypes in a bacterial serial transfer experiment
- D. Papadopoulos, D. Schneider, J. Meier-Eiss, W. Arber, R. E. Lenski, M. Blot. Genomic evolution during a
10,000-generation experiment with bacteria. Proc.Natl.Acad.Sci.USA 96 (1999), 3807-3812
Innovation by mutation in long time evolution of Escherichia coli in constant environment Z.D. Blount, C.Z. Borland, R.E. Lenski. 2008. Proc.Natl.Acad.Sci.USA 105:7899-7906
Innovation by mutation in long time evolution of Escherichia coli in constant environment
Z.D. Blount, C.Z. Borland, R.E.
- Lenski. 2008.
Proc.Natl.Acad.Sci.USA 105:7899-7906
Three necessary conditions for Darwinian evolution are: 1. Multiplication, 2. Variation, and 3. Selection.
Charles Darwin, 1809-1882
All three conditions are fulfilled not only by cellular organisms but also by nucleic acid molecules – DNA or RNA – in suitable cell-free experimental assays:
Darwinian evolution in the test tube
Taq = thermus aquaticus
Accuracy of replication: Q = q1 · q2 · q3 · … · qn
The logics of DNA replication
RNA replication by Q-replicase
- C. Weissmann, The making of a phage.
FEBS Letters 40 (1974), S10-S18
1 1 2 2 2 1
and x f dt dx x f dt dx = =
2 1 2 1 2 1 2 1 2 1 2 1
, , , , f f f f x f x = − = + = = = ξ ξ η ξ ξ ζ ξ ξ
ft ft
e t e t ) ( ) ( ) ( ) ( ζ ζ η η = =
−
Complementary replication as the simplest molecular mechanism of reproduction
Christof K. Biebricher, 1941-2009
Kinetics of RNA replication
C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22:2544-2559, 1983
Evolution in the test tube: G.F. Joyce, Angew.Chem.Int.Ed. 46 (2007), 6420-6436
RNA sample Stock solution: Q RNA-replicase, ATP, CTP, GTP and UTP, buffer
- Time
1 2 3 4 5 6 69 70 Application of serial transfer technique to evolution of RNA in the test tube
Decrease in mean fitness due to quasispecies formation
The increase in RNA production rate during a serial transfer experiment
Manfred Eigen 1927 -
∑ ∑ ∑
= = =
= = − =
n i i n i i i j i n i ji j
x x f Φ n j Φ x x W x
1 1 1
, , 2 , 1 ; dt d K
Mutation and (correct) replication as parallel chemical reactions
- M. Eigen. 1971. Naturwissenschaften 58:465,
- M. Eigen & P. Schuster.1977. Naturwissenschaften 64:541, 65:7 und 65:341
∑ ∑ ∑ ∑
= = = =
= = − = − =
n i i n i i i j i i n i ji j i n i ji j
x x f Φ n j Φ x x f Q Φ x x W x
1 1 1 1
, , 2 , 1 ; dt d K
Factorization of the value matrix W separates mutation and fitness effects.
Mutation-selection equation: [Ii] = xi 0, fi 0, Qij 0 solutions are obtained after integrating factor transformation by means
- f an eigenvalue problem
f x f x n i x x f Q dt dx
n j j j n i i i j j n j ij i
= = = = − =
∑ ∑ ∑
= = = 1 1 1
; 1 ; , , 2 , 1 , φ φ L
( ) ( ) ( ) ( ) ( )
) ( ) ( ; , , 2 , 1 ; exp exp
1 1 1 1
∑ ∑ ∑ ∑
= = − = − =
= = ⋅ ⋅ ⋅ ⋅ =
n i i ki k n j k k n k jk k k n k ik i
x h c n i t c t c t x L l l λ λ
{ } { } { }
n j i h H L n j i L n j i Q f W
ij ij ij i
, , 2 , 1 , ; ; , , 2 , 1 , ; ; , , 2 , 1 , ;
1
L L l L = = = = = = ÷
−
{ }
1 , , 1 , ;
1
− = = Λ = ⋅ ⋅
−
n k L W L
k
L λ
constant level sets of
Selection of quasispecies with f1 = 1.9, f2 = 2.0, f3 = 2.1, and p = 0.01 , parametric plot on S3
Phenomenon Optimization of fitness Unique selection outcome Selection yes yes Recombination and selection Independent genes yes no Recombination and selection Interacting genes no no Mutation and selection no yes
The Darwinian mechanism of variation and selection is a very powerful optimization heuristic.
The Darwinian mechanism and optimization of fitness
Chain length and error threshold
n p n p n p p n p Q
n
σ σ σ σ σ ln : constant ln : constant ln ) 1 ( ln 1 ) 1 (
max max
≈ ≈ − ≥ − ⋅ ⇒ ≥ ⋅ − = ⋅ K K
sequence master
- f
y superiorit length chain rate error accuracy n replicatio ) 1 ( K K K K
∑ ≠
= − =
m j j m n
f f σ n p p Q
Quasispecies
Driving virus populations through threshold
The error threshold in replication: No mutational backflow approximation
W = G
- F
0 , 0 largest eigenvalue and eigenvector
diagonalization of matrix W „ complicated but not complex “ fitness landscape mutation matrix „ complex “ ( complex )
sequence
- structure
„ complex “
mutation selection
Complexity in molecular evolution
The single peak fitness landscapes corresponding to a mean field approximation
Error rate p = 1-q
0.00 0.05 0.10
Quasispecies Uniform distribution
Stationary population or quasispecies as a function of the mutation or error rate p
Fitness landscapes and the search for error thresholds
Error threshold on single-peak and hyperbolic landscapes
Error threshold on single-peak, linear, and step-linear landscapes
Fitness landscapes showing error thresholds
Error threshold on a single peak fitness landscape with n = 50 and = 10
Error threshold: Individual sequences n = 10, = 2 and d = 0, 1.0, 1.85
Motoo Kimuras population genetics of neutral evolution. Evolutionary rate at the molecular level. Nature 217: 624-626, 1955. The Neutral Theory of Molecular Evolution. Cambridge University Press. Cambridge, UK, 1983.
Motoo Kimura
Is the Kimura scenario correct for frequent mutations?
5 . ) ( ) ( lim
2 1
= =
→
p x p x
p
dH = 1
a p x a p x
p p
− = =
→ →
1 ) ( lim ) ( lim
2 1
dH = 2 dH ≥3
1 ) ( lim , ) ( lim
- r
) ( lim , 1 ) ( lim
2 1 2 1
= = = =
→ → → →
p x p x p x p x
p p p p
Random fixation in the sense of Motoo Kimura Pairs of neutral sequences in replication networks
- P. Schuster, J. Swetina. 1988. Bull. Math. Biol. 50:635-650
A fitness landscape including neutrality
Neutral network: Individual sequences n = 10, = 1.1, d = 1.0
Consensus sequence of a quasispecies of two strongly coupled sequences of Hamming distance dH(Xi,,Xj) = 1.
Neutral network: Individual sequences n = 10, = 1.1, d = 1.0
Consensus sequence of a quasispecies of two strongly coupled sequences of Hamming distance dH(Xi,,Xj) = 2.
N = 7 Neutral networks with increasing : = 0.10, s = 229
Adjacency matrix
many genotypes
- ne phenotype
A mapping and its inversion
- Gk =
( ) | ( ) =
- 1
U
- S
I S
k j j k
I
( ) = I S
j k Space of genotypes: = { I
S I I I I I S S S S S
1 2 3 4 N 1 2 3 4 M
, , , , ... , } ; Hamming metric Space of phenotypes: , , , , ... , } ; metric (not required) N M = {
Degree of neutrality of neutral networks and the connectivity threshold
A multi-component neutral network formed by a rare structure: < cr
A connected neutral network formed by a common structure: > cr
Evolution of RNA molecules as a Markow process
Replication in the flow reactor as a stochastic process with two absorbing barriers
10 12 14 16 18 20 22 Population size 0.2 0.4 0.6 0.8 1 P r
- b
a b i l i t y t
- r
e a c h t h e t a r g e t s t r u c t u r e
AUGC GC
Probability of a single trajectory to reach the target structure
Computer simulation using Gillespie‘s algorithm: Replication rate constant: fk = / [ + dS
(k)]
dS
(k) = dH(Sk,S)
Selection constraint: Population size, N = # RNA molecules, is controlled by the flow Mutation rate: p = 0.001 / site replication N N t N ± ≈ ) ( The flowreactor as a device for studies
- f evolution in vitro and in silico
Evolution in silico
- W. Fontana, P. Schuster,
Science 280 (1998), 1451-1455
Phenylalanyl-tRNA as target structure Structure of randomly chosen initial sequence
In silico optimization in the flow reactor: Evolutionary Trajectory
28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged
Neutral genotype evolution during phenotypic stasis
Evolutionary trajectory Spreading of the population
- n neutral networks
Drift of the population center in sequence space
A sketch of optimization on neutral networks
If Charles Darwin would have written the „Origin“ in mathematical language, how would he have done it? What did we learn about evolution from in vitro experiments? Quantitative systems biology – A challenge for biologists, chemists, physicists, and mathematicians !
Systems biology or quantitative biology is the chemistry of whole cells and organisms Challenges for theorists and mathematicians: 1. Very large numbers of variables and parameters in ODE modeling 2. Stochastic effects because of very low particle numbers 3. Complex nonlinear reaction networks 4. Complex spatial structures in specific aggregates and compartments
Three-dimensional structure of the complex between the regulatory protein cro-repressor and the binding site on -phage B-DNA
1 2 3 4 5 6 7 8 9 10 11 12 Regulatory protein or RNA Enzyme Metabolite Regulatory gene Structural gene
A model genome with 12 genes
Sketch of a genetic and metabolic network
A B C D E F G H I J K L 1
Biochemical Pathways
2 3 4 5 6 7 8 9 10
The reaction network of cellular metabolism published by Boehringer-Mannheim.
The bacterial cell as an example for the simplest form of autonomous life Escherichia coli genome: 4 million nucleotides 4460 genes The structure of the bacterium Escherichia coli
Evolution does not design with the eyes of an engineer, evolution works like a tinkerer.
François Jacob. The Possible and the Actual. Pantheon Books, New York, 1982, and Evolutionary tinkering. Science 196 (1977), 1161-1166.
- D. Duboule, A.S. Wilkins. 1998.
The evolution of ‚bricolage‘. Trends in Genetics 14:54-59.
The difficulty to define the notion of „gene”. Helen Pearson, Nature 441: 399-401, 2006
ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799-816, 2007
ENCODE stands for ENCyclopedia Of DNA Elements.
Coworkers
Karl Sigmund, Universität Wien, AT Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Paul E. Phillipson, University of Colorado at Boulder, CO Heinz Engl, Philipp Kügler, James Lu, Stefan Müller, RICAM Linz, AT Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, Martin Nowak, Harvard University, MA Christian Reidys, Nankai University, Tien Tsin, China Christian Forst, Los Alamos National Laboratory, NM Thomas Wiehe, Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder, Stefan Wuchty, Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Erich Bornberg-Bauer, Universität Wien, AT
Universität Wien
Acknowledgement of support
Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute
Universität Wien