Neutrality in Structural Bioinformatics and Molecular Evolution - - PowerPoint PPT Presentation

neutrality in structural bioinformatics and molecular
SMART_READER_LITE
LIVE PREVIEW

Neutrality in Structural Bioinformatics and Molecular Evolution - - PowerPoint PPT Presentation

Neutrality in Structural Bioinformatics and Molecular Evolution Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA Bioinformatics Research and Development 2008


slide-1
SLIDE 1
slide-2
SLIDE 2

Neutrality in Structural Bioinformatics and Molecular Evolution

Peter Schuster

Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA

Bioinformatics Research and Development 2008 Technische Universität Wien, 07.07.2008

slide-3
SLIDE 3

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4
slide-5
SLIDE 5

Charles Darwin. The Origin of Species. Sixth edition. John Murray. London: 1872

slide-6
SLIDE 6

Motoo Kimuras population genetics of neutral evolution. Evolutionary rate at the molecular level. Nature 217: 624-626, 1955. The Neutral Theory of Molecular Evolution. Cambridge University Press. Cambridge, UK, 1983.

slide-7
SLIDE 7

The average time of replacement of a dominant genotype in a population is the reciprocal mutation rate, 1/, and therefore independent of population size.

Fixation of mutants in neutral evolution (Motoo Kimura, 1955)

slide-8
SLIDE 8

1. Ruggedness of molecular landscapes 2. Replication-mutation dynamics 3. Models of fitness landscapes 4. Ruggedness and error thresholds 5. Stochasticity of replication and mutation 6. Population dynamics on neutral networks

slide-9
SLIDE 9
  • 1. Ruggedness of molecular landscapes

2. Replication-mutation dynamics 3. Models of fitness landscapes 4. Ruggedness and error thresholds 5. Stochasticity of replication and mutation 6. Population dynamics on neutral networks

slide-10
SLIDE 10

O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

5'-end 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

Definition of RNA structure

slide-11
SLIDE 11

N = 4n NS < 3n Criterion: Minimum free energy (mfe) Rules: _ ( _ ) _ {AU,CG,GC,GU,UA,UG} A symbolic notation of RNA secondary structure that is equivalent to the conventional graphs

slide-12
SLIDE 12

many genotypes

  • ne phenotype
slide-13
SLIDE 13

AUCAAUCAG GUCAAUCAC GUCAAUCAU GUCAAUCAA G U C A A U C C G G U C A A U C G G GUCAAUCUG G U C A A U G A G G U C A A U U A G GUCAAUAAG GUCAACCAG G U C A A G C A G GUCAAACAG GUCACUCAG G U C A G U C A G GUCAUUCAG GUCCAUCAG GUCGAUCAG GUCUAUCAG GUGAAUCAG GUUAAUCAG GUAAAUCAG GCCAAUCAG GGCAAUCAG GACAAUCAG UUCAAUCAG CUCAAUCAG

GUCAAUCAG

One-error neighborhood

The surrounding of GUCAAUCAG in sequence space

slide-14
SLIDE 14

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-15
SLIDE 15

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-16
SLIDE 16

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-17
SLIDE 17

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-18
SLIDE 18

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-19
SLIDE 19

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-20
SLIDE 20

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-21
SLIDE 21

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-22
SLIDE 22

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-23
SLIDE 23

GGCUAUCGUAUGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUAGACG GGCUAUCGUACGUUUACUCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGCUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCCAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUGUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAACGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCUGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCACUGGACG GGCUAUCGUACGUUUACCCAAAAGUCUACGUUGGUCCCAGGCAUUGGACG GGCUAGCGUACGUUUACCCAAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCGAAAGUCUACGUUGGACCCAGGCAUUGGACG GGCUAUCGUACGUUUACCCAAAAGCCUACGUUGGACCCAGGCAUUGGACG

G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G

One error neighborhood – Surrounding of an RNA molecule of chain length n=50 in sequence and shape space

slide-24
SLIDE 24

Number Mean Value Variance Std.Dev. Total Hamming Distance: 150000 11.647973 23.140715 4.810480 Nonzero Hamming Distance: 99875 16.949991 30.757651 5.545958 Degree of Neutrality: 50125 0.334167 0.006961 0.083434 Number of Structures: 1000 52.31 85.30 9.24 1 (((((.((((..(((......)))..)))).))).))............. 50125 0.334167 2 ..(((.((((..(((......)))..)))).)))................ 2856 0.019040 3 ((((((((((..(((......)))..)))))))).))............. 2799 0.018660 4 (((((.((((..((((....))))..)))).))).))............. 2417 0.016113 5 (((((.((((.((((......)))).)))).))).))............. 2265 0.015100 6 (((((.(((((.(((......))).))))).))).))............. 2233 0.014887 7 (((((..(((..(((......)))..)))..))).))............. 1442 0.009613 8 (((((.((((..((........))..)))).))).))............. 1081 0.007207 9 ((((..((((..(((......)))..))))..)).))............. 1025 0.006833 10 (((((.((((..(((......)))..)))).))))).............. 1003 0.006687 11 .((((.((((..(((......)))..)))).))))............... 963 0.006420 12 (((((.(((...(((......)))...))).))).))............. 860 0.005733 13 (((((.((((..(((......)))..)))).)).)))............. 800 0.005333 14 (((((.((((...((......))...)))).))).))............. 548 0.003653 15 (((((.((((................)))).))).))............. 362 0.002413 16 ((.((.((((..(((......)))..)))).))..))............. 337 0.002247 17 (.(((.((((..(((......)))..)))).))).).............. 241 0.001607 18 (((((.(((((((((......))))))))).))).))............. 231 0.001540 19 ((((..((((..(((......)))..))))...))))............. 225 0.001500 20 ((....((((..(((......)))..)))).....))............. 202 0.001347 G G C U A U C G U A C G U U U A C C C AA AAG UC UACG U UGGA CC C A GG C A U U G G A C G

Shadow – Surrounding of an RNA structure in shape space: AUGC alphabet, chain length n=50

slide-25
SLIDE 25

1. Ruggedness of molecular landscapes

  • 2. Replication-mutation dynamics

3. Models of fitness landscapes 4. Ruggedness and error thresholds 5. Stochasticity of replication and mutation 6. Population dynamics on neutral networks

slide-26
SLIDE 26

Chemical kinetics of molecular evolution

  • M. Eigen, P. Schuster, `The Hypercycle´, Springer-Verlag, Berlin 1979
slide-27
SLIDE 27

Complementary replication is the simplest copying mechanism

  • f RNA.

Complementarity is determined by Watson-Crick base pairs: GC and A=U

slide-28
SLIDE 28

Variation of genotypes through mutation and recombination

slide-29
SLIDE 29

Complementary replication as the simplest molecular mechanism of reproduction

slide-30
SLIDE 30

Kinetics of RNA replication

C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22:2544-2559, 1983

slide-31
SLIDE 31

Stock solution: activated monomers, ATP, CTP, GTP, UTP (TTP); a replicase, an enzyme that performs complemantary replication; buffer solution Flow rate:

r = R

  • 1

The population size N , the number of polynucleotide molecules, is controlled by the flow r

N N t N ± ≈ ) (

The flowreactor is a device for studies of evolution in vitro and in silico.

slide-32
SLIDE 32

Chemical kinetics of replication and mutation as parallel reactions

slide-33
SLIDE 33

1 and with

1 1

= = Φ Φ − =

∑ ∑ ∑

= = = n i i i n i i i j i n i i ji j

x x f x x f Q dt dx

( )

1 and between distance Hamming ) , ( digit per rate error , 1

1 ) , ( ) , (

= − =

∑ =

− n j ji j i j i H X X d X X d n ij

Q X X X X d p p p Q

j i H j i H

K K

Uniform error rate model

The replication-mutation equation

slide-34
SLIDE 34

Formation of a quasispecies in sequence space

p = 0

slide-35
SLIDE 35

Formation of a quasispecies in sequence space

p = 0.25 pcr

slide-36
SLIDE 36

Formation of a quasispecies in sequence space

p = 0.50 pcr

slide-37
SLIDE 37

Formation of a quasispecies in sequence space

p = 0.75 pcr

slide-38
SLIDE 38

Uniform distribution in sequence space

p pcr

slide-39
SLIDE 39

Error rate p = 1-q

0.00 0.05 0.10

Quasispecies Uniform distribution

Stationary population or quasispecies as a function of the mutation or error rate p

slide-40
SLIDE 40

Quasispecies

Driving virus populations through threshold

The error threshold in replication

slide-41
SLIDE 41

1. Ruggedness of molecular landscapes 2. Replication-mutation dynamics

  • 3. Models of fitness landscapes

4. Ruggedness and error thresholds 5. Stochasticity of replication and mutation 6. Population dynamics on neutral networks

slide-42
SLIDE 42

Every point in sequence space is equivalent

Sequence space of binary sequences with chain length n = 5

slide-43
SLIDE 43

A fitness landscape showing an error threshold

slide-44
SLIDE 44

Fitness landscapes not showing error thresholds

slide-45
SLIDE 45

Error thresholds and gradual transitions n = 20 and = 10

slide-46
SLIDE 46

1. Ruggedness of molecular landscapes 2. Replication-mutation dynamics 3. Models of fitness landscapes

  • 4. Ruggedness and error thresholds

5. Stochasticity of replication and mutation 6. Population dynamics on neutral networks

slide-47
SLIDE 47

Sources of ruggedness:

1. Variation in fitness values 2. Deviations from uniform error rates 3. Neutrality

slide-48
SLIDE 48

Three sources of ruggedness:

  • 1. Variation in fitness values

2. Deviations from uniform error rates 3. Neutrality

slide-49
SLIDE 49

Fitness landscapes showing error thresholds

slide-50
SLIDE 50

Error threshold: Error classes and individual sequences n = 10 and = 2

slide-51
SLIDE 51

Error threshold: Individual sequences n = 10, = 2 and d = 0, 1.0, 1.85

slide-52
SLIDE 52

Error threshold: Individual sequences n = 10, = 1.1, d = 1.95, 1.975, 2.00 and seed = 877

slide-53
SLIDE 53

Three sources of ruggedness:

1. Variation in fitness values

  • 2. Deviations from uniform error rates

3. Neutrality

slide-54
SLIDE 54

Local replication accuracy pk: pk = p + 4 p(1-p) (Xrnd-0.5) , k = 1,2,...,2

slide-55
SLIDE 55

Error threshold: Classes n = 10, = 1.1, = 0, 0.3, 0.5, and seed = 877

slide-56
SLIDE 56

Three sources of ruggedness:

1. Variation in fitness values 2. Deviations from uniform error rates

  • 3. Neutrality
slide-57
SLIDE 57
slide-58
SLIDE 58
slide-59
SLIDE 59

5 . ) ( ) ( lim

2 1

= =

p x p x

p

a p x a p x

p p

− = =

→ →

1 ) ( lim ) ( lim

2 1

Elements of neutral replication networks

slide-60
SLIDE 60

Error threshold: Individual sequences n = 10, = 1.1, d = 1.0

slide-61
SLIDE 61

Error threshold: Individual sequences n = 10, = 1.1, d = 1.0

slide-62
SLIDE 62

Error threshold: Individual sequences n = 10, = 1.1, d = 1.0

slide-63
SLIDE 63

= 0.10

N = 7 Neutral networks with increasing

slide-64
SLIDE 64

= 0.15

N = 24 Neutral networks with increasing

slide-65
SLIDE 65

= 0.20

N = 70 Neutral networks with increasing

slide-66
SLIDE 66

Size of selected neutral networks in the limit p 0 as a function of the degree of neutrality

random number seed

  • 229

367 491 673 877 0.005 1 1 1|1 1 1|1 0.01 2 2 2 1 1|1 0.015 2 2 2 2 1|1 0.02 3 2 2 2|2 1|1|1|1 0.025 3 2 2 3 1|1|1|1 0.03 3 3 2 3 3 0.035 3 3 2 3 3 0.04 3 3|3 2 3 3 0.045 3 5 3 3 4 0.05 3 5 3 5 7 0.06 6 5 3 7 7 0.07 6 8 5 7 7 0.08 7 8 5 4 8 0.09 7 8 10 5 9 0.10 7 10 9 5 9 0.11 8 14 22 6 9 0.12 10 17 44 14 9 0.13 11 40 49 43 9 0.14 16 52 70 84 28 0.15 24 72 71 95 12 0.20 70 (69) 180 152 181 151

slide-67
SLIDE 67

1. Ruggedness of molecular landscapes 2. Replication-mutation dynamics 3. Models of fitness landscapes 4. Ruggedness and error thresholds

  • 5. Stochasticity of replication and mutation

6. Population dynamics on neutral networks

slide-68
SLIDE 68

Evolution in silico

  • W. Fontana, P. Schuster,

Science 280 (1998), 1451-1455

slide-69
SLIDE 69

Phenylalanyl-tRNA as target structure Structure of randomly chosen initial sequence

slide-70
SLIDE 70

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-71
SLIDE 71

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-72
SLIDE 72

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-73
SLIDE 73

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-74
SLIDE 74

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-75
SLIDE 75

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-76
SLIDE 76

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-77
SLIDE 77

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-78
SLIDE 78

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-79
SLIDE 79

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-80
SLIDE 80

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-81
SLIDE 81

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-82
SLIDE 82

Evolution of RNA molecules as a Markow process and its analysis by means of the relay series

slide-83
SLIDE 83

Replication rate constant (Fitness): fk = / [ + dS

(k)]

dS

(k) = dH(Sk,S)

Selection pressure: The population size, N = # RNA moleucles, is determined by the flux: Mutation rate: p = 0.001 / Nucleotide Replication N N t N ± ≈ ) ( The flow reactor as a device for studying the evolution of molecules in vitro and in silico.

slide-84
SLIDE 84

In silico optimization in the flow reactor: Evolutionary Trajectory

slide-85
SLIDE 85

28 neutral point mutations during a long quasi-stationary epoch Transition inducing point mutations change the molecular structure Neutral point mutations leave the molecular structure unchanged

Neutral genotype evolution during phenotypic stasis

slide-86
SLIDE 86

Randomly chosen initial structure Phenylalanyl-tRNA as target structure

slide-87
SLIDE 87

1. Ruggedness of molecular landscapes 2. Replication-mutation dynamics 3. Models of fitness landscapes 4. Ruggedness and error thresholds 5. Stochasticity of replication and mutation

  • 6. Population dynamics on neutral networks
slide-88
SLIDE 88

Evolutionary trajectory Spreading of the population

  • n neutral networks

Drift of the population center in sequence space

slide-89
SLIDE 89

Spreading and evolution of a population on a neutral network: t = 150

slide-90
SLIDE 90

Spreading and evolution of a population on a neutral network : t = 170

slide-91
SLIDE 91

Spreading and evolution of a population on a neutral network : t = 200

slide-92
SLIDE 92

Spreading and evolution of a population on a neutral network : t = 350

slide-93
SLIDE 93

Spreading and evolution of a population on a neutral network : t = 500

slide-94
SLIDE 94

Spreading and evolution of a population on a neutral network : t = 650

slide-95
SLIDE 95

Spreading and evolution of a population on a neutral network : t = 820

slide-96
SLIDE 96

Spreading and evolution of a population on a neutral network : t = 825

slide-97
SLIDE 97

Spreading and evolution of a population on a neutral network : t = 830

slide-98
SLIDE 98

Spreading and evolution of a population on a neutral network : t = 835

slide-99
SLIDE 99

Spreading and evolution of a population on a neutral network : t = 840

slide-100
SLIDE 100

Spreading and evolution of a population on a neutral network : t = 845

slide-101
SLIDE 101

Spreading and evolution of a population on a neutral network : t = 850

slide-102
SLIDE 102

Spreading and evolution of a population on a neutral network : t = 855

slide-103
SLIDE 103

A sketch of optimization on neutral networks

slide-104
SLIDE 104

Initial state Target Extinction

Replication, mutation and dilution

Replication and mutation as a stochastic process

slide-105
SLIDE 105

Expectation values as functions of population size: Extinction probability, average number of replications and run time

slide-106
SLIDE 106

Application of molecular evolution to problems in biotechnology

slide-107
SLIDE 107

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU Siemens AG, Austria Universität Wien and the Santa Fe Institute

Universität Wien

slide-108
SLIDE 108

Universität Wien

Coworkers

Walter Fontana, Harvard Medical School, MA Christian Forst, Los Alamos National Laboratory, NM Christian Reidys, Nankai University, Tientsin, China Peter Stadler, Bärbel Stadler, Universität Leipzig, GE Christoph Flamm, Ivo L.Hofacker, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach, Andreas Wernitznig, Stefanie Widder, Michael Wolfinger, Stefan Wuchty,Universität Wien, AT Stefan Bernhart, Jan Cupal, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Hakim Tafer, Universität Wien, AT Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE

slide-109
SLIDE 109

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-110
SLIDE 110