How computation has changed research in chemistry and biology Peter - - PowerPoint PPT Presentation

how computation has changed research in chemistry and
SMART_READER_LITE
LIVE PREVIEW

How computation has changed research in chemistry and biology Peter - - PowerPoint PPT Presentation

How computation has changed research in chemistry and biology Peter Schuster Institut fr Theoretische Chemie, Universitt Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA IWR - 25 Jahre-Jubilum Heidelberg, 21.


slide-1
SLIDE 1
slide-2
SLIDE 2

How computation has changed research in chemistry and biology

Peter Schuster

Institut für Theoretische Chemie, Universität Wien, Austria and The Santa Fe Institute, Santa Fe, New Mexico, USA

IWR - 25 Jahre-Jubiläum Heidelberg, 21. – 22.02.2013

slide-3
SLIDE 3

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-4
SLIDE 4

Some technological revolutions in 20th century science: 1. molecular spectroscopy, 2. micro-technology, 3. electronic computation, 4. molecular revolution in biology, 5. computational quantum chemistry, and 6. holistic chemistry of biological entities.

slide-5
SLIDE 5

Electronics 38 (8), 4-7,1965

Gordon E. Moore, 1929 -

Exponential increase in hardware power

slide-6
SLIDE 6

Martin Grötschel, 1948 -

. Grötschel, an expert in optimization, observes that a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – the same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. J.P. Holdren, E. Lander, H. Varmus. Designing a digital future: Federally funded research and development in networking and information technology. President‘s council on science and technology, Washington, DC, p.71, 2010 Of this, a factor of roughly 1000 was due to increased processor speed, whereas a factor of roughly 43000 was due to improvements in algortihms ! Grötschel also cites an algorithmic improvement of roughly 30000 for mixed integer programming between 1991 and 2008.

PCIT Report to the President, 2010. Progress in Algorithms Beats Moore‘s Law.

slide-7
SLIDE 7

Four selected examples

1. Parameter determination in chemical kinetics 2. Design of ribonucleic acid (RNA) structures 3. Kinetic folding of RNA molecules 4. Modeling evolution

slide-8
SLIDE 8

Four selected examples

  • 1. Parameter determination in chemical kinetics

2. Design of ribonucleic acid (RNA) structures 3. Kinetic folding of RNA molecules 4. Modeling evolution

slide-9
SLIDE 9

Michaelis-Menten mechanism of enzyme reactions

  • L. Michaelis, M. Menten. Die Kinetik der

Invertin-Wirkung. Biochemische Zeitschrift 49, 333-369,1913

] S [ ] S [ ]) S ([ ] P [

M max

+ ⋅ = = K v v dt d

max M

] E [ and ] ES [ ]) S ([ , ⋅ = ⋅ = + =

r r f d r

k v k v k k k K

vv

v

basic assumptions: kr  kd [E]0 << [S]0

slide-10
SLIDE 10

Linearization of a hyperbola:

] S [ ] S [ ]) S ([

M max

+ ⋅ = K v v

Lineweaver-Burk: 1/v = f (1/[S]) Eadie-Hofstee: v = f (1/[S]) Scatchard: 1/[S] = f (v) Hanes: [S] / v = f ([S]) Hill: log (v/(vmax – v)) = f (log [S])

slide-11
SLIDE 11

The Lineweaver-Burke plot of Michaelis-Menten kinetics

Source: Wikipedia, “Enzymkinetik”

slide-12
SLIDE 12

Validity of the Michaelis-Menten approximation

slide-13
SLIDE 13

The forward problem of chemical reaction kinetics

slide-14
SLIDE 14

The inverse problem of chemical reaction kinetics

Parameter identification and determination is an ill-posed problem Inverse problem solution techniques

slide-15
SLIDE 15

Y y Q q y q y q F ∈ ∈ =         and ; data (noisy) , vector parameter , ) (

δ δ

Q q Y

q F y

→ −

  min ) (

2 δ

ill-conditioned problem

2 2

) , ( with min ) , ( ) (

Q Q q Y

q q q q q q q F y        

− = → + −

R R α

δ

regularization term R - here Tikhonov regularization - with q0 being an initial parameter guess and  the regularization parameter

Parameter identification and determination as an inverse problem

slide-16
SLIDE 16
slide-17
SLIDE 17

Four selected examples

1. Parameter determination in chemical kinetics 2. Design of ribonucleic acid (RNA) structures 3. Kinetic folding of RNA molecules 4. Modeling evolution

slide-18
SLIDE 18

O CH2 OH O O P O O O

N1

O CH2 OH O P O O O

N2

O CH2 OH O P O O O

N3

O CH2 OH O P O O O

N4

N A U G C

k =

, , ,

3' - end 5' - end Na Na Na Na

5'-end 3’-end

GCGGAU AUUCGC UUA AGUUGGGA G CUGAAGA AGGUC UUCGAUC A ACCA GCUC GAGC CCAGA UCUGG CUGUG CACAG

RNA structure The molecular phenotype

slide-19
SLIDE 19

The notion of structure

slide-20
SLIDE 20

The minimum free energy structures on a discrete space of conformations

S1

(h)

S9

(h)

Free energy G  Minimum of free energy Suboptimal conformations

S0

(h) S2

(h)

S3

(h)

S4

(h)

S7

(h)

S6

(h)

S5

(h)

S8

(h)

slide-21
SLIDE 21

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: structural biology, spectroscopy of biomolecules, understanding molecular function empirical parameters biophysical chemistry: thermodynamics and kinetics

From RNA sequence to structure

linear programming

slide-22
SLIDE 22

RNA sequence RNA structure

  • f minimal free

energy

RNA folding: Structural biology, spectroscopy of biomolecules, understanding molecular function inverse folding of RNA: biotechnology, design of biomolecules with predefined structures and functions inverse Folding Algorithm iterative determination

  • f a sequence for the

given secondary structure

From RNA structure to sequence

Linear programming

slide-23
SLIDE 23

Ivo L. Hofacker, Walter Fontana, Peter F. Stadler, Sebastian Bonhoeffer, Manfred Tacker, and Peter Schuster. Fast folding and comparison of RNA secondary structures. Mh.Chem. 125:167-188, 1994 Ronny Lorenz, Stephan H. Bernhart, Christian Höner zu Siederissen, Hakim Tafer, Christioh Flamm, Peter F. Stadler, and Ivo L. Hofacker. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6:26, 2011

ViennaRNA Package:

slide-24
SLIDE 24
slide-25
SLIDE 25

A mapping and its inversion 

Gk = ( ) | ( ) =  

  • 1

U 

S I S

k j j k

I

( ) = I S

j k Space of genotypes: = { I S I I I I I S S S S S

1 2 3 4 N 1 2 3 4 M

, , , , ... , } ; Hamming metric Space of phenotypes: , , , , ... , } ; metric (not required) N M = { 

slide-26
SLIDE 26

many genotypes  one phenotype

slide-27
SLIDE 27

Four selected examples

1. Parameter determination in chemical kinetics 2. Design of ribonucleic acid (RNA) structures

  • 3. Kinetic folding of RNA molecules

4. Modeling evolution

slide-28
SLIDE 28

Extension of the notion of structure

slide-29
SLIDE 29
slide-30
SLIDE 30

F r e e e n e r g y G  "Reaction coordinate" Sk S{ Saddle point T

{ k

Free energy G  Sk S{ T

{ k

"Barrier tree"

Definition of a ‚barrier tree‘

slide-31
SLIDE 31

Interconversion of suboptimal structures

slide-32
SLIDE 32
slide-33
SLIDE 33

Computation of kinetic folding

slide-34
SLIDE 34
slide-35
SLIDE 35

JN1LH

1D 1D 1D 2D 2D 2D R R R

G GGGUGGAAC GUUC GAAC GUUCCUCCC CACGAG CACGAG CACGAG

  • 28.6 kcal·mol
  • 1

G/

  • 31.8 kcal·mol
  • 1

G G G G G G C C C C C C A A U U U U G G C C U U A A G G G C C C A A A A G C G C A A G C /G

  • 28.2 kcal·mol
  • 1

G G G G G G GG CCC C C C C C U G G G G C C C C A A A A A A A A U U U U U G G C C A A

  • 28.6 kcal·mol
  • 1

3 3 3 13 13 13 23 23 23 33 33 33 44 44 44

5' 5' 3’ 3’

J.H.A. Nagel, C. Flamm, I.L. Hofacker, K. Franke, M.H. de Smit, P. Schuster, and C.W.A. Pleij. Structural parameters affecting the kinetic competition of RNA hairpin formation. Nucleic Acids Res. 34:3568-3576 (2006)

An experimental RNA switch

slide-36
SLIDE 36

4 5 8 9 11

19 20 24 25 27 33 34

36

38 39 41 46 47

3

4 9

1

2 6 7 10

1 2 1 3 1 4 1 5 1 6 1 7 1 8 2 1 22 2 3 2 6 2 8 2 9 3 3 1 32 35 37

40

42 43 44 45 48 50

  • 26.0
  • 28.0
  • 30.0
  • 32.0
  • 34.0
  • 36.0
  • 38.0
  • 40.0
  • 42.0
  • 44.0
  • 46.0
  • 48.0
  • 50.0

2.77 5.32 2.09 3.4 2.36 2.44 2.44 2.44 1.46 1.44 1.66

1.9

2 . 1 4

2 . 5 1 2 . 1 4 2 . 5 1

2.14 1.47

1.49

3.04 2.97 3.04 4.88 6.13 6.8 2.89

F r e e e n e r g y [ k c a l / m

  • l

e ]

J1LH barrier tree

slide-37
SLIDE 37

Four selected examples

1. Parameter determination in chemical kinetics 2. Design of ribonucleic acid (RNA) structures 3. Kinetic folding of RNA molecules

  • 4. Modeling evolution
slide-38
SLIDE 38

Sewall Wrights fitness landscape as metaphor for Darwinian evolution

Sewall Wright. 1932. The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: D.F.Jones, ed. Int. Proceedings of the Sixth International Congress on Genetics. Vol.1, 356-366. Ithaca, NY.

slide-39
SLIDE 39

The multiplicity of gene replacements with two alleles on each locus + …….. wild type a .......... alternative allele

  • n locus A

: : : abcde … alternative alleles

  • n all five loci

Sewall Wright. 1988. Surfaces of selective value revisited. American Naturalist 131:115-123

Sewall Wright, 1889 - 1988

slide-40
SLIDE 40

Evolution is hill climbing of populations or subpopulations Sewall Wright. 1988. Surfaces of selective value revisited. American Naturalist 131:115-123

slide-41
SLIDE 41

The logics of DNA (or RNA) replication

Accuracy of replication: Q = q1  q2  q3  q4  …

slide-42
SLIDE 42

Evolution in the test tube: G.F. Joyce, Angew.Chem.Int.Ed. 46 (2007), 6420-6436

Sol Spiegelman, 1914 - 1983

slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45

Kinetics of RNA replication

C.K. Biebricher, M. Eigen, W.C. Gardiner, Jr. Biochemistry 22:2544-2559, 1983

Christof K. Biebricher, 1941-2009

slide-46
SLIDE 46

Manfred Eigen 1927 -

∑ ∑ ∑

= = =

= ⋅ = = − =

n i i n i i i i ji ji j i n i ji j

x x f Φ f Q W n j Φ x x W x

1 1 1

, , , 2 , 1 ; dt d 

Mutation and (correct) replication as parallel chemical reactions

  • M. Eigen. 1971. Naturwissenschaften 58:465,
  • M. Eigen & P. Schuster.1977. Naturwissenschaften 64:541, 65:7 und 65:341
slide-47
SLIDE 47

quasispecies

The error threshold in replication and mutation

slide-48
SLIDE 48

The paradigm of structural biology

slide-49
SLIDE 49

The simplified model

slide-50
SLIDE 50
slide-51
SLIDE 51

Model fitness landscapes I

single peak landscape step linear landscape

slide-52
SLIDE 52

Stationary population or quasispecies as a function

  • f the mutation or error

rate p

Error rate p = 1-q

0.00 0.05 0.10

Quasispecies Uniform distribution

slide-53
SLIDE 53

Error threshold on the single peak landscape

slide-54
SLIDE 54

Error threshold on the step linear landscape

slide-55
SLIDE 55

Rugged fitness landscapes

  • ver individual binary sequences

with n = 10

single peak landscape „realistic“ landscape

slide-56
SLIDE 56

Random distribution of fitness values: d = 1.0 and s = 637

slide-57
SLIDE 57

Error threshold on ‚realistic‘ landscapes n = 10, f0 = 1.1, fn = 1.0, d = 0.5

s = 541 s = 637 s = 919

slide-58
SLIDE 58

s = 541 s = 919 s = 637

Error threshold on ‚realistic‘ landscapes n = 10, f0 = 1.1, fn = 1.0, d = 0.995

slide-59
SLIDE 59

s = 919 s = 541 s = 637

Error threshold on ‚realistic‘ landscapes n = 10, f0 = 1.1, fn = 1.0, d = 1.0

slide-60
SLIDE 60

Complexity in molecular evolution

W = G  F 0 , 0  largest eigenvalue and eigenvector

diagonalization of matrix W „ complicated but not complex “ fitness landscape mutation matrix „ complex “ ( complex )

sequence  structure

„ complex “

mutation selection

slide-61
SLIDE 61

The new biology provides a hitherto unknown challenge for mathematicians, computer scientists, and theorical biologists for mainly two reasons enormous amount of data and complexity of structure and dynamics:

slide-62
SLIDE 62

. I was taught in the pregenomic era to be a

  • hunter. I learnt how to identify the wild beasts

and how to go out, hunt them down and kill

  • them. We are now urged to be gatherers, to

collect everything lying around and put it into storehouses. Someday, it is assumed, someone will come and sort through the storehouses, discard all the junk, and keep the rare finds. The only difficulty is how to recognize them. Sydney Brenner, 1927 - Sydney Brenner. Hunters and gatherers. The Scientist 16(4): 14, 2002

The „big data“ problem in bioinformatics

slide-63
SLIDE 63

Theory – mathematics and computation – cannot remove complexity, but it shows what kind of „regular“ behavior can be expected and what experiments have to be done to get a grasp on the irregularities.

Manfred Eigen, 1927 -

Preface to E. Domingo, C.R. Parrish, J.J.Holland, eds. Origin and Evolution of

  • Viruses. Academic Press 2008

Theory, mathematics and complexity

slide-64
SLIDE 64

Coworkers

Peter Stadler, Bärbel M. Stadler, Universität Leipzig, GE Paul E. Phillipson, University of Colorado at Boulder, CO Heinz Engl, Philipp Kügler, James Lu, Stefan Müller, RICAM Linz, AT Jord Nagel, Kees Pleij, Universiteit Leiden, NL Walter Fontana, Harvard Medical School, MA Martin Nowak, Harvard University, MA Christian Reidys, University of Southern Denmark, Odense, DK Christian Forst, University of Texas, Southwestern Medical Center, TX Thomas Wiehe, Ulrike Göbel, Walter Grüner, Stefan Kopp, Jaqueline Weber, Institut für Molekulare Biotechnologie, Jena, GE Ivo L.Hofacker, Christoph Flamm, Andreas Svrček-Seiler, Universität Wien, AT Kurt Grünberger, Michael Kospach , Andreas Wernitznig, Stefanie Widder, Stefan Wuchty, Jan Cupal, Stefan Bernhart, Lukas Endler, Ulrike Langhammer, Rainer Machne, Ulrike Mückstein, Erich Bornberg-Bauer, Universität Wien, AT

Universität Wien

slide-65
SLIDE 65

Universität Wien

Acknowledgement of support

Fonds zur Förderung der wissenschaftlichen Forschung (FWF) Projects No. 09942, 10578, 11065, 13093 13887, and 14898 Wiener Wissenschafts-, Forschungs- und Technologiefonds (WWTF) Project No. Mat05 Jubiläumsfonds der Österreichischen Nationalbank Project No. Nat-7813 European Commission: Contracts No. 98-0189, 12835 (NEST) Austrian Genome Research Program – GEN-AU: Bioinformatics Network (BIN) Österreichische Akademie der Wissenschaften Siemens AG, Austria Universität Wien and the Santa Fe Institute

slide-66
SLIDE 66

Thank you for your attention! Happy 25th birthday IWR and ad multos annos.

slide-67
SLIDE 67

Web-Page for further information: http://www.tbi.univie.ac.at/~pks

slide-68
SLIDE 68