CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; - PowerPoint PPT Presentation

The computational Core of Molecular Modeling. (The What? Why? And How? ) CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; Dept. of Physics and the GBCB program.

Thanks to: Faculty co-authors: T.M. Murali, M. Prisant, L. Heath, C. Simmerling Student co-authors: J. Gordon, J. Myers, A. Fenley, J. Ruscio, R. Anandakrishnan, D. Kumar, M. Shukla, V. Sojia Sponsors: NIH, VT. Support: System-X team, N. Polys (myoglobin graphics)

Will focus on the computational side of solving problems in the following areas: • a. Rational Drug Design • b. From atomic motion to biological function • c. The “grand challenge” of computational science: the Protein Folding Problem.

The emergence of “ in virtuo” Science. in vivo in vitro in virtuo

Biological function = f( 3D molecular structure ) …A T G C … DNA sequence Bilogical Protein function structure How can we predict/understand/modify function if we know the structure? Can we predict the structure? Key challenges: Biomolecular structures are complex (e.g. compared to crystal solids). Biology works on many time scales. Experiments can only go so far. A solution: Computational methods. Model movements of individual atoms.

The paradigm: “All things are made of atoms, and everything that living things do can be understood in terms of the wigglings and jigglings of atoms” R. Feynmann Suggests the approach: model what nature does, i.e. let the molecule evolve with time according to underlying physics laws.

A protein on a surface. Atomic resolution

Theme 1. Rational (structure- based) design of new medicines: Picture: Design of a HIV protease inhibitor. Hornak, V.; Okur, A., Rizzo, R. and Simmerling, C., “HIV-1 protease flaps spontaneously open and reclose in molecular dynamics simulations”, Proc. Nat. Acad. Sci. USA, 103:915-920 (2006)

Example: rational drug design. If you block the enzymes function – you kill the virus. Drug e.g: viral protease agent (chops up proteins)

Example of successful computer-aided (rational) drug design: One of the drugs that helped slow down the AIDS epidemic (part of anti-retro viral cocktail). The drug blocks the function of a key viral protein. To design the drug, one needs a precise 3D structure of that protein.

A computational challenge: Need high quality, complete protein structures. But the experiment (X- ray) does not ``see” hydrogen atoms. These have to be placed computationally.

Combinatorial explosion problem: A molecule with N sites, each of which can be occupied by an H+ atom (or be empty), has 2 N possible states. All of the possible charge + (protonation) states must be taken into account. For a typical protein with 100 X 3 ionizable amino-acids this means 2 100 ~ W 23 10 30 possible variants to consider! And X 1 we need to find the minimum energy state among them (for the experts: what we really X 2 + + need is, of course, the partition function Z, see below. Which is an even more computationally demanding job) Matrix of site-site interactions. N N N Δ G k (pH) = Σ k (kT ln10 pH - Δ E i calc ) + 1/2 ΣΣ k x j k W ij x i ΣΣ x i i i j trouble X i = 1 or 0 ( occupied or empty in state k ) Z = Σ all states exp(- Δ G k /kT) k – protonation state ( out of 2 N ) Free energy

Beating the combinatorics: the clustering approach. Example: a protein with N = 6 ionizable groups. Total number of protonation states = 2 N = 64 Every site interacts with every other site, a complete graph Cluster 2, N=3 Cluster 1, N=3 Solution: neglect the ``weak” edges. Cluster the strong ones. After clustering: Total number of states = 2 3 + 2 3 = 16 < 64.

http://biophysics.cs.vt.edu/H++ A web-server that adds hydrogens to molecular structures. Launched by Onufriev’s group in June 2005. ~1000 registered users since.

Intrigued? Suggested reading: “The Many roles of computation in drug discovery”, W. Jorgensen, Sceince 303, 1813 (2004). + references therein.

THEME II The protein folding challenge. Nature does it all the time. Can we? Amino-acid sequence – translated genetic code. MET—ALA—ALA—ASP—GLU—GLU--…. How? Experiment: amino acid sequence uniquely determines protein ʼ s 3D shape (ground state). Why bother: protein ʼ s shape determines its biological function.

Protein Structure in 3 steps. Step 1. Two amino-acids together (di-peptide) Peptide bond Amino-acid #1 Amino-acid #2

Protein Structure in 3 steps. Step 2: Most flexible degrees of freedom:

A protein is simply a chain of amino-acids: φ 4 φ 2 φ 3 φ 1 Each configuration { Φ 1, Φ 2, … Φκ } has some energy. The folded (biologically functional) protein has the lowest possible energy - global minimum. So just find this conformation by some kind of a minimization algorithm… what’s he big deal?

The magnitude of the protein folding challenge: Enormous number of the possible conformations of the polypeptide chain φ 4 φ 2 φ 3 φ 1 A typical protein is a chain of ~ 100 mino acids. Assume that each amino acid can take up only 10 conformations (vast underestimation) Total number of possible conformations: 10 100 Suppose each energy estimate is just 1 float point operation. Suppose you have a Penta-Flop supercomputer. An exhaustive search for the global minimum would take 10 85 seconds ~ 3*10 78 years. Age of the Universe ~ 2*10 10 years.

2 3 Free energy 1 Finding a global minimum in a multidimensional case is easy only when the landscape is smooth. No matter where you start (1, 2 or 3), Folding coordinate you quickly end up at the bottom - - the Native (N), functional state of the protein. Adopted from Ken Dill’s web site at UCSF

Realistic landscapes are much more complex, with multiple local minima – folding traps. Proteins “trapped” in those minima may lead to disease, such as Altzheimer’s Adopted from Ken Dill’s web site at UCSF

Adopted from Dobson, NATURE 426, 884 2003

Since minimization won’t work, choose an alternative. Do what Nature does: just let it fold on its own, at normal temperature. Method: Molecular Dynamics

Principles of M olecular Dynamics (MD): Y Each atom moves by Newton’s 2 nd Law: F = ma F = dE/dr System’s energy + - Bond spring x + … + Q 1 Q 2 /r Kr 2 E = Bond stretching + A/r 12 – B/r 6 Electrostatic forces VDW interaction

Can compute statistical averages, fluctuations; Analyze side chain Now we movements, have Cavity positions of dynamics, all atoms Domain as a motion, function of Etc. time.

Molecular Dynamics: PRICIPLE: Given positions of each atom x(t) at time t, its position at next time-step t + Δ t is given by: force x(t + Δ t)  x(t) + v(t) Δ t + ½ *F / m * ( Δ t) 2 Key parameter: integration time step Δ t . Controls accuracy and speed of numerical integration routines. Smaller Δ t – more accurate, but need more steps. How many needed to simulate biology? How many can one afford?

As a result, we can not quite get into the “biological” time scales. Currently accessible times biology Characteristic 10 -14 10 -6 10 0 time scales [sec] H-C bond vibration Protein folding Time-step, Δ t For stability, Δ t must be at least an order of magnitude less than the fastest motion, i.e Δ t ~ 10 -15 s. Example: to simulate folding of the fastest folding protein , at least 10 -6 /10 -15 = 10 9 steps will be needed .

The bottleneck of the methodology: computation of long-range interactions. Electrostatic interactions fall of as inverse distance between atoms. Too strong to neglect. Need to account for all of them. Very expensive. Up to 99% of total cost for a protein.

Massive parallel machines help. Virginia Tech’s supercomputer, System-X

The “worst” problem for parallel computations: Force acting on each atom Processor #1 Processor #2 depends upon X 1 , F(on X 1 ) X 2 , F(on X 2 ) positions of every other atom in the system. Computed coordinates have to be communicated between all Processor #3 processors Processor #4 at each step

Without approximations, computation of long-range electrostatic forces will cost you O(N 2 ), where the number of atoms N may be as large as N ~ 10 6 . Too expensive for large systems. Every atom interacts with every other atom, a complete graph A solution: combine charges (vertices) into groups, that is use coarse-graining. +3 After coarse-graining costs can be as low as Nlog(N) Works because macromolecules are naturally partitioned into hierarchical levels: atoms -> amino-acids -> proteins -> complexes..

Simulated Refolding pathway Movie available at: www.scripps.edu/~onufriev/RESEARCH/in_virtuo.html of the 46-residue protein. Molecular dynamics based on AMBER-7 1 3 0 1 2 3 4 5 6 5 NB: due to the absence of viscosity, folding occurs on much shorter time-scale than in an experiment.

Intrigued? Suggested articles: 1.“Protein Folding and Misfolding”, C. Dobson, Nature 426, 884 (2003). 2. “Design of a Novel Globular Protein Fold with Atomic-level Accuracy, Kuhlman et al. , Science , 302, 1364, (2003) + references therein.

CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; - PowerPoint PPT Presentation

The computational Core of Molecular Modeling. (The What? Why? And How? ) CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; Dept. of Physics and the GBCB program. Thanks to: Faculty co-authors: T.M. Murali, M. Prisant, L.

March 2018 Progress Report March Feb Anderson March Feb Anderson March Feb Anderson March

35 30 33 20 10 10 8 7 0 Feb 10 Aug 10 Feb 11 Aug 11 Feb 12 Aug 12 Feb 13 Aug 13

19 th ,20 th Feb 2010 Feb 2010 1 19 th ,20 th Feb 2010 Feb 2010 2 Contents Importance of

1 21-Feb-17 2 21-Feb-17 3 21-Feb-17

Banburismus Banburismus Monday Feb 23 and Wednesday Feb 25 Monday Feb 23 and Wednesday Feb

Personal Leadership Personal Leadership Philosophy Philosophy Sydney 03.14.2018 Overview

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

Kildare Export Success Seminar Kilian Duignan Export Success Seminar Export Success Seminar

District 6310 District 6310 Grant Management Seminar Grant Management Seminar Grant Management

Finanz okonometrisches Seminar Seminar f ur Finanz okonometrie 12. April 2018 (Seminar

New Banks seminar New Bank Start-up Unit 19 th February 2018 NBSU Seminar NBSU Seminar Welcome

PRESENTATION OF OUR FINANCIAL MID-YEAR RESULTS 14 FEB 2020 1 IBL LTD | FEB 2020 Agenda

MEE MEETING 7 TING 7 FEB FEB 2019 2019 1. Welcome and apologies 2. Minutes 17 Nov 2018 and

Nested Timed Automata Guoqiang Li Shanghai Jiao Tong University Feb. 9, 2014 Guoqiang LI | Feb.

IPv6 R&D and Deployment IPv6 R&D and Deployment Status in Korea Status in Korea Feb.

PARENT COUNCIL FEB 19 PARENT COUNCIL FEB 19 Brief Updates Inspection Rock Challenge

Olefin Metathesis Catalysts for the Synthesis of Molecules and Materials December 8, 2005

Computational Drug Discovery Guha. January 10, 2006 Two Revolutions Guha. January 10, 2006 A

TITAN Trial Darunavir/r versus Lopinavir/r in Treatment-Experienced TITAN: Study Design Study

Keith. W. Crawford, RPH, PhD Assistant Chief

Big Data in Drug Discovery David J. Wild Assistant Professor & Director, Cheminformatics

Clinical Program for Cervical Cancer Bulent ULKER MD International Medical Director F.

10. Enterprise-wide Optimization 11. Batch Scheduling TOTAL (110 pts) 1. Biosystems Engineering

Choosing the Right 1. Diagnosis screening 2. Staging of disease Treatment Regimen 3.

Sambuz

Useful Links

Newsletter

Mail Us

CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; - PowerPoint PPT Presentation

The computational Core of Molecular Modeling. (The What? Why? And How? ) CS Seminar. Feb 09. Alexey Onufriev, Dept of Computer Science; Dept. of Physics and the GBCB program. Thanks to: Faculty co-authors: T.M. Murali, M. Prisant, L.

March 2018 Progress Report March Feb Anderson March Feb Anderson March Feb Anderson March

35 30 33 20 10 10 8 7 0 Feb 10 Aug 10 Feb 11 Aug 11 Feb 12 Aug 12 Feb 13 Aug 13

19 th ,20 th Feb 2010 Feb 2010 1 19 th ,20 th Feb 2010 Feb 2010 2 Contents Importance of

1 21-Feb-17 2 21-Feb-17 3 21-Feb-17

Banburismus Banburismus Monday Feb 23 and Wednesday Feb 25 Monday Feb 23 and Wednesday Feb

Personal Leadership Personal Leadership Philosophy Philosophy Sydney 03.14.2018 Overview

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

Kildare Export Success Seminar Kilian Duignan Export Success Seminar Export Success Seminar

District 6310 District 6310 Grant Management Seminar Grant Management Seminar Grant Management

Finanz okonometrisches Seminar Seminar f ur Finanz okonometrie 12. April 2018 (Seminar

New Banks seminar New Bank Start-up Unit 19 th February 2018 NBSU Seminar NBSU Seminar Welcome

PRESENTATION OF OUR FINANCIAL MID-YEAR RESULTS 14 FEB 2020 1 IBL LTD | FEB 2020 Agenda

MEE MEETING 7 TING 7 FEB FEB 2019 2019 1. Welcome and apologies 2. Minutes 17 Nov 2018 and

Nested Timed Automata Guoqiang Li Shanghai Jiao Tong University Feb. 9, 2014 Guoqiang LI | Feb.

IPv6 R&amp;D and Deployment IPv6 R&amp;D and Deployment Status in Korea Status in Korea Feb.

PARENT COUNCIL FEB 19 PARENT COUNCIL FEB 19 Brief Updates Inspection Rock Challenge

Olefin Metathesis Catalysts for the Synthesis of Molecules and Materials December 8, 2005

Computational Drug Discovery Guha. January 10, 2006 Two Revolutions Guha. January 10, 2006 A

TITAN Trial Darunavir/r versus Lopinavir/r in Treatment-Experienced TITAN: Study Design Study

Keith. W. Crawford, RPH, PhD Assistant Chief

Big Data in Drug Discovery David J. Wild Assistant Professor &amp; Director, Cheminformatics

Clinical Program for Cervical Cancer Bulent ULKER MD International Medical Director F.

10. Enterprise-wide Optimization 11. Batch Scheduling TOTAL (110 pts) 1. Biosystems Engineering

Choosing the Right 1. Diagnosis screening 2. Staging of disease Treatment Regimen 3.

Sambuz

Useful Links

Newsletter

Mail Us

IPv6 R&D and Deployment IPv6 R&D and Deployment Status in Korea Status in Korea Feb.

Big Data in Drug Discovery David J. Wild Assistant Professor & Director, Cheminformatics