Mauro Boero Institut de Physique et Chimie des Matriaux de - - PowerPoint PPT Presentation

mauro boero
SMART_READER_LITE
LIVE PREVIEW

Mauro Boero Institut de Physique et Chimie des Matriaux de - - PowerPoint PPT Presentation

Hybrid Quantum Mechanics / Molecular Mechanics (QM/MM) Approaches - Practical aspects: Size of the QM region, initial structure (classical equilibration), QM/MM protocol; computational requirements Mauro Boero Institut de Physique et Chimie des


slide-1
SLIDE 1

1

Hybrid Quantum Mechanics / Molecular Mechanics (QM/MM) Approaches - Practical aspects: Size of the QM region, initial structure (classical equilibration), QM/MM protocol; computational requirements

Mauro Boero

Institut de Physique et Chimie des Matériaux de Strasbourg University of Strasbourg - CNRS, F-67034 Strasbourg, France and @Institute of Materials and Systems for Sustainability, Nagoya University - Oshiyama Group, Nagoya Japan

slide-2
SLIDE 2

2

Hy = Ey F = m a

e-(DE/kT) ab initio molecular dynamics coarse grain mesoscale continuum

Time and Length scales in simulations

Length Scale (m) 10-10 10-8 10-6 10-4 10-6 10-8 10-12

slide-3
SLIDE 3

From the crystallographic data to the initial structure: Starting a QM/MM calculation

  • First question to be addressed: Do we really need

QM ? (Sometimes MM is enough !)

  • Second question: Do we have all what we need ?

3

Not always: the structure of proteins and or nucleic acids available via the web-accessible Protein Data Bank or provided by experiments, are generally obtained by X-ray scattering. As it is well known that X-rays are insensitive to H atoms*, thus only the coordinates of heavier atoms are available.

*X-rays are scattered by the electron cloud of the atom and hence the scattering amplitude of X-rays increases with the atomic number

slide-4
SLIDE 4

From the crystallographic data to the initial structure: Starting a QM/MM calculation

4

Trp-cage, the second smallest protein made

  • f 20 residues.

Protein Data Bank www.pdb.org access codes: 2JOF, 1L2Y, 1RIJ C = gray N = blue O = red …and H ?

slide-5
SLIDE 5

5

  • First warning: Hydrogens addition leaves in general a certain

freedom in the selection of the protonation state of the various residues or bases

  • To complement the missing information, a careful analysis a

posteriori of the local hydrogen bond (H-bond) network is used

  • And (or jointly with) theoretical estimations of the pK

Then, before starting any calculation we need to complete the structure…

While the former operation is easily done with most of the available graphic software, the second one is more demanding and generally implies the solution of the Poisson-Boltzmann equation.

slide-6
SLIDE 6

6

  • The method has been extensively tested and assessed over the

years and makes use of the Poisson equation

Poisson-Boltzmann equation (1)

) ( 4 ) (

2

r r     =  V

where the total electrostatic potential V(r) is due to the charge distribution  (r), i.e. all the positive and negative ions of the system, including also the solvent, if present. The dielectric function e is generally assumed to be a constant and typical values e ~ 80 represent the standard assumption in the case of aqueous solutions

(D. Andelman, Electrostatic properties of membranes: The Poisson- Boltzmann Theory in “Handbook of Biological Physics”, vol. 1, p. 603-638, ed. By R. Lipowsky and E. Sackmann, Elsevier Science B.V. 1995).

slide-7
SLIDE 7

7

  • In the limit of a continuum density distribution, the chemical

potential mi of the i-th charge in the system is defined by

Poisson-Boltzmann equation (2)

where e is the electron charge, Zi the valence of the i-th ion, V the electrostatic potential, kB the Boltzmann constant, T the temperature and i the charge density of the i-th ion.

  • The first addend in the right-hand side of Eq. (2) comes from

the pure electrostatic contribution

  • The second one is due to the entropy contribution of the ions

under the hypothesis of weak solution.

) ( ln ) ( ) ( r r r

i B i i

T k V eZ  m  =

slide-8
SLIDE 8

8

  • This leads to a Boltzmann distribution of the charges

Poisson-Boltzmann equation (3)

and the explicit form of the Poisson-Boltzmann equation reads

T k V eZ i i

B i

e

/ ) (

) (

r

r

 =  

 

T k V eZ T k V eZ

B B

Z Z e V

/ ) ( / ) ( 2

e e 4 ) (

r r

r

 

     

  =     

The contributions from positive and negative ions are written separately.

This equation is solved numerically by free computer codes (N. A. Baker et al. (2001) Proc. Natl. Acad. Sci. USA 98, 10037 - http://apbs.sourceforge.net/) or by web-based services (http://bioserv.rpbs.jussieu.fr/PCE http://biophysics.cs.vt.edu/H++ etc.).

slide-9
SLIDE 9

9

  • Once that the Poisson-Boltzmann is solved, one can compute

the electrostatic free energy of the system by integrating over the volume W of the system,

Poisson-Boltzmann equation (4)

and estimate the theoretical theoretical pK is computed by introducing the free energy difference between two (or more) states, A and B, DFel = Fel (B) – Fel (A),

 

   =

W

T k r d V F

B el 3 2

) ( 8 r  

 

r d 3 )] ( 2 ) ( ) ( [ )] ( / ) ( ln[ ) ( )] ( / ) ( ln[ ) (

W      

     r r r r r r r r r         

el B

F T k K D = 10 ln 1 p

slide-10
SLIDE 10

10

  • The Poisson-Boltzmann / H-bond network inspection allows to

recover the missing H coordinates

The complete structure:

slide-11
SLIDE 11

11

  • Once the protonation state of the system has been addressed,

the protein or nucleic acid system has to be solvated in water, since this would correspond to a natural environment of the biomolecule.

…and this is not yet the end of the story:

304 atoms →2306 at.

slide-12
SLIDE 12

How can I add water ?

  • Few O atoms of H2O molecules can be already present in

the crystal structure (crystallographic water)

  • Several software tools are available
  • Xleap or tleap (Amber / glycam/parm99 add force fields)

http://www.chem.hamilton.edu/~kkirschn/Amber_Tutorial/ xleap_glycam_tutorial.pdf

  • MOIL, http://cbsu.tc.cornell.edu/software/moil/moil.html
  • GROMACS, http://www.gromacs.org/
  • PACKMOL, http://code.google.com/p/packmol/
  • Add sufficient amount of solvent water around the

crystallographic structure at standard density (1.0 g/cm3)

12

slide-13
SLIDE 13

13

The next step: MM-equilibrating your system

  • The Cartesian positions (x1,y1,z1),…,(x N,yN,zN) of our whole

system, hereafter identified by vectors RI = (xI,yI,zI), are interacting atoms

  • We have then to select an appropriate classical potential that will

be our MM description of the interactions among atoms (AMBER, CHARMM, GROMOS, GROMACS, OPLS-AA, etc.)

  • These potentials are analytic functions V(RI) of the positions RI
  • The forces fI on each particle are simply the gradients (derivatives)
  • f this potential,

I I I

V R R f    = ) (

and the analytical 3xN dimensional function fI is called force field

slide-14
SLIDE 14

14

MM: what is a Force Field ? Which interactions ?

intermolecular interactions intramolecular nonbonded interactions torsion bond stretch bending

2 2 1

) ( ) , (

IJ IJ IJ IJ IJ

r r k k r v  =

2 2 1

) ( ) (   

 =

IJK IJK

k v

v(fIJKL)

slide-15
SLIDE 15

Files in input (whatever MM code):

  • Basically we need to handle three pieces of information that can

be either contained in a single file (almost never) or separated in three different files (practically always) COORDINATES : RI = (xI,yI,zI), TOPOLOGY : ???? INPUT : list of parameters (integration step, total simulation time, etc.) and keywords activating specific functions, e.g. MD

Files in output:

  • ENERGIES, TRAJECTORY, CHARGES, MULTIPOLES,

etc.

15

slide-16
SLIDE 16

TOPOLOGY ?

Any force field (of course) does not do any electronic structure

  • calculation. So giving the coordinates (x,y,z) is insufficient. We

must supply to the MM code also the chemical species (this we do also in QM approaches) and which type of interaction bewteen atoms (=parameters of the force field). Example: Suppose we have an atom at (x,y,z). The numbers x,y,z are listed in the COORDINATES file. Now we want this atom to be C and we want it in sp3 configuration. Evidently a force field for sp3 is an analytical function different from the one for sp2. This information is given in the file called TOPOLOGY.

 

) , , ( , ,

2 3

z r V r V

sp sp

 f  

16

slide-17
SLIDE 17

TOPOLOGY ?... I do not have it…

  • The first thing to do is complaining with the person(s) who gave

you only the coordinates

  • The second alternative is to re-generate yourself the topology via

Amber_tools, e.g.

  • 1. Write a tentative input file tleap.in as

source leaprc.cell → load glycam and parm99 additional ff cellmod=loadpdb <your-PBD-file>.pdb set cellmod box {110.466 110.466 110.466} → cell size # save coordinates, topology and pdb files ! saveamberparm cellmod cellmod.top cellmod.crd savepdb cellmod cellmod.pdb quit

  • 2. Run tleap/xleap

tleap -f tleap.in

17

slide-18
SLIDE 18

18

Equilibrating your system

  • Once the full system has been obtained, before doing any

quantum calculation one should better do a full classical (=MM) dynamical run …why ? 1) You had just frozen coordinates of heavy atoms only 2) You added H atoms a posteriori 3) You added solvent water molecules 4) So, you have to reach an equilibrium state solute-solvent Monitor your MM dynamics, especially crucial quantities: T (temperature of the system) and partial T solute/solvent Root Mean Square Displacement (RMSD) of heavy atoms

 

=

 =

heavy atoms

N I I I heavy atoms

t N

1 2

) ( ) ( 1 RMSD R R

slide-19
SLIDE 19

19

The RMSD: comparison with pristine crystallographic data

Check if your system is still compatible with the experimental resolution* ! (at very best 1-2 Å. Mostly about 3 Å)

slide-20
SLIDE 20

20

  • Not every atom in a large system generally needs a QM

treatment (the QM/MM ansatz)

  • Anyhow they are there in real systems and participate to many

interactions (H-bonds, van der Waals, etc.)

  • (Thumb) rule of the QM/MM game: Identify a small portion of

your large system that is interesting (localized) for your specific purposes and that is tractable by QM approaches

  • Classical-Quantum interface: be careful about the interaction

between the two worlds (we shall see details later)

  • Remember that any QM/MM dynamics must preserve the

constants of motion

QM/MM selection and protocol

slide-21
SLIDE 21

21

  • 1. The choice of the QM methods follows the same criteria of

any purely QM approach

  • 2. The QM subsystem is always embedded inside the (larger)

MM subsystem

  • 3. The QM and MM subsystems are always in strong interaction

with each other

  • 4. We shall focus on QM = DFT because of its favorable

computational cost / accuracy ratio

QM/MM selection and protocol

slide-22
SLIDE 22

22

  • The choice of the QM subsystem is always somehow

arbitrary, hence it must be carefully checked. e.g. by extending the QM region from QM1 to QM2 (or by reducing from QM2 to QM1) do forces on the atoms change ? QM1 FMM(RI) ≈ FQM(RI) QM2 ??????

QM/MM selection and protocol

QM2 QM1 RI

slide-23
SLIDE 23

An example of hybrid QM/MM-MD:

RNA enzymes =Ribozymes

1) Biological function

Contribution to transfer of genetic information from DNA to protein

2) Evolution of organisms

primordial organism present organism (in the RNA world) (in the RNP world) gene RNA DNA enzyme RNA protein

3) Medical application

Inhibition of expression of genes, such as oncogene (cancer gene therapy)

slide-24
SLIDE 24

24

An example of 3D structure of riboyzme: hammerhead ribozyme

(molecular) scissors = (catalytic) reaction

slide-25
SLIDE 25

25

QM ? QM ?

Hammerhead ribozyme: Just an example

Or ?

slide-26
SLIDE 26

26

QMMM dynamical simulation in the absence of OH-

slide-27
SLIDE 27

27

QMMM dynamical simulation in the presence of OH-

slide-28
SLIDE 28

28

Mg2+

1-Mg2+ 2 distance during the simulation

No OH- with OH-

slide-29
SLIDE 29

29

  • We need to treat as QM at least all the atoms participating to

the active process we are interested in. In many cases also atoms far apart which are expected to approach the active center, and/or first/second solvation shells, etc.

  • This implies at least a rough knowledge of the process under

study

  • Since inside the QM subsystem also the electronic structure is

computed, we must select a QM sub-cell large enough to accommodate wavefunctions and densities at any time step of the simulation (at least to the numerical accuracy ~10-5 e/Å3)

  • If the process under study implies large separations of QM

atoms, these must not escape the QM simulation box…

  • …as well as wavefunctions (electron spill-out problem)

QM/MM selection and protocol