Integrative modeling of biomolecular complexes Prof. Alexandre - - PDF document

integrative modeling of biomolecular complexes
SMART_READER_LITE
LIVE PREVIEW

Integrative modeling of biomolecular complexes Prof. Alexandre - - PDF document

10/24/19 Integrative modeling of biomolecular complexes Prof. Alexandre M.J.J. Bonvin Bijvoet Center for Biomolecular Research Faculty of Science, Utrecht University the Netherlands a.m.j.j.bonvin@uu.nl @amjjbonvin 1 Overview g


slide-1
SLIDE 1

10/24/19 1

Integrative modeling of biomolecular complexes

  • Prof. Alexandre M.J.J. Bonvin

Bijvoet Center for Biomolecular Research Faculty of Science, Utrecht University the Netherlands a.m.j.j.bonvin@uu.nl @amjjbonvin

1

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

2

slide-2
SLIDE 2

10/24/19 2

[Faculty of Science Chemistry]

The social network of proteins

Majority of ‘life’ depends on interactions, particularly protein-protein

3

The protein-protein interaction Cosmos

4

slide-3
SLIDE 3

10/24/19 3

[Faculty of Science Chemistry]

1vpn

Structural biology of interactions

High-throughput computation vs. High-resolution experiments computational models are often not trusted by the experimental community Computation Experiment NMR MS Cryo-EM X-Ray SAS FRET EPR Docking Molecular Dynamics Homology Modeling Threading

5

[Faculty of Science Chemistry]

Unique interactions in interactomes E.coli H.sapiens

with complete structures with partial (domain-domain) or complete models with structures for the interactors (suitable for docking) without structural data

  • ~7,500 binary interactions

in E.coli

  • ~44,900 binary

interactions in H.sapiens

Structural coverage of interactomes

Statistics from Interactome3D (2013-01) Mosca et al. Nature Methods 2013

6

slide-4
SLIDE 4

10/24/19 4

[Faculty of Science Chemistry]

  • H. Sapiens

8,679 (2015_02) à 13,889 (2019_01) Total: 118,706

  • E. Coli

1,347 (2015_02) à 1,499 (2019_01) Total: 4,217

Unique binary interactions

Statistics taken from Interactome3D project (2019_01): https://interactome3d.irbbarcelona.org/

Structural coverage of interactomes

7

[Faculty of Science Chemistry]

Molecular Docking

8

slide-5
SLIDE 5

10/24/19 5

[Faculty of Science Chemistry]

Methodology Sampling Scoring Data incorporation

Conformational Landscape Interaction Energy 9

[Faculty of Science Chemistry]

Global Search Information-driven Search

Conformational Landscape Interaction Energy Conformational Landscape Interaction Energy

Data Integration during Sampling

10

slide-6
SLIDE 6

10/24/19 6

[Faculty of Science Chemistry]

What is Integrative Modeling?

11

[Faculty of Science Chemistry]

Why integrative modelling?

For Experimentalists

ü New hypothesis to drive experiments ü Speed up structure determination ü Increase our understanding of function For Modelers ü Decrease high false positive rate ü Ease accuracy assessment

12

slide-7
SLIDE 7

10/24/19 7

[Faculty of Science Chemistry]

Related reviews

  • Halperin et al. (2002) Principles of docking: an overview of search algorithms and a guide to

scoring functions. PROTEINS: Struc. Funct. & Genetics 47, 409-443.

  • Special issues of PROTEINS: (2003) (2005) (2007) (2010) (2013) and (2016), which are

dedicated to CAPRI.

  • de Vries SJ and Bonvin AMJJ (2008). How proteins get in touch: Interface prediction in the

study of biomolecular complexes. Curr. Pept. and Prot. Research 9, 394-406.

  • Melquiond ASJ, Karaca E, Kastritis PL and Bonvin AMJJ (2012). Next challenges in protein-

protein docking: From proteome to interactome and beyond. WIREs Computational Molecular Science 2, 642-651 (2012).

  • Karaca E and Bonvin AMJJ (2013). Advances in integrated modelling of biomolecular
  • complexes. Methods, 59, 372-381 (2013).
  • Rodrigues JPGLM and Bonvin AMJJ (2014). Integrative computational modelling of protein
  • interactions. FEBS J., 281, 1988-2003 (2014).

13

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

14

slide-8
SLIDE 8

10/24/19 8

[Faculty of Science Chemistry]

Experimental sources:

mutagenesis

Advantages/disadvantages + Residue level information

  • Loss of native structure

should be checked Detection

  • Binding assays
  • Surface plasmon resonance
  • Mass spectrometry
  • Yeast two hybrid
  • Phage display libraries, …

15

[Faculty of Science Chemistry]

Experimental sources:

cross-linking and other chemical modifications

Advantages/disadvantages + Distance information between linker residues

  • Cross-linking reaction problematic
  • Detection difficult

Detection

  • Mass spectrometry

16

slide-9
SLIDE 9

10/24/19 9

[Faculty of Science Chemistry]

Experimental sources:

H/D exchange

Advantages/disadvantages + Residue information

  • Direct vs indirect effects
  • Labeling needed for NMR

Detection

  • Mass spectrometry
  • NMR 15N HSQC

17

[Faculty of Science Chemistry]

Experimental sources:

NMR chemical shift perturbations

Advantages/disadvantages + Residue/atomic level + No need for assignment if combined with a.a. selective labeling

  • Direct vs indirect effects
  • Labeling needed

Detection

  • NMR 15N or 13C HSQC

18

slide-10
SLIDE 10

10/24/19 10

[Faculty of Science Chemistry]

Experimental sources:

NMR orientational data (RDCs, relaxation)

Advantages/disadvantages + Atomic level

  • Labeling needed

Detection

  • NMR

19

[Faculty of Science Chemistry]

Other potential experimental sources

  • Paramagnetic probes in combination with NMR
  • Cryo-electron microscopy or tomography and small

angle X-ray scattering (SAXS) ==> shape information

  • Fluorescence quenching
  • Fluorescence resonance energy transfer (FRET)
  • Infrared spectroscopy combined with specific labeling

20

slide-11
SLIDE 11

10/24/19 11

[Faculty of Science Chemistry]

Predicting interaction surfaces

  • In the absence of any experimental information (other

than the unbound 3D structures) we can try to predict interfaces from sequence information?

  • WHISCY:

WHat Information does Surface Conservation Yield?

http://www.nmr.chem.uu.nl/whiscy EFRGSFSHL EFKGAFQHV EFKVSWNHM LFRLTWHHV IYANKWAHV EFEPSYPHI Alignment

Surface smoothing

+

Propensities

predicted true

+

De Vries, van Dijk Bonvin. Proteins 2006

21

[Faculty of Science Chemistry]

Predicting interaction surfaces

  • Several other approaches have been described:

– HSSP (Sander & Schneider, 1993) – Evolutionary trace (Lichtarge et al., 1996) – Correlated mutations

(Pazos et al., 1996)

– ConsSurf (Armon et al., 2001) – Neural network (Zhou & Shan, 2001) (Fariselli et al., 2002) – Rate4Site (Pupko et al., 2002) – ProMate (Neuvirth et al., 2004) – PPI-PRED (Bradford & Westhead, 2005) – PPISP (Chen & Zhou, 2005) – PINUP (Liang et al., 2006) – SPPIDER (Kufareva et al, 2007) – PIER (Porolo & Meller, 2007) – SVM method (Dong et al., 2007) – ... and many more since then – Our recent meta-server: CPORT (de Vries & Bonvin, 2011)

See review article (de Vries & Bonvin 2008)

22

slide-12
SLIDE 12

10/24/19 12

[Faculty of Science Chemistry]

Interface prediction servers

  • PPISP (Zhou & Shan,2001; Chen & Zhou, 2005)

http://pipe.scs.fsu.edu/ppisp.html

  • ProMate (Neuvirth et al., 2004)

http://bioportal.weizmann.ac.il/promate

  • WHISCY (De Vries et al., 2005)

http://www.nmr.chem.uu.nl/whiscy

  • PINUP (Liang et al., 2006)

http://sparks.informatics.iupui.edu/PINUP

  • PIER (Kufareva et al., 2006)

http://abagyan.scripps.edu/PIER

  • SPPIDER (Porollo & Meller, 2007)

http://sppider.cchmc.org

Consensus interface prediction (CPORT)

haddock.science.uu.nl/services/CPORT

23

[Faculty of Science Chemistry]

CPORT webserver

haddock.science.uu.nl/services/CPORT/

24

slide-13
SLIDE 13

10/24/19 13

[Faculty of Science Chemistry]

Combining experimental or predicted data with docking

  • a posteriori: data-filtered docking

– Use standard docking approach – Filter/rescore solutions

  • a priori: data-directed docking

– Include data directly in the docking by adding an additional energy term

  • r limiting the search space

25

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

26

slide-14
SLIDE 14

10/24/19 14

[Faculty of Science Chemistry]

Docking

  • Choices to be made in docking:

– Representation of the system – Sampling method:

  • 3 rotations and 3 translations
  • Internal degrees of freedom?

– Scoring – Flexibility, conformational changes? – Use experimental information?

27

[Faculty of Science Chemistry]

Systematic search

  • Sample rotations (3) and translations (3)
  • For each orientation calculate a score
  • Can be very time consuming depending on scoring

function

  • Translational search often carried out in (2D or 3D)

Fourier space by convolution of the grids

  • Examples:

– FFT methods: Z-DOCK, GRAMM, FTDOCK… – Direct search: Bigger (uses fast boolean operations)

28

slide-15
SLIDE 15

10/24/19 15

[Faculty of Science Chemistry]

Energy-driven search methods

  • Conformational search techniques aiming at

minimizing some kind of energy function (e.g. VdW, electrostatic…):

– Energy minimization – Molecular dynamics – Brownian dynamics – Monte-Carlo methods – Genetic algorithms – …

  • Often combined with some simulated annealing

scheme

29

[Faculty of Science Chemistry]

Dealing with flexibility

  • Flexibility makes the docking problem harder!

– Increased number of degrees of freedom – Scoring more difficult

  • Difficult to predict a-priori conformational changes
  • Current docking methodology can mainly deal with

small conformational changes

  • Treatment of flexibility depends on the chosen

representation of the system and the search method

30

slide-16
SLIDE 16

10/24/19 16

[Faculty of Science Chemistry]

Scoring

  • The holy grail in docking!
  • Depends on the

representation of the system and treatment of flexibility

  • Depends on the type of

complexes

– e.g. antibody-antigen might behave differently than enzyme- inhibitors complexes

31

[Faculty of Science Chemistry]

Scoring

  • Score is often a combination of various (empirical) terms

such as

– Intermolecular van der Waals energy – Intermolecular electrostatic energy – Hydrogen bonding – Buried surface area – Desolvation energy – Entropy loss – Amino-acid interface propensities – Statistical potentials such as pairwise residue contact matrices

– …

  • Experimental filters sometimes applied a posteriori if data

available (e.g. NMR chemical shift perturbations, mutagenesis,..)

32

slide-17
SLIDE 17

10/24/19 17

[Faculty of Science Chemistry]

Clustering protein complexes

  • Docking methods often produce thousands of models.
  • Scoring functions do not perfectly describe the energy

landscape.

  • Clustering

groups similar structures together and allows better analysis.

  • Similarity is defined by a specific measure (e.g. RMSD,

interface RMSD, FCC)

Energy

33

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

34

slide-18
SLIDE 18

10/24/19 18

[Faculty of Science Chemistry]

Incorporates ambiguous and low- resolution data to aid the docking Capable of docking up to 20 molecules (new version) Symmetries can be leveraged Allows for flexibility at the interface Final flexible refinement in explicit solvent One of the best performing software in CAPRI

HADDOCK: An integrative modeling platform

http://www.bonvinlab.org/software

35

[Faculty of Science Chemistry]

Data-driven docking with HADDOCK

A B i x y z j k

List of interface residues for protein A List of interface residues for protein B

Ambiguous Interaction Restraint:

a residue must make contact with any residue from the other list Different fraction of restraints (typically 50%) randomly deleted for each docking trial to deal with inaccuracies and errors in the information used

Effective distance diABeff calculated as diAB

eff

= 1 dmnk

6 n k = 1 Nat o t o ms

!

k= 1 N r N r esB

!

mi A

i A= 1

N a N a t o t o ms

!

" # $ $ $ % & ' ' '

( 1 6

(Nilges & Brunger 1991)

36

slide-19
SLIDE 19

10/24/19 19

[Faculty of Science Chemistry]

Searching the interaction space in HADDOCK

  • Experimental and/or predicted information is combined

with an empirical force field into an energy function whose minimum is searched for

  • Vpotential = Vbonds

+ Vangles + Vtorsion + Vnon-bonded + Vexp

  • Search is performed by a combination of gradient driven

energy minimization and molecular dynamics simulations

Van der Waals electrostatic

37

[Faculty of Science Chemistry]

Classical mechanics

  • Molecular dynamics: generates successive

configurations of the system by integrating Newton’s second law d 2 dt 2  r

i =

 F

i

mi  F

i = − ∂V

∂ r

i

with

t1 t2 t3

 r (t1)  r (t2)  v (t1)  v (t2)  F (t1) 38

slide-20
SLIDE 20

10/24/19 20

[Faculty of Science Chemistry]

Succession of energy minimization and molecular dynamics protocols reminiscent of NMR structure calculations

it1 itw it0

HADDOCK docking protocol

39

[Faculty of Science Chemistry]

Rigid-body energy minimization guided by restraints for fast sampling in the absence of data, define restraints between centers of mass

it0

Rigid-body Energy Minimization

Rigid-body protocol allows generation of several thousand of models in a short period of time. Simultaneous docking of max. 6 molecules, resembling in vivo complex assembly (vs. sequential docking) Typically, 10.000 conformations are sampled but

  • nly the best 1.000 are written to disk.

Rotational and translational optimization of the interacting partners, guided by the data-driven energy function.

HADDOCK docking protocol

40

slide-21
SLIDE 21

10/24/19 21

[Faculty of Science Chemistry]

Flexible simulated annealing in torsion angle space at the interface region thorough optimization reproduces small conformational changes

it1

Semi-flexible simulated annealing

3-step process that increasingly allows more flexibility at the interface: rigid-body, side-chain, backbone + side-chain. Flexibility reproduces conformation changes up to 2Å, typical of small induced fit. Typically, the 200 best models of it0 undergo refinement. Torsion angle dynamics allows for faster integration time steps, while sampling relevant motions.

HADDOCK docking protocol

41

[Faculty of Science Chemistry]

Refinement in explicit solvent to optimize the contacts at the interface can be used in isolation to refine and score existing models

itw

Refinement in explicit solvent

Short molecular dynamics simulation in explicit solvent to refine residue-residue contacts, mainly electrostatics, at the interface. Position restraints on backbone heavy atoms ensure conformation remains largely the same. Explicit solvent models include TIP3P water and DMSO (membrane mimic). Typically, all models of it1 are refined, i.e. there is no selection between it1 and itw.

HADDOCK docking protocol

42

slide-22
SLIDE 22

10/24/19 22

[Faculty of Science Chemistry]

HADDOCK docking protocol

43

[Faculty of Science Chemistry]

HADDOCK & Flexibility

  • Several levels of flexibility:
  • Implicit:

– docking from ensembles of structures – Scaling down of intermolecular interactions

  • Explicit:

– semi-flexible refinement stage with both side- chain and backbone flexibility during in torsion angle dynamics – Final refinement in explicit solvent

44

slide-23
SLIDE 23

10/24/19 23

[Faculty of Science Chemistry]

Energetics & Scoring

  • OPLS non-bonded parameters (Jorgensen, JACS 110, 1657 (1988))
  • 8.5Å non-bonded cutoff, switching function, ε=10
  • Clustering of solutions
  • Ranking based on cluster-based HADDOCK score:

– Eair: ambiguous interaction restraint energy – Edesolv: desolvation energy using Atomic Solvation Parameters

(Fernandez-Recio et al JMB 335, 843 (2004))

– BSA: buried surface area

Rigid: Score = 0.01 Eair + 0.01 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSA Flexible: Score = 0.1 Eair + 1.0 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSA Water: Score = 0.1 Eair + 1.0 EvdW + 0.2 Eelec + 1.0 Edesolv Score

45

[Faculty of Science Chemistry]

Performance of the HADDOCK team in the last CASP/CAPRI

46

slide-24
SLIDE 24

10/24/19 24

[Faculty of Science Chemistry]

CASP/CAPRI 2014

Slide courtesy of Marc Lensink and Shoshana Wodak

47

[Faculty of Science Chemistry]

ADDOCK

High-Ambiguity Driven Docking saxs xl-ms bioinfo

mutagenesis nmr

Haddock web portal

> 14500 registered users > 220000 served runs since June 2008 > 45% on the GRID

Visit bonvinlab.org/software

De Vries et al. Nature Prot. 2010 Van Zundert et al. J.Mol.Biol. 2016

48

slide-25
SLIDE 25

10/24/19 25

[Faculty of Science Chemistry]

https://wenmr.science.uu.nl/user_map

Haddock web portal

49

[Faculty of Science Chemistry]

https://wenmr.science.uu.nl/user_map

Haddock web portal

50

slide-26
SLIDE 26

10/24/19 26

[Faculty of Science Chemistry]

HADDOCK development’s highlights

  • Extension to up to 20 molecules

Example of a complex protein structure calculated with the new HADDOCK framework: the box C/D enzyme for RNA methylation.

51

[Faculty of Science Chemistry]

HADDOCK development’s highlights

  • Complete rewrite of the portal (v2.4 to be released soon)
  • Provides support for cryo-EM data, coarse-graining, …

https://haddock.science.uu.nl/services/HADDOCK2.4

52

slide-27
SLIDE 27

10/24/19 27

Partners Funding

BioExcel Centre of Excellence Driving and Supporting Computational Biomolecular Research in Europe

53

Partners Funding

HADDOCK forum in BioExcel ask. sk.bioexc xcel.eu

54

slide-28
SLIDE 28

10/24/19 28

[Faculty of Science Chemistry]

DisVis PowerFit CPORT Prodigy 3D-DART SpotON

Bonvin Lab

Restraints visualization Interface predicton HotSpot predicton DNA structure modelling cryo EM map fitting affinity prediction

Computational Structural Biology @Utrecht University CS-Rosetta

Chemical shift- based structure prediction

55

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

56

slide-29
SLIDE 29

10/24/19 29

[Faculty of Science Chemistry]

  • Iron import machinery in

gram-negative bacteria

  • First complete crystal

structure of such a receptor

Iron Piracy:

NMR-based modelling of the FusA-ferredoxin complex

57

[Faculty of Science Chemistry]

  • NMR chemical shift perturbation experiments define the

binding site on ferredoxin (which carries an iron-sulfur cluster) à active residues in HADDOCK

Docking strategy

58

slide-30
SLIDE 30

10/24/19 30

[Faculty of Science Chemistry]

  • No info for FusA (expect that

the binding

  • ccurs

in the extracellular part)

à extra cellular loops defined as passive (which does not generate an energetic penalty if not contacted) à Definition of passive refined in a second docking run

Docking strategy

59

[Faculty of Science Chemistry]

Model of the FusA-ferredoxin complex

60

slide-31
SLIDE 31

10/24/19 31

[Faculty of Science Chemistry]

SUBSTRATE

NMR Example: CSP-driven docking

Lys-48 Lys-63 C-ter

Ubiquitin

  • Ub-cleaving enzyme

– Josephin

  • Which di-Ub linkage type is

cleaved, K48 and/or K63 linkage?

  • Collaboration with Annalisa

Pastore (London, MRC)

Nicastro et al., Plos One, 2010

61

[Faculty of Science Chemistry]

Input for docking:

  • Catalytic Triad
  • 2 Binding-sites

– CSP + Mutation

  • FMD Protocol

Josephin Binding-site-2 Binding-site-1

NMR Example: CSP-driven docking

Nicastro et al., Plos One, 2010

62

slide-32
SLIDE 32

10/24/19 32

[Faculty of Science Chemistry]

Ub1 Ub2 Lys48-linkage Lys63-linkage Ub1 Ub2

C-ter C-ter

Ub Reaction Products (%)

K48 K63

Nicastro et al., Plos One, 2010

NMR Example: CSP-driven docking

63

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking

g Chemical shift perturbation data g MS data as filters in docking

g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

64

slide-33
SLIDE 33

10/24/19 33

[Faculty of Science Chemistry]

MS-based modelling of a bacterial circadian clock machinery

Adrien Melquiond

65

[Faculty of Science Chemistry]

Circadian clock controlled by the Kai system consisting of three proteins: KaiA, KaiB and KaiC Interactions define the phosphorylation status of KaiC and control the phase of the cycle Information from MS:

  • From native MS: Stochiometry of the KaiB-KaiC complex (6:1)
  • From HD exchange: Binding interface and allosteric effects upon

binding

Insight into cyanobacterial circadian timing: the KaiB-KaiC interaction

Snijder et al. PNAS 111, 1379 (2014)

66

slide-34
SLIDE 34

10/24/19 34

[Faculty of Science Chemistry]

The KaiB-KaiC interaction: HDX

  • HDX-MS data reveal one protected face on KaiB
  • Mutagenesis data show that R22, K67 and R74 abolish or alter the

circadian rhythm when mutated

67

[Faculty of Science Chemistry]

The KaiB-KaiC interaction: HDX

KaiC

68

slide-35
SLIDE 35

10/24/19 35

[Faculty of Science Chemistry]

Collision cross section from MS allows to filter the HADDOCKing solutions

The KaiB-KaiC interaction: CCS

HADDOCK best scoring/most populated solution of CII

Snijder et al. PNAS 111, 1379 (2014)

69

[Faculty of Science Chemistry]

Collision cross section from MS allows to filter the HADDOCKing solutions

The KaiB-KaiC interaction: CCS

HADDOCK best scoring/most populated solution of CII

Recent cryo-EM model reveals CI as true structure! Snijder et al. Science 2017 CCS misled us

70

slide-36
SLIDE 36

10/24/19 36

[Faculty of Science Chemistry]

Fooled by KaiB!

Recent structure of KaiB reveals a different fold for the low populated monomeric form

Tseng et al, Science 355, 2017

180

“KaiB belongs to a rare class

  • f so-called metamorphic

proteins, which reversibly switch between different folds under native conditions. KaiB transitions from a highly populated, inactive tetrameric ground-state fold (KaiBgs) to a rare, active-state monomeric fold (KaiBfs)”

71

[Faculty of Science Chemistry]

Coarse Grain to All Atom All Atom to Coarse Grain

ADDOCK

High-Ambiguity Driven Docking saxs xl-ms bioinfo

mutagenesis nmr

  • CG

CG

D O C K I N G

MARTINI 2.2p De Jong et al. JCTC 2013

INTERMEZZO

Jorge Roel-Touris

72

slide-37
SLIDE 37

10/24/19 37

[Faculty of Science Chemistry]

Full 7 body 6:1 KaiB:KaiC docking

  • HDX + mutagenesis
  • C6 symmetry restraints
  • 7-body simultaneous docking with

HADDOCK-CG Haddock score

  • best CI model
  • 216 ± 13
  • best CII model

+45 ± 19 ~7 fold speed-up Now consistent with cryo-EM Independent validation:

  • Fitting in cryo-EM map using Chimera
  • Correlation score: 0.82 (vs 0.84 for EM model PDB-UD 5N8Y)

73

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

74

slide-38
SLIDE 38

10/24/19 38

[Faculty of Science Chemistry]

Distance-based information

  • Many experimental methods can provide sparse and possibly

ambiguous distance information for the modelling of complexes

  • E.g. cross-links detected by MS provide distance restraints

with an upper bound

Gydo van Zundert

75

[Faculty of Science Chemistry]

Given 2 interacting structures and a set of distance restraints between them, are there any solutions that satisfy N restraints?

Defining the information content and consistency of distance restraints

A solution is a complex that satisfies all N distance restraints A complex is a conformation where: The subunits are interacting The subunits are not clashing The accessible interaction space is the set of all solutions satisfying at least N restraints

76

slide-39
SLIDE 39

10/24/19 39

[Faculty of Science Chemistry]

core region interaction region receptor ligand core region

Sample many conformations, by a systematic 6D exhaustive search (3 rotations and 3 translations) (rigid-body FFT-docking) For each conformation check whether it is a complex (at least

  • ne contact), and count them

For each complex check how many and which restraints are obeyed, and count them

DisVis: re-using old tools to solve new problems

77

[Faculty of Science Chemistry]

Visualizing the accessible interaction space

At every grid position, save the maximum number of consistent restraints found during the 6D search

Accessible interaction space consistent with at least 5 restraints Accessible interaction space consistent with at least 7 restraints

78

slide-40
SLIDE 40

10/24/19 40

[Faculty of Science Chemistry]

Case study: RNA-polymerase II

  • Two chains of RNA Polymerase II
  • Crystal structure available
  • 6 BS3 cross-links available
  • Molecular dynamics trajectory

analysis:

  • 30Å max Lys-Lys distance (Cb – Cb)
  • Added 2 false-positive restraints

BS3: Bissulfosuccinimidyl suberate

79

[Faculty of Science Chemistry]

RNA-polymerase II: Accessible interaction space

DisVis 6D systematic search with a 1Å grid size and 5.27° interval

  • 80
slide-41
SLIDE 41

10/24/19 41

[Faculty of Science Chemistry]

RNA-polymerase II: Detecting false-positive restraints

DisVis 6D systematic search with a 1Å grid size and 5.27 interval

  • 81

[Faculty of Science Chemistry]

RNA-polymerase II: Accessible interaction space

DisVis 6D systematic search with a 1Å grid size and 5.27° interval

  • 82
slide-42
SLIDE 42

10/24/19 42

[Faculty of Science Chemistry]

RNA-polymerase II: Detecting false-positive restraints

DisVis 6D systematic search with a 1Å grid size and 5.27 interval

  • 83

[Faculty of Science Chemistry]

Interface residues from consistent solutions

Mapping the interface

84

slide-43
SLIDE 43

10/24/19 43

[Faculty of Science Chemistry]

Because of complex software dependencies we use docker containers

  • Python2.7
  • NumPy 1.8+
  • SciPy
  • FFTW3
  • pyFFTW 0.10+
  • OpenCL1.1+
  • pyopencl
  • clFFT
  • gpyfft

And to avoid security issues on the grid side, udocker from INDIGO

DISVIS: grid, GPGPU-enabled web portal

http://milou.science.uu.nl/enmr/services/DISVIS/

Van Zundert et al., J. Mol. Biol. (2017)

indigodatacloudapps/disvis Mikael Trellet Jörg Schaarschmidt

85

[Faculty of Science Chemistry]

Guided interpretation

  • f results

86

slide-44
SLIDE 44

10/24/19 44

[Faculty of Science Chemistry]

E2A-HPR mapping from unbound structures using 56 intermolecular NOEs

(Wang et al, EMBO J 2000)

Not limited to MS cross-links

87

[Faculty of Science Chemistry]

  • Visualization the information content of distance restraints
  • Solely based on geometric considerations
  • Identification of possible false positives
  • Provides information about possible interfaces, valuable

information to guide modelling

  • BUT: Does not account for conformational changes and

energetics Conclusions - DISVIS

88

slide-45
SLIDE 45

10/24/19 45

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

89

Multiple choice...

Protein-DNA modelling Solvated protein-DNA docking HADDOCK’s adventures in CAPRI Protein-peptide modelling Modelling (N>2) and large conformational changes Limits of homology modelling in docking Solvated protein-protein docking Protein-ligand modelling Binding affinity in protein-protein interactions A specificity riddle Integrative modelling of a peptide nanocarrier Fast and reliable rigid body docking in cryo-EM maps HADDOCKing with cryo-EM data Visualization of the interaction space defined by restraints HADDOCKing with SAXS data HADDOCKing with MS data HADDOCKing with NMR residual dipolar couplings data HADDOCKing with NMR pseudo contact shifts iSee: ∆∆G of mutations Graph-based scoring: iScore HADDOCKing using co-evolution data Antibody-antigen modelling

90

slide-46
SLIDE 46

10/24/19 1

[Faculty of Science Chemistry]

Helix I Helix II

A B C D F E G H I J CDA I

Open end

CDA II

Loop CD Loop EF Helix I Helix II

A B C D F E G H I J CDA I

Open end

CDA II

Loop CD Loop EF

a

Collaboration with Lucia Zetta (Milano) and Henriette Molinari (Verona)

Tomaselli et al., Proteins, 2007

Multiple ligand HADDOCKing

  • NMR and modelling study of

Chicken liver bile acid binding protein (cL-BABP)

  • NMR-based HADDOCK modelling
  • f a ternary complex based on:

– CSP data – Relaxation data for cL-BABP – Limited Proteolysis – STD data for the chenodeoxycholic acid – NOE data from 2D (13C-15N) filtered in NOESY-HSQC

[Faculty of Science Chemistry]

Can deal with complex molecules

slide-47
SLIDE 47

10/24/19 2

[Faculty of Science Chemistry]

Fully flexible small molecule docking Fully flexible small molecule docking

slide-48
SLIDE 48

10/24/19 3

[Faculty of Science Chemistry]

HADDOCK-modelling of substrate binding in PagL, an outer-membrane enzyme involved in LPS- modification

PagL

  • Deacetylase (hydrolysis of acylesterbond)
  • Activity found in S. typhimurium, B.

Bronchiseptica and P. aeruginosa

  • PagL homologues found in more than 10

bacterial species

  • Crystal structure solved in Utrecht
  • Only three residues conserved (Phe104,

His126, Ser128)

  • Site directed mutagenesis: serine

hydrolase

Wietske Lambert Lucy Vandeputte-Rutten Piet Gros

Crystal and Structural Chemistry, Bijvoet Center, Utrecht University

[Faculty of Science Chemistry]

LPS (substrate) PagL catalytic triad PagL (oxyanion hole) Glu/Asp His Ser

PagL: serine hydrolase mechanism

Still open questions:

  • catalytic triad:

– His126, Ser128 (conserved) – Glu140 or Asp 106?

  • oxyanion hole:

– backbone nitrogens? – semi-conserved Asn136?

slide-49
SLIDE 49

10/24/19 4

[Faculty of Science Chemistry]

Substrate recognition by PagL

[Faculty of Science Chemistry]

Lipid x docking onto PagL

  • AIRs for docking:

– reaction mechanism

  • carbonyl C of lipid x close to

active site Ser of PagL

  • ester O of lipid x close to

active site His of PagL

– hydrophobicity

  • acyl chains of lipid x should

be in the membrane

slide-50
SLIDE 50

10/24/19 5

[Faculty of Science Chemistry]

HADDOCK best solution

New insights from docking:

Lipid x acyl chains bind in well-defined grooves Catalytic triad: Ser-His-Glu triad

Asp involved in specific (OH group) substrate recognition

[Faculty of Science Chemistry]

Gly Ala Asn Asp Ser His Glu

  • xyanion hole

Phe

active site specificity for OH group substrate stabilizing acyl chain

PagL active site

slide-51
SLIDE 51

10/24/19 6

[Faculty of Science Chemistry]

Schieborr et al. ChemBioChem (2005)

NMR-based ligand docking

  • Test on PTP1B

– With a apo form of the protein (1PTY)

AIRs from Schwalbe’s group (Frankfurt) Simulated AIRs (5Å or 10Å)

[Faculty of Science Chemistry]

HADDOCKing with real NMR data

  • Comparison with reference structure (1ECV)

– RMSD on ligand (10 best ranked structures): 0.8 0.1 Å (NMR CSP data) 1.0 0.4 Å (10Å simulated) 0.9 0.2 Å (5Å simulated)

slide-52
SLIDE 52

10/24/19 7

[Faculty of Science Chemistry]

HADDOCK can handle distance restraints (what most small molecule docking software surprisingly can’t)

[Faculty of Science Chemistry]

  • Blind experiments that test the

performance and push the boundaries of biomolecular design and simulation in drug development.

  • Flagship experiment -> Grand

Challenge 2 & 3

  • Two stages:

– 1) pose prediction + ranking – 2) ranking/affinity prediction

https://drugdesigndata.org/

D3R -> Drug Design Data Resource

Zeynep Kurkcuoglu Panos Koukos

slide-53
SLIDE 53

10/24/19 8

[Faculty of Science Chemistry]

  • Farnesoid X Receptor (FXR) also

known as Nuclear Bile Acid Receptor.

  • Involved in bile acid homeostasis

and cholesterol and glucose metabolism.

  • Pharmaceutical target for diabetes

and dyslipidemia.

PDBID: 3DCT

The target

[Faculty of Science Chemistry]

GC2: The challenges

Deep and mostly hydrophobic pocket Rather complex and diverse set of ligands (102 in total) Flexible helices at the receptor proximal end

Zeid et al.

slide-54
SLIDE 54

10/24/19 9

[Faculty of Science Chemistry]

  • Generate 3D conformers

from the SMILE strings (using the OpenEye Omega Toolkit)

  • Cluster (hierarchically) the

conformers and select representative structures

  • Ensemble docking in

HADDOCK

GC2: Ligand generation

[Faculty of Science Chemistry]

  • BLAST Apo protein sequence against the

PDB

  • Remove problematic entries, cluster the

proteins based on binding pocket backbone-RMSD

  • Mutate as necessary, select

representatives (4 for stage 1)

  • Binding pocket defined as the union of all

the residues within 5Å of all the ligands

GC2: Receptor for docking

slide-55
SLIDE 55

10/24/19 10

[Faculty of Science Chemistry]

Some successful predictions

FXR-27: l-RMSD of 1.17Å FXR-34: l-RMSD of 1.94Å

[Faculty of Science Chemistry]

Average best RMSD from 5 poses per ligand

Stage1: Pose prediction performance

slide-56
SLIDE 56

10/24/19 11

[Faculty of Science Chemistry]

  • Lesson 1: Smart choice of receptor

conformation is crucial

  • Lesson 2: Need to better select ligand

conformations for docking

  • Lesson 3: Our docking protocol starts from

randomly rotated, separated conformations

  • > need for better starting conformations

GC2: Main lessons

[Faculty of Science Chemistry]

D3R 2017 Grand Challenge 3

  • Protein (cathepsin)

against 141 (24) small molecules

slide-57
SLIDE 57

10/24/19 12

[Faculty of Science Chemistry]

Selecting the receptor

  • Identify templates with high sequence

identity (>70%) to the target protein sequence with at least one bound ligand.

  • Compare the crystallographic ligand to the

Cathepsin set using the Maximum Common Substructure (MCS) (as implemented in ChemmineR)

  • Select the receptor with the highest

Tanimoto Coefficient (GC2 lesson 1)

[Faculty of Science Chemistry]

Selecting the receptor

  • For each ligand to dock select the receptor with the most

similar ligand

slide-58
SLIDE 58

10/24/19 13

[Faculty of Science Chemistry]

Ligand conformation

  • Compare similarity of crystallographic

template ligands with generated conformers using shape and color tanimoto (as implemented in OpenEye ROCS)

  • Select 10 conformers with the highest

combined score for ensemble docking (GC2 Lesson 2)

[Faculty of Science Chemistry]

Starting conformations

  • Superpose the selected conformers on the

crystallographic ligands (OpenEye shape-TK)

  • Refine using HADDOCK – only short minimization in 2nd

stage and final refinement in explicit solvent (GC2 Lesson 3)

  • For stage 1b, receptor, water molecules and other small

molecules kept rigid in place (not much impact on performance)

slide-59
SLIDE 59

10/24/19 14

[Faculty of Science Chemistry]

Docking results

Stage 1 heavy-atom RMSD 0.91Å 1.75Å 9.26Å

[Faculty of Science Chemistry]

Impact of ligand similarity

slide-60
SLIDE 60

10/24/19 15

[Faculty of Science Chemistry]

Lessons learned!

GC3 – Excellent performance

[Faculty of Science Chemistry]

D3R 2018 Grand Challenge 4

  • Protein: beta secretase

1 (BACE) against 20 small molecules

  • Same strategy as GC3
slide-61
SLIDE 61

10/24/19 16

[Faculty of Science Chemistry]

GC4: Excellent results again!

Template- based approach Koukos et al.

  • J. Comp. Aid.
  • Mol. Des. 2019

Medians:

  • Best: 1.2Å
  • Top1: 1.53Å
  • Top5: 1.69Å

[Faculty of Science Chemistry]

GC4: Excellent results again!

slide-62
SLIDE 62

10/24/19 17

[Faculty of Science Chemistry]

  • By analogy to our protein-protein BA predictor we

developed an atomic contact-based for GC2 stage 2

Binding affinity prediction

5.5 Å

[Faculty of Science Chemistry]

  • An atomic contact-based predictor
  • Trained on the 2P2I dataset
  • HADDOCK refinement to calculate the intermolecular

energy terms (elec, vdw)

  • The atomic contacts (ACs) between the ligands and the

proteins were calculated using a 10.5Å cutoff

  • D"#$%&' = ). +,+-., ∗ 0'1'$ − ). )+-3.- ∗ 4555 + ). 7+8-+8 ∗ 4599 +

). 7:)),+ ∗ 45;; − +. )888:7 ∗ 45<< + 78-. )77+8,

Binding affinity prediction

Nevia Citro

slide-63
SLIDE 63

10/24/19 18

[Faculty of Science Chemistry]

Ligand binding affinity/ranking prediction

  • Nothing changed between GC2 and GC3
  • Structure-based prediction using an atomic contact model

(see our GC2 paper)

Kurkcuoglu et al. J. Comp. Aid. Mol. Des. 2017 GC2 GC3

[Faculty of Science Chemistry]

PRODIGY-Ligand web server

Vangone et al. Bioinformatics 35, 1585–1587 (2019). https://nestor.science.uu.nl/prodigy/

slide-64
SLIDE 64

10/24/19 19

[Faculty of Science Chemistry]

GC2

Ligand binding affinity/ranking prediction

  • Ligand-based prediction using target-specific ligand similarity

SVR model (see our GC2 paper)

Kurkcuoglu et al. J. Comp. Aid. Mol. Des. 2017 GC3 GC3 GC3 GC3

[Faculty of Science Chemistry]

Conclusions

  • HADDOCK can handle small ligands
  • Especially interesting when experimental

information is available

  • as catalyzer for learning from failures
  • Success factors (for us):

– Smart selection of receptor and ligand conformations – Smart positioning of ligand in binding pocket

Kurkcuoglu et al. J. Comp. Aid. Mol. Des. 2017

slide-65
SLIDE 65

10/24/19 20

[Faculty of Science Chemistry]

  • Use docking for fragment-based drug/2P2I design
  • E.g.: neuraminidase

– Ab initio docking with 18 different fragments

Perspectives

180 Inge Clemens (6-VWO – profielwerkstuk)

[Faculty of Science Chemistry]

  • Use docking for fragment-based drug/2P2I design
  • E.g.: neuraminidase

– Correctly identifies the binding pocket – Flexibility allows the sampling of new pockets

Perspectives

slide-66
SLIDE 66

10/24/19 21

Multiple choice...

Protein-DNA modelling Solvated protein-DNA docking HADDOCK’s adventures in CAPRI Protein-peptide modelling Modelling (N>2) and large conformational changes Limits of homology modelling in docking Solvated protein-protein docking Protein-ligand modelling Binding affinity in protein-protein interactions A specificity riddle Integrative modelling of a peptide nanocarrier Fast and reliable rigid body docking in cryo-EM maps HADDOCKing with cryo-EM data Visualization of the interaction space defined by restraints HADDOCKing with SAXS data HADDOCKing with MS data HADDOCKing with NMR residual dipolar couplings data HADDOCKing with NMR pseudo contact shifts iSee: ∆∆G of mutations Graph-based scoring: iScore HADDOCKing using co-evolution data Antibody-antigen modelling

slide-67
SLIDE 67

10/24/19 46

Overview

g Introduction g Information sources g General aspects of docking g Information-driven docking with HADDOCK g Incorporating biophysical data into docking g Modelling protein-ligand interactions g Modelling from cryo-EM data g Assessing the interaction space g Multiple choices... g Conclusions & perspectives

91

[Faculty of Science Chemistry]

  • (Information-driven) docking is useful to generate models of

biomolecular complexes, even when little information is available

  • While such models may not be fully accurate, they provide

working hypothesis and can still be sufficient to explain and drive the molecular biology behind the system under study

  • … and with a little bit of effort they can be validated!
  • Information-driven docking is complementary to classical

structural methods Conclusions

92

slide-68
SLIDE 68

10/24/19 47

Acknowledgments:

the CSB group@UU

VICI TOP-PUNT

WeNMR West-Life EGI-Engage INDIGO-Datacloud BioExcel CoE EOSC-Hub

€€

93

[Faculty of Science Chemistry]

http://www.capri-docking.org

94

slide-69
SLIDE 69

10/24/19 48

[Faculty of Science Chemistry]

New integrative models PDB repo

  • mmCIF dictionary expended for integrative models
  • https://github.com/ihmwg/IHM-dictionary
  • PDB dev web site accepting models:
  • https://pdb-dev.wwpdb.org

95

HADDOCK online:

  • http://haddock.science.uu.nl
  • http://bonvinlab.org/software
  • http://ask.bioexcel.eu

Thank you for your attention!

96