PEPSI-DOCK A Detailed Data-Driven Protein-Protein Interaction - - PowerPoint PPT Presentation

pepsi dock
SMART_READER_LITE
LIVE PREVIEW

PEPSI-DOCK A Detailed Data-Driven Protein-Protein Interaction - - PowerPoint PPT Presentation

PEPSI-DOCK A Detailed Data-Driven Protein-Protein Interaction Potential Accelerated By Polar Fourier Correlations MACARON Workshop- March 21st 2017 Emilie Neveu, Dave Ritchie, Petr Popov, Sergei Grudinin Nano-D & Capsid INRIA Teams


slide-1
SLIDE 1

PEPSI-DOCK

A Detailed Data-Driven 
 Protein-Protein Interaction Potential 
 Accelerated By Polar Fourier Correlations

Emilie Neveu, Dave Ritchie, Petr Popov, Sergei Grudinin
 Nano-D & Capsid INRIA Teams MACARON Workshop- March 21st 2017

slide-2
SLIDE 2

Workshop MACARON March 2017 Definition

2

protein (long) chain of amino acids (aa)

side chain

N C H H C O N C H H C O N H

aa aa

ABOUT PROTEINS

20 possible aa

slide-3
SLIDE 3

Workshop MACARON March 2017 Definition

2

protein (long) chain of amino acids (aa)

side chain

N C H H C O N C H H C O N H

aa aa

ABOUT PROTEINS

Representation “cartoon”-like 3D structure

stable pieces: helices, parallel sheets flexible pieces: structures not well-defined

20 possible aa

slide-4
SLIDE 4

Workshop MACARON March 2017

3

ABOUT DOCKING

Protein - ligand Protein - receptor

1. 2. 3. N. …

Whitehead, Timothy A., et al. Nature biotechnology 30.6 (2012)

Structure Prediction

slide-5
SLIDE 5

Why so important? Workshop MACARON March 2017

3

ABOUT DOCKING

Protein - ligand Protein - receptor

1. 2. 3. N. …

Whitehead, Timothy A., et al. Nature biotechnology 30.6 (2012)

100 nm Influenza virus Hemagglutinin protein Inhibitor

Structure Prediction

slide-6
SLIDE 6

Workshop MACARON March 2017

4

ABOUT DOCKING

> 2001 Community-wide experiment: CAPRI ( Critical Assessment of PRedicted Interactions)

slide-7
SLIDE 7

Workshop MACARON March 2017

4

ABOUT DOCKING

> 2001 Community-wide experiment: CAPRI ( Critical Assessment of PRedicted Interactions)

  • 1. Interaction energy to score/assess the structures

∆Gbind = ∆H − T∆S

enthalpy entropy Interaction energy

Ligand Receptor

slide-8
SLIDE 8

Workshop MACARON March 2017

4

  • 2. Search algorithm + set of parameters

ABOUT DOCKING

> 2001 Community-wide experiment: CAPRI ( Critical Assessment of PRedicted Interactions)

  • 1. Interaction energy to score/assess the structures
slide-9
SLIDE 9

Workshop MACARON March 2017

4

  • 2. Search algorithm + set of parameters
  • 3. Multilevel approach: selection of top solutions ; restart with higher resolution

ABOUT DOCKING

> 2001 Community-wide experiment: CAPRI ( Critical Assessment of PRedicted Interactions)

  • 1. Interaction energy to score/assess the structures
slide-10
SLIDE 10

Workshop MACARON March 2017

4

starring ZDock zdock.umassmed.edu HexDock hex.loria.fr/hex.php ClusPro cluspro.bu.edu AutoDock autodock.scripps.edu RosettaDock rosie.rosettacommons.org/ligand_docking
 DOCK dock.compbio.ucsf.edu and many others…..

  • 2. Search algorithm + set of parameters
  • 3. Multilevel approach: selection of top solutions ; restart with higher resolution

ABOUT DOCKING

> 2001 Community-wide experiment: CAPRI ( Critical Assessment of PRedicted Interactions)

  • 1. Interaction energy to score/assess the structures
slide-11
SLIDE 11

Workshop MACARON March 2017

5

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

GOAL: To improve the first level : large and global search space

slide-12
SLIDE 12

Workshop MACARON March 2017

5

  • SVM-based algorithm to learn the atomistic potentials
  • physically interpretable features: 


number densities of site-site pairs at a given distance

  • arbitrarily shaped atomistic distance dependent interaction potentials

Popov, P ., & Grudinin, S. (2015). J. Chem. Info. Model.
 Knowledge of Native Protein–Protein Interfaces Is Sufficient To Construct Predictive Models for the Selection of Binding Candidates.

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

GOAL: To improve the first level : large and global search space Simple but accurate interaction energy approximation

slide-13
SLIDE 13

Workshop MACARON March 2017

5

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

GOAL: To improve the first level : large and global search space Fast exploration Simple but accurate interaction energy approximation

D.W. Ritchie, D. Kozakov, and S. Vajda, Hex code

  • rigid bodies assumption
  • spherical Fourier correlation: complexity from O(N9) to O(N6logN)
slide-14
SLIDE 14

Workshop MACARON March 2017

5

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

GOAL: To improve the first level : large and global search space Fast exploration Sparse representation in Gauss-Laguerre basis Simple but accurate interaction energy approximation

slide-15
SLIDE 15

Workshop MACARON March 2017

6

in 
 Gauss-Laguerre 
 basis 2 Sparse Representation 4 Stored 210 atomistic distance dependent potentials 5 
 From 1D to 3D 
 7 Ranked docking predictions 6 Fast exploration 


  • f the search space

3 
 Optimisation 1 Features extraction

slide-16
SLIDE 16

Workshop MACARON March 2017

7

1 & 2 - Features Extraction/ Sparse representation

Detailed description of 1-D interactions at the interface

40 000 generated false complexes 195 native non-redundant complexes 1-D non-native distributions of atom pairs / distance 1-D native distributions of atom pairs /distance

from ITScore Training Set [Zou Lab, University of Missouri Columbia]

slide-17
SLIDE 17

Workshop MACARON March 2017

7

1 & 2 - Features Extraction/ Sparse representation

Detailed description of 1-D interactions at the interface

40 000 generated false complexes 195 native non-redundant complexes 1-D non-native distributions of atom pairs / distance 1-D native distributions of atom pairs /distance

from ITScore Training Set [Zou Lab, University of Missouri Columbia]

20 different atom types 210 interactions

slide-18
SLIDE 18

Workshop MACARON March 2017

7

1 & 2 - Features Extraction/ Sparse representation

Detailed description of 1-D interactions at the interface

40 000 generated false complexes 195 native non-redundant complexes 1-D non-native distributions of atom pairs / distance 1-D native distributions of atom pairs /distance

Sparse representation

in a Gauss-Laguerre polynomial basis scaled to describe distributions up to 30 Å about 6300 geometric features for each native and non-native complex

from ITScore Training Set [Zou Lab, University of Missouri Columbia]

20 different atom types 210 interactions

vc

slide-19
SLIDE 19

Workshop MACARON March 2017

8

Convex optimisation problem: Find w and bc that minimise

Knowledge of Native Protein–Protein Interfaces Is Sufficient 
 To Construct Predictive Models for the Selection of Binding Candidates.
 Popov, Grudinin, 2015, J Chem Info Model.

3 - Optimisation

Optimal discrimination between native and non native interfaces min

w,bc

λ 2 kwk2

2

| {z }

prevents overfitting

+ γ X

c

log ⇣ 1 + eyc(wT vc+bc)/γ⌘ | {z }

penalises misclassification

associated false complexes

⦿ native complexes

hyperplane separator estimated

normal vector: 1-D interaction potentials

features ; classifier known

margin

w bc

vc

yc = 1

yc = −1

slide-20
SLIDE 20

2 4 6 8 10 12

  • 2
  • 4
  • 6
  • 8

Workshop MACARON March 2017

9

N+ with O-

atom-atom distance-dependent interaction potentials

210 interactions precision

w

4 - 210 atom-atom distance dependent interaction potentials

slide-21
SLIDE 21

Workshop MACARON March 2017

10

x y z

x y z

fn00 fnl0 fnlm

  • 1. z-translation
  • 2. θ-rotation
  • 3. φ-rotation

E = X

pairwise interactions ij

X

Ri

X

Lj

ZZZ

V

fij(x − xRi) g(x − xLj) dV,

5 - Pre-processing before docking

Linear sum of atom-atom convolution with potentials and densities Representation with truncated polynomial expansion

xRi

ZZZ

V

fij(r)g(r − xLj)dV = X

nlm

(R.T.w)nlm | {z }

= f ij

nlm

. gnlm

slide-22
SLIDE 22

Workshop MACARON March 2017

11

  • 1 translation and 5 rotations to adjust
  • discretised to enable exhaustive search

E(R, βA, γA, βB, γB, αB) =

r ∈ [0 : 1 : 40 ˚ A] α ∈ [0 : 7.5 : 360o] (β, γ) ∈ [0 : 7.5 : 180o]2

R

6 - Exploration of the search space: the Hex engine

Rigid body assumption

Energy depends to rigid positions of proteins

slide-23
SLIDE 23

Workshop MACARON March 2017

11

  • 1 translation and 5 rotations to adjust
  • discretised to enable exhaustive search

E(R, βA, γA, βB, γB, αB) =

r ∈ [0 : 1 : 40 ˚ A] α ∈ [0 : 7.5 : 360o] (β, γ) ∈ [0 : 7.5 : 180o]2

R

Accelerating and Focusing Protein-Protein Docking Correlations Using Multi-Dimensional Rotational FFT Generating Functions. 
 D.W. Ritchie, D. Kozakov, and S. Vajda (2008). Bioinformatics. 24 1865-1873.

Truncated expressions using spherical Fourier correlation complexity from O(N9) to O(N6 log N): 109 poses in ~ 10 min 6 - Exploration of the search space: the Hex engine

Rigid body assumption

Energy depends to rigid positions of proteins

Fast exhaustive search

slide-24
SLIDE 24

Workshop MACARON March 2017

12

Test on 88 complexes from the Docking Benchmark Set v5.0
 for which the separation distance ≤ 30 Å

Docking Benchmark Set = the only existing benchmark to compare different docking algorithms


[Hwang, Vreven, Janin, Weng, 2010]

Comparison on v4.0

Top 10 for I-RMS ≤ 2.5Å

7 - Ranked predictions

Success Rate

slide-25
SLIDE 25

Workshop MACARON March 2017

13

Running Time of PEPSI-Dock measured on a modern laptop Docking of 109 poses in less than 10 min on a laptop ~ weeks of a 1 μs MD simulation 7 - Ranked predictions

5
 preprocessing 7 6 docking sorted list

  • 5

6

5 + 6 + 7

Computational Time (min) Nb of atoms in the complex

slide-26
SLIDE 26

Workshop MACARON March 2017

14

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

A docking automatic algorithm for the first stage of the docking pipeline

»

novelty: arbitrarily -shaped + distance-dependent potentials combined with a FFT search sampling technic

slide-27
SLIDE 27

Workshop MACARON March 2017

14

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

TO DO

  • 1. Improve unbound predictions: use other training set

A docking automatic algorithm for the first stage of the docking pipeline

»

novelty: arbitrarily -shaped + distance-dependent potentials combined with a FFT search sampling technic

slide-28
SLIDE 28

Workshop MACARON March 2017

14

  • Bound sets: High-rank predictions !
  • Large distances ⚠ loss of precision
  • Unbound sets: similar results than SwarmDock or ZDOCK
  • Adaptation to other types of interactions # $ %

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

TO DO

  • 1. Improve unbound predictions: use other training set

A docking automatic algorithm for the first stage of the docking pipeline

»

novelty: arbitrarily -shaped + distance-dependent potentials combined with a FFT search sampling technic

slide-29
SLIDE 29

Workshop MACARON March 2017

14

  • Bound sets: High-rank predictions !
  • Large distances ⚠ loss of precision
  • Unbound sets: similar results than SwarmDock or ZDOCK
  • Adaptation to other types of interactions # $ %

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

TO DO

  • 1. Improve unbound predictions: use other training set
  • 2. Deal with the docking of large proteins: use other sampling

A docking automatic algorithm for the first stage of the docking pipeline

»

novelty: arbitrarily -shaped + distance-dependent potentials combined with a FFT search sampling technic

slide-30
SLIDE 30

Workshop MACARON March 2017

15

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

PEPSI-Dock, Neveu et al., Bioinformatics, 2016

slide-31
SLIDE 31

Workshop MACARON March 2017

15

https://www.samson-connect.net

PEPSI-DOCK

Polynomial Expansions of Protein Structures and Interactions for Docking

PEPSI-Dock, Neveu et al., Bioinformatics, 2016

slide-32
SLIDE 32

@ Sergei Grudinin @ Petr Popov
 Nano-D team 
 INRIA Grenoble @ David Ritchie Capsid Team INRIA Nancy

https://www.samson-connect.net

ANR

THANKS

slide-33
SLIDE 33

ANY QUESTION?