Empirical scoring functions for docking and virtual screening - PDF document

Strasbourg Summer School on Chemoinformatics Strasbourg, June 23-27, 2014 Empirical scoring functions for docking and virtual screening Fundamentals, challenges and trends Christoph Sotriffer Institute of Pharmacy and Food Chemistry University of Würzburg Am Hubland D – 97074 Würzburg Key questions in structure-based drug design given a protein: Where is the binding site? PROTEIN given a binding site and a ligand structure: What is the structure of the complex? PROTEIN What is the energy of interaction? PROTEIN-LIGAND COMPLEX Target structure given a binding site: What is a suitable, tight-binding ligand? PROTEIN requires some sort of affinity prediction or scoring 1

Scoring functions: Tasks and types Application tasks: A) Determination of the correct binding mode for a given ligand Pose prediction in docking B) Identification and ranking of new ligands Virtual screening C) Affinity prediction for compound series Ligand design, lead optimization Available approaches: • Force field-based methods • Knowledge-based scoring functions • Empirical scoring functions Force field-based methods Molecular Mechanics (MM): • atoms  charged spheres • bonds  springs • classical potentials • no electrons  no bond formation / cleavage • typically parameterized to reproduce molecular potential energy surface (  conformational ∆ H in the gas phase!) Scoring protein-ligand complexes: + for pose prediction in docking – for ligand ranking by affinity Terms accounting for (de)solvation & entropic factors required (cf. MM-PBSA) 2

Knowledge-based scoring functions Derivation from g ij (r) P ij : distance-dependent pair P ij (r) = - ln potential crystal-structure data g ref g ij : frequency distribution of atom-atom contacts g ref : reference distribution Frequency of occurrence No experimental affinities used! Statistical potential g(r) 3 2 1 2 3 4 5 6 1 r [Å] 0 -1 R-O O-R -2 O N 1 2 3 4 5 6 O r [Å] R R R O-R N O O Empirical scoring functions Regression-based: pKi =  pKi n f n (structure) affinity structure descriptors weighting factors determined via regression analysis (MLR, PLS) Data: Experimental Experimental structures binding affinities 3

The prototype: SCORE1 (Böhm, 1994) Affinity prediction on generic data sets Scoring function performance 2004 or: The „large-test-set“ shock … Correlation with affinity for a test set of 800 known complexes: Scoring value for most functions r < 0.50 (r 2 < 0.25) Wang et al., J. Chem. Inf. Comp. Sci. 44 (2004), 2114 4

Affinity prediction on generic data sets Scoring function performance 2004 or: The „large-test-set“ shock … Correlation with affinity for a test set of 800 • poor correlation for generic data sets known complexes: for most functions • hardly possible to obtain correct ranking r < 0.50 (r 2 < 0.25) • of limited use for ligand optimization Wang et al., J. Chem. Inf. Comp. Sci. 44 (2004), 2114 How to improve empirical scoring functions? pKi =  pKi n f n (structure) Regression-based: affinity weighting factors structure descriptors determined via regression analysis (MLR, PLS) Development options: • training sets • descriptors • regression methods 5

The SFCscore approach • Training sets: SFC: Scoring Function Consortium Data collection from public & industry sources up to 855 complexes with affinity data • Descriptors: larger training set additional descriptors • Regression method: MLR + PLS SFCscore Example: SFCscore function „sfc_290m“ pKi = - pKi 1  n_rot_bonds + pKi 2  neutral_H_bonds + pKi 3  metal_interaction + pKi 4  AHPDI + pKi 5  ring-ring_interaction + pKi 6  ring-metal_interaction + pKi 7  total_buried_surface + pKi 8 Statistical parameters for training set (n = 290): R R 2 s F Q 2 s PRESS Comparison with SCORE1 (n = 45): 0.843 0.711 1.09 99.2 0.692 1.12 R R 2 s F Q 2 s PRESS 0.873 0.762 1.40 32.1 0.696 1.67 Sotriffer et al., Proteins 73 (2008), 395 6

Scoring function performance 2009 benchmark Correlation of scores with experimental binding affinities Test set compiled by Cheng et al., 2009: 195 PDBbind complexes (65 targets) Pearson correlation coefficient R P 1 0,9 Some known limitations of SFCscore: 0,8 0,7 0,644 • data set issues (IC 50 etc.) 0,587 0,6 Zilian & Sotriffer J. Chem. Inf. Model. 0,5 53 ( 2013), 1923 • implicit model assumptions (i.e., 0,4 SFCscore functions 0,3 functional form of descriptors, Functions tested by 0,2 Cheng et al. 2009 linear regression techniques) 0,1 J. Chem. Inf. Model. 0 49 ( 2009), 1079 Random Forest for scoring functions Addressing these limitations … • Training sets: growth of PDBbind → 1005 complexes with K i data (not overlapping with Cheng & CSAR test sets) • Regression methods: Non-parametric machine-learning methods: (not imposing any particular functional form) Random Forest in particular : 7

Random Forest for scoring functions First scoring function trained with Random Forest: RF-Score (Ballester & Mitchell, Bioinformatics 2010) • Training set: 1105 PDBbind complexes • Descriptors: count of protein-ligand atom type pair contacts withing 12 Å 9 atom types (C, N, O, S, P, F, Cl, Br, I) → 36 pairs → each complex characterised by vector of 36 contact counts RF-Score yields much higher R p for Cheng test set! BUT: Do the pure contact counts sufficiently well capture the physicochemical interaction features? Random Forest for scoring functions: SFCscore RF use SFCscore descriptors to train Random Forest model! SFCscore RF • Training set: 1005 PDBbind complexes • Descriptors: 63 SFCscore descriptors Test set (Cheng) Relative descriptor importance R P = 0.779 RMSE = 1.56 Increase of the mean squared error when randomly permuting the descriptor values Zilian & Sotriffer, J. Chem. Inf. Model. 53 ( 2013), 1923 8

Scoring function performance Correlation of scores with experimental binding affinities Test set compiled by Cheng et al., 2009: 195 PDBbind complexes (65 targets) Pearson correlation coefficient R P 1 0,9 0,776 0,779 0,8 0,7 0,644 0,587 0,6 SFCscore functions 0,5 0,4 Functions tested by Cheng et al. 2009 0,3 0,2 RF functions 0,1 Zilian & Sotriffer J. Chem. Inf. Model. 0 53 ( 2013), 1923 Applicability domain of SFCscore RF Why does SFCscore RF outperform the other SFCscore functions? SFCscore RF training data sfc_229m training data Knowing in advance the best Cheng test set complexes Cheng test set complexes SFCscore function for each better coverage individual complex would lead to of training-set region R P = 0.93 RMSE = 1.03 9

Scoring function performance One more generic test set: CSAR-NRC HiQ (2010) Correlation of scores with experimental binding affinities CSAR-NRC HiQ evaluation set: 332 complexes Dunbar et al., J. Chem. Inf. Model. 51 ( 2011), 2036; Smith et al., J. Chem. Inf. Model. 51 ( 2011), 2115 Performance across 17 core methods: • R P in the range 0.35 – 0.76 (only 3 >0.65) • RMSE in the range 2.99 – 1.51 (pK d units) • correlation with heavy atom count: R P 0.51 SFCscore RF : R P = 0.73 RMSE = 1.53 (pK d units) Scoring function performance One more generic test set: CSAR-NRC HiQ (2010) Correlation of scores with experimental binding affinities Where are the limits? CSAR-NRC HiQ evaluation set: 332 complexes Inherent experimental error Dunbar et al., J. Chem. Inf. Model. 51 ( 2011), 2036; Smith et al., J. Chem. Inf. Model. 51 ( 2011), 2115 limits the possible correlation between scores and measured affinity. R P is limited to: ∼ 0.91 ~0.83 when fitting to the data set when scoring the data set with a without overparameterizing method trained on outside data (estimate based on error with σ = 1.0 log K) Dunbar et al., J. Chem. Inf. Model. 51 ( 2011), 2146 SFCscore RF : R P = 0.73 RMSE = 1.53 (pK d units) 10

Scoring function performance What about individual targets? Leave-Cluster-Out (LCO) Validation: Target-dependent performance Zilian & Sotriffer RMSE J. Chem. Inf. Model. Correl. coeff. R P 53 ( 2013), 1923 Scoring function performance What about individual targets? Leave-Cluster-Out (LCO) Validation: Target-dependent performance BUT: Somewhat artificial setup … Out-of-bag (OOB) predictions for HIV-protease class (n=97): R P = 0.60 RMSE = 1.26 Training HIV-protease set set 11

Scoring function performance What about individual targets and docked ligands? The CSAR 2012 challenge Example: ERK2 test set ~40 compounds for docking and affinity ranking rather poor results for most groups: median R p = 0.37 best: 0.66 SFCscore RF : 0.49 Major problem : binding-mode prediction! Scoring function performance What about individual targets and docked ligands? The CSAR 2012 challenge Example: ERK2 test set Based on 12 crystal structures released later: Damm-Ganamet et al., J. Chem. Inf. Model. 53 ( 2013), 1853 12

Empirical scoring functions for docking and virtual screening - PDF document

Strasbourg Summer School on Chemoinformatics Strasbourg, June 23-27, 2014 Empirical scoring functions for docking and virtual screening Fundamentals, challenges and trends Christoph Sotriffer Institute of Pharmacy and Food Chemistry

Protein Docking and 3D Ligand-Based Virtual Screening Schedule Lecture 1 Rigid Body

Exercise 8: Scoring Exercise 8: Scoring FLUKA Beginners Course Exercise 8: Scoring Aim of the

Integrative modeling of ! General aspects of docking ! Information-driven docking with HADDOCK

Mountain High Swim League Scoring Presentation 2018 Scoring Committee 1 MHSL Scoring Training

Exercise 8: Scoring FLUKA Beginners Course Exercise 8: Scoring Aim of the exercise: 1- Add

The truck scheduling problem at cross-docking terminals L. Berghman, C. Briand, R. Leus and P.

Protein Docking Amit P. Singh Biochemistry 218/MIS 231 November 30, 1998 Why is Docking

combining electron Density-based docking microscopy with Does not need one-to-one correspondence

Receding meniscus induced docking of yeast Receding meniscus induced docking of yeast cells for

Ensemble Docking Revisited Oliver Korb Cambridge Crystallographic Data Centre

Hex Modeling Protein Docking Using Polar Fourier Correlations The CAPRI Experiment Demo:

Hex Modeling Protein Docking Using Polar Fourier Correlations Dave Ritchie Team Orpailleur

Welcome to Scoring the ACIRI a Job Aid. 1 This job aid provides a brief review of the scoring

Investment Board April 21, 2014 Agenda UW-IT Portfolio Scoring Process Scoring Results

Mobile Credit Scoring: Powering Consumer Finance in Emerging Markets SUMMARY Credit Scoring

SI Scoring Guide SUBORDINATION INDEX USING SALT Discuss the scoring rules SALT SOFTWARE, LLC

Alpha-Lipoic Acid s Effects on the Mitochondrion and Human Disease Modification Burton M.

Hot Topics in Diabetes Ketogenic Diets What do Health Care Professionals Need to know? October

Treating Postprandial Hyperglycemia in Young with Type 2 Diabetes Antonio Ceriello Warwick

Role of SGLT2 Inhibitors in Current Treatment Paradigms How does inhibition of excessive renal

Nudge dges s to Im Impr prove e Adher herence ence to Chr hronic onic Cardio diovascular

EXTRACTIONS in ORTHODONTICS Jules E. Lemay III d.d.s., cert. ortho., F.R.C.D. (C) Diplomate,

COMPREHENSIVE GENOMIC CHARACTERIZATION OF SQUAMOUS CELL CARCINOMA OF THE HEAD AND NECK Neil

Perfecting your draft Campinas, September 4, 2018 Diana Hopkins University of Bath