incorporating molecular flexibility into three
play

Incorporating Molecular Flexibility into Three- Dimensional - PowerPoint PPT Presentation

Center for Bioinformatics Tbingen Incorporating Molecular Flexibility into Three- Dimensional Structural Kernels Andreas Jahn 4. German Conference on Chemoinformatics 10.11.2008 Goslar Computer Science Department Computer Architecture


  1. Center for Bioinformatics Tübingen Incorporating Molecular Flexibility into Three- Dimensional Structural Kernels Andreas Jahn 4. German Conference on Chemoinformatics 10.11.2008 Goslar Computer Science Department • Computer Architecture • Prof. Zell

  2. Introduction & Motivation “… enzyme and substrate must fit each other like a lock and key.” (Emil Fischer, 1894) “Form follows function.” (Louis H. Sullivan, 1896) � Activity is a function of the 3D structure. • 3D structure is not unique due to the flexibility of the compounds. • What are the possible 3D structures? • Possible solution: Conformational sampling • But: Time-consuming & non deterministic � Try to encode the flexibility and possible shape into the data structure. Andreas Jahn 2

  3. Basics Optimal Assignment Kernel source; QSAR Comb. Sci., 2006 , 25, 4, 317-323 ( , ) ( , ) = Κ w i j i j RBF • Atom-based similarity measure + • RBF kernel calculates local γ Κ ( , ) + i1 j1 1 RBF ( , ) atom similarity using atom and γ Κ i2 j1 1 RBF + bond descriptors ( , ) γ Κ + i3 j2 � • Incorporates the local 2 RBF neighbourhood • Atom-wise similarity acts as weight of an edge in complete bipartite graph. • Choose edges that maximize the sum of the edges. Andreas Jahn 3

  4. Basics Two problems of the Optimal Assignment Kernel source; J.-P. Vert, Technical Report HAL-002182 78, 2008 The Optimal Assignment Kernel is not a valid kernel function. Κ ← Κ − λ Ι � Fix the kernel matrix with min No consideration of the flexibility and the shape of the structures. Andreas Jahn 4

  5. Methods - Overview Two different methods were implemented • OAK FLEX • Encode the neighbourhood flexibility space relative to an atom. • Determine the similarity of the flexibility space. • Incorporate the similarity of the flexibility space into the Optimal Assignment Kernel. • Rigid Superposition • Identify rigid scaffolds of the structures. • Superposition of rigid fragments and determine a similarity score. • Integrate the similarity score into the Optimal Assignment Kernel. Andreas Jahn 5

  6. Rigid Superposition • Rule-based expert system identifies rigid scaffolds of the structures. • Calculate all pairwise similarity values. • Calculate the optimal assignment of the fragments. 23.662 Fragment #1 Fragment #1 Molecule b Similarity of Fragments F #1 F #2 F #1 23.662 8.676 Molecule a F #2 6.599 19.262 Fragment #2 Fragment #2 19.262 Molecule a Molecule b Andreas Jahn 6

  7. Rigid Superposition • Superposition of the assigned fragments. • Calculate a similarity score based on the overlap volume. • Integrate information into the Optimal Assignment Kernel. Andreas Jahn 7

  8. OAK FLEX Encode the neighbourhood flexibility space relative to an atom • The flexibility space results from rotatable bonds. � All single bonds outside of a ring generate flexibility spaces. • Flexibility space of the whole molecule is important. � For each atom the relative flexibility space has to be enumerated. • Flexibility spaces have to be comparable. � Unique parameterisation of the space necessary. Andreas Jahn 8

  9. OAK FLEX Flexibility space and the unique parameterisation • Core atom acts as origin. • Parameterisation of the flexibility relative to core atom. • 1 st order rotation is parameterised by d1 and r1. Neighbour n1 Rotatable bond Neighbour n2 Core atom Andreas Jahn 9

  10. OAK FLEX Enumeration of the 1 st order rotations • Depth-limited search with limited depth of 2. • Prune subtrees after rigid bond. 5 9 3 6 4 10 1 11 2 7 8 Andreas Jahn 10

  11. OAK FLEX Extension to the 2 nd order rotation • Unique parametrization by M1, M2, r2 and h. • Additional flag necessary for case differentiation. h Core atom Andreas Jahn 11

  12. OAK FLEX Different cases of the 2 nd order rotation • Both cases are special cases of the 1 st order rotation. � Only two parameters and the flag are necessary. Core atom Core atom Atom n1 Andreas Jahn 12

  13. OAK FLEX Enumeration of the 2 nd order rotations • Depth-limited search with limited depth of 3. • Prune subtrees after 2 rigid bonds. Andreas Jahn 13

  14. OAK FLEX Similarity calculation of two flexibility spaces • RBF kernel based on the parameters. • Individual σ to adjust weight of the parameter  2 2  − − ( ) ( ) d d r r   − +   2 σ 2 σ = Similarity   e d r Core atom Core atom d r Parameters: d r Parameters: Andreas Jahn 14

  15. OAK FLEX Comparison of the flexibility spaces of two core atoms • Atoms have list of flexibility spaces. • But: Only one similarity value is needed. � Calculate similarity matrix and use optimal assignment. #1 #1 Similarity #1 #2 #3 #1 RBF(#1,#1) RBF(#1,#2) RBF(#1,#3) #2 #2 RBF(#2,#1) RBF(#2,#2) RBF(#2,#3) #2 Normalize similarity value #3 ( , ) k a b Atom a ( , ) ← k a b ( , ) ( , ) k a a k b b Atom b Andreas Jahn 15

  16. OAK FLEX Overview of the calculation steps Atom A 1 st R. Atom B 1 st R. Atom A 2 nd R. Atom B 2 nd R. RBF-Kernel RBF-Kernel OAK Matrix 1 st R. Matrix 2 nd R. Local atom similarity Hungarian Hungarian Matrix Method Method 2 nd R. similarity 1 st R. similarity Flex-Matrix matrix matrix Hungarian Method Normalisation Kernel value Andreas Jahn 16

  17. Results • Methods evaluated on 8 QSAR datasets compiled by source; J. Med. Chem., 2004 , 47, 22, 5541-5554 Sutherland et al. • Using ε -SVR to build models. • Seeded 10-fold multirun • Equal folds for both methods � Comparison of the methods possible • 100 multiruns generate 1000 MSE values. • Each value is considered as a sample of a Gaussian distribution. � Paired Wilcoxon signed-rank test determines significant shifts of the mean. : µ = µ H • Hypotheses for the test: 0 OAK OAK FLEX : µ > µ H 1 OAK OAK FLEX Andreas Jahn 17

  18. Results OAK OAK FLEX Dataset MSE Q 2 MSE Q 2 p-value ACE α 1.52 ± 0.63 0.71 ± 0.13 0.98 1.48 ± 0.61 0.73 ± 0.13 AchE β 0.86 ± 0.36 0.48 ± 0.21 0.80 ± 0.30 0.52 ± 0.19 0.02 BZR γ 0.67 ± 0.30 0.48 ± 0.19 < 0.001 0.58 ± 0.25 0.54 ± 0.17 COX2 δ 1.02 ± 0.31 0.51 ± 0.13 0.97 ± 0.27 0.53 ± 0.12 0.001 DHFR ε 0.64 ± 0.19 0.71 ± 0.08 0.60 ± 0.17 0.73 ± 0.08 0.001 GPB ζ 0.55 ± 0.33 0.59 ± 0.25 0.089 0.53 ± 0.37 0.60 ± 0.29 THER η 1.64 ± 0.96 0.64 ± 0.21 0.149 1.56 ± 1.00 0.66 ± 0.22 THR θ 0.47 ± 0.26 0.57 ± 0.25 0.42 ± 0.24 0.59 ± 0.24 0.022 α Angiotensine Converting Enzyme, β Acetylcholinesterase, γ Benzodiazepine Receptor, δ Cyclooxygenase II, ε Dihydrofolate Reductase, ζ Glycogen Phosphorylase B, η Thermolysin, θ Thrombin Andreas Jahn 18

  19. Computation time Comparison of the avg. runtime • Overhead between 16% and 70%. • Overhead correlates with the flexibility of the molecules. Dataset ACE AchE BZR COX2 DHFR GPB THERM THR Ø OAK (ms) 5.3 6.7 4.8 6.2 6.0 4.7 7.2 11.7 Ø OAK FLEX (ms) 7.9 8.5 5.6 8.5 8.2 6.9 12.3 18.0 Factor 1.49 1.26 1.16 1.37 1.36 1.46 1.70 1.53 Ring atoms 35% 71% 77% 66% 58% 36% 24% 51% Andreas Jahn 19

  20. Discussion • Interpretation of kernel models are difficult. • But: Visualization of the mappings disclose differences. OAK OAK FLEX Andreas Jahn 20

  21. Conclusion • Method incorporates molecular flexibility for similarity calculations. • Significant performance gain in 5 of 8 QSAR datasets. • Type of encoding the flexibility not suitable for all datasets. (ACE: Quality of the model decreased) • Publication: Fechner, N.; Jahn, A.; Hinselmann, G.; Zell, A. Journal of Chemical Information and Modeling, in revision. Andreas Jahn 21

  22. Center for Bioinformatics Tübingen Acknowledgement I thank Nikolas Fechner, Georg Hinselmann and Andreas Zell. Computer Science Department • Computer Architecture • Prof. Zell

  23. Center for Bioinformatics Tübingen Thank you for your attention Computer Science Department • Computer Architecture • Prof. Zell

  24. OAK FLEX Performance tuning ( ) ( ) 3 Ο a + b • Hungarian method: • Performance problem due to high number of calculations. � Implementation of a greedy heuristic to reduce computational cost. 1 st order rotations 2 nd order rotations Heuristic Hungarian Heuristic Hungarian Ø sum 2,079 2,091 2,709 2,18 Difference 0,561% 0,008% Andreas Jahn 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend