Systems Biology: Applications in pharma research 20 September 2010, - - PowerPoint PPT Presentation
Systems Biology: Applications in pharma research 20 September 2010, - - PowerPoint PPT Presentation
Systems Biology: Applications in pharma research 20 September 2010, TU Mnchen Andrea Schafferhans Andrea Schafferhans @ TU Mnchen Similar proteins have similar interaction partners (?) 20 January 2011 Introduction 2 Andrea
Andrea Schafferhans @ TU München
Similar proteins have similar interaction partners (?)
20 January 2011 Introduction 2
Andrea Schafferhans @ TU München
Applications
- Function prediction
- Drug development
– “Target Class” approach – Side effects – “Polypharmacology” / “Network pharmacology”
20 January 2011 Introduction 3 Hopkins,A.L. (2008) Network pharmacology: the next paradigm in drug
- discovery. Nat Chem Biol, 4, 682-690.
Andrea Schafferhans @ TU München
Contents
- 1. Introduction
- 2. Protein comparison
– Computational binding site identification – Binding site comparison
- 3. Application examples
20 January 2011 Introduction 4
Andrea Schafferhans @ TU München
Types of protein similarity
- Function
- Sequence
– Paralogs – within species – Orthologs – across species
- Binding sites / interaction patterns
20 January 2011 Protein similarity 5
Andrea Schafferhans @ TU München
What is a binding site?
- Function
– Binding other proteins (e.g. signal transduction) – Binding substrates (enzymes) – Binding Co-Factors (e.g. Heme) – …
- Form
– Cavity in the protein – CAVE: induced fit / conformational selection more realistic
- Pragmatic
– Around all HETATM records in PDB (CAVE: e.g. metals…)
20 January 2011 Protein similarity 6
Andrea Schafferhans @ TU München
Binding site characteristics
- Usually a pocket or cleft in the protein
- Less hydrophobic than the interior of a protein
- Specific through complementarity of
– Form – Electrostatic interactions – Hydrogen bonds – Hydrophobic interactions
Henrich S, Salo-Ahen OM, Huang B, et al.: Computational approaches to identifying and characterizing protein binding sites for ligand design. Journal of Molecular Recognition 2010, 23:209-219 20 January 2011 Protein similarity 7
Andrea Schafferhans @ TU München
Binding site analysis – Applications
- Automated drug target annotation
– E.g. estimation of druggability (binding site size, hydrophobicity, etc.)
- Virtual screening
– Restrict the search space for docking experiments
- Function prediction
- Prediction of drug side effects
20 January 2011 Protein similarity 8
Andrea Schafferhans @ TU München
Finding binding sites – geometrically
Observation: Binding sites usually are the largest pockets e.g. 83% of enzyme active sites found in the largest pocket
(Laskowski RA, et al. Protein clefts in molecular recognition and function. Protein
- Sci. 1996; 5:2438-2452.)
20 January 2011 Protein similarity 9
Andrea Schafferhans @ TU München
- Fill the protein with a grid (3 Å spacing)
- Mark grid points as “protein“
(within 3 Å of an atom ) or “solvent“
- Go along grid and mark “solvent” points
that lie between “protein” points for potential pocket
- Find largest “clusters” of “pocket” points
Levitt D, Banaszak L. POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. J. Mol. Graph 1992, 10:229-234.
20 January 2011 Protein similarity 10
Andrea Schafferhans @ TU München
LIGSITE
Differences to POCKET
- More efficient searching for
neighbour atoms
- Cubic diagonals also used for
finding pockets less dependent on orientation
- Grid points scored by the number of times they are
found (between 0 and 7) adjustable “buriedness“
- Smaller and adjustable grid spacing (best: 0.5 to 0.75 Å)
Hendlich M, et al.: LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. J. Mol. Graph. Mod. 1997, 15:359-363
20 January 2011 Protein similarity 11
Andrea Schafferhans @ TU München
Finding binding sites – energetically
Binding sites interact with the bound molecules Find location of favourable interaction energies
20 January 2011 Protein similarity 12
Andrea Schafferhans @ TU München
GRID
- Calculates interaction energies of probe molecules
- Uses three terms:
– Lennard-Jones (attraction + repulsion) – electrostatic – directional hydrogen bond
Goodford, P.J. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J. Med. Chem. 1985 28:849-857
20 January 2011 Protein similarity 13
Andrea Schafferhans @ TU München
GRID application
- Cluster energy minima binding site
- BUT:
– Hard to cluster – Computationally intensive
- Good for binding site characterisation
Picture from: Henrich S, Salo-Ahen OM, Huang B, et al. JMR 2010, 23:209-19.
20 January 2011 Protein similarity 14
Andrea Schafferhans @ TU München
Q-SiteFinder
- GRID methyl probe (0.9 Å grid)
- Cluster:
adjacent grid points that meet energy criterion → Success: > 70% first predicted binding site > 90% first three → 68% average precision (precision: overlap between ligand and predicted binding site)
Laurie AT, Jackson RM: Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 2005, 21:1908-16
20 January 2011 Protein similarity 15
Andrea Schafferhans @ TU München
i-Site
20 January 2011 Protein similarity 16
Variation of Q-Site:
- Better probe distribution
(more dense grid)
- Two energy limits
– low value for cluster seeds – higher value for extension filtering out meaningful clusters
- AMBER force field
Morita M, Nakamura S, Shimizu K: Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 2008, 73:468-479
Andrea Schafferhans @ TU München
i-Site
20 January 2011 Protein similarity 17
Variation of Q-Site:
- Better probe distribution
(more dense grid)
- Two energy limits
– low value for cluster seeds – higher value for extension filtering out meaningful clusters
- AMBER force field
Morita M, Nakamura S, Shimizu K: Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 2008, 73:468-479
Andrea Schafferhans @ TU München
Challenges in binding site identification
- Protein flexibility can “hide” binding sites
→ Use multiple experimental conformations → Use molecular dynamics to generate conformations
- Dimerisation has to be considered
→ Carefully look at PDB unit cell → Carefully look at information about the protein
20 January 2011 Protein similarity 18
Andrea Schafferhans @ TU München
Characterising binding sites
Properties to characterise:
- Geometry
- Amino acid composition
- Solvation
- Hydrophobicity
- Electrostatics
- Interactions with functional groups
20 January 2011 Protein similarity 19
Andrea Schafferhans @ TU München
Hydrophobicity
Measured by logP (partitioning between water and octanol)
- Map atom / residue based
contributions
- Calculate interaction
energies of hydrophobic probes (e.g. GRID)
20 January 2011 Protein similarity 20
Andrea Schafferhans @ TU München
Electrostatics
- Map electrostatic
potential onto surface
(e.g. using DelPhi, see http://structure.usc.edu/ howto/delphi-surface- pymol.html)
- CAVE:
dependence on protonation!
20 January 2011 Protein similarity 21
Andrea Schafferhans @ TU München
Functional groups
- Superstar
– Analyse the spatial distribution of functional groups in CSD density maps – Break the protein into fragments found in CSD – Map the observed distribution of interaction partners onto the protein
Verdonk ML, Cole JC, Taylor R: SuperStar: a knowledge-based approach for identifying interaction sites in proteins. Journal of molecular biology 1999, 289:1093-108.
20 January 2011 Protein similarity 22
Andrea Schafferhans @ TU München
Binding site comparison
- Align structures in 3D
- Analyse differences and similarities of
– Amino acid composition – Local conformation – Pocket size – Presence of interaction partners
- Straightforward in case of
– Sequence similarity or – Structural similarity
20 January 2011 Protein similarity 23
Andrea Schafferhans @ TU München
RELIBASE
20 January 2011 Protein similarity 24
Andrea Schafferhans @ TU München
RELIBASE
- Stores binding sites from PDB structures
- Allows superposition of related binding sites
- Computes differences between binding sites
Hendlich M, Bergner A, Günther J, Klebe G: Relibase: Design and Development of a Database for Comprehensive Analysis of Protein-Ligand Interactions. Journal of Molecular Biology 2003, 326:607-620. http://relibase.ccdc.cam.ac
20 January 2011 Protein similarity 25
Andrea Schafferhans @ TU München
- cAMP-dependent protein kinase (1cdk)
with adenyl-imido-triphosphate
- trypanothione reductase (1aog)
with flavine-adenine-dinucleotide
20 January 2011 Protein similarity 26
Similar but not homologous binding sites
Andrea Schafferhans @ TU München
20 January 2011 Protein similarity 27
Similar but not homologous binding sites
Graphics from www.ebi.ac.uk/pdbsum/
Andrea Schafferhans @ TU München
20 January 2011 Protein similarity 28
Similar but not homologous binding sites
Graphics from Schmitt S, Kuhn D, Klebe G. Journal of molecular biology 2002, 323:387-406
Andrea Schafferhans @ TU München
Problems in binding site comparison
- Automatically locate binding site
- Capture important features in efficient representation
- Search efficiently across all structures
– Find best superimposition – Score the alignment
20 January 2011 Protein similarity 29
Andrea Schafferhans @ TU München
Binding site comparison methods
- Representation by
– Coordinate set with physico-chemical or evolutionary properties
- Atoms
- Chemical groups
- Surface points
– 3D shape descriptors
- Superimposition by
– Geometric hashing – Graph theory, clique search
- Similarity measurement by
– RMSD – Residue conservation – Physico-chemical property similarity 20 January 2011 Protein similarity 30
Andrea Schafferhans @ TU München
CavBase – Structure representation
- Cavity detection with LIGSITE (stored in Relibase)
- Cavity-flanking residues represented as pseudo-centers:
– Donor – Acceptor – Donor-Acceptor – Aliphatic – PI – several per residue if necessary
- Create Graph:
– Nodes: pseudo-centers – Edges: distances between the pseudo-centres
Graphics from Schmitt S, Kuhn D, Klebe G. Journal of molecular biology 2002, 323:387-406
20 January 2011 Protein similarity 31
Andrea Schafferhans @ TU München
CavBase – Alignment
Create associated graph:
Node: node from protein A and node from protein B with similar interaction properties Edge: member nodes in protein A and B are connected member node distance <12Å distance difference <2Å
Find maximal common subgraph (Bron-Kerbosh) similar arrangement of pseudo-centers in original graphs
20 January 2011 Protein similarity 32
Andrea Schafferhans @ TU München
CavBase – Scoring
- Scoring based on
- verlap of similarly
typed surface patches
Kuhn D, Weskamp N, Schmitt S, Hüllermeier E, Klebe G: From the Similarity Analysis of Protein Cavities to the Functional Classification of Protein Families Using Cavbase. Journal of Molecular Biology 2006, 359:1023-1044
20 January 2011 Protein similarity 33
Andrea Schafferhans @ TU München
SOIPPA – Structure representation
- Delaunay tesselation of Cα atoms
- > 1 tetrahedron/Cα
- Environmental boundary (red) and
protein boundary (blue)
Bourne PE, Xie L: A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites. BMC Bioinformatics 2007, 8:S9. Bourne PE, Xie L: A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery. Bioinformatics 2009, 25:i305-312.
20 January 2011 Protein similarity 34
Andrea Schafferhans @ TU München
SOIPPA – Structure representation (2)
- Each Cα characterized by
– Vector with distance and direction
- f boundaries
– Substitution matrix
- Graph:
Node: Cα Edge: connection of tetrahedra
Xie L., Bourne PE. Bioinformatics 2009, 25:i305-312.
20 January 2011 Protein similarity 35
Andrea Schafferhans @ TU München
SOIPPA - Alignment
Create associated graph:
Node: node(A) + node(B) with similar geometric potential weight: amino acid frequency profile similarity Edge: member nodes in protein A and B are connected distance difference <2Å surface normal difference <30°
Find maximum-weight common subgraph (MWCS)
Xie L., Bourne PE. Bioinformatics 2009, 25:i305-312.
20 January 2011 Protein similarity 36
Andrea Schafferhans @ TU München
SOIPPA – Scoring
- Sum over aligned residue pairs:
Residue similarity weighted by distance and normal vector angle
- Statistical significance of score
Background score distribution: – compare unrelated structures with random sequences – fit resulting score distribution to extreme value distribution function giving probability of randomness dependent on score
Sij = (Mij × paij × pdij )
i, j
∑
Xie L., Bourne PE. Bioinformatics 2009, 25:i305-312.
20 January 2011 Protein similarity 37
Andrea Schafferhans @ TU München
Isocleft
- Structure representation: Cα / atoms within 5 Å of ligand
- Alignment: Bron-Kerbosh of associated graph
- Scoring:
Najmanovich R, Kurbatova N, Thornton J: Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites. Bioinformatics 2008, 24:i105 http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/icfdb/StartPage.pl
S = NC NA + NB − NC
20 January 2011 Protein similarity 38
Andrea Schafferhans @ TU München
Isocleft - innovations
- Two iterations of alignment:
- 1. Nodes: Cα atoms,
Edges: distance difference <3.5 Å, minimal residue similarity Superimpose based on found graph
- 2. Nodes: all heavy atoms,
Edges: distance <4 Å, similar atom type (hydrophilic, acceptor, donor, hydrophobic, aromatic, neutral, neutral-donor and neutral- acceptor)
- Use first result of Bron-Kerbosch, then terminate
Najmanovich R, Kurbatova N, Thornton J: Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites. Bioinformatics 2008, 24:i105
20 January 2011 Protein similarity 39
Andrea Schafferhans @ TU München
Example 1: Explaining side effects
Problem: side effects of ERα modulators (SERMs) Finding “off target” effects:
- Map sequences to structures (BLAST)
- Limit to “druggable” proteins (?)
- Search with SOIPPA
=> SERCA (SarcoplasmicReticulum Ca2+ channel ATPase)
20 January 2011 Application examples 40
Xie L, Wang J, Bourne PE (2007) In silico elucidation of the molecular mechanism defining the adverse effect of selective estrogen receptor
- modulators. PLoS Comput Biol 3(11)
Andrea Schafferhans @ TU München
Example 1: Validating results
- Inverse search
- Docking
– SERM – similar compounds, correlate (?)
20 January 2011 Application examples 41
Graphics from Xie L, Wang J, Bourne PE (2007) PLoS Comput Biol 3(11)
Andrea Schafferhans @ TU München
Example 2: Repositioning known drug
Problem: new tuberculosis drugs needed, but many parameters to optimise Finding compound to reuse against InhA:
- Search other structures binding Adenine
(ATP, ADP, NAD, FAD, ...)
- Compare binding sites with SOIPPA
=> SAM-dependent methyltransferases
20 January 2011 Application examples 42
Kinnings SL, Liu N, Buchmeier N, Tonge PJ, Xie L, et al. (2009) Drug Discovery Using Chemical Systems Biology: Repositioning the Safe Medicine Comtan to Treat Multi-Drug and Extensively Drug Resistant Tuberculosis. PLoS Comput Biol 5(7)
Andrea Schafferhans @ TU München
Example 2: Structure match
20 January 2011 Application examples 43
Graphics from Kinnings SL et al. (2009) PLoS Comput Biol 5(7)
Andrea Schafferhans @ TU München
Example 3: Analysing target relationships
Nodes: proteins Edges: similar binding (within factor 103)
20 January 2011 Application examples 44
Paolini,G.V. et al. (2006) Global mapping of pharmacological space. Nature biotechnology, 24, 805-15.
Andrea Schafferhans @ TU München
Example 3: Analysing target relationships
20 January 2011 Application examples 45
Paolini,G.V. et al. (2006) Global mapping of pharmacological space. Nature biotechnology, 24, 805-15.
Andrea Schafferhans @ TU München
Summary
Pharma research focus moving from
- nly individual interactions to
system oriented research Challenges:
- How to compare?
- Computational overhead
20 January 2011 Summary 46