SLIDE 1 Multiple alignment of protein structures based
- n ligand or prosthetic group position
Jean-Christophe Nebel
Faculty of Computing, Information Systems & Mathematics Kingston University, London
j.nebel@kingston.ac.uk
http://www.kingston.ac.uk/~ku33185/Nestor3D.html
SLIDE 2 Some challenges
- Protein 3D structure prediction
(John Moult, organiser of CASP 6, December 2004)
Proteins with homologues: good models, but fine grain (all atoms) models are still
needed Twilight zone: approximate (and useful) models; further improvements will require full atom description and refinements
- Protein function annotation from 3D structure
Structural genomics projects: high-throughput delivery of protein structures regardless of the state of their functional annotation
High resolution models of active sites are required
SLIDE 3 Principles
- Active sites are key to understanding of protein functions
- Most conserved regions of homologue proteins are linked to
active sites (e.g. PROSITE patterns)
- Multiple alignment improves pattern recognition
(e.g. ClustalW)
Multiple alignment of protein 3D structures based on active site position at atomic level generation of 3D motifs
SLIDE 4 3D motif generation
- Rigid alignment of protein 3D structures based on local
features linked to active sites prosthetic groups
ligands PROSITE patterns (under development)…
- Generate consensus pattern based on threshold
atom positions (1-1.4 Å)
chemical group positions, e.g. carboxyl, (1.5-3.8 Å) cavity position, if relevant, (1-1.4 Å)
SLIDE 5 Validation: Proteins w ith porphyrin rings
- The PDB holds 1551 proteins containing these groups
(i.e. 5.3% of PDB entries, 01/02/05) globines peroxidases cytochromes P450s chlorophyll proteins…
- All atoms of the porphyrin ring used to align sets of
homologue proteins
SLIDE 6 Validation: Methodology
- Representatives of proteins containing porphyrin rings
PDB50% Identical chains and chains that are not involved with a prosthetic group were removed. Structures of 237 chains
Generation of a set of homologues (within our set) using FASTA Generation of a 3D template based on homologues only Comparison between template and PDB structure
SLIDE 7 Number of true and false positives Distribution of true positives
66 patterns w ere produced
(at least 3 homologues were required, E value of 10e-6, atom distance of 1.25 Å)
Validation: Results
Similar results with other parameters, groups and cavity Detection of abnormal structures: 1U5U & 1S05
1S05: structural model validated using a restricted set of NMR experiments 1U5U: protein fragment
True positives False positives
SLIDE 8 Application: Modelling of active site of CYP17
- P450 protein involved in biosynthesis of sex hormones
- Enzyme associated to some forms of cancer (breast & prostate)
- 3D structure is unknow n
- P450 active site:
Haem group linked to protein by a cysteine Ligand (i.e. drug) on the other side of the haem group
SLIDE 9
Homologues of p450 human CYP17
P450s: 3500+ sequences (50+ human genes) 125 structures ( 5 humans)
18 29.5% 417 1AKD 35 23.9% 417 1IZO 61 26.4% 389 1N97 63 24.0% 455 1H5Z 79 29.3% 455 1BU7 135 28.2% 485 1W0G 136 28.6% 476 1PQ2 Hits Identity% Length Protein
No single good candidate! But 5 proteins are close…
SLIDE 10
Consensus structure of CYP17
Haem group Cavity area Consensus atoms
Is it biologically meaningful?
Consensus groups
SLIDE 11
P450 know n patterns based on sequences
4 clusters based on structure the 4 most common patterns! e.g. blue box, P450 signature
[FW]-[SGNH]-x-[GD]-x-[RKHPT]-x-C-[LIVMFAP]-[GAP]
Modelling of P450 active site based on consensus 3D structures, J.-C. Nebel, International Conference on Biomedical Engineering, BioMed 2005, 16-18 February 2005, Innsbruck, Austria
SLIDE 12
Comparison w ith CYP17 active site models
Generated independently by Dr S. Ahmed*
(Kingston University - School of Chemical & Pharmaceutical Sciences)
*Ahmed S: The use of the novel substrate-heme complex approach in the derivation of a representation of the active site of the enzyme complex 17alpha-hydroxylase and 17,20-lyase. Biochem Biophys Res Commun 2004, 316(3):595-598.
Putative H-bond
SLIDE 13
Clustering according to active site similarity
(Kulczynski’s metric & Neighbour Joining) NESTOR3D ClustalW
Tow ards function prediction
Protein families: P450s Globines Flavocytochromes Peroxidases Catalases, Cytochromes B, C, C2, C3 & C’
SLIDE 14 Nestor3D: free softw are (Java)
Generation of consensus structures Generation of cavity descriptions Generation of similarity matrices
Better understanding of active sites (drug design…) Function prediction: detection of putative active sites from a protein 3D structure generation of phylogenetic tree from similarity matrix Homology modelling generation of a structure template (constraints…) validation of predicted protein structure http://www.kingston.ac.uk/~ku33185/Nestor3D.html
SLIDE 15
- New method to produce high resolution active site models
- Consensus structure elements biologically meaningful and
related to function
- Require proteins interacting w ith ligands or heterogeneous
groups: rigid molecules:
6% PDB entries semi rigid molecules: 20% PDB entries
Conclusion Future w ork
- Alignments based on PROSITE patterns
- Generation of a 3D motif database
5 kinases (ATP or ADP)