prediction of human protein kinase substrate specificities
play

Prediction of Human Protein Kinase Substrate Specificities Javad - PowerPoint PPT Presentation

Prediction of Human Protein Kinase Substrate Specificities Javad Safaei 1 , Jan Manuch 1 , Arvind Gupta 1 , Ladislav Stacho 2 , Steven Pelech 3 1. UBC, Department of Computer Science 2. SFU, Department of Mathematics 3. UBC, Department of


  1. Prediction of Human Protein Kinase Substrate Specificities Javad Safaei 1 , Jan Manuch 1 , Arvind Gupta 1 , Ladislav Stacho 2 , Steven Pelech 3 1. UBC, Department of Computer Science 2. SFU, Department of Mathematics 3. UBC, Department of Medicine, and Kinexus Bioinformatics Corporation

  2. Cell Signaling Network � Human body consists of different types of cells � 23,000 different protein types in cells � Different cell types are different in the level of each protein type � Defects in the cell signaling network leads to 400 diseases (esp. Cancer, Diabetes, and Alzheimer) � Modeling the network is useful for drug discovery

  3. Major components in cell phosphorylation signaling � Each component participate in interaction via its domains (d 1 ,d 2 ) � Phosphorylation creates dramatic changes in 3D structure of proteins leading to inhibition, stimulation of proteins � Kinases are S, S-T, Y specific based on their phosphorylation

  4. Dynamics of Kinase-Substrate Interaction (Docking) � Kinase-Phospho site interaction is a kind of key Protein and lock model Substrate H - � Active sites should be + close to each other for interaction - H - + � Important factors in bond + � Size and position of + amino acids + � Charge of amino acids Protein Kinase

  5. cAMP-dependent Protein Kinase Structure (PKA) � Alanine as base 0 position � Isolecine as +1 position � Argenine as -2 position � Phospho-S Peptide GRTGRRNSIHPDSAC +1 I Sub-domains are shown in color � -2 R L198 SDRs are the key residues helpful for � E170 P202 specificity prediction P169 L205 E230

  6. Problem and Dataset nature � Peptides are found usually in vitro by mass-spectrometry � Peptide is a small sub-sequence with length 15 centered at phospho-site (S, T, Y) �. � Kinases with a lot of peptides � � with a few peptides � with no peptide � Problem is to find PSSM matrix (kinase specificity) of all kinases having only primary structure

  7. Alignments of catalytic domains � Done by ClustalW tool � Purified Manually by experts � Each column is a random variables (RV) � We can now infer how the dynamics of the binding will be

  8. Charge Matrix R(x i ,y j ) � Glycine is favoured to be on the peptide � Histedine is less positive than the others � S, T, Y are neutral but tend to attract each other � Proline is neutral and creates stair like structure on the protein

  9. Graphical model of the interaction X 1 � Mutual Information X 2 Y 1 X 3 Y 2 X 4 Y 3 � Charge Dependecy (n is # of �... �... training data for each RV) X 245 Y 15 X 246 X 247 � Correlation Charge Dependecy

  10. Graphical model of the interaction � Pick top 7, X variables as SDRs Z 1 C c (Z 1 ,Y 1 ) � Compute the probability of Z 2 each amino acid on the peptide C ( Z c , Y 2 ) 1 Z 3 C c (Z 3 ,Y 1 ) Y 1 ... �... ) Y 1 , � Having trained the model, for a Z 7 ( C c new kinase aligned catalytic domain we can predict the Z 7 specificity matrix, knowing only SDRs

  11. Compute profile matrix of a kinase without peptide data � Having trained the model, for a new aligned kinase catalytic domain we can predict the profile matrix, knowing only SDRs

  12. Data and Process Flow Phospho.ELM Literature PhosphoSite Plus Maching Learning ANN, SVM, HMM 9,125 Kinase-Phospho Compute Background (Surface) 550 Kinases in human Peptide pairs for 309 Kinase Frequency of Amino Acids domains Remove atypical kinases 500 Kinase catalytic domain 229 Kinases with consensus Compute Specificity (PSSM) Matrices sequences from 488 Kinases Of 309 Kinases domains Compute Profile Matrix of 309 Find SDRs and Profile Matrix of 500 Kinases domains with data Kinases with No Data Comparison in Experiment Compute PSSM Matrices for 500 domains Comparison in NetPhorest Predictor Experiment Sites

  13. Definitions � Background Frequency B(i) , probability of amino acid i on the surface, we compute it by peptide training data � Profile Matrix of each Kinase, P k (i,j) amino acid i at position j of the peptides phosphorylated by Kinase K. � Specificity (PSSM) Matrix of a Kinase usually is log odds ratio M k (i,j)= log(P k (i,j) / B(i)) � We used the following eq. to eliminate –inf in the matrix M k (i,j)= sgn{P k (i,j) – B(i)}× |P k (i,j) – B(i)| 1.2

  14. Predicted vs. Experimental profile matrices � Comparison for 309 Kinases that we have phospho-peptide data � Prediction was 100% correct to recognize (S,S-T,Y) specific kinases, using only their aligned SDRs

  15. Comparison with Netphorest Netphorest has � 8,746 phosphosite-kinase � 169 Kinases � 50 Kinase groups � Doesn’t work for kinases with � no data Keeping the best kinase for each � site leads to 6299 site-kinase for comparison Our Method � Works for all 500 kinases � 500 different profile matrices � and specificities SDRs and yielding information � about 3D structure

  16. Future work (1) � hybrid recommender User / Kinase 1 u 1 systems for prediction Movie/ Peptide 1 u 5 u 2 Movie/ Peptide 2 User / Kinase 2 � Sparse utility matrix u 4 should be completed Movie/ Peptide 3 ? � SDRs therefore are important features in Movie/ Peptide M User / Kinase N user spec vector 3 S N×N U N×M Q M×M Similarity Similarity Utility Matrix Matrix Matrix

  17. Future work (2) � Generalize it for SH2, PTB domain proteins, to complete our model of cell signalling pathway � We have many crystallographic datasets here from PDB, and computational geometry or vision methods can be applied � Like user-movie problem, there is signal strength between SH2 domain proteins and receptor (substrate) proteins.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend