evolutionary conservation of human phosphorylation sites
play

Evolutionary Conservation of Human Phosphorylation Sites Javad - PowerPoint PPT Presentation

Evolutionary Conservation of Human Phosphorylation Sites Javad Safaei 1 , Jan Manuch 1 , Arvind Gupta 1 , Ladislav Stacho 2 , Steven Pelech 3 1. UBC, Department of Computer Science 2. SFU, Department of Mathematics 3. UBC, Department of


  1. Evolutionary Conservation of Human Phosphorylation Sites Javad Safaei 1 , Jan Manuch 1 , Arvind Gupta 1 , Ladislav Stacho 2 , Steven Pelech 3 1. UBC, Department of Computer Science 2. SFU, Department of Mathematics 3. UBC, Department of Medicine, and Kinexus Bioinformatics Corporation 1 1:02 AM

  2. Cell Signaling Network � Human body consists of different types of cells � 23,000 different protein types in cells � Different cell types are different in the level of each protein type � Defects in the cell signaling network leads to 400 diseases (esp. Cancer, Diabetes, and Alzheimer) � Modeling the network is useful for drug discovery

  3. Cell Phosphorylation Network � Network defect correlates with 400 diseases (Cancer) � Phosphorylation Important PTMs � Protein kinases phosphorylate � substrates, protein phosphatases dephosphorylate substrates, phospho- dependent proteins bind to phosphate and move around it Can change protein function and 3D � structure dramatically � Phosphosites Only on serine (S), threonine(T), � tyrosine (Y), and rarely histidine (H). Into two main groups: Inhibitory: inhibit the protein from its � activity Activatory: activate protein � 3 1:02 AM

  4. Phosphosites Conservation � Why? Correlate conservation and � inhibition/activation sites Correlate conservation and confirmed � disease mutation data Investigate conservation in S, T, and Y � sites How much phosphates (negative � moiety) are replaced by negatively charged amino acids: aspartic (D), glutamic (E) amino acids � Conservation of sites, requires conservation of proteins, and that requires recognition of human protein orthologs in other species. 4 1:02 AM

  5. Orthologs Recognition The most similar protein in the � other species is the orthologs protein Certain threshold of similarity � needed for ortholog Global sequence alignment is the � similarity measure Protein orthologs are aligned � with blue rectangles Number of proteins are different � in different species 5 1:02 AM

  6. Orthologs Recognition Big protein databases, need to be done fast and accurately � For each species build the blast database from Fasta sequences � Species_DB <= formatdb -i Species_Seqs -p T -o T -p T works proteins, and -o T to create indices in the results. � For each human protein run blast search on each formatted species db, and retrieve top five � candidate proteins Top_5_Orthologs <= blastp –i Input_Seq –d Species_DB –b 5 Blast is imperfect database search, therefore for each candidate protein compute the global � alignment based on Needleman–Wunsch. Protein with the highest percent identity is chosen as the human protein in that ortholog. � Works correctly to find the protein itself in human protein database. � 6 1:02 AM

  7. Conservation of Phosphosites Phosphosites are analyzed through regions r 1 , r 2 , r 3 (subsequence) centered at each site (15 � residues in our case) This region is well known in biology and specificity of the kinases and phosphatases is defined � using it. Globally aligning human proteins ( p h ) with species orthologs ( p s ), automatically aligns � phospho-regions but with high probability of gaps. We modified needleman-wunsch global alignment to take gaps outside of the phospho- � regions, and also predict more sites in the ortholog: constrained global alignment (CGA) Some sites ( r 3 ) are aligned with different amino acids than S, T, Y (we don’t count those cases � in statistics). 7 1:02 AM

  8. Constrained Global Alignment (CGA) 8 1:02 AM

  9. Phosphosite Prediction in Human Prediction Species # Proteins P-Ser P-Thr P-Tyr Total Sites Yeast 1,542 8,184 1,855 0 10,039 Yeast to Human Human 311 225 126 9 360 Ratio (Human/Yeast) 20.17% 2.75% 6.79% NA 3.59% Worm 696 3,060 440 114 3,614 Worm to Human Human 369 178 82 27 287 Ratio (Human/Worm) 53.02% 5.82% 18.64% 23.68% 7.94% Fruit Fly 3,956 11,556 3,495 705 15,756 Fruit fly to Human Human 1,676 1,666 917 188 2,771 Ratio (Human/Fruit Fly) 42.37% 14.42% 26.24% 26.67% 17.59% Total Predicted Human 2,356 2,069 1,125 224 3,418 Sites gathered from PhosphositePlus, Phospho-ELM, Phosidia, Literature � Prediction of over 3,000 phospho-sites by constrained GA from 30,000 sites in 3 different � species. (T, Y)-sites are more conserved than S-sites. � zero Y -site in yeast, leads to 9 Y -sites in Human (i.e. S, T have changed to Y in human) � The more similar specie to human, the more sites predicted in human. � 9 1:02 AM

  10. Phosphosite Prediction in Species � Using 90K experimentally P-Ser P-Thr P-Tyr Thr/Ser All confirmed phosphosites in human Human 53,478 16,971 18,849 32% 89,298 1 Mouse 45,096 14,344 16,598 32% 76,038 2 Dog 42,479 13,605 15,830 32% 71,914 3 Chimpanzee 41,471 14,030 15,227 34% 70,728 � Prediction of over 620K sites in 19 4 Rhesus macaque 40,163 13,228 14,735 33% 68,126 species 5 Rat 39,733 13,437 14,672 34% 67,842 6 Chicken 30,333 11,233 12,566 37% 54,132 7 Brachydanio rerio 26,669 11,045 11,050 41% 48,764 8 Duckbill platypus 24,467 9,035 10,023 37% 43,525 � Availability 9 African clawed frog 19,780 8,617 8,911 44% 37,308 � www.phosphonet.ca includes exact 10 Fruit fly 9,665 5,878 4,698 61% 20,241 11 Purple sea urchin 8,156 4,709 3,489 58% 16,354 proteins and sites information 12 Honeybee 6,766 4,219 3,440 62% 14,425 13 Nematode worm 5,364 3,390 2,846 63% 11,600 14 Baker's yeast 3,135 2,223 1,661 71% 7,019 � The farther the species, the more 15 Mouse-ear cress 3,070 1,752 1,444 57% 6,266 Thr/Ser- ratio 16 Red bread mold 791 671 557 85% 2,019 17 Maize 693 419 488 60% 1,600 18 Western balsam poplar 747 430 371 58% 1,548 19 Tammar wallaby 38 31 23 82% 92 Total Predicted Sites 348,616 132,296 138,629 NA 619,541 10 1:02 AM

  11. Human Phosphosites Scores � Avg Activation score -4 , is Check if the negatively charged PO 3 replaced by Aspartic (D) or Gultamic (E) acids in other species to keep the functionality. � Avg Conservation score � Identity Conservation � Similarity Conservation � Divide by the number of found phospho-regions (less than 20) 11 1:02 AM

  12. Amino Acids Similarity T o compute percent similarity of phospho-regions, the following graph is suggested by � experience. Edges means similarity � Different than BLOSUM matrix that is for conservation � df � 12 1:02 AM

  13. Conclusion Results � Conservation Similarity is used P-Ser P-Thr P-Tyr Total All P-Sites Phospho-Thr sites are more conserved, � than Ser-Tyr sites. Total: 89,298 0.00242 Avg Activation 0.00390 0.00099 -0.00048 26.98 Avg Conservation 25.62 27.33 30.52 Phospho-Thr/Phospho-Ser ratio � Functional P-Sites increase in farther species to human Total: 769 0.009709 Avg Activation - Activating 0.006557 0.006543 0.018085 Kinase sites are more conserved than a � 36.59 Avg Conservation - Activating 35.81 39.36 34.57 random site in a substrate 0.003 Avg Activation - Inhibitory 9.26E-05 -0.00634 0.025484 31.90 Avg Conservation - Inhibitory 30.56 34.34 33.34 Functional activatory sites are more � Functional Kinase P-Sites conserved. Total: 183 0.009931 Avg Activation - Activating 0.006025 0.008227 0.016565 37.67 Avg Conservation - Activating 37.48 40.51 34.86 � Activation Scores Avg Activation - Inhibitory -0.00357 0.002857 0.025 0.005179 Activatory sites have higher avg. � 32.47 Avg Conservation - Inhibitory 30.29 36.01 33.30 activation score than inhibition sites as Kinase P-Sites we excepted. Total: 7,121 0.001276 Avg Activation 0.000331 0.000408 0.003833 30.33 Avg Conservation 27.62 32.34 33.78 13 1:02 AM

  14. Acknowledgement � CRD grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada and the MITACS Accelerate Internship Program � Kinexus Company, on data preparation 14 1:02 AM

  15. Questions 15 1:02 AM

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend