introduction
play

Introduction Applied Bioinformatics Michael Schroeder - PowerPoint PPT Presentation

Introduction Applied Bioinformatics Michael Schroeder Biotechnology Center TU Dresden DNA the molecule of life http://www.ornl.gov/hgmis 2 High-throughput Technology 1950s: 2000s: 2010s: Watson and Crick Sanger Center BGI,


  1. Introduction Applied Bioinformatics Michael Schroeder Biotechnology Center TU Dresden

  2. DNA – the molecule of life http://www.ornl.gov/hgmis 2

  3. High-throughput Technology 1950s: 2000s: 2010s: Watson and Crick Sanger Center BGI, Cambridge Beijing 3

  4. Drug Discovery 20 20 80 80 New Drugs New Drugs 70 70 R&D spendings R&D spendings New drugs per year New drugs per year 15 15 60 60 50 50 10 10 40 40 30 30 5 5 20 20 10 10 0 0 0 0 60 60 65 65 70 70 75 75 80 80 85 85 90 90 95 95 Year Year 4

  5. Genetic Code 5

  6. Actinidin and Papain Cystein proteases in kiwi and papaya, respectively Tenderises meat and breaks down casein (milk) 50% sequence ID, same structure 6

  7. Hemoglobin and Leghemoglobin Oxygen transport in red blood cells and legumes, respectively 11% sequence ID, same structure 7

  8. Sequence-Structure Relation 8

  9. Similar sequences hint for … § common ancestry and § possibly similar function 9

  10. Similar sequence, similar function? § Monkey V-sys and human PDGF 85% similar Doolittle et al., Science, 1983 Simian sarcoma virus onco gene, v-sys, is derived from the gene encoding a platelet-derived growth factor. § Hypothesis: Cancer = deregulated growth factor Alignment from: http://pdf.aminer.org/000/244/500/design_and_implementation_of_a_dna_sequence_processor.pdf 10

  11. Similar sequence, common ancestry? >sp|P00674|RNP_HORSE Ribonuclease pancreatic Horse KESPAMKFERQHMDSGSTSSSNPTYCNQMMKRRNMTQGWCKPVNTFVHEPLADVQAICLQ… >sp|P00673|RNP_BALAC Ribonuclease pancreatic Minke whale RESPAMKFQRQHMDSGNSPGNNPNYCNQMMMRRKMTQGRCKPVNTFVHESLEDVKAVCSQ… >sp|P00686|RNP_MACRU Ribonuclease pancreatic Red kangaroo ETPAEKFQRQHMDTEHSTASSSNYCNLMMKARDMTSGRCKPLNTFIHEPKSVVDAVCHQE… 11

  12. Alignment CLUSTAL 2.1 multiple sequence alignment sp|P00674|RNP_HORSE sp|P00673|RNP_BALAC sp|P00686|RNP_MACRU KESPAMKFERQHMDSGSTSSSNPTYCNQMMKRRNMTQGWCKPVNTFVHEPLADVQAICLQ 60 RESPAMKFQRQHMDSGNSPGNNPNYCNQMMMRRKMTQGRCKPVNTFVHESLEDVKAVCSQ 60 -ETPAEKFQRQHMDTEHSTASSSNYCNLMMKARDMTSGRCKPLNTFIHEPKSVVDAVCHQ 59 *:** **:*****: :......*** ** *.**.* ***:***:**. *.*:* * KNITCKNGQSNCYQSSSSMHITDCRLTSGSKYPNCAYQTSQKERHIIVACEGNPYVPVHF 120 KNVLCKNGRTNCYESNSTMHITDCRQTGSSKYPNCAYKTSQKEKHIIVACEGNPYVPVHF 120 ENVTCKNGRTNCYKSNSRLSITNCRQTGASKYPNCQYETSNLNKQIIVACEG-QYVPVHF 118 :*: ****::***:*.* : **:** *..****** *:**: :::******* ****** Number of aligned residues DASVEVST 128 DNSV---- 124 DAYV---- 122 * * § Horse and Minke whale: 95 § Minke whale and Red kangoroo: 82 § Horse and Red kangoroo: 75 http://www.genome.jp/tools/clustalw 12

  13. Similar sequence, common ancestry? 13

  14. African elephant: sp|O47885|CYB_ELEMA African elephant: sp|O47885|CYB_ELEMA Mammoth: sp|P92658|CYB_MAMPR Mammoth: sp|P92658|CYB_MAMPR Indian elephant: sp|P24958|CYB_LOXAF Indian elephant: sp|P24958|CYB_LOXAF 14

  15. Elephant and Mammoth Mammoth-African elephant 10 mismatches Mammoth-Indian elephant 14 mismatches Significant? 15

  16. Similarity implies homology Sequence similarity is not equal to homology 16

  17. Similarity usually implies homology § Conservation: Sequences similar in many species § Convergent evolution § Mutation rate varied § Horizontal gene transfer 17

  18. Homologue Orthologue Paralogue 18

  19. Darwin‘s Tree of Life 19

  20. Tree of Life with 2.3 Mio Species opentreeoflife.org 20

  21. Sequence Alignments § Why to compare and align sequences? § How to judge an alignment? § How to compute an alignment? § How to compute an alignment fast? 21

  22. How to judge an alignment § Scoring scheme § number of matches, mismatches, gaps § substitution matrices § Significance § E-value, P-value, Z-score § Structure § Benchmark sequence against structure alignment § Function § Benchmark sequence alignment implies similar function? 22

  23. Sequence Alignments § Why to compare and align sequences? § How to judge an alignment? § How to compute an alignment? § How to compute an alignment fast? 23

  24. Levenshtein (or Edit) Distance Minimum number of insertions, deletions, and replacements to convert string a into string b 24

  25. Levenshtein (or Edit) Distance Let a = a 1 . . . a m and b = b 1 . . . b n be strings. Then lev a,b = lev a,b ( m, n ) is the Levenshtein distance of a and b , where 8 max( i, j ) if min( i, j ) = 0 , > > > 8 > lev a,b ( i � 1 , j ) + 1 < lev a,b ( i, j ) = > < min lev a,b ( i, j � 1) + 1 otherwise, > > > > > lev a,b ( i � 1 , j � 1) + 1 ( a i 6 = b j ) : : and 0  i  m and 0  j  n and ( 1 if ( a i 6 = b j ) , 1 ( a i 6 = b j ) := 0 if ( a i = b j ) . 25

  26. From Distance to Alignment Aligning RDISLVKNAGI and RNILVSDAKNVGI R D I S L V - - - K N A G I R N I - L V S D A K N V G I From lectures.molgen.mpg.de/Alg/Intro/ 26

  27. Sequence Alignments § Why to compare and align sequences? § How to judge an alignment? § How to compute an alignment? § How to compute an alignment fast? 27

  28. Computing Alignments fast compbio.pbworks.com 28

  29. Computing multiple sequence alignments 29

  30. Computing phylogenetic trees § Distance-based § Neighbour joining § Hierarchical clustering § Character-based § Parsimony method § Maximum Likelihood Saitou, Kyushu Museum, 2002 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend