alignments in practice blast and clustal
play

Alignments in Practice BLAST and CLUSTAL Introduction to - PowerPoint PPT Presentation

Alignments in Practice BLAST and CLUSTAL Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Overview Dot Plots Nucleotide BLAST Protein BLAST BLAST


  1. Alignments in Practice BLAST and CLUSTAL Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1

  2. Overview ● Dot Plots ● Nucleotide BLAST ● Protein BLAST ● BLAST Statistics ● BLAT ● CLUSTAL ● JalView 2

  3. Dotter – Tool for Dot Plots ● http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.html ● Dotlet: a Java applet for Dot Plots 3

  4. Dot Plots ● Hemoglobin Alpha against Hemoglobin Beta 4

  5. EBI Alignment Service 5

  6. BLAST ● URL: http://www.ncbi.nlm.nih.gov/BLAST/ ● Basic Local Alignment Search Tool 6

  7. Choose the right BLAST 7

  8. Nucleotide BLAST Interface 8

  9. BLAST Parameters ● Expect threshold: low [0.01] = strict high [100] = loose ● Word size: speed vs. sensitivity high = faster low = slower, but more sensitive 9

  10. Protein BLAST 10

  11. Protein BLAST Parameters 11

  12. Translated BLAST ● protein query against nucleotide database – nucleotide sequence not unique – also consider reverse complement ● nucleotide query against protein database – consider all 6 reading frames 12

  13. BLAST Output 13

  14. BLAST Output II Database + Accession Description Bit score E-value Link 14

  15. BLAST Statistics ● How good / reliable is a hit found by BLAST? ● Raw score := score of the alignment according to scoring matrix and gap penalties ● Bit score := score (log2 units), length-normalized ● E-value := Number of hits of such or better score in a hypothetical database of random proteins of the same size 15

  16. More on Statistics ● Null model := random model describing sequences without intentional signal (here: pair of random sequences without intentional similarity) ● (single) p-value for observed score s := Prob(Score >= s) in the null model ● (multiple) p-value := Prob(Score >= s at least once) 16

  17. BLAT ● BLAST-Like Alignment Tool ● index-based ● developed at UC Santa Cruz ● especially for searching in whole genomes ● very fast ● limited to nearly exact matches 17

  18. UCSC Genome Browser + BLAT 18

  19. CLUSTAL 19

  20. What Clustal Did (“Output file”) 20

  21. Clustal Results (pretty) 21

  22. Clustal Results (“alignment file”) CLUSTAL W (1.83) multiple sequence alignment FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60 FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60 FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.* FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119 FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119 FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119 FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118 FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS-------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : : FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178 FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:***** FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234 FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234 FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234 FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230 FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 213 22 * :*********:********:* **:* : *.* :: : :: :..

  23. Clustal Guide Tree 23

  24. Clustal Guide Tree ● Guide Tree is not a phylogenetic tree, just a computational device ● Cladogram: edge lengths have no meaning ● Phylogram: edgle lengths correspond to distances 24

  25. JalView: Alignment Editor (start from the CLUSTAL web site) 25

  26. Simple JalView Window ● Simple alignment editor (Java applet) ● Complex alignment editor (Java application) – Web Start, or – Download installer 26

  27. Starting or Installing JalView www.jalview.org 27

  28. Multiple Alignment @ BiBiServ 28

  29. For Windows/MAC: QAlign2 ● URL: http://gi.cebitec.uni-bielefeld.de/QAlign/ ● Live Demo of QAlign2 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend