Alignments in Practice BLAST and CLUSTAL Introduction to - - PowerPoint PPT Presentation

alignments in practice blast and clustal
SMART_READER_LITE
LIVE PREVIEW

Alignments in Practice BLAST and CLUSTAL Introduction to - - PowerPoint PPT Presentation

Alignments in Practice BLAST and CLUSTAL Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Overview Dot Plots Nucleotide BLAST Protein BLAST BLAST


slide-1
SLIDE 1

1

Alignments in Practice BLAST and CLUSTAL

Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst

slide-2
SLIDE 2

2

Overview

  • Dot Plots
  • Nucleotide BLAST
  • Protein BLAST
  • BLAST Statistics
  • BLAT
  • CLUSTAL
  • JalView
slide-3
SLIDE 3

3

Dotter – Tool for Dot Plots

  • http://www.cgb.ki.se/cgb/groups/sonnhammer/Dotter.html
  • Dotlet: a Java applet for Dot Plots
slide-4
SLIDE 4

4

Dot Plots

  • Hemoglobin Alpha against Hemoglobin Beta
slide-5
SLIDE 5

5

EBI Alignment Service

slide-6
SLIDE 6

6

BLAST

  • URL: http://www.ncbi.nlm.nih.gov/BLAST/
  • Basic Local Alignment Search Tool
slide-7
SLIDE 7

7

Choose the right BLAST

slide-8
SLIDE 8

8

Nucleotide BLAST Interface

slide-9
SLIDE 9

9

BLAST Parameters

  • Expect threshold:

low [0.01] = strict high [100] = loose

  • Word size:

speed vs. sensitivity high = faster low = slower, but more sensitive

slide-10
SLIDE 10

10

Protein BLAST

slide-11
SLIDE 11

11

Protein BLAST Parameters

slide-12
SLIDE 12

12

Translated BLAST

  • protein query against nucleotide database

– nucleotide sequence not unique – also consider reverse complement

  • nucleotide query against protein database

– consider all 6 reading frames

slide-13
SLIDE 13

13

BLAST Output

slide-14
SLIDE 14

14

BLAST Output II

Database + Accession Link Bit score E-value Description

slide-15
SLIDE 15

15

  • How good / reliable is a hit found by BLAST?
  • Raw score :=

score of the alignment according to scoring matrix and gap penalties

  • Bit score :=

score (log2 units), length-normalized

  • E-value :=

Number of hits of such or better score in a hypothetical database of random proteins of the same size

BLAST Statistics

slide-16
SLIDE 16

16

More on Statistics

  • Null model :=

random model describing sequences without intentional signal (here: pair of random sequences without intentional similarity)

  • (single) p-value for observed score s :=

Prob(Score >= s) in the null model

  • (multiple) p-value :=

Prob(Score >= s at least once)

slide-17
SLIDE 17

17

BLAT

  • BLAST-Like Alignment Tool
  • index-based
  • developed at UC Santa Cruz
  • especially for searching in whole genomes
  • very fast
  • limited to nearly exact matches
slide-18
SLIDE 18

18

UCSC Genome Browser + BLAT

slide-19
SLIDE 19

19

CLUSTAL

slide-20
SLIDE 20

20

What Clustal Did (“Output file”)

slide-21
SLIDE 21

21

Clustal Results (pretty)

slide-22
SLIDE 22

22

Clustal Results (“alignment file”)

CLUSTAL W (1.83) multiple sequence alignment FOS_RAT MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_MOUSE MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNTQDFCADLSVSSANF 60 FOS_HUMAN MMFSGFNADYEASSSRCSSASPAGDSLSYYHSPADSFSSMGSPVNAQDFCTDLAVSSANF 60 FOS_CHICK MMYQGFAGEYEAPSSRCSSASPAGDSLTYYPSPADSFSSMGSPVNSQDFCTDLAVSSANF 60 FOS_ZEBRAFISH MMFTSLNADCDASS-RCSTASPSGDSVGYY------------PLNQTQEFTDLSVSSASF 47 **: .: .: :*.* ***:***:***: ** *:* : :**:****.* FOS_RAT IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTPS-TGAYARAGVVKTMSGGR 119 FOS_MOUSE IPTVTAISTSPDLQWLVQPTLVSSVAPSQTRAPHPYGLPTQS-AGAYARAGMVKTVSGGR 119 FOS_HUMAN IPTVTAISTSPDLQWLVQPALVSSVAPSQTRAPHPFGVPAPS-AGAYSRAGVVKTMTGGR 119 FOS_CHICK VPTVTAISTSPDLQWLVQPTLISSVAPSQNRG-HPYGVPAPAPPAAYSRPAVLKAP-GGR 118 FOS_ZEBRAFISH VPTVTAISSCPDLQWMVQP-MISSAAPS-------NGAAQSYNPSSYPKMRVTGAK---- 95 :*******:.*****:*** ::**.*** * . ..:*.: : : FOS_RAT AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_MOUSE AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_HUMAN AQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 179 FOS_CHICK GQSIGRRGKVEQLSPEEEEKRRIRRERNKMAAAKCRNRRRELTDTLQAETDQLEEEKSAL 178 FOS_ZEBRAFISH --TSNKRSRSEQLSPEEEEKKRVRRERSKMAAAKCRNRRRELTDTLQAETDQLEDEKSAL 153 : .:*.: **********:*:****.**************************:***** FOS_RAT QTEIANLLKEKEKLEFILAAHRPACKIPNDLGFPEE----MSVTS-LDLTGGLPEATTPE 234 FOS_MOUSE QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEASTPE 234 FOS_HUMAN QTEIANLLKEKEKLEFILAAHRPACKIPDDLGFPEE----MSVAS-LDLTGGLPEVATPE 234 FOS_CHICK QAEIANLLKEKEKLEFILAAHRPACKMPEELRFSEE----LAAATALDLG----APSPAA 230 FOS_ZEBRAFISH QNDIANLLKEKERLEFILAAHKPICKIPADASFPEPSSSPMSSISVPEIVTTSVVSSTPN 213 * :*********:********:* **:* : *.* :: : :: :..

slide-23
SLIDE 23

23

Clustal Guide Tree

slide-24
SLIDE 24

24

Clustal Guide Tree

  • Guide Tree is not a phylogenetic tree,

just a computational device

  • Cladogram: edge lengths have no meaning
  • Phylogram: edgle lengths correspond to

distances

slide-25
SLIDE 25

25

JalView: Alignment Editor (start from the CLUSTAL web site)

slide-26
SLIDE 26

26

Simple JalView Window

  • Simple alignment editor (Java applet)
  • Complex alignment editor (Java application)

– Web Start, or – Download installer

slide-27
SLIDE 27

27

Starting or Installing JalView

www.jalview.org

slide-28
SLIDE 28

28

Multiple Alignment @ BiBiServ

slide-29
SLIDE 29

29

For Windows/MAC: QAlign2

  • URL: http://gi.cebitec.uni-bielefeld.de/QAlign/
  • Live Demo of QAlign2