csi5126 algorithms in bioinformatics
play

CSI5126 . Algorithms in bioinformatics Pairwise Sequence Alignment - PowerPoint PPT Presentation

. Local . . . . . . . . Preamble Edit graph Global Gaps . Preamble Edit graph Global Local Gaps CSI5126 . Algorithms in bioinformatics Pairwise Sequence Alignment Marcel Turcotte School of Electrical Engineering and Computer


  1. . Local . . . . . . . . Preamble Edit graph Global Gaps . Preamble Edit graph Global Local Gaps CSI5126 . Algorithms in bioinformatics Pairwise Sequence Alignment Marcel Turcotte School of Electrical Engineering and Computer Science (EECS) University of Ottawa Version October 2, 2018 Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics

  2. . Global . . . . . . Preamble Edit graph Global Local Gaps Preamble Edit graph Local . Gaps Summary We now exploring important adaptations of the pairwise sequence alignment problem to make it relevant to real-world biology problems. General objective Select the appropriate pairwise alignment algorithm for a given problem. Reading Bernhard Haubold and Thomas Wiehe (2006). Introduction to computational biology: an evolutionary approach. Birkhäuser Basel. Pages 11-15, 30-33. Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics

  3. . Reading . . . . Preamble Edit graph Global Local Gaps Preamble Edit graph Global Local Gaps Bernhard Haubold and Thomas Wiehe (2006). . Introduction to computational biology: an evolutionary approach. Birkhäuser Basel. Pages 11-15, 30-33. Wing-Kin Sung (2010) Algorithms in Bioinformatics: A Practical Introduction. Chapman & Hall/CRC. QH 324.2 .S86 2010 Chapter 2. Dan Gusfjeld (1997) Algorithms on strings, trees, and sequences : computer science and computational biology . Cambridge University Press. Chapters 10 and 11. Pavel A. Pevzner and Phillip Compeau (2018) Bioinformatics Algorithms: An Active Learning Approach . Active Learning Publishers. http://bioinformaticsalgorithms.com Chapter 5. Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics

  4. . Preamble . . . . . . . . . . Edit graph . Global Local Gaps Preamble Edit graph Global Local Gaps Edit Graph . Marcel Turcotte . CSI5126 . Algorithms in bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . A T C G C − − − − − − − − − − − T C G A A A A A A C A A A A A A A T C G C − − − − − − − − − − − G C G A G T G C G G G G G G G G A T C G C − − − − − − − − − − − A T C G C G G G G G G G G G G G A T C G C − − − − − − − − − − − C A T C G C C C C C C T C C C C A T C G C − − − − −

  5. Edit Distance ][ ][ ][ ][ ][ ][ ][ ][ ] ][ ][ ][ ] e [ ][ t [ ][ ][ ][ ][ ][ ][ ] min = ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ] t [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ] ][ ][ ][ ] ][ ][ ][ ][ ][ ][ n [ ][ ][ ][ ][ ][ ][ ][ ][ ][ e [ ][ ] ][ ][ ][ ][ ][ ][ c [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ i - c o m p l m ][ e n t s - [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ] p [ ][ ][ ][ ][ ][ ][ ][ ][ m [ ] o [ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ][ ] . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . .. . . . . .

  6. Edit Distance 6][ 6][ 7][ e [ 5] 4][ 4][ 4][ 3][ 2][ 2][ 2][ 3][ 4][ 5][ t [ 4][ 2][ 6][ 7] e [ 5][ min = 4 3][ 1][ 6] 1][ 2][ 3][ 3][ 4][ 5][ 5][ 3][ 4][ 5][ 5] t [ 9][ 8][ 7][ 6][ 5][ 3][ 5][ 5][ 5][ 4][ 3][ 4] 4][ 4][ 3][ n [ 3][ 3][ 3][ 4][ 5][ 5] 8][ 4][ 7][ 6][ 5][ 4][ 4][ 4][ 5][ 4][ 3][ 1][ 5][ 6][ 7][ 8][ 9][ 10][ 11] c [ 0][ 3][ 1][ 2][ 3][ 4][ 5][ 6][ 4][ 2][ 8][ i - c o m p l m 1][ e n t s - [ 0][ 7][ 2][ 9][ 10] 8] 2][ 3][ 4][ 5][ o [ 7][ p [ 0][ 4][ 3][ 2][ 1][ 0][ 1][ 1][ 6][ 1][ 4][ 2][ 1][ 0][ 1][ 2][ 3][ 2][ 5][ 6][ 7][ 8][ 9] m [ 3][ . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . .. . . . . .

  7. Edit Distance 4][ 3][ 2][ 2][ 2]{ 3}[ 4][ 4][ 5][ 5] e [ 7][ 6][ 5][ 4][ 4][ 6][ 3][ 1][ 7] e [ 5][ 4][ min = 4 2][ 1]{ t [ 2}[ 3][ 3][ 4][ 5][ 6] 3][ 3][ 5][ 5][ 8][ 7][ 6][ 5][ 5][ 5][ 5][ t [ 4]{ 3}{ 4} MMMMDSSMMMD compliments comp-etent- 9][ 5] 3]{ 7][ 3}[ 4][ 5][ 5] n [ 8][ 6][ 4][ 5][ 4][ 4][ 4][ 4][ 4]{ 3}[ 6][ 3][ 4][ 1]{ 5][ 6][ 7][ 8][ 9][ 10][ 11] c [ 0}[ 3][ 1][ 3][ 3][ 4][ 5][ 6][ 4][ 2][ 8][ i - c o m p l m 1][ e n t s - { 0}[ 7][ 2][ 9][ 10] p [ 3][ 4][ 5][ 6][ o [ 8] 4][ 1][ 3][ 2][ 1]{ 0}{ 1}[ 2][ 2][ 7][ 0}[ 4][ 1]{ 1]{ 0}[ 1][ 2][ 3][ 2][ 5][ 6][ 7][ 8][ 9] m [ 3][ 2][ . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . .. . . . . .

  8. Edit Distance 4][ 3][ 2][ 2]{ 2}{ 3}[ 4][ 4][ 5][ 5] e [ 7][ 6][ 5][ 4][ 4][ 6][ 3][ 1]{ 7] e [ 5][ 4][ min = 4 2][ 1}[ t [ 2][ 3][ 3][ 4][ 5][ 6] 3][ 3][ 5][ 5][ 8][ 7][ 6][ 5][ 5][ 5][ 5][ t [ 4]{ 3}{ 4} MMMMSSDMMMD compliments compet-ent- 9][ 5] 3]{ 7][ 3}[ 4][ 5][ 5] n [ 8][ 6][ 4][ 5][ 4][ 4][ 4][ 4][ 4]{ 3}[ 6][ 3][ 4][ 1]{ 5][ 6][ 7][ 8][ 9][ 10][ 11] c [ 0}[ 3][ 1][ 3][ 3][ 4][ 5][ 6][ 4][ 2][ 8][ i - c o m p l m 1][ e n t s - { 0}[ 7][ 2][ 9][ 10] p [ 3][ 4][ 5][ 6][ o [ 8] 4][ 1][ 3][ 2][ 1]{ 0}[ 1][ 2][ 2][ 7][ 0}[ 4][ 1]{ 1]{ 0}[ 1][ 2][ 3][ 2][ 5][ 6][ 7][ 8][ 9] m [ 3][ 2][ . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . .. . . . . .

  9. . Preamble . . . . . . . Preamble Edit graph Global Local Gaps Edit graph . Global Local Gaps Remarks The calculation of each cell necessitates only three look-ups (the algorithm does not reconstruct the partial alignments as we did as we did for the purpose of the example); How many operations are needed then? The order in which we visit the cells during the fjrst pass is not important; as long as the value of the cells Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics ( i − 1 , j − 1 ) , ( i − 1 , j ) and ( i , j − 1 ) are known when calculating the value of the cell ( i , j ) .

  10. . . . . . . . . . . . . Preamble . Edit graph Global Local Gaps Preamble Edit graph Global Local Gaps Sequence alignment Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics . . . . . . − A G C − A A A C

  11. . . . . . . . . . . . . Preamble . . Global Local Gaps Preamble Edit graph Global Local Gaps Sequence alignment Marcel Turcotte . Edit graph . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics . . . . . . . . . . . . − A G C − 0 −1 −2 −3 A −1 1 0 −1 A −2 0 0 −1 A −3 −1 −1 −1 C −4 −2 −2 0 ⇒ How many optimal alignments are there?

  12. . Local . . . . . Preamble Edit graph Global Local Gaps Preamble Edit graph Global Gaps . Weighted Edit Operations A fjrst generalisation of the edit distance problem consists of associating weights to the edit operations : for instance, the cost of an insertion/deletion could be 1, the cost of a mismatch could be 2, and the cost of a match 0 ( useful weights will be derived in the next lecture ) The same algorithm can be used only this time it fjnds the edit transcript/alignment which has the minimum overall cost . The terms weight and cost are used interchangeably in the C.S. literature whilst score is most frequently used in the biological literature Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics

  13. . Local . . . . . Preamble Edit graph Global Local Gaps Preamble Edit graph Global Gaps . Weighted Edit Operations A fjrst generalisation of the edit distance problem consists of associating weights to the edit operations : for instance, the cost of an insertion/deletion could be 1, the cost of a mismatch could be 2, and the cost of a match 0 ( useful weights will be derived in the next lecture ) The same algorithm can be used only this time it fjnds the edit transcript/alignment which has the minimum overall cost . The terms weight and cost are used interchangeably in the C.S. literature whilst score is most frequently used in the biological literature Marcel Turcotte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CSI5126 . Algorithms in bioinformatics

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend