structure based comparison of biomolecules
play

Structure-Based Comparison of Biomolecules Benedikt Christoph - PowerPoint PPT Presentation

Structure-Based Comparison of Biomolecules Benedikt Christoph Wolters Seminar Bioinformatics Algorithms RWTH AACHEN 07/17/2015 Outline 1 Introduction and Motivation Protein Structure Hierarchy Protein Data Bases 2 Arc-Annotated Sequences


  1. Arc-Preserving Common Subsequences Definition (Arc-Preserving Common Subsequence) Let S = ( s 1 s 2 ... s m , P s ) and T = ( t 1 t 2 ... t n , P t ) be two arc-annotated sequences over an alphabet Σ . A string is called an arc-preserving common subsequence of S and T if there exists a common subsequence w of s and t and a mapping ϕ consistent with w such that 1 s i = t j for all � i , j � ∈ ϕ , and 2 for all pairs of elements ( � i 1 , j 1 � , � i 2 , j 2 � ) from ϕ ( i 1 , i 2 ) ∈ P s ⇐ ⇒ ( j 1 , j 2 ) ∈ P t . 13 / 45

  2. Example Σ = { A , G , U , C } ϕ = {� 1 , 4 � , � 5 , 5 � , � 6 , 6 � , � 9 , 8 � , � 10 , 9 � , � 11 , 11 � , � 12 , 12 �} – – – A U C D A G C G A U – C G S: G U A A – – – A G A – A U G C G T: 14 / 45

  3. Longest Arc-Preserving Common Subsequence (LAPCS) Definition (LAPCS(L EVEL 1 ,L EVEL 2 )) By LAPCS(L EVEL 1 ,L EVEL 2 we denote the optimization problem for two arc-annotated strings S ∈ L EVEL 1 and T ∈ L EVEL 2 to find the longest common arc-annotated substring. 15 / 45

  4. LAPCS(P LAIN ,P LAIN ) Theorem The optimization problem LAPCS(P LAIN ,P LAIN ) is computable in O ( m · n ) , where m and n are the length of the input strings. Proof. This problem is the same as the global alignment problem discussed in a previous talk. We can leverage dynamic programming and backtracking to solve this. 16 / 45

  5. NP-Hardness of LAPCS(C ROSSING ,C ROSSING ) Theorem LAPCS(C ROSSING ,C ROSSING ) is an NP-hard optimization problem. Idea: Consider D EC LAPCS, the corresponding decision problem of LAPCS. Reduce input instance of C LIQUE to D EC LAPCS. 17 / 45

  6. Recap: C LIQUE Problem Definition Let G = ( V , E ) be an undirected graph. A subset V ′ ⊆ V is called a clique , if every two vertices v i , v j ∈ V ′ , where v i � = v j are � � connected by an edge, i.e., v i , v j ∈ E . Definition (C LIQUE Decision Problem) Input: An undirected graph G = ( V , E ) and a positive integer k . Output: Y ES if G contains a clique V ′ of size k , N O , otherwise. Clique is a well-known NP-complete decision problem. 18 / 45

  7. Example: C LIQUE Is there a clique for k = 3? v 2 v 1 v 3 v 5 v 4 19 / 45

  8. Example: C LIQUE Is there a clique for k = 3? v 2 v 2 v 1 v 1 v 3 v 3 v 5 v 4 19 / 45

  9. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  10. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  11. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  12. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  13. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  14. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  15. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  16. Arc-Annotated String Construction from Input-Graph v 2 v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  17. Arc-Annotated String Construction from Input-Graph v 2 v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  18. Arc-Annotated String Construction from Input-Graph v 2 v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  19. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  20. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  21. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 4 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  22. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 4 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  23. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  24. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  25. Arc-Annotated String Construction from Input-Graph v 2 v 1 v 3 v 5 v 4 S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b Block for v 1 Block for v 2 Block for v 3 Block for v 4 Block for v 5 20 / 45

  26. Reduction construction formally Definition A undirected graph G = ( V , E ) , with | V | = n can be encoded as an arc-annotated string s = ( s , P s ) . � � n ba n b s = arcs encoding edges � �� � � � P s = (( i − 1 )( n + 2 )+ j + 1 , ( j − 1 )( n + 2 )+ i + 1 ) |{ v i , v j } ∈ E ) ∪ { (( i − 1 )( n + 2 )+ 1 , i ( n + 2 )) | i ∈ { 1 ,..., n }} � �� � arcs between two b ’s of a block 21 / 45

  27. Analog: Construction of the Clique v 2 v 1 v 3 T : b a a a b b a a a b b a a a b Block for v i 1 Block for v i 2 Block for v i 3 Note that | T | = k · ( k + 2 ) , where k is the size of the clique. 22 / 45

  28. Input for D EC LAPCS(C ROSSING ,C ROSSING ) Is there an arc-preserving common subsequence of size | T | ? S : b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a a a b b a a a b a a a b b a a a b b T : 23 / 45

  29. Proof (I): Polynominal Time Reduction Lemma The input ( S , T , | T | ) to D EC LAPCS(C ROSSING ,C ROSSING ) from ( G , k ) can be performed in polynomial time. • S can be directly constructed from G and has quadratic length in the number of vertices. • A fully connected graph G T of size k can be constructed in polynomial-time. • Analogously to S , now also T and | T | can be constructed in polynomial time by constructing a fully connected graph G T . 24 / 45

  30. Proof (II): Correctness “ ⇒ ” Lemma Existence of a clique of size k in G implies existence of an arc-preserving common subsequence of S and T of size | T | . • Let { v i 1 ,..., v i k } be a clique of size k in the input graph. • We can align k blocks of S to the k blocks of T . • In each block again k symbols are matched to symbols at positions i 1 ,..., i k in the block of S . • Arcs between two b ’s are matched since we always map complete blocks to complete blocks • v i 1 ,..., v i k are vertices of a clique, thus their corresponding arcs between a ’s are spanned by a arcs. 25 / 45

  31. Proof (III): Correctness “ ⇐ ” Lemma Existence of an arc-preserving common subsequence of S and T of size | T | implies a clique of size k in G. • | T | = k · ( k + 2 ) . • Due to arcs over b framing a block only blocks can be mapped to blocks. • T represents a clique of size k and blocks are constructed the same way as in S . • Thus i 1 ,..., i k blocks that are matched from T to S � � ⇒ v i 1 ,..., v i k is a clique of size k . 26 / 45

  32. NP-hardness of LAPCS(N ESTED ,N ESTED ) Theorem LAPCS(N ESTED ,N ESTED ) is an NP-hard optimization problem. • Proof [Lin et al., 2002] not presented here due to many preliminaries. • Idea: Reduction to variant of Maximum Independent Set (cubic planar graph) using several graph transformations with book embedding. 27 / 45

  33. Complexity Results Overview for LAPCS Classes P LAIN C HAIN N ESTED C ROSSING U NLIMITED U NLIMITED NP-hard C ROSSING NP-hard O ( nm 3 ) N ESTED NP-hard C HAIN O ( nm ) P LAIN O ( nm ) Table: Complexity Results for LAPCS(L EVEL 1,L EVEL 2 ) Due to hardness results: LAPCS approximation algorithms. 28 / 45

  34. 2-Approximation Algorithm for LAPCS(C ROSSING ,C ROSSING ) Idea: Use Longest Common Subsequence without arcs as a starting point and remove arc-conflicting parts successively. 2-Approximation Algorithm for LAPCS(C ROSSING ,C ROSSING ) Input: Two arc-annotated strings S = ( s , P s ) and T = ( t , P t ) with S , T ∈ C ROSSING . 1 Determine longest common subsequence w of s and t . Let ϕ a mapping consistent to w . 2 Construct the conflict-graph G ϕ from ϕ . 3 For each connected component in G ϕ delete every second vertex. 4 From the resulting graph G ϕ ′ construct output string w ′ 29 / 45

  35. Construction of the Conflict-Graph Definition (Conflict-Graph) Given a mapping ϕ that is consistent with by the longest common subsequence w of s and t . G ϕ = ( V , E ) • V = {� i , j �|� i , j � ∈ ϕ } • E = {{� i 1 , j 1 � , � i 2 , j 2 �}| either ( i 1 , i 2 ) ∈ P s or ( j 1 , j 2 ) ∈ P t } Note: G ϕ describes position pairs that are not arc-preserving. 30 / 45

  36. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 31 / 45

  37. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 31 / 45

  38. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  39. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  40. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  41. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  42. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  43. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  44. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  45. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  46. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  47. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  48. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  49. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  50. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  51. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  52. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  53. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  54. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  55. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 15 , 15 16 , 16 17 , 18 18 , 19 31 / 45

  56. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 16 , 16 17 , 18 18 , 19 31 / 45

  57. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 17 , 18 18 , 19 31 / 45

  58. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 17 , 18 18 , 19 31 / 45

  59. Conflict-Graph – Example ϕ = {� 1 , 1 � , � 3 , 2 � , � 4 , 3 � , � 6 , 5 � , � 7 , 6 � , � 8 , 7 � , � 9 , 9 � , � 10 , 10 � , � 11 , 11 � , � 12 , 12 � , � 13 , 13 � , � 14 , 14 � , � 15 , 15 � , � 16 , 16 � , � 17 , 18 � , � 18 , 19 �} A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 G ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 18 , 19 31 / 45

  60. Conflict-Graph Observation . . . A . . . B . . . C . . . ... ... i , j i , j S: . . . A . . . B . . . C . . . T: Lemma G ϕ has at most node degree two for two arc-annotated strings T , S ∈ C ROSSING . Proof. • Since T , S ∈ C ROSSING no two arcs share a common start/endpoint. • Incoming edge: w.l.o.g. at most one arc-mismatch for incoming edges • Outgoing edge: analogous. 32 / 45

  61. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 3 , 2 4 , 3 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 33 / 45

  62. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 1 , 1 3 , 2 3 , 2 4 , 3 4 , 3 6 , 5 6 , 5 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 33 / 45

  63. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 1 , 1 4 , 3 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 33 / 45

  64. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 4 , 3 7 , 6 7 , 6 8 , 7 9 , 9 10 , 10 11 , 11 11 , 11 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 33 / 45

  65. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 4 , 3 7 , 6 7 , 6 8 , 7 9 , 9 10 , 10 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 33 / 45

  66. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 18 , 19 17 , 18 18 , 19 33 / 45

  67. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 17 , 18 33 / 45

  68. Approximation Algorithm – Step 3 For each connected component in G ϕ delete every second vertex. 1 , 1 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 G ′ ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 33 / 45

  69. Approximation Algorithm – Final Step Reconstruct corresponding arc-preserving common subsequence w ′ . 1 , 1 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 G ′ ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 34 / 45

  70. Approximation Algorithm – Final Step Reconstruct corresponding arc-preserving common subsequence w ′ . 1 , 1 4 , 3 7 , 6 8 , 7 9 , 9 10 , 10 G ′ ϕ : 12 , 12 13 , 13 14 , 14 15 , 15 16 , 16 17 , 18 A A C G G U A C – G U A C G U A C – G U S: A – C G U U A C G G U A C G U A C C G U T: 34 / 45

  71. Correctness Proof (I) Theorem The Approximation algorithm computes a feasible solution for LAPCS(C ROSSING ,C ROSSING ) . Proof. • The string w ′ results from removing some symbols in w and thus is still a common subsequence. • Also, w ′ is arc-preserving: • Connected vertices in the conflict-graph G ϕ denoted violating position pairs. • The algorithm removes all edges from the conflict graph. 35 / 45

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend