Dynamic String Alignment Panagiotis Charalampopoulos 1 , 2 , Tomasz - - PowerPoint PPT Presentation

dynamic string alignment
SMART_READER_LITE
LIVE PREVIEW

Dynamic String Alignment Panagiotis Charalampopoulos 1 , 2 , Tomasz - - PowerPoint PPT Presentation

Dynamic String Alignment Panagiotis Charalampopoulos 1 , 2 , Tomasz Kociumaka 3 , and Shay Mozes 4 1 Kings College London, United Kingdom 2 University of Warsaw, Poland 3 Bar-Ilan University, Israel 4 The Interdisciplinary Center Herzliya, Israel


slide-1
SLIDE 1

Dynamic String Alignment

Panagiotis Charalampopoulos1,2, Tomasz Kociumaka3, and Shay Mozes4

1King’s College London, United Kingdom 2University of Warsaw, Poland 3Bar-Ilan University, Israel 4The Interdisciplinary Center Herzliya, Israel

CPM 2020

June 17-19, 2020

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-2
SLIDE 2

Problem Definition

String Alignment Input: two strings of total length n and weights

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-3
SLIDE 3

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters,

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-4
SLIDE 4

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters,

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-5
SLIDE 5

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-6
SLIDE 6

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned. Goal: compute an alignment with maximum weight.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-7
SLIDE 7

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned. Goal: compute an alignment with maximum weight.

1 2 3 4 5 6 7 8 9 10

S = a − b a a b a c b b | | | | | · | T = a c b a a − − c d b

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-8
SLIDE 8

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned. Goal: compute an alignment with maximum weight.

1 2 3 4 5 6 7 8 9 10

S = a − b a a b a c b b | | | | | · | T = a c b a a − − c d b

Alignment’s weight: 6wmatch + wmis + 3wgap.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-9
SLIDE 9

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned. Goal: compute an alignment with maximum weight.

1 2 3 4 5 6 7 8 9 10

S = a − b a a b a c b b | | | | | · | T = a c b a a − − c d b

Alignment’s weight: 6wmatch + wmis + 3wgap. Generalizes the Longest Common Subsequence problem and the Edit Distance problem.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-10
SLIDE 10

Problem Definition

String Alignment Input: two strings of total length n and weights wmatch – for aligning a pair of matching letters, wmis – for aligning a pair of mismatching letters, wgap – for letters that are not aligned. Goal: compute an alignment with maximum weight.

S = a b a a b a c b b T = a c b a a c d b

Generalizes the Longest Common Subsequence problem and the Edit Distance problem.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-11
SLIDE 11

Related Work

There is a textbook O(n2)-time dynamic programming algorithm.

[Vintsyuk; Cybernetics 1968] [Needleman-Wunsch; Journal of Molecular Biology 1970] [Wagner-Fischer; Journal of the ACM 1974]

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-12
SLIDE 12

Related Work

There is a textbook O(n2)-time dynamic programming algorithm.

[Vintsyuk; Cybernetics 1968] [Needleman-Wunsch; Journal of Molecular Biology 1970] [Wagner-Fischer; Journal of the ACM 1974]

Several works improved the complexity by polylogarithmic factors.

[Masek-Paterson; Journal of Computer and System Sciences 1980] [Crochemore-Landau-Ziv-Ukelson; SIAM Journal on Computing 2003] [Grabowski; Discrete Applied Mathematics 2016]

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-13
SLIDE 13

Related Work

There is a textbook O(n2)-time dynamic programming algorithm.

[Vintsyuk; Cybernetics 1968] [Needleman-Wunsch; Journal of Molecular Biology 1970] [Wagner-Fischer; Journal of the ACM 1974]

Several works improved the complexity by polylogarithmic factors.

[Masek-Paterson; Journal of Computer and System Sciences 1980] [Crochemore-Landau-Ziv-Ukelson; SIAM Journal on Computing 2003] [Grabowski; Discrete Applied Mathematics 2016]

A strongly subquadratic-time algorithm would refute the Strong Exponential Time Hypothesis (SETH).

[Backurs-Indyk; SIAM Journal on Computing 2018] [Bringmann-K¨ unnemann; FOCS 2015] [Abboud-Hansen-Vassilevska Williams-Williams; STOC 2016]

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-14
SLIDE 14

Related Work

The DP algorithm is online: it can handle appending a letter to either of the strings in O(n) time. It can also handle deleting the last letter of either of the strings.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-15
SLIDE 15

Related Work

The DP algorithm is online: it can handle appending a letter to either of the strings in O(n) time. It can also handle deleting the last letter of either of the strings. Several works considered prepending letters or deleting the first letter in either of the strings, culminating in an O(n)-time algorithm.

[Landau-Myers-Schmidt; SIAM Journal on Computing 1998] [Kim-Park; Journal of Discrete Algorithms 2004] [Ishida-Inenaga-Shinohara-Takeda; FCT 2005] [Tiskin; arxiv 2007] [Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]
  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-16
SLIDE 16

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-17
SLIDE 17

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.
  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-18
SLIDE 18

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.

The lower bound for the static version of the problem means that we cannot hope for O(n1−ǫ) update time for any constant ǫ > 0.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-19
SLIDE 19

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.

The lower bound for the static version of the problem means that we cannot hope for O(n1−ǫ) update time for any constant ǫ > 0. Integer alignment weights ≤ w: update time ˜ O(nw).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-20
SLIDE 20

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.

The lower bound for the static version of the problem means that we cannot hope for O(n1−ǫ) update time for any constant ǫ > 0. Integer alignment weights ≤ w: update time ˜ O(nw). Based on Tiskin’s algorithm for efficient distance multiplication of simple unit-Monge matrices.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-21
SLIDE 21

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.

The lower bound for the static version of the problem means that we cannot hope for O(n1−ǫ) update time for any constant ǫ > 0. Integer alignment weights ≤ w: update time ˜ O(nw). Based on Tiskin’s algorithm for efficient distance multiplication of simple unit-Monge matrices. Alignment weights of size nO(1): update time ˜ O(n√n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-22
SLIDE 22

Our Results

We consider a dynamic setting, where letter updates (insertions, deletions, substitutions) are allowed anywhere in the strings.

[Hyyr¨

  • -Narisawa-Inenaga; JDA 2015]: practical, O(n2) worst-case time.

The lower bound for the static version of the problem means that we cannot hope for O(n1−ǫ) update time for any constant ǫ > 0. Integer alignment weights ≤ w: update time ˜ O(nw). Based on Tiskin’s algorithm for efficient distance multiplication of simple unit-Monge matrices. Alignment weights of size nO(1): update time ˜ O(n√n). Based on black-boxes from planar graphs.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-23
SLIDE 23

Preliminaries

1 1 1 2

a a c b c d c e d c c b wmatch = 1, wmis = 2, and wgap = 1.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-24
SLIDE 24

Preliminaries

1 1 1 2 u v

a a c b c d c e d c c b wmatch = 1, wmis = 2, and wgap = 1. LCS(S, T) = |S| + |T| − d(u, v).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-25
SLIDE 25

Preliminaries

1 1 1 2 u v

a a c b c d c e d c c b

distance matrix

wmatch = 1, wmis = 2, and wgap = 1. LCS(S, T) = |S| + |T| − d(u, v).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-26
SLIDE 26

Preliminaries

1 1 1 2 u v

a a c b c d c e d c c b

distance matrix

wmatch = 1, wmis = 2, and wgap = 1. LCS(S, T) = |S| + |T| − d(u, v).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-27
SLIDE 27

Preliminaries

1 1 1 2 u v

a a c b c d c e d c c b

distance matrix

a d b c

wmatch = 1, wmis = 2, and wgap = 1. LCS(S, T) = |S| + |T| − d(u, v). A matrix M is Monge if M[i, j] + M[i′, j′] ≤ M[i′, j] + M[i, j′] for all i < i′ and j < j′.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-28
SLIDE 28

Preliminaries

1 1 1 2 u v

a a c b c d c e d c c b

distance matrix

a d b c

wmatch = 1, wmis = 2, and wgap = 1. LCS(S, T) = |S| + |T| − d(u, v). A matrix M is Monge if M[i, j] + M[i′, j′] ≤ M[i′, j] + M[i, j′] for all i < i′ and j < j′. This distance matrix is Monge, and, in fact, unit-Monge.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-29
SLIDE 29

Unit-Monge Matrices

M

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-30
SLIDE 30

Unit-Monge Matrices

M M′

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-31
SLIDE 31

Unit-Monge Matrices

M M′ d a c b = b + c − a − d

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-32
SLIDE 32

Unit-Monge Matrices

M M′

M is unit-Monge if and only if M′ is a permutation matrix.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-33
SLIDE 33

Unit-Monge Matrices

M M′

M is unit-Monge if and only if M′ is a permutation matrix. We can represent an n × n unit-Monge matrix M in ˜ O(n) space so that each entry can be retrieved in ˜ O(1) time.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-34
SLIDE 34

Distance Product of Unit-Monge Matrices

Distance Product The (min, +) product or distance product of an m × k matrix A and a k × n matrix B, denoted by A ⊙ B is an m × n matrix C, such that C[i, j] = min

1≤r≤k{A[i, r] + B[r, j]}.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-35
SLIDE 35

Distance Product of Unit-Monge Matrices

Distance Product The (min, +) product or distance product of an m × k matrix A and a k × n matrix B, denoted by A ⊙ B is an m × n matrix C, such that C[i, j] = min

1≤r≤k{A[i, r] + B[r, j]}.

A := distance matrix B := distance matrix C := distance matrix C = A ⊙ B

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-36
SLIDE 36

Distance Product of Unit-Monge Matrices

Distance Product The (min, +) product or distance product of an m × k matrix A and a k × n matrix B, denoted by A ⊙ B is an m × n matrix C, such that C[i, j] = min

1≤r≤k{A[i, r] + B[r, j]}.

A := distance matrix B := distance matrix C := distance matrix C = A ⊙ B

Efficient (min, +) multiplication [Tiskin; Algorithmica 2015] The distance product of two n × n simple unit-Monge matrices can be computed in time O(n log n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-37
SLIDE 37

Distance Product of Unit-Monge Matrices

Distance Product The (min, +) product or distance product of an m × k matrix A and a k × n matrix B, denoted by A ⊙ B is an m × n matrix C, such that C[i, j] = min

1≤r≤k{A[i, r] + B[r, j]}.

A := distance matrix B := distance matrix C := distance matrix C = A ⊙ B

Efficient (min, +) multiplication [Tiskin; Algorithmica 2015] The distance product of two n × n simple unit-Monge matrices can be computed in time O(n log n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-38
SLIDE 38

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph.

distance matrix

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-39
SLIDE 39

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-40
SLIDE 40

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix a

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-41
SLIDE 41

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-42
SLIDE 42

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b 20

O(n/2i) distance matrices change at level i.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-43
SLIDE 43

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b 21

O(n/2i) distance matrices change at level i.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-44
SLIDE 44

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b 22

O(n/2i) distance matrices change at level i.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-45
SLIDE 45

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b 23

O(n/2i) distance matrices change at level i.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-46
SLIDE 46

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

distance matrix b 24

O(n/2i) distance matrices change at level i.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-47
SLIDE 47

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

⊙ distance matrix

O(n/2i) distance matrices change at level i. Each of them is recomputed from four distance matrices of the previous level in O(2i log(2i)) time using distance multiplication.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-48
SLIDE 48

Algorithm for substitutions

Goal: Maintain the distance matrix of the alignment graph. We maintain a hierarchy of decompositions of the alignment graph into 2i × 2i blocks. For each block we maintain a distance matrix.

⊙ distance matrix

O(n/2i) distance matrices change at level i. Each of them is recomputed from four distance matrices of the previous level in O(2i log(2i)) time using distance multiplication. The total update time is thus O(n log2 n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-49
SLIDE 49

Remarks

Insertions and deletions can be handled by carefully resizing blocks.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-50
SLIDE 50

Remarks

Insertions and deletions can be handled by carefully resizing blocks. The actual LCS can be retrieved within the same time complexity by tracing back the computations.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-51
SLIDE 51

Remarks

Insertions and deletions can be handled by carefully resizing blocks. The actual LCS can be retrieved within the same time complexity by tracing back the computations. Fragment-to-fragment LCS queries can be answered in time O(n log2 n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-52
SLIDE 52

Remarks

Insertions and deletions can be handled by carefully resizing blocks. The actual LCS can be retrieved within the same time complexity by tracing back the computations. Fragment-to-fragment LCS queries can be answered in time O(n log2 n). String alignment with integer weights ≤ w can be reduced to LCS by replacing each letter by a string of size O(w), as shown by Tiskin.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-53
SLIDE 53

Remarks

Insertions and deletions can be handled by carefully resizing blocks. The actual LCS can be retrieved within the same time complexity by tracing back the computations. Fragment-to-fragment LCS queries can be answered in time O(n log2 n). String alignment with integer weights ≤ w can be reduced to LCS by replacing each letter by a string of size O(w), as shown by Tiskin. Update time: ˜ O(nw).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-54
SLIDE 54

Remarks

Insertions and deletions can be handled by carefully resizing blocks. The actual LCS can be retrieved within the same time complexity by tracing back the computations. Fragment-to-fragment LCS queries can be answered in time O(n log2 n). String alignment with integer weights ≤ w can be reduced to LCS by replacing each letter by a string of size O(w), as shown by Tiskin. Update time: ˜ O(nw). Next: An ˜ O(n√n)-time algorithm for integer weights of size nO(1) using techniques for computing shortest paths in planar graphs.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-55
SLIDE 55

MSSP

Multiple Source Shortest Paths (MSSP) [Klein; SODA 2005] We can construct in nearly-linear time (in the size of the graph) a data structure that can report in logarithmic time the distance between any node on the infinite face and any node in the graph.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-56
SLIDE 56

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-57
SLIDE 57

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-58
SLIDE 58

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time. Before: distance product. Now: SSSP computations, many DDGs.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-59
SLIDE 59

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-60
SLIDE 60

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y) Monge Property:

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-61
SLIDE 61

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y)

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-62
SLIDE 62

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y)

z

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-63
SLIDE 63

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y)

z

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-64
SLIDE 64

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y)

z

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-65
SLIDE 65

FR-Dijkstra

Dense Distance Graphs The distance matrix capturing pairwise distances between the vertices of a set ∂H of vertices of a planar graph H, lying on a single face, can be computed in ˜ O(|H| + |∂H|2) time using MSSP. FR-Dijkstra [Fakcharoenphol-Rao; JCSS 2006] We can compute shortest paths from a single-source in a collection

  • f DDGs with N vertices in total (with multiplicities) in ˜

O(N) time.

u v x y

d(u, y) + d(v, x) ≤ d(u, x) + d(v, y) Monge Property:

z

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-66
SLIDE 66

Algorithm for Large Weights

a a c b c d d a a e a d c e d c c b b b a

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-67
SLIDE 67

Algorithm for Large Weights

a a c b c d d a a e a d c e d c c b b b a Θ(√n) Θ(√n)

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-68
SLIDE 68

Algorithm for Large Weights

a a c b c d d a a e a d c e d c c b b b a We maintain a DDG for each piece P with the set of “boundary” vertices as ∂P. |P| = Θ(n), |∂P| = Θ(√n).

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-69
SLIDE 69

Algorithm for Large Weights

a a c b c d d a a e a d c e d c c b b b a c Each update in one of the strings affects O(√n) pieces. The DDG information for each piece is recomputed in ˜ O(n) time using MSSP.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-70
SLIDE 70

Algorithm for Large Weights

a a c b c d d a a e a d c e d c c b b b a a b c d e e d c b a We run FR-Dijkstra on the union of O(√n · √n) = O(n) DDGs. The runtime is ˜ O(n√n), since each DDG has O(√n) vertices.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-71
SLIDE 71

Final Remarks

Extension: We can in fact also handle copy-paste operations.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-72
SLIDE 72

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights?

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-73
SLIDE 73

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights? What if one string is given as a straight-line program (SLP)?

[Tiskin; arxiv 2007]: The LCS of a standard string of length n

and a string given by an SLP of size N can be computed in ˜ O(n · N) time.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-74
SLIDE 74

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights? What if one string is given as a straight-line program (SLP)?

[Tiskin; arxiv 2007]: The LCS of a standard string of length n

and a string given by an SLP of size N can be computed in ˜ O(n · N) time.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-75
SLIDE 75

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights? What if one string is given as a straight-line program (SLP)?

[Tiskin; arxiv 2007]: The LCS of a standard string of length n

and a string given by an SLP of size N can be computed in ˜ O(n · N) time. How about maintaining an approximation of the edit distance/ LCS in the dynamic setting?

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-76
SLIDE 76

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights? What if one string is given as a straight-line program (SLP)?

[Tiskin; arxiv 2007]: The LCS of a standard string of length n

and a string given by an SLP of size N can be computed in ˜ O(n · N) time. How about maintaining an approximation of the edit distance/ LCS in the dynamic setting?

[Andoni-Nosatzki; arxiv 2020]: The edit distance can be

O(1)-approximated in O(n1+ǫ) time for any ǫ > 0.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-77
SLIDE 77

Final Remarks

Extension: We can in fact also handle copy-paste operations. Open problems: Can we do better than ˜ O(n√n) for large weights? What if one string is given as a straight-line program (SLP)?

[Tiskin; arxiv 2007]: The LCS of a standard string of length n

and a string given by an SLP of size N can be computed in ˜ O(n · N) time. How about maintaining an approximation of the edit distance/ LCS in the dynamic setting?

[Andoni-Nosatzki; arxiv 2020]: The edit distance can be

O(1)-approximated in O(n1+ǫ) time for any ǫ > 0.

[Mitzenmacher-Seddighin; STOC 2020]: Dynamic LIS and distance

to monotonicity.

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment

slide-78
SLIDE 78

Thank You Thank you for your attention!

  • P. Charalampopoulos, T. Kociumaka, S. Mozes

Dynamic String Alignment