Algorithms for Computing the Longest Parameterized Common - - PowerPoint PPT Presentation

algorithms for computing the longest parameterized common
SMART_READER_LITE
LIVE PREVIEW

Algorithms for Computing the Longest Parameterized Common - - PowerPoint PPT Presentation

Algorithms for Computing the Longest Parameterized Common Subsequence Costas S. Iliopoulos 1 , Marcin Kubica 2 , M. Sohel Rahman 1 and Tomasz Wale 2 1 Algorithm Design Group Department of Computer Science, Kings College London 2 Faculty of


slide-1
SLIDE 1

Algorithms for Computing the Longest Parameterized Common Subsequence

Costas S. Iliopoulos1, Marcin Kubica2,

  • M. Sohel Rahman1 and Tomasz Waleń2

1Algorithm Design Group

Department of Computer Science, Kings College London

2Faculty of Mathematics, Informatics and Applied Mathematics

Warsaw University, Poland

CPM, 2007-07-11

slide-2
SLIDE 2

The LPCS Problem

The LPCS (longest parameterized common subsequence) problem is a generalization of a well known LCS problem, containing gap-constraints.

Definition

In LPCS(X, Y , K1, K2, D) we look for such longest increasing sequences of indices P[1, .., l] and Q[1, .., l], that: X[P[i]] = Y [P[i]] Common subsequence. K1 ≤ P[i + 1] − P[i], Q[i + 1] − Q[i] ≤ K2 Gaps between consecutive matches are not shorter than K1 and not longer than K2. |(P[i + 1] − P[i]) − (Q[i + 1] − Q[i])| ≤ D The corresponding gaps in both sequences cannot differ more than D.

slide-3
SLIDE 3

The LPCS Problem

The LPCS (longest parameterized common subsequence) problem is a generalization of a well known LCS problem, containing gap-constraints.

Definition

In LPCS(X, Y , K1, K2, D) we look for such longest increasing sequences of indices P[1, .., l] and Q[1, .., l], that: X[P[i]] = Y [P[i]] Common subsequence. K1 ≤ P[i + 1] − P[i], Q[i + 1] − Q[i] ≤ K2 Gaps between consecutive matches are not shorter than K1 and not longer than K2. |(P[i + 1] − P[i]) − (Q[i + 1] − Q[i])| ≤ D The corresponding gaps in both sequences cannot differ more than D.

slide-4
SLIDE 4

The LPCS Problem

The LPCS (longest parameterized common subsequence) problem is a generalization of a well known LCS problem, containing gap-constraints.

Definition

In LPCS(X, Y , K1, K2, D) we look for such longest increasing sequences of indices P[1, .., l] and Q[1, .., l], that: X[P[i]] = Y [P[i]] Common subsequence. K1 ≤ P[i + 1] − P[i], Q[i + 1] − Q[i] ≤ K2 Gaps between consecutive matches are not shorter than K1 and not longer than K2. |(P[i + 1] − P[i]) − (Q[i + 1] − Q[i])| ≤ D The corresponding gaps in both sequences cannot differ more than D.

slide-5
SLIDE 5

The LPCS Problem

The LPCS (longest parameterized common subsequence) problem is a generalization of a well known LCS problem, containing gap-constraints.

Definition

In LPCS(X, Y , K1, K2, D) we look for such longest increasing sequences of indices P[1, .., l] and Q[1, .., l], that: X[P[i]] = Y [P[i]] Common subsequence. K1 ≤ P[i + 1] − P[i], Q[i + 1] − Q[i] ≤ K2 Gaps between consecutive matches are not shorter than K1 and not longer than K2. |(P[i + 1] − P[i]) − (Q[i + 1] − Q[i])| ≤ D The corresponding gaps in both sequences cannot differ more than D.

slide-6
SLIDE 6

The LCS and LPCS Problems

LCS

a x x c b d e a b d c o o e

LPCS, K1 = 1, K2 = 3, D = 1

a x x c b d e a b d c o o e

slide-7
SLIDE 7

The LCS and LPCS Problems

X Y (i, j) j i

LCS(i, j) = 1 + max{LCS(x, y) : 1 ≤ x < i, 1 ≤ y < j}

slide-8
SLIDE 8

The LCS and LPCS Problems

X Y (i, j) j i K1 K2 D

PLCS(i, j) = 1 + max PLCS(x, y) : K1 ≤ i − x, j − y ≤ K2, |(i − x) − (j − y)| ≤ D

slide-9
SLIDE 9

The FIG, ELAG, RIFIG and RELAG problems

The LPCS problem is a generalization of four problems introduced by

  • C. S. Iliopoulos and M. S. Rahman (ISAAC 2006):

Definition

FIG(X, Y , K) = LPCS(X, Y , 1, K, n) LCS problem with fixed gaps. ELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, n) LCS problem with elastic gaps. RIFIG(X, Y , K) = LPCS(X, Y , 1, K, 0) LCS problem with rigid fixed gaps. RELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, 0) LCS problem with rigid elastic gaps.

slide-10
SLIDE 10

The FIG, ELAG, RIFIG and RELAG problems

The LPCS problem is a generalization of four problems introduced by

  • C. S. Iliopoulos and M. S. Rahman (ISAAC 2006):

Definition

FIG(X, Y , K) = LPCS(X, Y , 1, K, n) LCS problem with fixed gaps. ELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, n) LCS problem with elastic gaps. RIFIG(X, Y , K) = LPCS(X, Y , 1, K, 0) LCS problem with rigid fixed gaps. RELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, 0) LCS problem with rigid elastic gaps.

slide-11
SLIDE 11

The FIG, ELAG, RIFIG and RELAG problems

The LPCS problem is a generalization of four problems introduced by

  • C. S. Iliopoulos and M. S. Rahman (ISAAC 2006):

Definition

FIG(X, Y , K) = LPCS(X, Y , 1, K, n) LCS problem with fixed gaps. ELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, n) LCS problem with elastic gaps. RIFIG(X, Y , K) = LPCS(X, Y , 1, K, 0) LCS problem with rigid fixed gaps. RELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, 0) LCS problem with rigid elastic gaps.

slide-12
SLIDE 12

The FIG, ELAG, RIFIG and RELAG problems

The LPCS problem is a generalization of four problems introduced by

  • C. S. Iliopoulos and M. S. Rahman (ISAAC 2006):

Definition

FIG(X, Y , K) = LPCS(X, Y , 1, K, n) LCS problem with fixed gaps. ELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, n) LCS problem with elastic gaps. RIFIG(X, Y , K) = LPCS(X, Y , 1, K, 0) LCS problem with rigid fixed gaps. RELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, 0) LCS problem with rigid elastic gaps.

slide-13
SLIDE 13

The FIG, ELAG, RIFIG and RELAG problems

The LPCS problem is a generalization of four problems introduced by

  • C. S. Iliopoulos and M. S. Rahman (ISAAC 2006):

Definition

FIG(X, Y , K) = LPCS(X, Y , 1, K, n) LCS problem with fixed gaps. ELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, n) LCS problem with elastic gaps. RIFIG(X, Y , K) = LPCS(X, Y , 1, K, 0) LCS problem with rigid fixed gaps. RELAG(X, Y , K1, K2) = LPCS(X, Y , K1, K2, 0) LCS problem with rigid elastic gaps.

slide-14
SLIDE 14

The LPCS Problem

X Y (i, j) j i K

FIG(i, j) = 1 + max{FIG(x, y) : i − x, j − y ≤ K}

slide-15
SLIDE 15

The LPCS Problem

X Y (i, j) j i K1 K2

ELAG(i, j) = 1 + max{ELAG(x, y) : K1 ≤ i − x, j − y ≤ K2}

slide-16
SLIDE 16

The LPCS Problem

X Y (i, j) j i K

RIFIG(i, j) = 1 + max{RIFIG(x, y) : i − x = j − y ≤ K}

slide-17
SLIDE 17

The LPCS Problem

X Y (i, j) j i K1 K2

RELAG(i, j) = 1 + max{RELAG(x, y) : K1 ≤ i − x = j − y ≤ K2}

slide-18
SLIDE 18

Previous Results

Summary of previously known results

PROBLEM Previous Results Our Results LPCS − O(min(n2, n + R log n)) FIG O(n2 + R log log n) ELAG O(n2 + R log log n) RIFIG O(n2) O(n + R) RELAG O(n2 + R(K2 − K1)) Where R is the total number of matches.

slide-19
SLIDE 19

Max-queue data structure

The max-queue data structure can be used to calculate maximum of last L elements inserted into the queue.

Operations

init(Q, L) — initialize and set the history length insert(Q, x) max(Q) — returns maximum of the last L inserted elements. All operations run in O(1) (amortized) time.

slide-20
SLIDE 20

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (1)

slide-21
SLIDE 21

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (7)

slide-22
SLIDE 22

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (7, 5)

slide-23
SLIDE 23

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (7, 5, 2)

slide-24
SLIDE 24

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (7, 6)

slide-25
SLIDE 25

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (6, 3)

slide-26
SLIDE 26

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (6, 3, 1)

slide-27
SLIDE 27

Max-queue data structure example

Example

For the sequence (1, 7, 5, 2, 6, 3, 1) and L = 4 1 7 5 2 6 3 1 5 ↑ MaxQueue = (6, 5)

slide-28
SLIDE 28

The Algorithm for LPCS

Algorithm

X Y (i, j) j i

Dynamic programming. Three-level Max-queue. Time complexity O(n2).

slide-29
SLIDE 29

The Algorithm for FIG and ELAG

Algorithm

For R = o(n2/ log n): Dynamic programming. Using dictionary data-structure providing: insertion, removal and max-range queries. O(R) steps (for matches only), each in O(log n) time. Time complexity: O(n + R log n). Can be extended to solve LPCS in O(n + R · log n) time.

slide-30
SLIDE 30

The Algorithm for RIFIG and RELAG.

Algorithm

Dynamic programming. Each diagonal is processed separately. O(R) steps (for matches only). Each step in O(1) amortised time (using max-queue). Time complexity: O(n + R).

slide-31
SLIDE 31

Conclusions

Conclusions

New problem LPCS which generalizes FIG, ELAG, RIFIG, RELAG. Simplified and faster, the O(n2) algorithm for LPCS problem. The O(n + R log n) algorithm for ELAG and LPCS problem. The O(n + R) algorithm for RIFIG, RELAG.

slide-32
SLIDE 32

The End

Thank you for your attention!