Computi ting l longes est c common square s e subsequen ences
Takafumi Inoue1, Shunsuke Inenaga1, Heikki Hyyrö2, Hideo Bannai1, Masayuki Takeda1
1Kyushu University 2University of Tampere
Computi ting l longes est c common square s e subsequen ences - - PowerPoint PPT Presentation
CPM 2018 Computi ting l longes est c common square s e subsequen ences Takafumi Inoue 1 , Shunsuke Inenaga 1 , Heikki Hyyr 2 , Hideo Bannai 1 , Masayuki Takeda 1 1 Kyushu University 2 University of Tampere Longest Common Subsequence
1Kyushu University 2University of Tampere
LCS is a classical measure for string comparison. Standard DP solves this in O(n2) time.
LCS is a classical measure for string comparison. Standard DP solves this in O(n2) time.
n is the length of the input strings. M is the number of matching points,
σ is the alphabet size.
a
M is the number of matching points,
a
M is the number of matching points,
e i
n is the length of the input strings. M is the number of matching points,
σ is the alphabet size.
Tuple r = (i, j, k, l) is called matching rectangle
c c c c
For matching rectangles r = (i, j, k, l) and r’ = (i’, j’, k’, l’),
…
…
… …
…
…
…
18
For each matching rectangle r, maintain DP table Dr of size M 2
For each character c, find the “closest” matching rectangle rc
For each matching rectangle r, maintain DP table Dr of size M 2
For each character c, find the “closest” matching rectangle rc
For each matching rectangle r, maintain DP table Dr of size M 2
For each character c, find the “closest” matching rectangle rc
Let R be # of matching rectangles ( R = O(M 2) ). We compute Dr[r’ ] for R 2 = O(M 4) pairs of
We test σ characters to extend the current sequence
Each extension can be obtained in O(1) time
Always better to use a start matching rectangle that
We compute Dm[r’ ] for MR = O(M 3) pairs of
We test σ characters to extend the current sequence
Each extension can be obtained in O(1) time
For random text M ≈ n2/σ and R ≈ M 2/σ ≈ n4/σ3.
|A| = |B| = |C| = |D| = n