Efficient algorithms for two extensions of LPF table: the power of - - PowerPoint PPT Presentation

efficient algorithms for two extensions of lpf table the
SMART_READER_LITE
LIVE PREVIEW

Efficient algorithms for two extensions of LPF table: the power of - - PowerPoint PPT Presentation

Efficient algorithms for two extensions of LPF table: the power of suffix arrays M.Crochemore C.S.Iliopoulos M.Kubica W.Rytter T.Wale SOFSEM 2010 M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Wale Efficient algorithms for two


slide-1
SLIDE 1

Efficient algorithms for two extensions of LPF table: the power of suffix arrays

M.Crochemore C.S.Iliopoulos M.Kubica W.Rytter T.Waleń SOFSEM 2010

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-2
SLIDE 2

Introduction

Preliminaries Input: a string y[0 . . n − 1]. Auxiliary algorithms: the suffix array (SUF), the longest common prefix array (LCP), range minimum/maximum query (RMQ) for SUF and LCP. Can be done in O(n) time.

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-3
SLIDE 3

Introduction

We consider two variants of the classical problem: The Longest Previous Factor Problem (LPF) LPF[i] =the largest such k, that y[i . . i + k] appears before (possibly overlapping).

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-4
SLIDE 4

Introduction

We consider two variants of the classical problem: The Longest Previous Factor Problem (LPF) LPF[i] =the largest such k, that y[i . . i + k] appears before (possibly overlapping). Well studied. Can be computed in O(n) time.

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-5
SLIDE 5

Introduction

The Longest Previous Reversed Factor Problem (LPrF) LPrF[i] =the largest such k, that rev(y[i . . i + k]) appears before (without overlapping).

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-6
SLIDE 6

Introduction

The Longest Previous Reversed Factor Problem (LPrF) LPrF[i] =the largest such k, that rev(y[i . . i + k]) appears before (without overlapping). Generalises a factorization of strings used to extract certain types of palindromes [Kolpakov, Kucherov, 2008]. Applications in compression of genetic sequences (in combination with LPF) [Grumbach, Tahi, 1993].

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-7
SLIDE 7

Introduction

The Longest Previous Non-Overlapping Factor Problem (LPnF) LPnF[i] =the largest such k, that y[i . . i + k] appears before (without overlapping).

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-8
SLIDE 8

Introduction

The Longest Previous Non-Overlapping Factor Problem (LPnF) LPnF[i] =the largest such k, that y[i . . i + k] appears before (without overlapping). Emerged from a version of Ziv-Lempel factorization. Decomposition of a string into already processed factors. Application in algorithms computing repetitions in strings [Crochemore, 1986], [Main, 1989], [Kolpakov, Kucherov, 1999].

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-9
SLIDE 9

Introduction

Example

position i 1 2 3 4 5 6 7 8 y[i] a b b a b b a b a LPF[i] 1 5 4 3 2 2 1 LPrF[i] 2 1 3 3 2 2 1 LPnF[i] 1 3 3 3 2 2 1

a b b a b b a b a a b b a b b a b b a b b a b a a b b a b b a b b a b b a b a

LPnF[4]=3 LPrF[4]=3 LPF[4]=4

a b b b a b b b

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-10
SLIDE 10

The Alternating Search Technique

Assumptions We assume, that the following operations are given, and take O(1) time: Val(k) — non-increasing (for i ≤ k ≤ j), Candidate(k) — a predicate, FirstMin(i, j) — first position k ∈ [i . . j] with the minimum value of Val(k), NextCand(i, j) — any candidate k ∈ [i . . j).

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-11
SLIDE 11

The Alternating Search Technique

Goal For a given range [i . . j], find a candidate k maximizing Val(k).

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-12
SLIDE 12

The Alternating Search Technique

Goal For a given range [i . . j], find a candidate k maximizing Val(k). Alternating-Search(i, j)

i kopt k0 = j k1 FirstMin NextCand

Running time: O(Val(kopt) − Val(j) + 1)

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-13
SLIDE 13

Computation of the LPrF table

Calculate SUF and LCP for x = y#rev(y). LPrF[i] = max{RMQ(LCP[i . . j]) : j > 2n − i}

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-14
SLIDE 14

Computation of the LPrF table

Calculate SUF and LCP for x = y#rev(y). LPrF[i] = max{RMQ(LCP[i . . j]) : j > 2n − i} Example

LPrF>[i] b a a b a b a a b a b a a b b a a b a b a a b a b a b a a b a b a i b a a b a <[i] LPrF a b a a b b a a b a b a b a b a a b a a b a a b a a b a b a a b a b a b a a b a a b a a b a a b a b a a y = a a b a b a a b a a b a a b a a b a b a b # x = y # rev(y)

2n−i

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-15
SLIDE 15

Computation of the LPrF table

LPrF[i + 1] ≥ LPrF[i] − 1

b a a b a b a a b a b a a b b a b a b a a b a a b a a b a a b a b a a i LPrF [i]

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-16
SLIDE 16

Computation of the LPrF table

LPrF[i + 1] ≥ LPrF[i] − 1

b a a b a b a a b a b a a b b a b a b a a b a a b a a b a a b a b a a i LPrF [i]

An instance of the alternating search (using: SUF and LCP for x, and RMQ). O(n) running time.

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-17
SLIDE 17

Computation of the LPnF table

LPnF[i + 1] ≥ LPnF[i] − 1

[i] LPnF b a a b a i b a a b a b a b a b a a b a a b a a b a a b a b a a

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-18
SLIDE 18

Computation of the LPnF table

LPnF[i + 1] ≥ LPnF[i] − 1

[i] LPnF b a a b a i b a a b a b a b a b a a b a a b a a b a a b a b a a

Boundary case (squares) — using runs [Kolpakov, Kucherov, 1999]. General case — the alternating search (using: SUF and LCP for y, and RMQ). O(n) running time.

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-19
SLIDE 19

Summary

Our results The LPrF and LPnF tables can be computed in O(n) time. The optimal parsing of a text, using factors and/or reverse factors can be computed in O(n) time.

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

slide-20
SLIDE 20

Thank you for your attention!

M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table