An Adaptive Hybrid Pattern-Matching Algorithm on Indeterminate - - PowerPoint PPT Presentation

an adaptive hybrid pattern matching algorithm on
SMART_READER_LITE
LIVE PREVIEW

An Adaptive Hybrid Pattern-Matching Algorithm on Indeterminate - - PowerPoint PPT Presentation

Outline An Adaptive Hybrid Pattern-Matching Algorithm on Indeterminate Strings . Smyth 1 , Shu Wang 1 and Mao Yu 1 W. F 1 Algorithms Research Group Department of Computing & Software McMaster University, Canada email:


slide-1
SLIDE 1

Outline

An Adaptive Hybrid Pattern-Matching Algorithm on Indeterminate Strings

  • W. F

. Smyth1, Shu Wang1 and Mao Yu1

1Algorithms Research Group

Department of Computing & Software McMaster University, Canada email: smyth,shuw@mcmaster.ca

The Prague Stringology Conference 2008

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-2
SLIDE 2

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-3
SLIDE 3

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-4
SLIDE 4

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-5
SLIDE 5

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-6
SLIDE 6

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-7
SLIDE 7

Outline

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-8
SLIDE 8

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-9
SLIDE 9

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Regular Pattern Matching Algorithms

Over the last several decades, dozens of regular pattern-matching algorithms have been proposed. Window shifting: KMP [KMP77], BM [BM77], FJS [FJS06], etc. Bit-parallel: Shift-Or [D¨

  • m68, WM92, BYG92], BNDM

[NR98], etc.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-10
SLIDE 10

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Regular Pattern Matching Algorithms

Over the last several decades, dozens of regular pattern-matching algorithms have been proposed. Window shifting: KMP [KMP77], BM [BM77], FJS [FJS06], etc. Bit-parallel: Shift-Or [D¨

  • m68, WM92, BYG92], BNDM

[NR98], etc.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-11
SLIDE 11

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Indeterminate Pattern-Matching Algorithms

Intuitive approach: Modify existing regular pattern-matching algorithms to do indeterminate pattern-matching. Shift-Or: Can be modified to indeterminate pattern-matching easily, with the same speed of regular pattern-matching. iBMS: A very fast indeterminate pattern-matching algorithm based on BMS has been proposed in [HSW06b]. iFJS: An indeterminate pattern-matching algorithm based

  • n modified FJS (cut-off border array) has been proposed

in [HSW06a].

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-12
SLIDE 12

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Indeterminate Pattern-Matching Algorithms

Intuitive approach: Modify existing regular pattern-matching algorithms to do indeterminate pattern-matching. Shift-Or: Can be modified to indeterminate pattern-matching easily, with the same speed of regular pattern-matching. iBMS: A very fast indeterminate pattern-matching algorithm based on BMS has been proposed in [HSW06b]. iFJS: An indeterminate pattern-matching algorithm based

  • n modified FJS (cut-off border array) has been proposed

in [HSW06a].

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-13
SLIDE 13

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Indeterminate Pattern-Matching Algorithms

Intuitive approach: Modify existing regular pattern-matching algorithms to do indeterminate pattern-matching. Shift-Or: Can be modified to indeterminate pattern-matching easily, with the same speed of regular pattern-matching. iBMS: A very fast indeterminate pattern-matching algorithm based on BMS has been proposed in [HSW06b]. iFJS: An indeterminate pattern-matching algorithm based

  • n modified FJS (cut-off border array) has been proposed

in [HSW06a].

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-14
SLIDE 14

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Indeterminate String

A (regular) string x on Σ is a finite sequence of letters drawn from Σ. Two letters λ, µ ∈ Σ are said to match (λ ≈ µ) iff λ = µ. Consider any specified subset S = {λ1, λ2, . . . , λj} of Σ, j ≥ 2. We introduce the idea of an indeterminate letter λ = λS with the property that it matches every element of S (but no other letter); we write λ ≈ λ1, λ ≈ λ2, . . . , λ ≈ λj. Given two subsets S, T of Σ, |S| ≥ 2, |T| ≥ 2, and indeterminate letters λ, µ associated with S, T respectively, λ ≈ µ ⇔ S ∩ T = ∅. Given two indeterminate strings x and y, x ≈ y ⇔ (|x| = |y|) ∧ (∀i ∈ 1..|x|, x[i] ≈ y[i]).

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-15
SLIDE 15

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-16
SLIDE 16

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The Knuth-Morris-Pratt Algorithm

A well-known linear time pattern-matching algorithm. Based on border array calculation. However, not very fast in practice.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-17
SLIDE 17

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The KMP Algorithm - 1

x

. . .

p 1 b 1 m

. . .

n i a

. . .

j match

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-18
SLIDE 18

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The KMP Algorithm - 2

x

. . .

p 1 b 1 m

. . .

n i a

. . .

j

Longest border

  • f [1.. -1]

equal

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-19
SLIDE 19

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The KMP Algorithm - 3

x

. . .

p 1 b 1 m

. . .

n i a

. . .

j

compare

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-20
SLIDE 20

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The Sunday Adaption of Boyer-Moore Algorithm

A simplified version of the Boyer-Moore algorithms. Time complexity O(mn). However, very fast in practice.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-21
SLIDE 21

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 1

x p 1 1 m

. . .

n i

. . .

i-m+1

. . .

a b

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-22
SLIDE 22

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 2

x p 1 1 m

. . .

n i

. . .

i-m+1

. . .

a c b c i+1

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-23
SLIDE 23

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 3

x p 1 1 m

. . .

n

. . .

c a c c not match

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-24
SLIDE 24

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 4

x p 1 1 m

. . .

n

. . .

d d

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-25
SLIDE 25

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 5

x p 1 1 m

. . .

n

. . .

c c d d

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-26
SLIDE 26

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The BMS Algorithm - 6

x p 1 1 m n c c compare

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-27
SLIDE 27

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The Shift-And Algorithm

Makes use of the bit-parallel nature of computer. Time complexity O(mn/w). Can be easily modified for indeterminate pattern-matching.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-28
SLIDE 28

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The Shift-And Algorithm

m\Σ A C G T A 1 A 1 T 1 C 1 G 1 Preprocessing(); BitArray: D[1..n] D[1] ← Sx[1] for i = 2 to n do D[i] ← (Shift(D[i − 1])& Sx[i]); if Dm&10m−1 then output i − m + 1;

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-29
SLIDE 29

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

The Franek-Jennings-Smyth Algorithm

A hybrid algorithm that combines the KMP and BMS algorithm. Inherits the merits of both algorithms: very fast both asymptotically (O(n)) and in practice.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-30
SLIDE 30

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

Outline of the FJS Algorithm

  • 1. Perform Sunday shift along text.
  • 2. When a match of letters is found at the end of the pattern,

switch to KMP matching.

  • 3. Continue KMP matching until no border can be used, then

switch back to Sunday shift.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-31
SLIDE 31

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-32
SLIDE 32

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Example of Non-transitivity Effect

Suppose we are performing KMP matching along the text. Index 1 2 3 4 5 6 7 x ...... a a b b a b b ...... p a ∗ ∗ b a ∗ a 1st Shift a ∗ ∗ b a ...... 2nd Shift a ∗ ∗ ...... 3rd Shift a ∗ ......

Table: First example of the non-transitivity effect

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-33
SLIDE 33

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Proposition Shifting the pattern to the right according to the longest border cannot guarantee a prefix match. Proposition A border of a border of x is not necessarily a border of x.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-34
SLIDE 34

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Index 1 2 3 4 5 6 7 x ...... a b a ∗ a ∗ a ...... p a b a a a b b Wrong Shift a b a ...... Correct Shift a b a a a ......

Table: Second example of the non-transitivity effect

Proposition Shifting the pattern to the right according to the longest border can miss some occurrences in between.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-35
SLIDE 35

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Impact

Because of these properties, transforming regular pattern-matching algorithms that use border arrays into indeterminate pattern-matching algorithms is non-trivial (KMP , FJS, etc.) However, since some of these regular algorithms are very fast in practice and have nice properties, we are motivated to invent indeterminate versions of them that avoid using border arrays.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-36
SLIDE 36

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-37
SLIDE 37

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

A New Hybrid Algorithm

We propose a new hybrid algorithm that uses Shift-And and BMS as complementary shift engines.

  • 1. Perform Sunday shift along text.
  • 2. When a match of letters is found at the end of the

pattern, switch to Shift-And matching.

  • 3. Continue Shift-And matching until no match can be

found at the current position (D = 0), then skip to next possible position and switch back to Sunday shift.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-38
SLIDE 38

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Shift-And Preprocessing

The usual Shift-And preprocessing is modified as follows: for i = 1 to m for j = 1 to |Σ| if MATCH(p[i], Σ[j]) then S[i, j] = 1 else S[i, j] = 0

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-39
SLIDE 39

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Properties of Shift-Or

Notice some of the important properties of Shift-Or. Proposition D[j] = 1 ⇔ p[1..j] ≈ x[i − j + 1..i] Proposition D = 0 if and only if there doesn’t exist any j ∈ 1..m such that p[1..j] ≈ x[i − j + 1..i] These properties enables us to move the pattern beyond x[i] when we finish ShiftAnd-Match.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-40
SLIDE 40

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

ShiftAnd-MATCH

D ← 0 do D ← (D ≪ 1)&Sx[i] if D&10m = 0 then output i i ← i + 1 //If D =0, terminate loop according to previous proposition while (i ≤ n and D = 0)

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-41
SLIDE 41

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

BMS Preprocessing

The usual BMS preprocessing is modified as follows. for i = 1 to |∆| ∆[i] = m + 1 for i = 1 to m for j = 1 to |Σ| if MATCH(p[i], Σ[j]) then ∆[p[i]] = i

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-42
SLIDE 42

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Sunday-Shift

while not MATCH(p[m], x[i′]) do i′ ← i′ + ∆

  • x[i′ + 1]
  • if i′ > n then return

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-43
SLIDE 43

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Algorithm Shift-And/Sunday

i′ ← m; m′ ← m − 1; while i′ ≤ n do Sunday-Shift(); i ← i′ − m′; //After Sunday-Shift stops, perform ShiftAnd-MATCH ShiftAnd-Match(); //After ShiftAnd-Match stops, shift pattern to the right i′ ← i + m′;

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-44
SLIDE 44

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c x ... D b 1 >> D Sa 1 D p b * ... Shift-Or Match

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-45
SLIDE 45

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

a b a b c ... D b 1 >> D 1 1 D 1 p b * ... x Sa

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-46
SLIDE 46

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c ... D b 1 1 >> D 1 1 Sb 1 1 D 1 1 p b * ... x

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-47
SLIDE 47

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c ... D b 1 1 1 >> D 1 Sb 1 1 D 1 p b * ... x

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-48
SLIDE 48

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c ... D b 1 1 >> D Sc 1 D p b * ... x

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-49
SLIDE 49

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c ... b p b * ... x b b a Begin Sunday-Shift

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-50
SLIDE 50

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

Example of The New Hybrid Algorithm

b a b c ... b p b * ... x b b a

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-51
SLIDE 51

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-52
SLIDE 52

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

20 40 60 80 100 120 140 160 180 200 1 2 3 4 5 6 7 Text Length (million letters) Execution Time (seconds) BMS ShiftAnd Hybrid

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-53
SLIDE 53

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

  • Smyth, Wang, Yu

Adaptive Indeterminate Pattern-Matching Algorithm

slide-54
SLIDE 54

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

  • Smyth, Wang, Yu

Adaptive Indeterminate Pattern-Matching Algorithm

slide-55
SLIDE 55

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

  • Smyth, Wang, Yu

Adaptive Indeterminate Pattern-Matching Algorithm

slide-56
SLIDE 56

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

In all of these tests, the hybrid algorithm’s behaviour is very close to that of the better of BMS and Shift-And. The new algorithms’s total running time is very competitive among these three algorithms being tested.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-57
SLIDE 57

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Outline

1

Introduction

2

Fundamental Algorithms The Knuth-Morris-Pratt Algorithm The Sunday Adaption of Boyer-Moore Algorithm The Shift-And Algorithm The Franek-Jennings-Smyth Algorithm

3

Special Properties of Indeterminate Borders

4

The New Hybrid Algorithm Outline of the New Algorithm Shift-And Matching Sunday-Shift Examples

5

Experimental Results

6

Conclusion

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-58
SLIDE 58

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

A new algorithm that performs fast pattern-matching on both regular and indeterminate strings. Strong ability to adapt to the nature of text/pattern and to achieve faster performance over cases that arise in

  • practice. This dynamic adaptivity is useful when we do not

know the type of text or pattern: we don’t need to make a decision ahead of time about which algorithm to use. Future work: Indeterminate pattern-matching algorithms based on variants of Shift-And such as BNDM and [Fre07], as well as on new convolution techniques [AAR07].

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-59
SLIDE 59

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Avivit Levy Amihood Amir and Liron Reuveni. The practical efficiency of convolutions in pattern matching. Fundamenta Informatica, page to appear, 2007. Robert S. Boyer and J. S Strother Moore. A fast string searching algorithm. CACM, 20(10):762–772, 1977. R.A. Baeza-Yates and G.H. Gonnet. A new approach to text searching. Communications of the ACM, 35(10):74–82, 1992. B´ alint D¨

  • lki.

A universal computer system based on production rules. BIT, 8:262–275, 1968.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-60
SLIDE 60

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Frantisek Franek, Christopher G. Jennings, and W. F . Smyth. A simple fast hybrid pattern-matching algorithm. Journal of Discrete Algorithms, to appear, 2006. Kimmo Fredriksson. Linear worst case time bndm. Information Processing Letters, page to appear, 2007. Jan Holub, W. F . Smyth, and Shu Wang. Hybrid pattern-matching algorithms on indeterminate strings. London Algorithmics and Stringology 2006, J. Daykin, M. Mohamed and K. Steinhoefel (eds.), King’s College London Series Texts in Algorithmics, pages 115–133, 2006.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-61
SLIDE 61

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

Jan Holub, W. F . Smyth, and Shu Wang. Fast pattern-matching on indeterminate strings. Journal of Discrete Algorithms, to appear, 2006.

  • D. E. Knuth, J. H. Morris, and V.R. Pratt.

Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323–350, 1977.

  • G. Navarro and M. Raffinot.

A bit-parallel approach to suffix automata: Fast extended string matching. In M. Farach-Colton, editor, Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching, number 1448, pages 14–33, Piscataway, NJ, 1998. Springer-Verlag, Berlin.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm

slide-62
SLIDE 62

Introduction Fundamental Algorithms Special Properties of Indeterminate Borders The New Hybrid Algorithm Experimental Results Conclusion

  • S. Wu and U. Manber.

Fast text searching with errors. Communications of the ACM, 35(10):83–91, 1992.

Smyth, Wang, Yu Adaptive Indeterminate Pattern-Matching Algorithm