Efficiently Testing Simons Congruence l Gawrychowski 1 Maria Kosche 2 - - PowerPoint PPT Presentation

efficiently testing simon s congruence
SMART_READER_LITE
LIVE PREVIEW

Efficiently Testing Simons Congruence l Gawrychowski 1 Maria Kosche 2 - - PowerPoint PPT Presentation

Efficiently Testing Simons Congruence l Gawrychowski 1 Maria Kosche 2 Tore Ko 2 Pawe Florin Manea 2 Stefan Siemer 2 1 University of Wroc law 2 G ottingen University September 16, 2020 Subsequences w b a c b a a b a d a


slide-1
SLIDE 1

Efficiently Testing Simon’s Congruence

Pawe l Gawrychowski 1 Maria Kosche 2 Tore Koß 2 Florin Manea 2 Stefan Siemer 2

1University of Wroc

law

2G¨

  • ttingen University

September 16, 2020

slide-2
SLIDE 2

Subsequences w b a c b a a b a d a

slide-3
SLIDE 3

Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa

slide-4
SLIDE 4

Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa abc

slide-5
SLIDE 5

Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa abc

slide-6
SLIDE 6

Subsequences

i1 i2 i3 ik · · · w

Subsequence

We call w′ a subsequence of length k of a word w, where |w| = n, if there exist positions 1 ≤ i1 < i2 < . . . < ik ≤ n, such that w′ = w[i1]w[i2] · · · w[ik].

Set of Subsequences of length k

Let SF≤k(i, w) denote the set of subsequences of length at most k of w[i : n]. Accordingly, the set of subsequences of length at most k of the entire word w will be denoted by SF≤k(1, w). Example: SF2(1, abaca) = {aa, ab, ac, ba, bc, ca} SF≤2(1, abaca) = {a, b, c, aa, ab, ac, ba, bc, ca}

slide-7
SLIDE 7

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′).

slide-8
SLIDE 8

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba

slide-9
SLIDE 9

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb}

slide-10
SLIDE 10

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w′) = {aa, ab, ac, ba, bb, bc, ca, cb}

slide-11
SLIDE 11

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w′) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w) = SF2(1, w′) ⇒ w ∼2 w′

slide-12
SLIDE 12

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba

slide-13
SLIDE 13

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba bbb / ∈ SF3(1, w), bbb ∈ SF3(1, w′)

slide-14
SLIDE 14

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba bbb / ∈ SF3(1, w), bbb ∈ SF3(1, w′) SF3(1, w) = SF3(1, w′) ⇒ w ≁3 w′

slide-15
SLIDE 15

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). (ii) Let i, j ∈ w. We define i ∼k j (w.r.t. w) if w[i : n] ∼k w[j : n], and we say that the positions i and j are k-equivalent. Example: w = abacab, w′ = baacabba

slide-16
SLIDE 16

Simon’s Congruence

Simon’s Congruence

(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). (ii) Let i, j ∈ w. We define i ∼k j (w.r.t. w) if w[i : n] ∼k w[j : n], and we say that the positions i and j are k-equivalent. (iii) A word u of length k distinguishes w and w′ w.r.t. ∼k if u

  • ccurs in exactly one of the sets SF≤k(1, w) and SF≤k(1, w′).

Example: w = abacab, w′ = baacabba

slide-17
SLIDE 17

Problem Definition

SimK

Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, and a natural number k, decide whether s ∼k t.

MaxSimK

Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, find the maximum k for which s ∼k t.

slide-18
SLIDE 18

History

◮ Line of research originating in the PhD thesis of Imre Simon from 1972 ◮ Long history of algorithm designs and improvements for associated problems. State of the art: SimK optimal linear time [DLT 2020] MaxSimK O(n log n) time [DLT 2020]. ◮ Today: an optimal linear-time algorithm for the MaxSimK problem.

slide-19
SLIDE 19

Simon-tree

Equivalence Classes

i l j w SFk(i, w) ⊃ SFk(l, w) ⊃ SFk(j, w)

◮ Splitting a word suffixwise into blocks of equivalence classes w.r.t. ∼k ◮ If i ∼k j, then SFk(i, w) = SFk(l, w) = SFk(j, w) and we say that i, l, and j are in the same k-block ◮ ∼k+1 is a refinement of ∼k ◮ Index i is a (k + 1)-splitting position if i ∼k i + 1 but not i ∼k+1 i + 1

slide-20
SLIDE 20

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

slide-21
SLIDE 21

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

w b a c b a a b a d a

slide-22
SLIDE 22

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

w b a c b a a b a d a 1-blocks

slide-23
SLIDE 23

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a 1-blocks

slide-24
SLIDE 24

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a 1-blocks

slide-25
SLIDE 25

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a

slide-26
SLIDE 26

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a 2-blocks

slide-27
SLIDE 27

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a 2-blocks

slide-28
SLIDE 28

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a

slide-29
SLIDE 29

Equivalence Classes

Use these properties to build a block structure for a word

  • 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w

→ We can go from right to left through the word and determine 1-splitting positions

  • 2. Split a k-block into (k + 1)-blocks by going from right to left

through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.

w b a c b a a b a d a 3-blocks

slide-30
SLIDE 30

Simon-tree Definition

◮ New data structure: Simon-tree ◮ Represents presented block structure ◮ Efficiently partition positions of a given word ◮ Construction takes linear time

slide-31
SLIDE 31

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

[1 : 10]

bacbaabada

slide-32
SLIDE 32

Simon-tree Definition

Simon-tree

The Simon-tree Tw associated to the word w, with |w| = n, is an

  • rdered rooted tree. The nodes represent k−blocks of w, for

0 ≤ k ≤ n, and are defined recursively. ◮ The root corresponds to the 0-block of the word w, i.e., the interval [1 : n]. ◮ For k > 1 and for a node b on level k − 1, which represents a (k − 1)-block [i : j] with i < j, the children of b are exactly the blocks of the partition of [i : j] in k-blocks, ordered decreasingly by their starting position. ◮ For k > 1, each node on the level k − 1 which represents a (k − 1)-block [i : i] is a leaf.

slide-33
SLIDE 33

Simon-tree Construction

◮ Algorithm: Build the Simon-tree right to left as the word is traversed right to left. Only the leftmost branch is edited during construction.

  • 1. The level (block), where a new position/letter should be

assigned to (resp., belongs to), is computed efficiently.

  • 2. Insert the new position/letter into the tree by moving up the

leftmost branch from leaf to root.

  • 3. Close traversed blocks on the path until the level for the new

position/letter is reached.

  • 4. Insert the new position/letter as a leftmost child on its

corresponding level.

slide-34
SLIDE 34

Simon-tree Construction

position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞

slide-35
SLIDE 35

Simon-tree Construction

k = 0 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

slide-36
SLIDE 36

Simon-tree Construction

k = 0 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

slide-37
SLIDE 37

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

slide-38
SLIDE 38

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

slide-39
SLIDE 39

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

slide-40
SLIDE 40

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11] 10]

$ . . . a

slide-41
SLIDE 41

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

slide-42
SLIDE 42

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

slide-43
SLIDE 43

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

[9]

d

slide-44
SLIDE 44

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

[9]

d

slide-45
SLIDE 45

Simon-tree Construction

k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

[9]

d

8]

. . . a

slide-46
SLIDE 46

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

[9]

d

8]

. . . a

slide-47
SLIDE 47

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

9]

. . . d

[9]

d

[8]

a

slide-48
SLIDE 48

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

slide-49
SLIDE 49

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

slide-50
SLIDE 50

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

slide-51
SLIDE 51

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

slide-52
SLIDE 52

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

slide-53
SLIDE 53

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

6]

. . . a

slide-54
SLIDE 54

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

6]

. . . a

[6]

a

slide-55
SLIDE 55

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

6]

. . . a

[6]

a

5]

. . . a

slide-56
SLIDE 56

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

6]

. . . a

[6]

a

5]

. . . a

k = 3

slide-57
SLIDE 57

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

6]

. . . a

[6]

a

[5]

a

k = 3

slide-58
SLIDE 58

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3

slide-59
SLIDE 59

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3

slide-60
SLIDE 60

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

[6]

a

[5]

a

[5:6]

aa

4]

. . . b

k = 3

slide-61
SLIDE 61

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

7]

. . . b

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

slide-62
SLIDE 62

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

slide-63
SLIDE 63

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

slide-64
SLIDE 64

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

slide-65
SLIDE 65

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

slide-66
SLIDE 66

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

2]

. . . a

slide-67
SLIDE 67

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

[2]

a

slide-68
SLIDE 68

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

[2]

a

slide-69
SLIDE 69

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

[2]

a

1]

. . . b

slide-70
SLIDE 70

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

3]

. . . c

[3]

c

[2]

a b

[1]

slide-71
SLIDE 71

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]

. . . $

[11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

slide-72
SLIDE 72

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

[1 : 11]

bacbaabada$

slide-73
SLIDE 73

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [11]

$

[10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

[1 : 10]

bacbaabada

slide-74
SLIDE 74

Simon-tree Construction

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

[1 : 10]

bacbaabada

slide-75
SLIDE 75

k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]

a

[9]

d

[8]

a

[8:9]

ad

[7]

b

[6]

a

[5]

a

[5:6]

aa

k = 3 [4]

b

[4 : 7]

baab

[3]

c

[2]

a b

[1] [1:3]

bac

[1 : 10]

bacbaabada

slide-76
SLIDE 76

Short Recap

So far: structure for one word representing the equivalence classes w.r.t. ∼k Now: set two words in relation to each other by using their respective Simon-trees

MaxSimK

Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, find the maximum k for which s ∼k t.

slide-77
SLIDE 77

Connecting Two Simon-trees

◮ Transform the words s and t into Simon-trees as shown. ◮ Use the tree structure to connect equivalent nodes of the two words.

slide-78
SLIDE 78

Connecting Two Simon-trees

◮ Transform the words s and t into Simon-trees as shown. ◮ Use the tree structure to connect equivalent nodes of the two words.

S-Connection

The k-node a of Ts and the k-node b of Tt are S-connected (i.e., the pair (a, b) is in the S-connection) if and only if s[i : n] ∼k t[j : n′] for all positions i in block a and positions j in block b.

slide-79
SLIDE 79

From P-Connection to S-Connection

Starting from a larger relation (P-Connection) which contains the S-Connection, and refine it. ◮ The 0-nodes of Ts and Tt are P-connected. ◮ For all levels k of Ts, if the explicit or implicit k-nodes a and b (from Ts and Tt, respectively) are P-connected, then the ith child of a is P-connected to the ith child of b, for all i. ◮ No other nodes are P-connected.

slide-80
SLIDE 80

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-81
SLIDE 81

From P-Connection to S-Connection

How to refine the P-Connection: ◮ Let k ≥ 1. Let a, b be k-blocks in the word t, resp. s, with a ∼k b. ◮ Let a′ be child of a, b′ be child of b. ◮ a′ ≁k+1 b′ if and only if there exists a letter x such that s[next(a′, x) + 1 : n] ≁k t[next(b′, x) + 1 : n′].

slide-82
SLIDE 82

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-83
SLIDE 83

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-84
SLIDE 84

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-85
SLIDE 85

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-86
SLIDE 86

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-87
SLIDE 87

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-88
SLIDE 88

From P-Connection to S-Connection

abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $

slide-89
SLIDE 89

Additional Notes and Analysis

◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.

slide-90
SLIDE 90

Additional Notes and Analysis

◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.

slide-91
SLIDE 91

Additional Notes and Analysis

◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.

Thank you!