SLIDE 1 Efficiently Testing Simon’s Congruence
Pawe l Gawrychowski 1 Maria Kosche 2 Tore Koß 2 Florin Manea 2 Stefan Siemer 2
1University of Wroc
law
2G¨
September 16, 2020
SLIDE 2
Subsequences w b a c b a a b a d a
SLIDE 3
Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa
SLIDE 4
Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa abc
SLIDE 5
Subsequences w b a c b a a b a d a a b c d ba bc bb baab bbb aaaaa ababa abc
SLIDE 6
Subsequences
i1 i2 i3 ik · · · w
Subsequence
We call w′ a subsequence of length k of a word w, where |w| = n, if there exist positions 1 ≤ i1 < i2 < . . . < ik ≤ n, such that w′ = w[i1]w[i2] · · · w[ik].
Set of Subsequences of length k
Let SF≤k(i, w) denote the set of subsequences of length at most k of w[i : n]. Accordingly, the set of subsequences of length at most k of the entire word w will be denoted by SF≤k(1, w). Example: SF2(1, abaca) = {aa, ab, ac, ba, bc, ca} SF≤2(1, abaca) = {a, b, c, aa, ab, ac, ba, bc, ca}
SLIDE 7
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′).
SLIDE 8
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba
SLIDE 9
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb}
SLIDE 10
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w′) = {aa, ab, ac, ba, bb, bc, ca, cb}
SLIDE 11
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba SF2(1, w) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w′) = {aa, ab, ac, ba, bb, bc, ca, cb} SF2(1, w) = SF2(1, w′) ⇒ w ∼2 w′
SLIDE 12
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba
SLIDE 13
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba bbb / ∈ SF3(1, w), bbb ∈ SF3(1, w′)
SLIDE 14
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). Example: w = abacab, w′ = baacabba bbb / ∈ SF3(1, w), bbb ∈ SF3(1, w′) SF3(1, w) = SF3(1, w′) ⇒ w ≁3 w′
SLIDE 15
Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). (ii) Let i, j ∈ w. We define i ∼k j (w.r.t. w) if w[i : n] ∼k w[j : n], and we say that the positions i and j are k-equivalent. Example: w = abacab, w′ = baacabba
SLIDE 16 Simon’s Congruence
Simon’s Congruence
(i) Let w, w′ ∈ Σ∗. We say that w and w′ are equivalent under Simon’s congruence ∼k if SF≤k(1, w) = SF≤k(1, w′). (ii) Let i, j ∈ w. We define i ∼k j (w.r.t. w) if w[i : n] ∼k w[j : n], and we say that the positions i and j are k-equivalent. (iii) A word u of length k distinguishes w and w′ w.r.t. ∼k if u
- ccurs in exactly one of the sets SF≤k(1, w) and SF≤k(1, w′).
Example: w = abacab, w′ = baacabba
SLIDE 17
Problem Definition
SimK
Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, and a natural number k, decide whether s ∼k t.
MaxSimK
Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, find the maximum k for which s ∼k t.
SLIDE 18
History
◮ Line of research originating in the PhD thesis of Imre Simon from 1972 ◮ Long history of algorithm designs and improvements for associated problems. State of the art: SimK optimal linear time [DLT 2020] MaxSimK O(n log n) time [DLT 2020]. ◮ Today: an optimal linear-time algorithm for the MaxSimK problem.
SLIDE 19
Simon-tree
Equivalence Classes
i l j w SFk(i, w) ⊃ SFk(l, w) ⊃ SFk(j, w)
◮ Splitting a word suffixwise into blocks of equivalence classes w.r.t. ∼k ◮ If i ∼k j, then SFk(i, w) = SFk(l, w) = SFk(j, w) and we say that i, l, and j are in the same k-block ◮ ∼k+1 is a refinement of ∼k ◮ Index i is a (k + 1)-splitting position if i ∼k i + 1 but not i ∼k+1 i + 1
SLIDE 20 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
SLIDE 21 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
w b a c b a a b a d a
SLIDE 22 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
w b a c b a a b a d a 1-blocks
SLIDE 23 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a 1-blocks
SLIDE 24 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a 1-blocks
SLIDE 25 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a
SLIDE 26 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a 2-blocks
SLIDE 27 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a 2-blocks
SLIDE 28 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a
SLIDE 29 Equivalence Classes
Use these properties to build a block structure for a word
- 1. i ∼1 j iff alph(w[i : n]) = alph(w[j : n]) for any i, j ∈ w
→ We can go from right to left through the word and determine 1-splitting positions
- 2. Split a k-block into (k + 1)-blocks by going from right to left
through the block (without its last letter) and determine (k + 1)-splitting positions exactly as for 1-splitting positions.
w b a c b a a b a d a 3-blocks
SLIDE 30
Simon-tree Definition
◮ New data structure: Simon-tree ◮ Represents presented block structure ◮ Efficiently partition positions of a given word ◮ Construction takes linear time
SLIDE 31
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
[1 : 10]
bacbaabada
SLIDE 32 Simon-tree Definition
Simon-tree
The Simon-tree Tw associated to the word w, with |w| = n, is an
- rdered rooted tree. The nodes represent k−blocks of w, for
0 ≤ k ≤ n, and are defined recursively. ◮ The root corresponds to the 0-block of the word w, i.e., the interval [1 : n]. ◮ For k > 1 and for a node b on level k − 1, which represents a (k − 1)-block [i : j] with i < j, the children of b are exactly the blocks of the partition of [i : j] in k-blocks, ordered decreasingly by their starting position. ◮ For k > 1, each node on the level k − 1 which represents a (k − 1)-block [i : i] is a leaf.
SLIDE 33 Simon-tree Construction
◮ Algorithm: Build the Simon-tree right to left as the word is traversed right to left. Only the leftmost branch is edited during construction.
- 1. The level (block), where a new position/letter should be
assigned to (resp., belongs to), is computed efficiently.
- 2. Insert the new position/letter into the tree by moving up the
leftmost branch from leaf to root.
- 3. Close traversed blocks on the path until the level for the new
position/letter is reached.
- 4. Insert the new position/letter as a leftmost child on its
corresponding level.
SLIDE 34
Simon-tree Construction
position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞
SLIDE 35
Simon-tree Construction
k = 0 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
SLIDE 36
Simon-tree Construction
k = 0 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
SLIDE 37
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
SLIDE 38
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
SLIDE 39
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
SLIDE 40
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11] 10]
$ . . . a
SLIDE 41
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
SLIDE 42
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
SLIDE 43
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
[9]
d
SLIDE 44
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
[9]
d
SLIDE 45
Simon-tree Construction
k = 0 k = 1 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
[9]
d
8]
. . . a
SLIDE 46
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
[9]
d
8]
. . . a
SLIDE 47
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
9]
. . . d
[9]
d
[8]
a
SLIDE 48
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
SLIDE 49
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
SLIDE 50
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
SLIDE 51
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
SLIDE 52
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
SLIDE 53
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
6]
. . . a
SLIDE 54
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
6]
. . . a
[6]
a
SLIDE 55
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
6]
. . . a
[6]
a
5]
. . . a
SLIDE 56
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
6]
. . . a
[6]
a
5]
. . . a
k = 3
SLIDE 57
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
6]
. . . a
[6]
a
[5]
a
k = 3
SLIDE 58
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3
SLIDE 59
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3
SLIDE 60
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
[6]
a
[5]
a
[5:6]
aa
4]
. . . b
k = 3
SLIDE 61
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
7]
. . . b
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
SLIDE 62
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
SLIDE 63
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
SLIDE 64
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
SLIDE 65
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
SLIDE 66
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
2]
. . . a
SLIDE 67
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
[2]
a
SLIDE 68
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
[2]
a
SLIDE 69
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
[2]
a
1]
. . . b
SLIDE 70
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
3]
. . . c
[3]
c
[2]
a b
[1]
SLIDE 71
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ 11]
. . . $
[11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
SLIDE 72
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
[1 : 11]
bacbaabada$
SLIDE 73
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [11]
$
[10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
[1 : 10]
bacbaabada
SLIDE 74
Simon-tree Construction
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
[1 : 10]
bacbaabada
SLIDE 75
k = 0 k = 1 k = 2 position 1 2 3 4 5 6 7 8 9 10 11 w b a c b a a b a d a $ X 4 5 ∞ 7 6 8 ∞ 10 ∞ ∞ ∞ [10]
a
[9]
d
[8]
a
[8:9]
ad
[7]
b
[6]
a
[5]
a
[5:6]
aa
k = 3 [4]
b
[4 : 7]
baab
[3]
c
[2]
a b
[1] [1:3]
bac
[1 : 10]
bacbaabada
SLIDE 76
Short Recap
So far: structure for one word representing the equivalence classes w.r.t. ∼k Now: set two words in relation to each other by using their respective Simon-trees
MaxSimK
Given two words s and t over an alphabet Σ, with |s| = n and |t| = n′, with n ≥ n′, find the maximum k for which s ∼k t.
SLIDE 77
Connecting Two Simon-trees
◮ Transform the words s and t into Simon-trees as shown. ◮ Use the tree structure to connect equivalent nodes of the two words.
SLIDE 78
Connecting Two Simon-trees
◮ Transform the words s and t into Simon-trees as shown. ◮ Use the tree structure to connect equivalent nodes of the two words.
S-Connection
The k-node a of Ts and the k-node b of Tt are S-connected (i.e., the pair (a, b) is in the S-connection) if and only if s[i : n] ∼k t[j : n′] for all positions i in block a and positions j in block b.
SLIDE 79
From P-Connection to S-Connection
Starting from a larger relation (P-Connection) which contains the S-Connection, and refine it. ◮ The 0-nodes of Ts and Tt are P-connected. ◮ For all levels k of Ts, if the explicit or implicit k-nodes a and b (from Ts and Tt, respectively) are P-connected, then the ith child of a is P-connected to the ith child of b, for all i. ◮ No other nodes are P-connected.
SLIDE 80
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 81
From P-Connection to S-Connection
How to refine the P-Connection: ◮ Let k ≥ 1. Let a, b be k-blocks in the word t, resp. s, with a ∼k b. ◮ Let a′ be child of a, b′ be child of b. ◮ a′ ≁k+1 b′ if and only if there exists a letter x such that s[next(a′, x) + 1 : n] ≁k t[next(b′, x) + 1 : n′].
SLIDE 82
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 83
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 84
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 85
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 86
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 87
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 88
From P-Connection to S-Connection
abac abacab$ a b $ ab a c a b baac baacabba$ abb a $ ab a c a b a b b a c a b $ a c a b b a $
SLIDE 89
Additional Notes and Analysis
◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.
SLIDE 90
Additional Notes and Analysis
◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.
SLIDE 91
Additional Notes and Analysis
◮ Solution of MaxSimK: last level k where the k-blocks containing position 1 of the input words are equivalent. ◮ Distinguishing word can be obtained. ◮ By efficiently using union-find and split find data structures the algorithm achieves an optimal linear runtime.
Thank you!