SLIDE 1
- Build a tree from the text
- Used if the text is expected to be the same
during several pattern queries
- Tree building is O(m) where m is the size of the
- text. This is preprocessing.
- Given any pattern of length n, we can answer
if it occurs in text in O(n) time
- Suffix tree = “modified” keyword tree of all
suffixes of text
Suffix tree
SLIDE 2
ATCATG TCATG CATG ATG TG G Keyword Tree Suffix Tree
Construct a suffix tree
Text: ATCATG
suffixes
SLIDE 3 Similar to keyword trees, except edges that form paths are collapsed
- Each edge is labeled with a
substring of a text for less space
- All internal edges have at least
two outgoing edges
- Leaves labeled by the location
- f the suffix on the text.
Suffix tree = Collapsed Keyword Tree on Suffixes
Text: ATCATG
SLIDE 4
All suffixes of text T
SLIDE 5
Example: suffix keyword tree
SLIDE 6
Example: suffix keyword tree
SLIDE 7
Example: suffix keyword tree
SLIDE 8
Example: suffix keyword tree
SLIDE 9
Example: suffix keyword tree
SLIDE 10
Example: suffix keyword tree
SLIDE 11
Example: suffix keyword tree
SLIDE 12
Example: suffix keyword tree
SLIDE 13
Example: suffix keyword tree
SLIDE 14
How many nodes does a suffix keyword tree have?
SLIDE 15
How many nodes does a suffix keyword tree have?
SLIDE 16
How many nodes does a suffix keyword tree have?
SLIDE 17
How many nodes does a suffix keyword tree have?
SLIDE 18 Trees built using the first 500 prefixes of the lambda phage virus genome
Actual growth: an example
SLIDE 19
How to compress these trees?
SLIDE 20 Similar to keyword trees, except edges that form paths are collapsed
- Each edge is labeled with a
substring of a text for less space
- All internal edges have at least
two outgoing edges
- Leaves labeled by the location
- f the suffix on the text.
Suffix tree = Collapsed Keyword Tree on Suffixes
Text: ATCATG
SLIDE 21
Compression
SLIDE 22
How many nodes does a suffix tree have?
SLIDE 23
Compression
SLIDE 24
Compression
SLIDE 25
Space complexity
SLIDE 26
Add starting location/offset at each leaf node
SLIDE 27
Retrieve substrings
SLIDE 28 Trees built using the first 500 prefixes of the lambda phage virus genome
suffix tree keyword tree
Actual growth: comparison
SLIDE 29
- Keyword and suffix trees are used to find patterns in a
text
- Keyword trees:
- Build keyword tree of patterns, and thread text
through it
- Usage: checking a set of patterns within various texts
- Suffix trees:
- Build suffix tree of text, and thread patterns through it
- Usage: checking various patterns in the same text
Summary