Suffix tree Build a tree from the text Used if the text is expected - - PowerPoint PPT Presentation

suffix tree
SMART_READER_LITE
LIVE PREVIEW

Suffix tree Build a tree from the text Used if the text is expected - - PowerPoint PPT Presentation

Suffix tree Build a tree from the text Used if the text is expected to be the same during several pattern queries Tree building is O(m) where m is the size of the text. This is preprocessing. Given any pattern of length n, we can


slide-1
SLIDE 1
  • Build a tree from the text
  • Used if the text is expected to be the same

during several pattern queries

  • Tree building is O(m) where m is the size of the
  • text. This is preprocessing.
  • Given any pattern of length n, we can answer

if it occurs in text in O(n) time

  • Suffix tree = “modified” keyword tree of all

suffixes of text

Suffix tree

slide-2
SLIDE 2

ATCATG TCATG CATG ATG TG G Keyword Tree Suffix Tree

Construct a suffix tree

Text: ATCATG

suffixes

slide-3
SLIDE 3

Similar to keyword trees, except edges that form paths are collapsed

  • Each edge is labeled with a

substring of a text for less space

  • All internal edges have at least

two outgoing edges

  • Leaves labeled by the location
  • f the suffix on the text.

Suffix tree = Collapsed Keyword Tree on Suffixes

Text: ATCATG

slide-4
SLIDE 4

All suffixes of text T

slide-5
SLIDE 5

Example: suffix keyword tree

slide-6
SLIDE 6

Example: suffix keyword tree

slide-7
SLIDE 7

Example: suffix keyword tree

slide-8
SLIDE 8

Example: suffix keyword tree

slide-9
SLIDE 9

Example: suffix keyword tree

slide-10
SLIDE 10

Example: suffix keyword tree

slide-11
SLIDE 11

Example: suffix keyword tree

slide-12
SLIDE 12

Example: suffix keyword tree

slide-13
SLIDE 13

Example: suffix keyword tree

slide-14
SLIDE 14

How many nodes does a suffix keyword tree have?

slide-15
SLIDE 15

How many nodes does a suffix keyword tree have?

slide-16
SLIDE 16

How many nodes does a suffix keyword tree have?

slide-17
SLIDE 17

How many nodes does a suffix keyword tree have?

slide-18
SLIDE 18

Trees built using the first 500 prefixes of the lambda phage virus genome

Actual growth: an example

slide-19
SLIDE 19

How to compress these trees?

slide-20
SLIDE 20

Similar to keyword trees, except edges that form paths are collapsed

  • Each edge is labeled with a

substring of a text for less space

  • All internal edges have at least

two outgoing edges

  • Leaves labeled by the location
  • f the suffix on the text.

Suffix tree = Collapsed Keyword Tree on Suffixes

Text: ATCATG

slide-21
SLIDE 21

Compression

slide-22
SLIDE 22

How many nodes does a suffix tree have?

slide-23
SLIDE 23

Compression

slide-24
SLIDE 24

Compression

slide-25
SLIDE 25

Space complexity

slide-26
SLIDE 26

Add starting location/offset at each leaf node

slide-27
SLIDE 27

Retrieve substrings

slide-28
SLIDE 28

Trees built using the first 500 prefixes of the lambda phage virus genome

suffix tree keyword tree

Actual growth: comparison

slide-29
SLIDE 29
  • Keyword and suffix trees are used to find patterns in a

text

  • Keyword trees:
  • Build keyword tree of patterns, and thread text

through it

  • Usage: checking a set of patterns within various texts
  • Suffix trees:
  • Build suffix tree of text, and thread patterns through it
  • Usage: checking various patterns in the same text

Summary