suffix tree
play

Suffix tree Build a tree from the text Used if the text is expected - PowerPoint PPT Presentation

Suffix tree Build a tree from the text Used if the text is expected to be the same during several pattern queries Tree building is O(m) where m is the size of the text. This is preprocessing. Given any pattern of length n, we can


  1. Suffix tree • Build a tree from the text • Used if the text is expected to be the same during several pattern queries • Tree building is O(m) where m is the size of the text. This is preprocessing. • Given any pattern of length n, we can answer if it occurs in text in O(n) time • Suffix tree = “modified” keyword tree of all suffixes of text

  2. Construct a suffix tree Text: ATCATG ATCATG TCATG Keyword Suffix CATG suffixes Tree Tree ATG TG G

  3. Suffix tree = Collapsed Keyword Tree on Suffixes Similar to keyword trees, except edges that form paths are collapsed • Each edge is labeled with a substring of a text for less space • All internal edges have at least two outgoing edges • Leaves labeled by the location of the suffix on the text. Text: ATCATG

  4. All suffixes of text T

  5. Example: suffix keyword tree

  6. Example: suffix keyword tree

  7. Example: suffix keyword tree

  8. Example: suffix keyword tree

  9. Example: suffix keyword tree

  10. Example: suffix keyword tree

  11. Example: suffix keyword tree

  12. Example: suffix keyword tree

  13. Example: suffix keyword tree

  14. How many nodes does a suffix keyword tree have?

  15. How many nodes does a suffix keyword tree have?

  16. How many nodes does a suffix keyword tree have?

  17. How many nodes does a suffix keyword tree have?

  18. Actual growth: an example Trees built using the first 500 prefixes of the lambda phage virus genome

  19. How to compress these trees?

  20. Suffix tree = Collapsed Keyword Tree on Suffixes Similar to keyword trees, except edges that form paths are collapsed • Each edge is labeled with a substring of a text for less space • All internal edges have at least two outgoing edges • Leaves labeled by the location of the suffix on the text. Text: ATCATG

  21. Compression

  22. How many nodes does a suffix tree have?

  23. Compression

  24. Compression

  25. Space complexity

  26. Add starting location/offset at each leaf node

  27. Retrieve substrings

  28. Actual growth: comparison Trees built using the first 500 prefixes of the lambda phage virus genome suffix tree keyword tree

  29. Summary • Keyword and suffix trees are used to find patterns in a text • Keyword trees: • Build keyword tree of patterns, and thread text through it • Usage: checking a set of patterns within various texts • Suffix trees: • Build suffix tree of text, and thread patterns through it • Usage: checking various patterns in the same text

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend