faster pattern matching with mismatches in compressed
play

Faster Pattern Matching with Mismatches in Compressed Texts Karl - PowerPoint PPT Presentation

Few Matches or Almost Periodicity: Faster Pattern Matching with Mismatches in Compressed Texts Karl Bringmann, Marvin Knnemann, and Philip Wellnitz Max Planck Institute for Informatics, Saarland Informatics Campus (SIC), Saarbrcken, Germany


  1. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  2. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . t p p p Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  3. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . t x x p Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  4. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . t x x p x Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  5. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . t x x x x p x Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  6. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching Fact (Folklore) Let text t and pattern p , | t | ≤ 3 2 | p | , be given such that there are ≥ 2 matches of p in t that together match t completely. Then, both p and t are periodic with some period x and every match of p in t starts at a position 1 + i · | x | . t x x x x x x x x x p x x x x Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  7. Basic Definitions and General Overview New Structural Insights Faster Algorithm What is the solution structure of Pattern Matching with Mismatches? Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  8. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least 2 k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  9. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least two k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A m B m · · · · · · · · · · · · t A A A A B B B B p · · · · · · A A B B A m / 2 B m / 2 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  10. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least two k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A m B m · · · · · · · · · · · · t A A A A B B B B p · · · · · · A A B B A m / 2 B m / 2 p and t not periodic, but 2 k k -matches of p in t Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  11. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least two Ω( poly ( k )) k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A m B m · · · · · · · · · · · · t A A A A B B B B p · · · · · · A A B B A m / 2 B m / 2 Insight 1 Periodicity only if number of k -matches of p in t is Ω( poly ( k )) Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  12. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least Ω( poly ( k )) k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A 2 m · · · · · · t A A A A A A A A A A p · · · A A A A A A m Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  13. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least Ω( poly ( k )) k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A 2 m · · · · · · t A A A A A A A A B B A A B B at k / 2 random positions each p · · · A A B A A A B A m Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  14. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least Ω( poly ( k )) k -matches of p in t , then p and t are periodic and every k -match of p starts at a position 1 + i | x | ? A 2 m · · · · · · t A A A A A A A A B B A A B B at k / 2 random positions each p · · · B A A A A A B A m O ( m ) k -matches of p in t , but p and t not perfectly periodic Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  15. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least Ω( poly ( k )) k -matches of p in t , then p and t are periodic periodic up to O ( k ) mismatches and every k -match of p starts at a position 1 + i | x | ? A 2 m · · · · · · t A A A A A A A A B B A A B B at k / 2 random positions each p · · · B A A A A A B A m Insight 2 Periodicity only up to O ( k ) mismatches Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  16. Basic Definitions and General Overview New Structural Insights Faster Algorithm Solution Structure of Pattern Matching with Mismatches If there are at least Ω( poly ( k )) k -matches of p in t , then p and t are periodic up to O ( k ) mismatches and every k -match of p starts at a position 1 + i | x | ? Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  17. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) , or t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ Both t ′ and p have HD O ( k ) to the same periodic string x and all k -matches of p in t ′ start at a position 1 + i · | x | . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  18. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) , or t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ Both t ′ and p have HD O ( k ) to the same periodic string x and all k -matches of p in t ′ start at a position 1 + i · | x | . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  19. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t x ∗ · · · · · · i p p 1 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  20. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ · · · · · · i p p 1 Consider t ′ : shortest substring of t that contains all k -matches Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  21. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ · · · · · · i · · · · · · p p 1 p 2 p i p 16 k p 1 Split p into 16 k parts p i of equal length Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  22. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ · · · · · · i p p i p 1 Fix a p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  23. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ · · · · · · i p x i x i p i p 1 Consider prefix x i of p i that is also a period of p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  24. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i p 1 Find first 3 k mismatches between p and x ∗ i before and after p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  25. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i ≤ 3 k mism. ≤ 3 k mism. p 1 Find first 3 k mismatches between p and x ∗ i before and after p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  26. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i < 6 k mism. p 1 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  27. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j < 2 · ( 6 + 1 ) k = 14 k mism. t ′ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i < 6 k mism. p 1 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  28. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i ≤ 3 k mism. ≤ 3 k mism. p 1 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  29. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  30. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Insight Any k -match of p in t ′ must match at least one p i ’s exactly . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  31. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Fix a p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  32. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Fix a p i ; count k -matches where p i is matched exactly Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  33. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x i x i x i x i x i x i x i x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Consider occurrences of x i in t ′ Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  34. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x i x i x i x i x i x i x i x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Problem Up to O ( m ) exact matches of x i in t ′ . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  35. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x i x i x i x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Consider power stretches of x i in t ′ of length ≥ | p i | Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  36. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t ′ x i x i x i x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Consider power stretches of x i in t ′ of length ≥ | p i | � at most 150 k different power stretches Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  37. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t j t ′ x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Fix a power stretch t j of x i in t ′ . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  38. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t j ≥ 2 k mism. t ′ x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Fix a power stretch t j of x i in t ′ . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  39. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t j ≥ 2 k mism. t ′ x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Insight Must align at least one mismatch. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  40. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most 1000 k 2 , or Both t ′ and p have HD < 20 k to a periodic x ; all k -matches start at position 1 + i · | x | . t j t j ≥ 2 k mism. t ′ x i x i x i x i x ∗ x ∗ x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i x i · · · · · · · · · · · · i i p x i x i p i = 3 k mism. p 1 Insight At most O ( k 4 ) matches: O ( k ) parts in p , O ( k ) stretches, O ( k 2 ) matches per combination. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  41. Basic Definitions and General Overview New Structural Insights Faster Algorithm Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) , or t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ Both t ′ and p have Hamming distance O ( k ) to the same periodic string x and all k -matches of p in t ′ start at a position 1 + i · | x | . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  42. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm Theorem (Algorithm) Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O ( n k 3 ( k log k + log m ) + k m ) . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  43. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm Theorem (Algorithm) Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O ( n k 3 ( k log k + log m ) + k m ) . Pattern-Compressed String [GS’13] Let p be a string of length m . We call a string f = v 1 . . . v q , � q i = 1 | v i | ≤ 2 m a p -pattern-compressed string (pc-string) if every v i is a substring of p . We call the v i ’s factors of f . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  44. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Pattern-Compressed String [GS’13] Let p be a string of length m . We call a string f = v 1 . . . v q , � q i = 1 | v i | ≤ 2 m a p -pattern- compressed string (pc-string) if every v i is a substring of p . We call the v i ’s factors of f . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  45. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Pattern-Compressed String [GS’13] Let p be a string of length m . We call a string f = v 1 . . . v q , � q i = 1 | v i | ≤ 2 m a p -pattern- compressed string (pc-string) if every v i is a substring of p . We call the v i ’s factors of f . PC-String , inst. J 1 SLP k , p , f 1 with O ( k ) factors Instance I PC-String , inst. J 2 T 3 k , p , f 2 with O ( k ) factors T 1 T 2 . . A B . SLP T = T 1 , . . . , T n PC-String , inst. J n k , p of length m k , p , f n with O ( k ) factors O ( n k 3 ( k log k + log m ) + k m ) O ( k 3 ( k log k + log m )) O ( n (log m + T ( m , k )) + km ) T ( m , k ) algorithm algorithm Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  46. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Pattern-Compressed String [GS’13] Let p be a string of length m . We call a string f = v 1 . . . v q , � q i = 1 | v i | ≤ 2 m a p -pattern- compressed string (pc-string) if every v i is a substring of p . We call the v i ’s factors of f . PC-String , inst. J 1 SLP k , p , f 1 with O ( k ) factors Instance I PC-String , inst. J 2 T 3 k , p , f 2 with O ( k ) factors T 1 T 2 . . A B . SLP T = T 1 , . . . , T n PC-String , inst. J n k , p of length m k , p , f n with O ( k ) factors O ( n k 3 ( k log k + log m ) + k m ) O ( k 3 ( k log k + log m )) algorithm algorithm Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  47. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Pattern-Compressed String [GS’13] Let p be a string of length m . We call a string f = v 1 . . . v q , � q i = 1 | v i | ≤ 2 m a p -pattern- compressed string (pc-string) if every v i is a substring of p . We call the v i ’s factors of f . PC-String , inst. J 1 SLP k , p , f 1 with O ( k ) factors Instance I PC-String , inst. J 2 T 3 k , p , f 2 with O ( k ) factors T 1 T 2 . . A B . SLP T = T 1 , . . . , T n PC-String , inst. J n k , p of length m k , p , f n with O ( k ) factors O ( n k 3 ( k log k + log m ) + k m ) O ( k 3 ( k log k + log m )) algorithm algorithm Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  48. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p -pc-string f of size O ( k ) representing at most 2 m characters, can be solved in time O ( k 3 ( k log k + log m )) . (With O ( km ) preprocessing on p .) Implementation of structural insight Need e.g. tools for finding first O ( k ) mismatches to a periodic string or finding all power stretches of a given string in a pc-string Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  49. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p -pc-string f of size O ( k ) representing at most 2 m characters, can be solved in time O ( k 3 ( k log k + log m )) . (With O ( km ) preprocessing on p .) Implementation of structural insight Need e.g. tools for finding first O ( k ) mismatches to a periodic string or finding all power stretches of a given string in a pc-string Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  50. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm for Pattern-Compressed Strings Theorem (Algorithm for pc-strings) Pattern matching with k mismatches on a pattern p of length m and a p -pc-string f of size O ( k ) representing at most 2 m characters, can be solved in time O ( k 3 ( k log k + log m )) . (With O ( km ) preprocessing on p .) Implementation of structural insight Need e.g. tools for finding first O ( k ) mismatches to a periodic string or finding all power stretches of a given string in a pc-string Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  51. Basic Definitions and General Overview New Structural Insights Faster Algorithm Faster Algorithm Theorem (Algorithm) Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O ( n k 3 ( k log k + log m ) + k m ) . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  52. Open Problems Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  53. Open Problems Improve insight to O ( k ) mismatches in the aperiodic case Theorem (Structural Insight ′ ) [KW’19+] For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k ) , or t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ Both t ′ and p have Hamming distance O ( k ) to the same periodic string x and all k -matches of p in t ′ start at a position 1 + i · | x | . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  54. Open Problems Improve insight to O ( k ) mismatches in the aperiodic case Improve dependence on k in the algorithm Theorem (Algorithm) Pattern matching with k mismatches on a text t given by an SLP of size n and a pattern p of length m can be solved in time O ( n k 3 ( k log k + log m ) + k m ) . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  55. Open Problems Improve insight to O ( k ) mismatches in the aperiodic case Improve dependence on k in the algorithm Fully-compressed setting ( p also given as an SLP) Pattern Matching with Errors (Edit distance instead of Hamming distance) Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  56. Solution Structure of Pattern Matching with Mismatches A 2 m / 3 − 1 A 2 m · · · · · · · · · t A A A A A A A A A p · · · · · · · · · A A A A A A A m Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  57. Solution Structure of Pattern Matching with Mismatches A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 · · · · · · · · · t A A A A B A A A B A A B at 2 m / 3, 4 m / 3 in t and the middle k + 1 positions in p p · · · · · · · · · A A A B B A A A B k + 1 A ( m − k − 1 ) / 2 A ( m − k − 1 ) / 2 Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  58. Solution Structure of Pattern Matching with Mismatches A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 · · · · · · · · · t A A A A B A A A B A A B at 2 m / 3, 4 m / 3 in t and the middle k + 1 positions in p p · · · · · · · · · A A A B B A A A B k + 1 A ( m − k − 1 ) / 2 A ( m − k − 1 ) / 2 All matches start at the union of two intervals. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  59. Solution Structure of Pattern Matching with Mismatches A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 − 1 A 2 m / 3 · · · · · · · · · t A A A A B A A A B A A B at 2 m / 3, 4 m / 3 in t and the middle k + 1 positions in p p · · · · · · · · · A A A B B A A A B k + 1 A ( m − k − 1 ) / 2 A ( m − k − 1 ) / 2 Insight 3 Arithmetic progression only approximates all matches Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  60. Main Result Theorem (Structural Insight) Given strings p of length m and t of length at most 2 m , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) . t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ There is a substring x of p , with | x | = O ( m / k ) , such that δ H ( p , x ∗ [ 1 , m ]) ≤ O ( k ) and δ H ( t ′ , x ∗ [ 1 , | t ′ | ]) ≤ O ( k ) . Moreover, any k -match of p in t ′ starts at a position of the form 1 + i · | x | with 0 ≤ i ≤ ( | t ′ | − | p | ) / | x | (but not every starting position 1 + i · | x | necessarily yields a k -match). Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  61. Main Result Theorem (Structural Insight) Given strings p of length m and t of length at most 2 m , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) . t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ There is a substring x of p , with | x | = O ( m / k ) , such that δ H ( p , x ∗ [ 1 , m ]) ≤ O ( k ) and δ H ( t ′ , x ∗ [ 1 , | t ′ | ]) ≤ O ( k ) . Moreover, any k -match of p in t ′ starts at a position of the form 1 + i · | x | with 0 ≤ i ≤ ( | t ′ | − | p | ) / | x | (but not every starting position 1 + i · | x | necessarily yields a k -match). Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  62. Main Result Theorem (Structural Insight) Given strings p of length m and t of length at most 2 m , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) . t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ There is a substring x of p , with | x | = O ( m / k ) , such that δ H ( p , x ∗ [ 1 , m ]) ≤ O ( k ) and δ H ( t ′ , x ∗ [ 1 , | t ′ | ]) ≤ O ( k ) . Moreover, any k -match of p in t ′ starts at a position of the form 1 + i · | x | with 0 ≤ i ≤ ( | t ′ | − | p | ) / | x | (but not every starting position 1 + i · | x | necessarily yields a k -match). Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  63. Main Result Theorem (Structural Insight) Given strings p of length m and t of length at most 2 m , at least one of the following holds: The number of k -matches of p in t is at most O ( k 2 ) . t ′ : shortest substring of t such that any k -match of p in t is also a k -match in t ′ There is a substring x of p , with | x | = O ( m / k ) , such that δ H ( p , x ∗ [ 1 , m ]) ≤ O ( k ) and δ H ( t ′ , x ∗ [ 1 , | t ′ | ]) ≤ O ( k ) . Moreover, any k -match of p in t ′ starts at a position of the form 1 + i · | x | with 0 ≤ i ≤ ( | t ′ | − | p | ) / | x | (but not every starting position 1 + i · | x | necessarily yields a k -match). Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  64. Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  65. Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. t P U N R A N P A M P A N t P A N C A K E P A N Finding ANPAN , k = 2 Finding PANPAN , k = 2 non-periodic case periodic case Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  66. Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. t P U N R A N P A M P A N t P A N C A K E P A N A N P A N p A N P A N Finding ANPAN , k = 2 Finding PANPAN , k = 2 non-periodic case periodic case Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  67. Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. t P U N R A N P A M P A N t P A N C A K E P A N A N P A N p A N P A N Finding ANPAN , k = 2 Finding PANPAN , k = 2 non-periodic case periodic case Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  68. Main Result Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. t P U N R A N P A M P A N t P A N C A K E P A N P A N P A N A N P A N p p P A N P A N A N P A N P A N P A N Finding ANPAN , k = 2 Finding PANPAN , k = 2 non-periodic case periodic case Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  69. Main Result, Proof Overview Theorem (Structural Insight) For pattern p and text t , | t | ≤ 2 | p | , it holds at least one of: The number of k -matches of p in t is at most O ( k 2 ) , and Both t and p have HD O ( k ) to the same periodic string. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  70. Main Result, Proof Overview Theorem (Structural Insight) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , then both t and p have a HD < 20 k to the same periodic string. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  71. Main Result, Proof Overview Theorem (Structural Insight) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , then both t and p have a HD < 20 k to the same periodic string. Main Steps: At least 1000 k 2 k -matches of p in t and p has a HD < 6 k to a specific periodic string x ∈ x ( p ) = ⇒ t has a Hamming Distance < 20 k to x p has HD ≥ 6 k to any specific periodic string x ∈ x ( p ) ⇒ Less than 1000 k 2 k -matches of p in t = Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  72. Main Result, Proof Overview Theorem (Structural Insight) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , then both t and p have a HD < 20 k to the same periodic string. Main Steps: At least 1000 k 2 k -matches of p in t and p has a HD < 6 k to a specific periodic string x ∈ x ( p ) = ⇒ t has a Hamming Distance < 20 k to x p has HD ≥ 6 k to any specific periodic string x ∈ x ( p ) ⇒ Less than 1000 k 2 k -matches of p in t = Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  73. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  74. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . t p Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  75. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . p p 1 Split p into 16 k parts p i of equal length Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  76. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . · · · · · · p p 1 p 1 p 2 p i p 16 k Split p into 16 k parts p i of equal length Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  77. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . p p 1 p i Fix a p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  78. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . p x i x i p 1 p i Consider prefix x i of p i that is also a period of p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  79. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . x ∗ x i x i x i x i x i x i x i x i x i x i x i x i i p x i x i p 1 p i Find first 3 k mismatches between p and x ∗ i before and after p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  80. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to a periodic string x ∈ x ( p ) , then t has HD < 20 k to x . x ∗ x i x i x i x i x i x i x i x i x i x i x i x i i p x i x i p 1 p i ≤ 3 k mism. ≤ 3 k mism. Find first 3 k mismatches between p and x ∗ i before and after p i Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

  81. Main Result, Proof Overview Lemma (Step 1) Fix a pattern p of length m and a text t of length at most 2 m . If the number of k -matches of p in t is at least 1000 k 2 , and p has HD < 6 k to some x ∗ i , 1 ≤ i ≤ 16 k , then t has HD < 20 k to x ∗ i . x ∗ x i x i x i x i x i x i x i x i x i x i x i x i i p x i x i p 1 p i < 6 k mism. Karl Bringmann, Marvin Künnemann, and Philip Wellnitz Faster Pattern Matching with Mismatches in Compressed Texts

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend