trace reconstruction for deletion channels
play

Trace reconstruction for deletion channels Yuval Peres Microsoft - PowerPoint PPT Presentation

Trace reconstruction for deletion channels Yuval Peres Microsoft Research Based on joint work with Alex Zhai Fedor Nazarov Stanford University Kent State University December 24, 2017 Y. Peres (MSR) Trace reconstruction for deletion


  1. Bit statistics: the first bit For simplicity, take q = 1 / 2 (general case is similar). Natural first attempt: suppose y ∼ D q ( x ) and y ′ ∼ D q ( x ′ ). Does first bit of y look different from first bit of y ′ ? E y 0 = 1 2 x 0 + 1 4 x 1 + 1 8 x 2 + · · · 0 = 1 0 + 1 1 + 1 E y ′ 2 x ′ 4 x ′ 8 x ′ 2 + · · · Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 8 / 28

  2. Bit statistics: the first bit For simplicity, take q = 1 / 2 (general case is similar). Natural first attempt: suppose y ∼ D q ( x ) and y ′ ∼ D q ( x ′ ). Does first bit of y look different from first bit of y ′ ? E y 0 = 1 2 x 0 + 1 4 x 1 + 1 8 x 2 + · · · 0 = 1 0 + 1 1 + 1 E y ′ 2 x ′ 4 x ′ 8 x ′ 2 + · · · If x and x ′ agree in first k digits, then | E y 0 − E y ′ 0 | is only ≈ 2 − k . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 8 / 28

  3. Bit statistics: the first bit For simplicity, take q = 1 / 2 (general case is similar). Natural first attempt: suppose y ∼ D q ( x ) and y ′ ∼ D q ( x ′ ). Does first bit of y look different from first bit of y ′ ? E y 0 = 1 2 x 0 + 1 4 x 1 + 1 8 x 2 + · · · 0 = 1 0 + 1 1 + 1 E y ′ 2 x ′ 4 x ′ 8 x ′ 2 + · · · If x and x ′ agree in first k digits, then | E y 0 − E y ′ 0 | is only ≈ 2 − k . Exponentially many samples needed: Requires at least 2 k traces to distinguish. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 8 / 28

  4. The key identity We can try other output bits y j besides y 0 . For y j to come from x k , this bit and exactly j bits among x 0 , . . . , x k − 1 should be retained, so Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 9 / 28

  5. The key identity We can try other output bits y j besides y 0 . For y j to come from x k , this bit and exactly j bits among x 0 , . . . , x k − 1 should be retained, so � k � � E y j = 1 1 x k . 2 k 2 j k ≥ j Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 9 / 28

  6. The key identity We can try other output bits y j besides y 0 . For y j to come from x k , this bit and exactly j bits among x 0 , . . . , x k − 1 should be retained, so � k � � E y j = 1 1 x k . 2 k 2 j k ≥ j Formula for E y j is best summarized by a generating function identity:   � w + 1 � k n − 1 n − 1 � �  = 1  y j w j . E x k 2 2 j =0 k =0 Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 9 / 28

  7. Using the key identity   � w + 1 � k n − 1 n − 1 � �  = 1  y j w j Ψ y ( w ) := E x k . 2 2 j =0 k =0 Goal: find small w so that Ψ y ( w ) and Ψ y ′ ( w ) differ substantially. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 10 / 28

  8. Using the key identity   � w + 1 � k n − 1 n − 1 � �  = 1  y j w j Ψ y ( w ) := E x k . 2 2 j =0 k =0 Goal: find small w so that Ψ y ( w ) and Ψ y ′ ( w ) differ substantially. Letting z = w +1 2 , we have n − 1 � Ψ y ( w ) − Ψ y ′ ( w ) = 1 k ) z k . ( x k − x ′ 2 k =0 Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 10 / 28

  9. Using the key identity   � w + 1 � k n − 1 n − 1 � �  = 1  y j w j Ψ y ( w ) := E x k . 2 2 j =0 k =0 Goal: find small w so that Ψ y ( w ) and Ψ y ′ ( w ) differ substantially. Letting z = w +1 2 , we have n − 1 � Ψ y ( w ) − Ψ y ′ ( w ) = 1 k ) z k . ( x k − x ′ 2 k =0 Suffices to find small z so that RHS of above expression is large. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 10 / 28

  10. The maximum of a polynomial on a small arc Theorem (Borwein-Erd´ elyi) Let n − 1 � a k z k f ( z ) = k =0 be a polynomial with coefficients a 0 = 1 and | a k | ≤ 1 . For any arc of length 1 / L on the unit circle, there is a point z on the arc such that | f ( z ) | ≥ e − cL , where c is a universal constant. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 11 / 28

  11. The maximum of a polynomial on a small arc Theorem (Borwein-Erd´ elyi) Let n − 1 � a k z k f ( z ) = k =0 be a polynomial with coefficients a 0 = 1 and | a k | ≤ 1 . For any arc of length 1 / L on the unit circle, there is a point z on the arc such that | f ( z ) | ≥ e − cL , where c is a universal constant. Apply to n − 1 � ( x j − x ′ j ) z j , f ( z ) = j =0 dividing out by a power of z if needed. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 11 / 28

  12. The maximum of a polynomial on a small arc Theorem (Borwein-Erd´ elyi) Let n − 1 � a k z k f ( z ) = k =0 be a polynomial with coefficients a 0 = 1 and | a k | ≤ 1 . For any arc of length 1 / L on the unit circle, there is a point z on the arc such that | f ( z ) | ≥ e − cL , where c is a universal constant. Apply to n − 1 � ( x j − x ′ j ) z j , f ( z ) = j =0 dividing out by a power of z if needed. Can find z in given arc so that | Ψ y ( w ) − Ψ y ′ ( w ) | ≥ e − cL . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 11 / 28

  13. How to make w small? Choose z near 1. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 12 / 28

  14. How to make w small? Choose z near 1. If z = e i θ , then | w | = 1 + O ( θ 2 ) . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 12 / 28

  15. How to make w small? Choose z near 1. If z = e i θ , then | w | = 1 + O ( θ 2 ) . With θ = O (1 / L ), we obtain | w | = 1 + O (1 / L 2 ) . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 12 / 28

  16. Using the key identity (cont’d) Conclusion: � � � � n − 1 � � � � � � � E y j − E y ′ w j ≥ e − cL � � j � � j =0 ⇒ | w j | ≤ e Cn / L 2 . We may assume where | w | = 1 + O (1 / L 2 ) = C > c . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 13 / 28

  17. Using the key identity (cont’d) Conclusion: � � � � n − 1 � � � � � � � E y j − E y ′ w j ≥ e − cL � � j � � j =0 ⇒ | w j | ≤ e Cn / L 2 . We may assume where | w | = 1 + O (1 / L 2 ) = C > c . Thus there is some j such that j | ≥ 1 ne − CL − Cn / L 2 ≥ e − 3 Cn 1 / 3 =: ǫ. | E y j − E y ′ (taking L = n 1 / 3 to minimize L + n / L 2 and absorbing 1 / n term) Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 13 / 28

  18. Using the key identity (cont’d) Conclusion: � � � � n − 1 � � � � � � � E y j − E y ′ w j ≥ e − cL � � j � � j =0 ⇒ | w j | ≤ e Cn / L 2 . We may assume where | w | = 1 + O (1 / L 2 ) = C > c . Thus there is some j such that j | ≥ 1 ne − CL − Cn / L 2 ≥ e − 3 Cn 1 / 3 =: ǫ. | E y j − E y ′ (taking L = n 1 / 3 to minimize L + n / L 2 and absorbing 1 / n term) ⇒ T = e 7 cn 1 / 3 samples suffice to detect the difference in means: = Probability of choosing wrongly between x and x ′ is e − Ω( T ǫ 2 ) which is much smaller than 2 − n . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 13 / 28

  19. Reducing complexity To avoid enumerating over all 2 n possible input strings, one can use linear programming, following Holenstein et al (2008). Suppose that x 0 , . . . , x m − 1 have been reconstructed and we wish to determine x m . Write � T y j for the empirical average of the output bits 1 t =1 y t j . Let L := n 1 / 3 T and consider two linear programs (one where x m = 0 and one where x m = 1) in the relaxed variables x m +1 , . . . , x n in [0 , 1]: � k � � | E ( y j ) − y j | < e − cL where E ( y j ) = 1 1 x k . 2 k 2 j k ≥ j Only one of these programs (either the LP determined by x m = 0 or by x m = 1) will be feasible if C is large enough and T = e 7 cn 1 / 3 . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 14 / 28

  20. Borwein-Erd´ elyi theorem: sketch of proof Take Γ to be a curve overlapping with the unit circle in an arc of length 1 / L , as shown. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 15 / 28

  21. Borwein-Erd´ elyi theorem: sketch of proof Take Γ to be a curve overlapping with the unit circle in an arc of length 1 / L , as shown. Since f is analytic, log | f ( x ) | is subharmonic. Thus, 0 = log | f (0) | � ≤ log | f ( z ) | d ω ( z ) . z ∈ Γ Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 15 / 28

  22. Borwein-Erd´ elyi theorem: sketch of proof Rearranging yields � � log | f ( z ) | d ω ( z ) ≥ − log | f ( z ) | d ω ( z ) . z blue z green Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 16 / 28

  23. Borwein-Erd´ elyi theorem: sketch of proof Rearranging yields � � log | f ( z ) | d ω ( z ) ≥ − log | f ( z ) | d ω ( z ) . z blue z green For | z | < 1, we have � ∞ 1 | z | j = | f ( z ) | ≤ 1 − | z | . j =0 Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 16 / 28

  24. Borwein-Erd´ elyi theorem: sketch of proof Rearranging yields � � log | f ( z ) | d ω ( z ) ≥ − log | f ( z ) | d ω ( z ) . z blue z green For | z | < 1, we have � ∞ 1 | z | j = | f ( z ) | ≤ 1 − | z | . j =0 Can show that this implies green part is O (1). Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 16 / 28

  25. Borwein-Erd´ elyi theorem: sketch of proof Rearranging yields � � log | f ( z ) | d ω ( z ) ≥ − log | f ( z ) | d ω ( z ) . z blue z green For | z | < 1, we have � ∞ 1 | z | j = | f ( z ) | ≤ 1 − | z | . j =0 Can show that this implies green part is O (1). This means log | f ( z ) | must be at least e − O ( L ) somewhere on blue part, or else the integral over blue part is too negative. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 16 / 28

  26. The Borwein-Erdelyi theorem is sharp. As shown in [NP] and [DOS], this implies that for some c > 0 and all n large enough, there exist input strings x , x ′ of length n such that the corresponding outputs satisfy j | < e − cn 1 / 3 for all j . Thus if T = e o ( n 1 / 3 ) , then we cannot | E y j − E y ′ distinguish between x , x ′ by a linear test. However, the existence of such a pair x , x ′ is proved via a pigeonhole argument, and we are unable to produce them explicitly. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 17 / 28

  27. Reconstruction of random strings Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 18 / 28

  28. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  29. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  30. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Greedy matching: try to fit y as a subsequence of x , gets within log n . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  31. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Greedy matching: try to fit y as a subsequence of x , gets within log n . Aligning subsequences: analyze subsequences more carefully to align within log 1 / 2 n . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  32. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Greedy matching: try to fit y as a subsequence of x , gets within log n . Aligning subsequences: analyze subsequences more carefully to align within log 1 / 2 n . Use bit statistics as before to reconstruct next several bits. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  33. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Greedy matching: try to fit y as a subsequence of x , gets within log n . Aligning subsequences: analyze subsequences more carefully to align within log 1 / 2 n . Use bit statistics as before to reconstruct next several bits. However, alignment is not exact! But approach can be modified to tolerate random shifts. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  34. Overview of strategy From now on, fix q < 1 / 2 and write p = 1 − q > 1 / 2. Given a trace y , figure out roughly which position in y corresponds to last reconstructed position so far. Two steps: Greedy matching: try to fit y as a subsequence of x , gets within log n . Aligning subsequences: analyze subsequences more carefully to align within log 1 / 2 n . Use bit statistics as before to reconstruct next several bits. However, alignment is not exact! But approach can be modified to tolerate random shifts. Can only tolerate a small amount of shifting, hence need to align accurately. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 19 / 28

  35. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  36. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  37. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  38. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  39. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  40. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  41. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  42. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  43. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  44. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  45. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  46. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  47. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  48. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  49. Greedy matching Suppose we see both the input x and the output y . We still don’t know which bits came from where. Nevertheless, we can try to fit y as a subsequence of x . Simple approach: just map bits in y “greedily” to the first possible match in x . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 20 / 28

  50. Greedy matching The “true location” (gray arrows) advances like a geometric with mean 1 / p . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 21 / 28

  51. Greedy matching The “true location” (gray arrows) advances like a geometric with mean 1 / p . The location given by greedy algorithm (red arrows) advances like a geometric with mean 2 > 1 / p , capped at hitting the true location. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 21 / 28

  52. Greedy matching The “true location” (gray arrows) advances like a geometric with mean 1 / p . The location given by greedy algorithm (red arrows) advances like a geometric with mean 2 > 1 / p , capped at hitting the true location. Gap between true and greedy location is like a random walk biased towards zero Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 21 / 28

  53. Greedy matching The “true location” (gray arrows) advances like a geometric with mean 1 / p . The location given by greedy algorithm (red arrows) advances like a geometric with mean 2 > 1 / p , capped at hitting the true location. Gap between true and greedy location is like a random walk biased towards zero = ⇒ stays O (log n ) over the course of the length- n string. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 21 / 28

  54. Aligning by subsequences To get subpolynomial, need to align more precisely than log n . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 22 / 28

  55. Aligning by subsequences To get subpolynomial, need to align more precisely than log n . Consider a block of length log n and focus on the middle a := log 1 / 2 ( n ) bits. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 22 / 28

  56. Aligning by subsequences To get subpolynomial, need to align more precisely than log n . Consider a block of length log n and focus on the middle a := log 1 / 2 ( n ) bits. After deletion channel, it becomes a subsequence of length ≈ pa . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 22 / 28

  57. Aligning by subsequences To get subpolynomial, need to align more precisely than log n . Consider a block of length log n and focus on the middle a := log 1 / 2 ( n ) bits. After deletion channel, it becomes a subsequence of length ≈ pa . But could this subsequence come from elsewhere (bad event) ? Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 22 / 28

  58. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  59. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Bad event covered by two unlikely events (of probability ≈ e − const · a ) Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  60. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Bad event covered by two unlikely events (of probability ≈ e − const · a ) 1. Only pa bits are retained from a block of length > b . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  61. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Bad event covered by two unlikely events (of probability ≈ e − const · a ) 1. Only pa bits are retained from a block of length > b . 2. A random string of length < b has a specific length pa string as a substring. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  62. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Bad event covered by two unlikely events (of probability ≈ e − const · a ) 1. Only pa bits are retained from a block of length > b . 2. A random string of length < b has a specific length pa string as a substring. #1 only depends on randomness of deletion, not input Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  63. Aligning by subsequences Pick b such that (1 + ǫ ) a < b < (2 − ǫ ) pa . Bad event covered by two unlikely events (of probability ≈ e − const · a ) 1. Only pa bits are retained from a block of length > b . 2. A random string of length < b has a specific length pa string as a substring. #1 only depends on randomness of deletion, not input #2 only depends on randomness of input, not deletion Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 23 / 28

  64. Aligning by subsequences By “most” we will mean all but e − const · a . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 24 / 28

  65. Aligning by subsequences By “most” we will mean all but e − const · a . We say an input is good if most length- pa subsequences of its middle a bits cannot be found elsewhere as subsequences of blocks of length b . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 24 / 28

  66. Aligning by subsequences By “most” we will mean all but e − const · a . We say an input is good if most length- pa subsequences of its middle a bits cannot be found elsewhere as subsequences of blocks of length b . For good input, we can align to the middle a bits by finding a subsequence of length pa . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 24 / 28

  67. Aligning by subsequences By “most” we will mean all but e − const · a . We say an input is good if most length- pa subsequences of its middle a bits cannot be found elsewhere as subsequences of blocks of length b . For good input, we can align to the middle a bits by finding a subsequence of length pa . Most inputs are good. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 24 / 28

  68. Putting it all together Greedy matching can align to within log n . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 25 / 28

  69. Putting it all together Greedy matching can align to within log n . In a typical random block of length log n , can align to within log 1 / 2 n . But this fails in a fraction e − const · log 1 / 2 n ≫ 1 n of blocks. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 25 / 28

  70. Putting it all together Greedy matching can align to within log n . In a typical random block of length log n , can align to within log 1 / 2 n . But this fails in a fraction e − const · log 1 / 2 n ≫ 1 n of blocks. Not all blocks will be good, but among log 1 / 2 n consecutive blocks there will (most likely) be a good one. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 25 / 28

  71. Putting it all together Recall: using bit statistics we can recover m bits using e O ( m 1 / 3 ) traces. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 26 / 28

  72. Putting it all together Recall: using bit statistics we can recover m bits using e O ( m 1 / 3 ) traces. Modification of proof allows us to tolerate random shifts by O ( m 1 / 3 ). Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 26 / 28

  73. Putting it all together Recall: using bit statistics we can recover m bits using e O ( m 1 / 3 ) traces. Modification of proof allows us to tolerate random shifts by O ( m 1 / 3 ). Can align to within log 1 / 2 n and want to reconstruct log 3 / 2 n bits ahead. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 26 / 28

  74. Putting it all together Recall: using bit statistics we can recover m bits using e O ( m 1 / 3 ) traces. Modification of proof allows us to tolerate random shifts by O ( m 1 / 3 ). Can align to within log 1 / 2 n and want to reconstruct log 3 / 2 n bits ahead. Number of traces used is e O (log 1 / 2 n ) = n o (1) . Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 26 / 28

  75. Random strings and arbitrary deletion probability Holden-Pemantle-Peres’17: For arbitrary deletion probability q ∈ [0 , 1) we can reconstruct random strings with e O (log 1 / 3 n ) = n o (1) traces. We also allow insertions and substitutions. Further improvement for random strings cannot be obtained without an improvement for worst-case strings. Nina Holden Robin Pemantle MIT University of Pennsylvania Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 27 / 28

  76. Alignment with error log( n ) reconstructed bits unknown bits w log 5 / 3 n ? � w p log 5 / 3 n Was � w likely obtained by sending w through the deletion channel? Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 28 / 28

  77. Alignment with error log( n ) reconstructed bits unknown bits w w log 5 / 3 n ? log 2 / 3 n � w � w p log 5 / 3 n p log 2 / 3 n Was � w likely obtained by sending w through the deletion channel? Divide � w and w into log n blocks. Let S be the number of corresponding blocks in � w and w with the same majority bit. Answer YES if S > (1 / 2 + c ) log n ; answer NO otherwise. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 28 / 28

  78. Alignment with error log( n ) reconstructed bits unknown bits w w log 5 / 3 n ? log 2 / 3 n � w � w p log 5 / 3 n p log 2 / 3 n Was � w likely obtained by sending w through the deletion channel? Divide � w and w into log n blocks. Let S be the number of corresponding blocks in � w and w with the same majority bit. Answer YES if S > (1 / 2 + c ) log n ; answer NO otherwise. Repeat with all strings � w of appropriate length. Y. Peres (MSR) Trace reconstruction for deletion channels December 24, 2017 28 / 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend