Mass Error-Correction Codes for Polymer-Based Data Storage Ryan - - PowerPoint PPT Presentation

mass error correction codes for polymer based data storage
SMART_READER_LITE
LIVE PREVIEW

Mass Error-Correction Codes for Polymer-Based Data Storage Ryan - - PowerPoint PPT Presentation

Motivation Problem Statement Construction THANK YOU! Mass Error-Correction Codes for Polymer-Based Data Storage Ryan Gabrys A joint work with S. Pattabiraman and O. Milenkovic ISIT June 8 th , 2020 Motivation Problem Statement Construction


slide-1
SLIDE 1

Motivation Problem Statement Construction THANK YOU!

Mass Error-Correction Codes for Polymer-Based Data Storage

Ryan Gabrys

A joint work with S. Pattabiraman and O. Milenkovic

ISIT

June 8th, 2020

slide-2
SLIDE 2

Motivation Problem Statement Construction THANK YOU!

Motivation

slide-3
SLIDE 3

Motivation Problem Statement Construction THANK YOU!

Protein Sequencing

▸ A protein is a long sequence of amino acids whose composition and order determine the protein’s functionality.

slide-4
SLIDE 4

Motivation Problem Statement Construction THANK YOU!

Protein Sequencing

▸ A protein is a long sequence of amino acids whose composition and order determine the protein’s functionality. ▸ Mass spectrometry (M/S) has emerged an an important technique for sequencing proteins.

slide-5
SLIDE 5

Motivation Problem Statement Construction THANK YOU!

Protein Sequencing

▸ A protein is a long sequence of amino acids whose composition and order determine the protein’s functionality. ▸ Mass spectrometry (M/S) has emerged an an important technique for sequencing proteins. ▸ The molecular masses of fragments of the protein sequence are then determined as the output of this mass spectrometry.

slide-6
SLIDE 6

Motivation Problem Statement Construction THANK YOU!

Protein Sequencing

▸ A protein is a long sequence of amino acids whose composition and order determine the protein’s functionality. ▸ Mass spectrometry (M/S) has emerged an an important technique for sequencing proteins. ▸ The molecular masses of fragments of the protein sequence are then determined as the output of this mass spectrometry. ▸ From these molecular masses, the identities of the corresponding amino acids can be determined.

slide-7
SLIDE 7

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-8
SLIDE 8

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-9
SLIDE 9

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-10
SLIDE 10

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-11
SLIDE 11

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-12
SLIDE 12

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Under this model, previous work in [ADMOP15] studied how to recover a string s

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-13
SLIDE 13

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Under this model, previous work in [ADMOP15] studied how to recover a string s

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-14
SLIDE 14

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Under this model, previous work in [ADMOP15] studied how to recover a string s provided its composition multi-set C(s).

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-15
SLIDE 15

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Under this model, previous work in [ADMOP15] studied how to recover a string s provided its composition multi-set C(s).

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-16
SLIDE 16

Motivation Problem Statement Construction THANK YOU!

Model

▸ The composition multi-set C(s) of a string s = (s1, . . . , sn) ∈ {0, 1}n is the multiset C(s) = {{si, si+1, . . . , sj} ∶ 1 ≤ i ≤ j ≤ n}

  • f all ( n + 1

2 ) contiguous substrings of s. ▸ As an example, if s = (0, 1, 0, 0), then, C(s) = {{0}, {1}, {0}, {0}, {01}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Under this model, previous work in [ADMOP15] studied how to recover a string s provided its composition multi-set C(s).

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-17
SLIDE 17

Motivation Problem Statement Construction THANK YOU!

Previous Work

▸ The following results are known from [ADMOP15]:

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-18
SLIDE 18

Motivation Problem Statement Construction THANK YOU!

Previous Work

▸ The following results are known from [ADMOP15]: Theorem All strings of length one less than a prime are reconstructable.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-19
SLIDE 19

Motivation Problem Statement Construction THANK YOU!

Previous Work

▸ The following results are known from [ADMOP15]: Theorem All strings of length one less than a prime are reconstructable. Theorem All strings of length one less than twice a prime are reconstructable.

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-20
SLIDE 20

Motivation Problem Statement Construction THANK YOU!

Problem Statement

slide-21
SLIDE 21

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (1/2)

▸ Motivated by applications in polymer-based storage, we wish to consider the setup where s belongs to a code and errors may occur.

slide-22
SLIDE 22

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (1/2)

▸ Motivated by applications in polymer-based storage, we wish to consider the setup where s belongs to a code and errors may occur. ▸ For any string s ∈ {0, 1}n, we say that t-composition errors have occurred to C(s) resulting in ˜ Ct(s) if

slide-23
SLIDE 23

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (1/2)

▸ Motivated by applications in polymer-based storage, we wish to consider the setup where s belongs to a code and errors may occur. ▸ For any string s ∈ {0, 1}n, we say that t-composition errors have occurred to C(s) resulting in ˜ Ct(s) if ∣ ˜ Ct(s)∣ = ∣C(s)∣,

slide-24
SLIDE 24

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (1/2)

▸ Motivated by applications in polymer-based storage, we wish to consider the setup where s belongs to a code and errors may occur. ▸ For any string s ∈ {0, 1}n, we say that t-composition errors have occurred to C(s) resulting in ˜ Ct(s) if ∣ ˜ Ct(s)∣ = ∣C(s)∣, and ∣C(s) △ ˜ Ct(s)∣ ≤ 2t, where C(s) △ ˜ Ct(s) denotes their symmetric difference.

slide-25
SLIDE 25

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (1/2)

▸ Motivated by applications in polymer-based storage, we wish to consider the setup where s belongs to a code and errors may occur. ▸ For any string s ∈ {0, 1}n, we say that t-composition errors have occurred to C(s) resulting in ˜ Ct(s) if ∣ ˜ Ct(s)∣ = ∣C(s)∣, and ∣C(s) △ ˜ Ct(s)∣ ≤ 2t, where C(s) △ ˜ Ct(s) denotes their symmetric difference.

slide-26
SLIDE 26

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (2/2)

▸ We say that any a code C is a t-composition error-correcting code (t-CECC for short) if for any distinct s, s′ ∈ C ˜ Ct(s) ≠ ˜ Ct(s′),

slide-27
SLIDE 27

Motivation Problem Statement Construction THANK YOU!

Composition Error-Correcting Codes (2/2)

▸ We say that any a code C is a t-composition error-correcting code (t-CECC for short) if for any distinct s, s′ ∈ C ˜ Ct(s) ≠ ˜ Ct(s′), where ˜ Ct(s), ˜ Ct(s′), are the result of at most t-composition errors to C(s), C(s′), respectively.

slide-28
SLIDE 28

Motivation Problem Statement Construction THANK YOU!

Example of Composition Errors

▸ Suppose s = (0, 1, 0, 0) so that

slide-29
SLIDE 29

Motivation Problem Statement Construction THANK YOU!

Example of Composition Errors

▸ Suppose s = (0, 1, 0, 0) so that C(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}.

slide-30
SLIDE 30

Motivation Problem Statement Construction THANK YOU!

Example of Composition Errors

▸ Suppose s = (0, 1, 0, 0) so that C(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Let ˜ C1(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 0}, {0, 1, 0, 0}}.

slide-31
SLIDE 31

Motivation Problem Statement Construction THANK YOU!

Example of Composition Errors

▸ Suppose s = (0, 1, 0, 0) so that C(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Let ˜ C1(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 0}, {0, 1, 0, 0}}. Since ∣ ˜ C1(s) △ C(s)∣ = 2,

slide-32
SLIDE 32

Motivation Problem Statement Construction THANK YOU!

Example of Composition Errors

▸ Suppose s = (0, 1, 0, 0) so that C(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 1}, {0, 1, 0, 0}}. ▸ Let ˜ C1(s) = {{0}, {1}, {0}, {0}, {0, 1}, {0, 1}, {0, 0}, {0, 0, 1}, {0, 0, 0}, {0, 1, 0, 0}}. Since ∣ ˜ C1(s) △ C(s)∣ = 2, we say that ˜ C1(s) is the result of a single composition error occurring to C(s).

slide-33
SLIDE 33

Motivation Problem Statement Construction THANK YOU!

Problem Statement

▸ We will be interested in the following two problems:

slide-34
SLIDE 34

Motivation Problem Statement Construction THANK YOU!

Problem Statement

▸ We will be interested in the following two problems:

1

How many bits of redundancy are sufficient to construct a t-CECC?

slide-35
SLIDE 35

Motivation Problem Statement Construction THANK YOU!

Problem Statement

▸ We will be interested in the following two problems:

1

How many bits of redundancy are sufficient to construct a t-CECC?

2

How can we construct a t-CECC that has a small amount of redundancy but possesses an efficient encoding/decoding algorithm?

slide-36
SLIDE 36

Motivation Problem Statement Construction THANK YOU!

Summary of Results

1

We show that at most O(t) + log n bits of redundancy are sufficient to construct a t-CECC.

  • S. Pattabiraman, R. Gabrys, and O. Milenkovic, “Reconstruction and

error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-37
SLIDE 37

Motivation Problem Statement Construction THANK YOU!

Summary of Results

1

We show that at most O(t) + log n bits of redundancy are sufficient to construct a t-CECC.

2

We construct a systematic code with O(t2 log n) bits of redundancy and decoding complexity O(n3).

  • S. Pattabiraman, R. Gabrys, and O. Milenkovic, “Reconstruction and

error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-38
SLIDE 38

Motivation Problem Statement Construction THANK YOU!

Summary of Results

1

We show that at most O(t) + log n bits of redundancy are sufficient to construct a t-CECC.

2

We construct a systematic code with O(t2 log n) bits of redundancy and decoding complexity O(n3). ▸ Our results from this work are an extension of our preliminary paper [PGM19], where we constructed a 1-CECC.

  • S. Pattabiraman, R. Gabrys, and O. Milenkovic, “Reconstruction and

error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-39
SLIDE 39

Motivation Problem Statement Construction THANK YOU!

Summary of Results

1

We show that at most O(t) + log n bits of redundancy are sufficient to construct a t-CECC.

2

We construct a systematic code with O(t2 log n) bits of redundancy and decoding complexity O(n3). ▸ Our results from this work are an extension of our preliminary paper [PGM19], where we constructed a 1-CECC. ▸ For the remainder of the talk, we will focus on 2).

  • S. Pattabiraman, R. Gabrys, and O. Milenkovic, “Reconstruction and

error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-40
SLIDE 40

Motivation Problem Statement Construction THANK YOU!

Construction

slide-41
SLIDE 41

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials.

slide-42
SLIDE 42

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0.

slide-43
SLIDE 43

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0.

slide-44
SLIDE 44

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0.

slide-45
SLIDE 45

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0.

slide-46
SLIDE 46

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0.

slide-47
SLIDE 47

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-48
SLIDE 48

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-49
SLIDE 49

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-50
SLIDE 50

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-51
SLIDE 51

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-52
SLIDE 52

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-53
SLIDE 53

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-54
SLIDE 54

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-55
SLIDE 55

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3.

slide-56
SLIDE 56

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3. ▸ From [ADMOP15], Ps(x, y)Ps( 1 x, 1 y ) = (n + 1) + Ss(x, y) + Ss( 1 x, 1 y ).

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-57
SLIDE 57

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (1/2)

▸ In the following, and similar to [ADMOP15], we will interpret the vectors as polynomials. ▸ For example, for the vector s = (0, 1, 0, 0), we associate s with the following bivariate polynomial: Ps(x, y) = 1 + y + xy + xy2 + xy3, where we associate the symbol x with 1 and y with 0. ▸ Note that we can also interpret the composition multiset as a polynomial Ss(x, y) where Ss(x, y) = 1 + 3y + x + 2xy + y2 + 2xy2 + xy3. ▸ From [ADMOP15], Ps(x, y)Ps( 1 x, 1 y ) = (n + 1) + Ss(x, y) + Ss( 1 x, 1 y ).

[ADMOP15] J. Acharya, H. Das, O. Milenkovic, A. Orlitsky, and S. Pan, “String reconstruction from substring compositions,” SIAM J. Discrete Math, 2015.

slide-58
SLIDE 58

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (2/2)

▸ For a bivariate polynomial f(x, y), let f ∗(x, y) denote its reciprocal polynomial where f ∗(x, y) = xdxydyf ( 1 x, 1 y ) , where dx is the x-degree of f(x, y) and dy is the y-degree.

slide-59
SLIDE 59

Motivation Problem Statement Construction THANK YOU!

A Polynomial Interpretation (2/2)

▸ For a bivariate polynomial f(x, y), let f ∗(x, y) denote its reciprocal polynomial where f ∗(x, y) = xdxydyf ( 1 x, 1 y ) , where dx is the x-degree of f(x, y) and dy is the y-degree. ▸ Using this interpretation, we can write Ps(x, y)P ∗

s (x, y) = xdxydy (n + 1 + Ss(x, y)) + S∗ s(x, y),

where dx is the x-degree of Ps(x, y) and dy is the y-degree.

slide-60
SLIDE 60

Motivation Problem Statement Construction THANK YOU!

Some Useful Results

Lemma Suppose wt(s) mod 2t + 1 ≡ 0. Then, given ˜ Ss(x, y) one can generate Ps(x, y)P ∗(x, y) + ˜ E(x, y), where ˜ E(x, y) has at most 4t non-zero terms.

slide-61
SLIDE 61

Motivation Problem Statement Construction THANK YOU!

Some Useful Results

Lemma Suppose wt(s) mod 2t + 1 ≡ 0. Then, given ˜ Ss(x, y) one can generate Ps(x, y)P ∗(x, y) + ˜ E(x, y), where ˜ E(x, y) has at most 4t non-zero terms. Lemma Suppose f(x, y) has at most 4t non-zero terms where f(x, y) has total degree at most n. Then, we can uniquely recover f(x, y) provided f(αℓ1, αℓ2) = 0, where α ∈ Fq is a primitive element of a field, q ≥ 2n + 1, and ℓ1, ℓ2 ∈ [[4t]].

slide-62
SLIDE 62

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-63
SLIDE 63

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

1

Step 1: From ˜ Ss(x, y) generate Z(x, y) = Ps(x, y)P ∗

s (x, y) + ˜

E(x, y).

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-64
SLIDE 64

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

1

Step 1: From ˜ Ss(x, y) generate Z(x, y) = Ps(x, y)P ∗

s (x, y) + ˜

E(x, y).

2

Step 2: Since Z(αℓ1, αℓ2)

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-65
SLIDE 65

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

1

Step 1: From ˜ Ss(x, y) generate Z(x, y) = Ps(x, y)P ∗

s (x, y) + ˜

E(x, y).

2

Step 2: Since Z(αℓ1, αℓ2) = Ps(αℓ1, αℓ2)P ∗

s (αℓ1, αℓ2) + ˜

E(x, y)

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-66
SLIDE 66

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

1

Step 1: From ˜ Ss(x, y) generate Z(x, y) = Ps(x, y)P ∗

s (x, y) + ˜

E(x, y).

2

Step 2: Since Z(αℓ1, αℓ2) = Ps(αℓ1, αℓ2)P ∗

s (αℓ1, αℓ2) + ˜

E(x, y) = ˜ E(αℓ1, αℓ2),

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-67
SLIDE 67

Motivation Problem Statement Construction THANK YOU!

Our Approach

Theorem Let C ⊆ {0, 1}n be such that for any s ∈ C, Ps(αℓ1, αℓ2) = 0, wt(s) = 0 mod (2t + 1), where α ∈ Fq is primitive, q ≥ 2n + 1, ℓ1, ℓ2 ∈ [[4t]]. Then, C is a t-ECC. In light of the previous lemmas the proof of correctness follows from two basic steps:

1

Step 1: From ˜ Ss(x, y) generate Z(x, y) = Ps(x, y)P ∗

s (x, y) + ˜

E(x, y).

2

Step 2: Since Z(αℓ1, αℓ2) = Ps(αℓ1, αℓ2)P ∗

s (αℓ1, αℓ2) + ˜

E(x, y) = ˜ E(αℓ1, αℓ2), we can recover ˜ E(x, y). Finally, from Ps(x, y)P ∗

s (x, y), one can recover

Ps(x, y) [PGM19].

[PGM19]S. Pattabiraman et. al, “Reconstruction and error-correction codes for polymer-based data storage,” Information Theory Workshop, 2019.

slide-68
SLIDE 68

Motivation Problem Statement Construction THANK YOU!

Conclusion and Future Work

▸ Is it possible to generate an efficient systematic t-CECC with less redundancy?

slide-69
SLIDE 69

Motivation Problem Statement Construction THANK YOU!

Conclusion and Future Work

▸ Is it possible to generate an efficient systematic t-CECC with less redundancy? ▸ We have shown that t + log n bits of redundancy is sufficient to correct t composition errors.

slide-70
SLIDE 70

Motivation Problem Statement Construction THANK YOU!

Conclusion and Future Work

▸ Is it possible to generate an efficient systematic t-CECC with less redundancy? ▸ We have shown that t + log n bits of redundancy is sufficient to correct t composition errors. ▸ Upper bounds on t-CECC.

slide-71
SLIDE 71

Motivation Problem Statement Construction THANK YOU!

THANK YOU!