Coding for DNA Storage in Live Organisms
Moshe Schwartz
Electrical & Computer Engineering Ben-Gurion University Israel
Coding for DNA Storage in Live Organisms Moshe Schwartz Electrical - - PowerPoint PPT Presentation
Coding for DNA Storage in Live Organisms Moshe Schwartz Electrical & Computer Engineering Ben-Gurion University Israel Based on joint works with: (alphabetically) Jehoshua Bruck Caltech Ohad Elishco Ben-Gurion
Electrical & Computer Engineering Ben-Gurion University Israel
Introduction 2 / 79
Introduction 3 / 79
Introduction 4 / 79
Introduction 5 / 79
Introduction 6 / 79
Introduction 7 / 79
1 An error ball: Its size and shape depend on
2 A packing of error balls: Its density affects
Introduction 8 / 79
u v w u w u v w u v w u v′ w u v w u w u v v w
Introduction 9 / 79
1Lander et al., Nature 2001. Introduction 10 / 79
2Lander et al., Nature 2001. Introduction 11 / 79
u v w u v w u v w u v w z u v w v u v vR w u v v w u v w v z Introduction 12 / 79
1 s ∈ S. 2 s′ ∈ S and T ∈ T imply T(s′) ∈ S.
Introduction 13 / 79
i,k (x) =
k
i,k
k
k
u v w u v w v Introduction 14 / 79
i,k (x) =
k
i,k
k
k
u v w u v v w Introduction 15 / 79
n→∞
Introduction 16 / 79
Introduction 17 / 79
Introduction 18 / 79
Introduction 19 / 79
k
k
k
End Duplication 20 / 79
k
n→∞
k
n→∞
End Duplication 21 / 79
k w1 w1
End Duplication 22 / 79
k w2 w1 w1 w2
k w1w2 w1w2
End Duplication 23 / 79
k
n→∞
k
t→∞
k
End Duplication 24 / 79
q
q × Z∗ q by,
q × Z∗ q → Zk q × Z∗ q,
Tandem Duplication 25 / 79
q Ttan
i,k
q
q × Z∗ q ζi,k
q × Z∗ q
q ,
i,k (x)) = ζi,k(φk(x)).
Tandem Duplication 26 / 79
Ttan
1,2
ζ1,2
Tandem Duplication 27 / 79
k
k
k
t
k
t→∞
Tandem Duplication 28 / 79
k
k (x, y1) can never be obtained from s.
Tandem Duplication 29 / 79
i,k (x) =
k =
i,k′
k = S(Σ, s, T tan k ).
u v w u v v w Tandem Duplication 30 / 79
k is fully expressive.
k
Tandem Duplication 31 / 79
1 ) log2(r + 1),
|Σ|−2
i=0
1 for which we can calculate the
Tandem Duplication 32 / 79
Tandem Duplication 33 / 79
1
Tandem Duplication 34 / 79
Tandem Duplication 35 / 79
G)u,v.
G)s,v.
n→∞
G)s,v)
Tandem Duplication 36 / 79
(Source: Wikipedia)
1 λ = λ(AG) max {|µ| : µ is an eigenvalue of AG} also
2 There exist y, x > 0, unique (up to scalar multiplication)
3 If y · xT = 1, then
n→∞
G = xT · y.
Tandem Duplication 37 / 79
1 and use
1 ).
|Σ|
|Σ|−1
2
1
Tandem Duplication 38 / 79
|Σ|
|Σ|−1
2
1
a|Σ| a|Σ|−1 a2 a1 a|Σ| a|Σ|−1 a2 a1 a1 a1 Tandem Duplication 39 / 79
1 1 1 1 1 1 ... ... 1 1 1 1 1 . . . 1 1
|Σ|−2
i=0
Tandem Duplication 40 / 79
k
k
k
k
k ) or improve the bounds on it.
Tandem Duplication 41 / 79
Tandem Duplication 42 / 79
Tandem Duplication 43 / 79
k
k
k
k
k
k,k′
Tandem Duplication 44 / 79
1
1
Pólya String Models 45 / 79
Pólya String Models 46 / 79
n→∞
w∈Σ∗
Pólya String Models 47 / 79
1
3 but Pr(S(3) = 0111) = 1 6.
Pólya String Models 48 / 79
1 , Stan 1
n→∞
n→∞
Pólya String Models 49 / 79
1 ) = 0.
1 ) 0. Additionally,
1 ) capComb(Stan 1 ) = 0,
Pólya String Models 50 / 79
1 History is a permutation. 2 Each permutation is equally likely.
Pólya String Models 51 / 79
1
Pólya String Models 52 / 79
1
Pólya String Models 53 / 79
1 Π01w – The set of history permutations that lead to a mutated
2 Ψw – The set of permutations from Sn with signature w. 3 For any string v ∈ {0, 1}ℓ, the set of positions where 0 is preceded
Pólya String Models 54 / 79
i∈Tw
Pólya String Models 55 / 79
i∈Tw
Pólya String Models 56 / 79
1
Pólya String Models 57 / 79
1
1 ) H2
Pólya String Models 58 / 79
1 X1, X2, . . . , Xn generates a random permutation
1
Pólya String Models 59 / 79
1
1 ) = lim sup n→∞
n→∞
1
n→∞
n−1
i=1
1
Pólya String Models 60 / 79
1
1
1 ) = lim sup n→∞
n−1
i=1
1
n→∞
n−1
i=1
Pólya String Models 61 / 79
1 ) = lim sup n→∞
n−1
i=1
1
n→∞
n−1
i=2
0 dx1
0 dx2
0 dx3
0 dx1
0 dx2
Pólya String Models 62 / 79
1 ) H2
1 ).
1 Find capProb(Stan 1 ). 2 We know nothing for duplication length 2.
Pólya String Models 63 / 79
1 An error ball: Its size and shape depend on
2 An error ball: Its size and shape depend on
3 A packing of error balls: Its density affects
Error-Correcting Codes 64 / 79
k
Error-Correcting Codes 65 / 79
k
k
k v. We
k(u) =
∗
k v
k(u) =
∗
k u
A∗
k (u)
D∗
k (u)
u Time
Error-Correcting Codes 66 / 79
k(u) ∩ D∗ k(v) = ∅.
k(v) ∩ C.
Error-Correcting Codes 67 / 79
q
q × Z∗ q by,
q × Z∗ q → Zk q × Z∗ q,
Error-Correcting Codes 68 / 79
q Ttan
i,k
q
q × Z∗ q ζi,k
q × Z∗ q
q ,
i,k (x)) = ζi,k(φk(x)).
Error-Correcting Codes 69 / 79
Ttan
1,2
ζ1,2
Error-Correcting Codes 70 / 79
A∗
k (u)
D∗
k (u)
u Time Rk(u)
k(v) = {v} we say v is irreducible. The set of
k(v) ∩ Irrk.
Error-Correcting Codes 71 / 79
k(v) must be of the form,
Error-Correcting Codes 72 / 79
k(u) ∩ D∗ k(v) = ∅ if and only if Rk(u) = Rk(v).
k(u) ∩ D∗ k(v) then
∗
k u ∗
k w
∗
k v ∗
k w,
Error-Correcting Codes 73 / 79
0y10m′ 1y20m′ 2 . . . 0m′ t−1yt0m′ t)
0 y10m′′ 1 y20m′′ 2 . . . 0m′′ t−1yt0m′′ t ).
0,m′′ 0 )y10max(m′ 1,m′′ 1 ) . . . 0max(m′ t−1,m′′ t−1)yt0max(m′ t,m′′ t )),
k w and v =
k w.
Error-Correcting Codes 74 / 79
1 Finding φk(v) = (x, y). 2 Reducing runs of 0’s in y modulo k to obtain y′. 3 Returning the answer φ−1
k (x, y′).
Error-Correcting Codes 75 / 79
3 : forms a regular language, unique root, positive (though not
|Σ| = 1 U ⊆ kN |Σ| = 2 U = {k} U ⊇ {1, 2} |Σ| 3 U = {k} U = {1, 2} U = {1, 2, 3}
3Jain et al., IEEE Trans. on Inform. Th. 2017. Conclusion 76 / 79
4Alon et al., ISIT 2016. 5Elishco et al., ISIT 2016. 6Jain et al., ISIT 2017. Conclusion 77 / 79
Conclusion 78 / 79
Conclusion 79 / 79