Analysis of Lempel-Ziv 78 for Markov sources
Ph Jacquet, W. Szpankowski Inria – Purdue U
the material is made available under the CC-BY-4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
Analysis of Lempel-Ziv 78 for Markov sources Ph Jacquet, W. - - PowerPoint PPT Presentation
Analysis of Lempel-Ziv 78 for Markov sources Ph Jacquet, W. Szpankowski Inria Purdue U the material is made available under the CC-BY-4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode Lempel Ziv algorithm Among
the material is made available under the CC-BY-4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode
Trees, Probab. Th. Rel. Fields, 1988.
Overflow, IEEE Trans. Information Theory, 1991
and digital search trees. Theoretical Computer Science, 1995
for digital search, Theoretical computer science, 1995
Lossy Data Compression, IEEE Trans. Information Theory, 1997
Algorithms and Combinatorial Structures, The Annals of Applied Probability, 2004
Markov model, DMTCS, 2005.
Models for Data Structures: The External Path Length in Tries under the Markov Model, Algorithms, SODA, 2013
1 2 3 2+a 1 2 3 a 0+a 1+b 1+a 2+a
a
a 1 2 3 ……
Jacquet, P., & Szpankowski, W. (1995). Asymptotic behavior of the Lempel-Ziv parsing scheme and digital search trees. Theoretical Computer Science, 144(1-2), 161-197.
– P. Jacquet, W. Szpankowski, Asymptotic behavior of the Lempel- Ziv parsing scheme and digital search trees. Theoretical Computer Science, 1995
" log 𝑛 + 𝛾(𝑛) with
" 𝑤(𝑛) with 𝑤(𝑛) = 𝑃(log 𝑛)
𝜖 𝜖𝑨 𝑄 𝑨, 𝑣 = 𝑄 𝑞!𝑣𝑨, 𝑣 𝑄(𝑞"𝑣𝑨, 𝑣) 𝑄 𝑀# − 𝐹[𝑀#] 𝑤𝑏𝑠(𝑀#) ∈ [𝑦, 𝑦 + 𝑒𝑦[ → 1 2𝜌 exp − 𝑦$ 2 𝑒𝑦
=
𝐹 𝑁% = ℓ&' 𝑜 + 𝑃 𝑜( , 𝑥𝑗𝑢ℎ 𝜀 > 1/2 𝐹 𝑁% ~ ℎ𝑜 log 𝑜 𝑤𝑏𝑠 𝑁% ~ 𝑤 ℓ&' 𝑜 ℓ) ℓ&' 𝑜
$ = 𝑃
𝑜 log$ 𝑜 𝐹 𝐷% − ℎ~ℎ log 𝐵 − 𝛾 ℓ&' 𝑜 log 𝑜 = 𝑃 1 log 𝑜
bbaababbaababaaababbaababababbabbaababbbbaabaababbaaaabbabbbabbbbba correlation
𝐹 𝑁% = ℓ&' 𝑜 + 𝑃 𝑜( , 𝑥𝑗𝑢ℎ ℓ 𝑛 ~𝑛 log 𝑛 ℎ 𝑤𝑏𝑠 𝑁% = 𝑃(𝑜$() 𝐹 𝐷% = 𝑃 1 log 𝑜
the Lempel-Ziv parsing scheme for a Markovian source. Algorithmica, 31(3), 318-360.
a b 𝑄
#,% !
= 𝑄 𝑀# = 𝑜 𝑏𝑚𝑚 𝑡𝑢𝑏𝑠𝑢𝑡 𝑥𝑗𝑢ℎ 𝑏 = 𝑄(𝑀#
! = 𝑜)
𝜖 𝜖𝑨 𝑄
! 𝑨, 𝑣 = 𝑄 !(𝑞!!𝑣𝑨, 𝑣)𝑄 "(𝑞!"𝑣𝑨, 𝑣)
𝐹 𝑀#
!
= 𝑛 ℎ (log 𝑛 + 𝛾!(𝑛)) 𝑤𝑏𝑠 𝑀#
!
= 𝑛𝑤! 𝑛 = 𝑃(𝑛 log 𝑛)
𝛾+ 𝑛 = 𝛾 𝑛 + 𝑃(𝑛&,) 𝛾 𝑛 = ̅ 𝛾 + 𝑄
a b b 𝑄
#,.,% +
= 𝑄 𝑈
# = 𝑙 & 𝑀# = 𝑜 𝑏𝑚𝑚 𝑡𝑢𝑏𝑠𝑢 𝑥𝑗𝑢ℎ 𝑑)
𝑑 ∈ {𝑏, 𝑐} 𝑄
+ 𝑨, 𝑣, 𝑤 = ] #,.,%
𝑄
#,.,% +
𝑣%𝑤. 𝑨# 𝑛! 𝜖 𝜖𝑨 𝑄
+ 𝑨, 𝑣, 𝑤 = 𝑞+!𝑤 + 𝑞+" 𝑄 ! 𝑞+!𝑣𝑨, 𝑣, 𝑤 𝑄 "(𝑞+"𝑣𝑨, 𝑣, 𝑤)
# , 𝑈 " # ) is asymptotically normal
" #
_(log 𝑛) with P1(.) periodic when the
# , 𝑈 " #
",' =
",' = 𝑄 ",'
a b b 𝒬
#,% =
]
#!,.,%!
𝑄
#!,.,%! !
𝑄
#&#!,#!&.,%&%! "
b b b a b a b a a b b b a a a a b b b 𝜏 = (𝑏, 𝑐, 𝑏, 𝑐, 𝑐, 𝑐) 𝜏! = 𝑏, 𝑐, 𝑐 . 𝜏" = 𝑏, 𝑐, 𝑐
",$ = ∑ % &" 𝒬 %,$ and 𝑄",',$ !
%,$ !
𝒬
0,% = 𝑄(𝑛 𝑔𝑗𝑠𝑡𝑢 𝑢𝑏𝑗𝑚 𝑡𝑧𝑛𝑐𝑝𝑚 𝑔𝑝𝑚𝑚𝑝𝑥 𝜏 & 𝑑𝑝𝑤𝑓𝑠 𝑚𝑓𝑜𝑢ℎ 𝑜)
𝑄
0,% +
= 𝑄 𝐸𝑇𝑈 𝑢𝑏𝑗𝑚 𝑡𝑧𝑛𝑐𝑝𝑚 𝑔𝑝𝑚𝑚𝑝𝑥 𝜏 & 𝑞𝑏𝑢ℎ 𝑚𝑓𝑜𝑢ℎ 𝑗𝑡 𝑜 𝑡𝑓𝑟𝑣𝑓𝑜𝑑𝑓𝑡 𝑡𝑢𝑏𝑠𝑢 𝑥𝑗𝑢ℎ 𝑑) 𝒬
#,% =
]
0! 1|0"|3#
𝑄
0!,%! !
𝑄
0",%&%! "
𝒬
#,% ≠
]
#! ,%!,.
𝑄
#!,.,%! !
𝑄
#&#!,#!&.,%&%! "
b a a b a a b b a a b a
𝒬
#,% ≤ ∑ 0! 1|0"|3# 𝑄 0!,%! !
𝑄
0",%&%! "
= ∑#! ,%!,. 𝑄
#!,.,%! !
𝑄
#&#!,#!&.,%&%! "
𝒬
#,% ≤
]
#! ,%!,.
𝑄
#!,.,%! !
𝑄
#&#!,#!&.,%&%! "
+ ∑#! ,%!,. 𝑄
#!,.,%! !
𝑄
#&#!,#!&.&',%&%! "
+ ∑#! ,%!,. 𝑄
#!,.,%! !
𝑄
#&#!,#!&.1',%&%! "
𝒬
#,% ≤ 3 ∑#! ,%!,. 𝑄 #!,.,%! !
𝑄
#&#!,#!&.,%&%! "
!
(
*
– The distribution would be sub-gaussian and the claimed results would hold.
𝒬
#,% ≤
3 2𝜌 𝐷𝑛 log 𝑛 exp − 𝑜 − 𝑛 log 𝑛 ℎ − 𝛾(𝑛) − 𝑛 ℎ ̅ 𝜐 ℎ − 1
$
2𝑛 𝐷 𝑛 log 𝑛
𝒬
#,% ≤ 𝐶𝑛'1( exp − 𝑜 − 𝑛 log 𝑛
ℎ − 𝛾 𝑛 − 𝑛 ℎ ̅ 𝜐 ℎ − 1 𝐷𝑛( ℎ 𝑦 = −𝑦 log 𝑦 − 1 − 𝑦 log(1 − 𝑦)