6 975 week 5 universal compression via grammar based
play

6.975 Week 5 Universal Compression Via Grammar Based Codes - PowerPoint PPT Presentation

6.975 Week 5 Universal Compression Via Grammar Based Codes Presenter: Emin Martinian Grammar Based Compression Initial data may contain complex relationships. Transform data to a basis with independent components. Use simple,


  1. 6.975 Week 5 Universal Compression Via Grammar Based Codes Presenter: Emin Martinian

  2. Grammar Based Compression • Initial data may contain complex relationships. • Transform data to a “basis” with independent components. • Use simple, memoryless compression on these components.

  3. Example: Suppose we want to compress x = c c c c a b a b c c c a b a b c c c a b. Let A 1 → a b, x = c c c c A 1 A 1 c c c A 1 A 1 c c c A 1 A 2 → c c c, x = A 2 c A 1 A 1 A 2 A 1 A 1 A 2 A 1 A 3 → A 1 A 1 A 2 , x = A 2 c A 3 A 3 A 1

  4. G x vs. Lempel-Ziv Transform: x = ccc, c, ababccc, ababccc, ab c, cc, ca, b, a, bc, cca, ba, bcc, cab A 0 A 1 A 2 A 3 A 4 A 5 A 6 A 7 A 8 A 9 A 10 → A 1 c → A 2 A 1 c → A 0 A 2 cA 3 A 3 A 1 A 3 A 1 a → → A 1 ab A 4 b → → A 2 ccc A 5 a → → A 3 A 1 A 1 A 2 A 6 A 4 c → → A 7 A 2 a → A 8 A 4 b → A 9 A 6 c → A 10 A 3 b →

  5. Context Free Grammar: G = ( V, T, P, S ) V = { A 0 , A 1 , A 2 , A 3 } T = { a, b, c } P = { A 0 → A 2 cA 3 A 3 A 1 , A 1 → ab, A 2 → ccc, A 3 → A 1 A 1 A 2 } S = A 0 L ( G ) = all strings derivable from G . Grammar Transform: x → G x where L ( G x ) = { x } .

  6. Advantages of Grammar Based Codes • Better matching of source correlations • Optimization for complexity, causality, side information, error resilience, etc. • Universal lossless compression

  7. Asymptotically Compact Grammars Asymptotically compact grammars defined as grammars which satisfy • ∀ x , G x ∈ G ∗ ( A ) • lim n →∞ max x ∈A n | G x | | x | = 0 asymptotically compact grammars yield Theorem 7: universal compression.

  8. Requirements for the set G ∗ ( A ) 1. ∀ A ∈ V ( G ), one rule in P ( G ) has left member A . 2. The empty string is not the right member of any rule. 3. L ( G ) is non-empty 4. G has no useless symbols. 5. Canonical variable naming. 6. f ∞ G ( A ) � = f ∞ G ( B ) for A � = B .

  9. Irreducible Grammar Transforms: A grammar, G , is called irreducible if 1. G ∈ G ∗ ( A ) 2. ∀ A ∈ V ( G ) /A 0 , A appears at least twice in the right members of P ( G ). 3. No ( Y 1 , Y 2 ) ∈ V ( G ) ∪ T ( G ) exists where Y 1 Y 2 appears more than once as a substring of P ( G ). Kieffer and Yang present rules to reduce any grammar to one satisfying these conditions.

  10. Encoding G = ( V, T, P, S ) • Canonical V ( G ) described by | V ( G ) | and requires | V ( G ) | bits in unary encoding. • T ∈ P ( A ) described by |A| bits in one-hot encoding. • S = A 0 in canonical encoding and requires 0 bits. Total = | V ( G ) | + |A| +0

  11. Encoding G = ( V, T, P, S ) To encode P we must describe f G ( A 0 ) , f G ( A 1 ) , . . . , f G ( A | V ( G ) |− 1 ) or equivalently we must describe | A 0 | , | A 1 | , . . . , | A | V ( G ) |− 1 | . using | G | bits in unary encoding and ∆ ρ G = f G ( A 0 ) f G ( A 1 ) . . . , f G ( A | V ( G ) |− 1 ) .

  12. Encoding G = ( V, T, P, S ) Instead of encoding ρ G directly, define ∆ ω G = ρ G with first occurence of each variable removed . Encode ρ G by • Indicating removed entries ( | G | bits) • Sending frequencies of V ( G ) ∪ T ( G ) occuring in ω G using unary encoding ( | G | bits) • Using frequencies to entropy code ω G ( ⌈ H ∗ ( ω G ) ⌉ bits) Total ≤ A + 4 | G | + ⌈ H ∗ ( ω G ) ⌉

  13. Bounding ⌈ H ∗ ( ω G ) ⌉ for G ∈ G ∗ ( A ) • There exists a σ = σ 1 σ 2 . . . σ t ∼ ω G , with f ∞ G ( σ ) = x . • Let π be the parsing π = ( f ∞ G ( σ 1 ) , f ∞ G ( σ 2 ) , . . . , f ∞ G ( σ t )) = x . • If ∀ ( A → α ) ∈ P , | α | > 1, then f ∞ G ( · ) is a one-to-one map between σ i and π i so H ∗ ( ω G ) = H ∗ ( σ ) = H ∗ ( π ). In any case, H ∗ ( ω G ) ≤ H ∗ ( π ) + | G | .

  14. Bounding ⌈ H ∗ ( π ) ⌉ Consider a k th-order finite state source, µ , and define m ∆ � � τ ( y ) = max p ( s i , y i | s i − 1 ) s 0 s 1 ,s 2 ,...,s m i =1 We design τ ( y ) to overestimate the probability y . To get a valid pdf, we normalize by Q k k − 1 | y | − 2 to obtain p ∗ ( y ) = Q k k − 1 | y | − 2 τ ( y ) , Q k ≥ 1 / 2 .

  15. Bounding ⌈ H ∗ ( π ) ⌉ (Continued) Combining t t � � H ∗ ( π ) = min − log p ∗ ( π i ) − log q ( π i ) ≤ q i =1 i =1 with � t � � t � t � � � p ∗ ( π i ) { 2 k | π i | 2 } µ ( x ) ≤ τ ( x ) ≤ τ ( π i ) = i =1 i =1 i =1 yields t � H ∗ ( π ) ≤ − log µ ( x ) + t (1 + log k ) + 2 log | π i | . i =1

  16. Summary For Encoding G x : � � | G x | | x | • Code length ≤ − log µ ( x ) + |A| + 5 | G x | + O | x | log . | G x | � � �� | G x | • Many parsings have H ∗ ( π ) near − log µ ( x ) + O ν . | x | • Obtaining universal codes requires choosing a parsing/grammar to make | G x | | x | small.

  17. Bounding | G x | / | x | for G x ∈ G ∗ ( A ) • Consider “worst case” G x ∈ G ∗ ( A ) which maximizes | G x | . • But for G x ∈ G ∗ ( A ), rule expansions must be unique. • So there are at most |A| l rules expanding to length l . • Create all rules of length l before any of length l + 1.

  18. Bounding | G x | / | x | for G x ∈ G ∗ ( A ) Exhausting all rules of length ≤ l requires l l |A| l +1 j |A| j ≥ � | x | ≥ ( |A| − 1) 2 . j =1 For G x ∈ G ∗ ( A ), rules are like A i → A i ′ α ( i.e. , | A i | = 2). 2 |A| j = 2( |A| l +1 − 1) l � | G x | ≤ . |A| − 1 j =1 Therefore ≤ 2( |A| l +1 − 1) · ( |A| − 1) 2 | G x | ≤ 2( |A| − 1) → 0 . l |A| l +1 | x | |A| − 1 l

  19. Encoding Summary: • Grammar encoding takes ≤ A + 4 | G | + ⌈ H ∗ ( ω G ) ⌉ bits. � � | G x | • H ∗ ( ω G ) ≈ H ∗ ( π ) ≈ − log µ ( x ) + ν | x | • For G ∈ G ∗ ( A ), | G x | | x | → 0.

  20. Conclusions • Grammar based codes provide a framework to build universal codes. • Many different parsings, π = ( π 1 , π 2 , . . . , π t ), yield H ∗ ( π ) = O ( − log µ ( x ) + t ). • Irreducible grammars yield π with t ≤ | G x | and | G x | | x | → 0 and also allow efficient encoding of π .

  21. Further Thoughts... Can grammar ideas be used in • universal lossy compression? • universal prediction/estimation? • error correction coding?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend