ho how to to co compr mpress ss hid hidden en markov
play

Ho How to to Co Compr mpress ss Hid Hidden en Markov Source - PowerPoint PPT Presentation

Ho How to to Co Compr mpress ss Hid Hidden en Markov Source ces Preetum Nakkiran Harvard University Joint works with: Venkatesan Guruswami, Madhu Sudan + Jarosaw Basiok, Atri Rudra Compression Problem: Given ! symbols from a


  1. Ho How to to Co Compr mpress ss Hid Hidden en Markov Source ces Preetum Nakkiran Harvard University Joint works with: Venkatesan Guruswami, Madhu Sudan + Jarosław Błasiok, Atri Rudra

  2. Compression • Problem: Given ! symbols from a probabilistic source, compress down to < ! symbols (ideally to “entropy” of the source) (s.t. decompression succeeds with high probability) • Sources: Usually iid. This talk: Hidden-Markov Model B(p) B(p) … B(p) 0.1 0 1 1 0 1 0 1 1 0 1 0.9 0.9 Compress B(0.1) B(0.5) 0 0 1 0 0.1 ≈ $!%&'() (Symbol alphabet can be arbitrary)

  3. Organization 1. Goal: Compressing Symbols • What/why 2. Polarization & Polar Codes (for iid sources) 3. Polar codes for Markov Sources

  4. Compression: Main Questions For a source distribution on (" # , " % , … , " ' ) : 1. How much can we compress? • [Shannon ‘48]: Down to the entropy ) " # , " % , … " ' [non-explicit] E.g. for iid Bernoulli(p): entropy = *)(+) . 2. Efficiency? • Efficiency of algorithms: compression/decompression * • Efficiency of code: Quickly approach the entropy rate * ,-./01, ↦ *) + + * #45 ,-./01, vs. * ,-./01, ↦ *) + + 0(*) Achieves within 6 of entropy rate ( * ,-./01, ↦ *[) + + 6] ) at blocklength * ≥ +01-( # : ) 3. Linearity? • Useful for channel coding (as we will see)

  5. Our Scheme: Compressing HMM sources Compression/decompression algorithms which, given the HMM source, achieve: 1. Poly-time compression/decompression 2. Linear 3. Rapidly approach entropy rate: For ! " ≔ ! $ , ! & , … ! " from source ( symbols ↦ 0 ! " + 2 3 $ ⋅ ( $56 symbols (for HMM with mixing time 2 ) • Previously unknown how to achieve all 3 above. Non-explicit: ( ↦ 0 ! " + ( [Lempel-Ziv]: ( ↦ 0 ! " + 7 ( . Nonlinear. But, works for unknown HMM. • Our result: Enabled by Polar Codes

  6. Detour: Compression ⇒ Error-Correction • Given a source ", corresponding Additive Channel: ( Alice ⊕ Bob Alice sends $ ∈ & ' Bob receives ) = $ + , for - = - . , - / , … - ( ∼ " - ∼ " • Linear compression scheme for - ∼ " ⇒ Linear error-correcting code for " -channel: ( → & ' (56 be compression matrix. Pe can be • Let 2: & ' decoded to e whp when - ∼ " • Alice encodes into nullspace(P): $ ∈ 7899(;) • Bob receives ) = $ + , P e • Bob computes ;) = ;$ + ;, = ;, , and recovers the error e Efficiency: compression which rapidly approaches entropy rate ⇒ code which rapidly approaches capacity

  7. Application: Correcting Markovian Errors • Our result yields efficient error- correcting codes for Markovian 0.1 errors. • Eg: Channel has two states, 0.9 0.9 BSC(0.1) BSC(0.5) “noisy” and “nice”, and transitions 0.1 between them. “Noisy” “Nice” channel channel

  8. Remainder of this talk • Focus on compressing Hidden-Markov Sources • For simplicity, alphabet = ! " The plan: 1. Polar codes for compressing iid Bernoulli(p) bits. 2. Reduce HMM to iid case

  9. Polar Codes • Linear compression / error-correcting codes • Introduced by [Arikan ‘08], efficiency first analyzed in [Guruswami-Xia ’13], extended in [BG N RS ’18] • Efficiency: First error-correcting codes to ``achieve capacity at polynomial blocklengths’’: within ! of capacity at blocklengths " ≥ $%&'( ) * ) • Simple, elegant, purely information-theoretic construction

  10. Compression via Polarization • Goal: Compress n iid Bernoulli(p) "(#) Set S: ( & ' ≈ . "(#) bits … P • Polarization ⇒ Compression: Set /: ( & ) | & ' ≈ 0 "(#) • Suppose we have invertible transform P such that, on input " # $ , first block of outputs (set S) have ≈ full entropy • Compression: Output & ' . • Decompression: Since ( & ) | & ' ≈ 0 , can guess & ) whp, then invert P to decompress.

  11. Polar Transform • The following 2x2 transform over H(X+Y) > H(X) X X + Y H(X) = H(Y) # ! " “polarizes” entropies: " H(Y| X+Y) < H(Y) Y Y 1 • Consider $, & iid B(p), for ' ∈ (0, 1) H ( X + Y ) H ( X ) • # " invertible ⟹ . $, & = .($ + &, &) H ( Y | X + Y ) • H(X + Y) > H(X) • Thus , H(Y | X+Y) < H(Y) 0 • Now recurse! t = 0 t = 1

  12. Polar Transform Consider # $ iid B(p), for % ∈ (0, 1) 1 W 1 X 1 ! W 2 " H ( W 1 ) X 2 H ( X i ) W 3 H ( W 2 | W 1 ) X 3 ! W 4 " X 4 0 t = 0 t = 1

  13. Polar Transform Consider # $ iid B(p), for % ∈ (0, 1) 1 W 1 Z 1 X 1 ! ! W 2 " " X 2 Z 2 H ( X i ) W 3 X 3 ! W 4 " X 4 0 t = 0 t = 1 t = 2

  14. Polar Transform Consider # $ iid B(p), for % ∈ (0, 1) 1 W 1 Z 1 X 1 ! ! W 2 " " X 2 Z 2 H ( X i ) W 3 X 3 Z 3 ! ! W 4 " " Z 4 X 4 0 t = 0 t = 1 t = 2

  15. Polar Transform Consider # $ iid B(p), for % ∈ (0, 1) Consider , - $ - .$ ) : 1 W 1 Z 1 X 1 ! ! W 2 " " X 2 Z 2 H ( X i ) W 3 X 3 Z 3 ! ! W 4 " " Z 4 X 4 0 t = 0 t = 1 t = 2 Hope: most of these entropies eventually close to 0 or 1

  16. Polar Transform • In general, the recursion is: ⊗& Equivalent to: ! " # ≝ ! "

  17. Analysis: Arikan Martingale 1 • Let ! " be entropy of a random wire conditioned on wires above it: H ( X i ) ! " = $ % " & % " [< &]) • + , forms a martingale - ! "./ | ! " = ! " 0 because entropy conserved t = 0 t = 1 t = 2 W 1 Z 1 X 1 1 1 W 2 2 2 X 2 Z 2 W 3 X 3 Z 3 1 1 W 4 2 2 Z 4 X 4

  18. Analysis: Arikan Martingale We want fast convergence : To achieve ! -close to entropy rate efficiently, ie w ith blocklength - = 2 0 = 1234( 6 7 ) , we need: 1/- ; 1 W 1 Z 1 X 1 < < W 2 ; ; X 2 Z 2 H ( X i ) W 3 X 3 Z 3 < < W 4 ; ; Z 4 X 4 0 t = 0 t = 1 t = 2

  19. Martingale Convergence 1 • NOT every [0, 1] martingale converges to 0 or 1: H ( X i ) • ! "#$ = ! " ± 2 (" • lim "→- ! " converges to Uniform[0, 1] 0 • Will introduce sufficient local conditions for t = 0 t = 1 t = 2 fast convergence: ``Local Polarization”

  20. Local Polarization Properties of the Martingale: 1. Variance in the Middle: 2. Suction at the Ends: and symmetrically for the upper end. Recall, we want to show: (easy to show these properties)

  21. Results of Polarization • So far: After ! = #(log 1/*) steps of polarization, the resulting polar code of blocklength , = 2 . = /012 3 4 has a set T of indices s.t: < ; =< ) ≈ 0 • ∀6 ∈ 8: : ; E(/) • 8 /, ≤ 1 − : / + * Set S: : ; C ≈ F E(/) … P Set 8: : ; D | ; C ≈ 0 E(/) • Compression: Output ; C • Decompression: Guess ; D given ; C (ML decoding) /26 21

  22. Polar Codes Inputs Auxiliary Info Theorem: For every distribution ! over ", $ , where " ∈ & ' , Let " = " ) , " * , … " , and $ = $ ) , $ * , … $ , where (" . , $ . ) ∼ ! 112 Then, entropies of 3 ≔ 5 , (") are polarized: ) ∀7: if 9 ≥ ;<=> ? , then all but 7 -fraction of indices 1 ∈ [9] have entropies C. , $) ∉ (9 EF , 1 − 9 EF ) B 3 . 3 3 " ) $ ) ) " * $ * All B(3 . 3 C. , $ ≈ 0 <K 1 … P 7 -fraction bad $ " , ,

  23. Compressing Hidden Markov Sources ! " • ! " , ! $ , … ! & are outputs of a Hidden-Markov Model ! $ • Not independent: Lots of dependencies between neighboring symbols … • Goal: Want to compress to within ' ! & + )* • First glance: everything breaks! • Polar code analysis (Martingale) relied on input being independent, identical • But, simple construction works… ! &

  24. Compression Construction • ! " , ! $ , … ! & : outputs of a stationary HMM ! " Output high- ! $ entropy set P • Mixing time ≪ ( … • Break input into ( blocks of ( . ) -fract. • Polarize the 1 st symbols of each block. P • These are approx. independent! • Then Polarize the 2 nd symbols • Polarizing , conditioned on P • Joint distribution of all {( , )} is approx. independent across blocks … • Output last ) -fraction of each block in the clear ! &

  25. Example • HMM: Marginally, ! # ! " is uniform bit ! $ Entire set P1 0.9 B(0.1) output … • P1: inputs have full B(0.9) 0.1 entropy Smaller set output P2 • P2: inputs have B(0.9) lower entropy, B(0.5) conditioned on P1 P3 B(0.1) … ! %

  26. Decompression Polar-decoder Black Box: 1 ! " 1 ! # 0 Input: P1 ? … • Product distribution on inputs 1 • Setting of high-entropy polarization outputs 0 1 Output: 1 P2 ? • Estimate of input ? 0 Markov decoding: P3 1. Decompress P1 outputs 2. Compute distribution of P2 inputs, 1 conditioned on P1 … 3. Decompress P2 outputs 4. … ! $

  27. Decompression: Extras Note: " # " $ C Could have done this with any black-box … compression scheme for independent, non- identically distributed symbols. C’ But: non-linear (and messy) • Linear compression black-box for every fixed distribution on symbols ⇏ overall C’’ linear compression … Polar codes are particularly suited for this " %

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend