Ho How to to Co Compr mpress ss Hid Hidden en Markov Source - - PowerPoint PPT Presentation

ho how to to co compr mpress ss hid hidden en markov
SMART_READER_LITE
LIVE PREVIEW

Ho How to to Co Compr mpress ss Hid Hidden en Markov Source - - PowerPoint PPT Presentation

Ho How to to Co Compr mpress ss Hid Hidden en Markov Source ces Preetum Nakkiran Harvard University Joint works with: Venkatesan Guruswami, Madhu Sudan + Jarosaw Basiok, Atri Rudra Compression Problem: Given ! symbols from a


slide-1
SLIDE 1

Ho How to to Co Compr mpress ss Hid Hidden en Markov Source ces

Preetum Nakkiran

Harvard University Joint works with: Venkatesan Guruswami, Madhu Sudan + Jarosław Błasiok, Atri Rudra

slide-2
SLIDE 2

Compression

  • Problem: Given ! symbols from a probabilistic source, compress

down to < ! symbols (ideally to “entropy” of the source)

(s.t. decompression succeeds with high probability)

  • Sources: Usually iid. This talk: Hidden-Markov Model

1 1 1 1 1 1 1 Compress B(p) B(p) B(p) … (Symbol alphabet can be arbitrary) ≈ $!%&'() 0.9

B(0.5) B(0.1)

0.9 0.1 0.1

slide-3
SLIDE 3

Organization

  • 1. Goal: Compressing Symbols
  • What/why
  • 2. Polarization & Polar Codes (for iid sources)
  • 3. Polar codes for Markov Sources
slide-4
SLIDE 4

Compression: Main Questions

For a source distribution on ("#, "%, … , "'): 1. How much can we compress?

  • [Shannon ‘48]: Down to the entropy ) "#, "%, … "' [non-explicit]

E.g. for iid Bernoulli(p): entropy = *)(+).

2. Efficiency?

  • Efficiency of algorithms: compression/decompression *
  • Efficiency of code: Quickly approach the entropy rate

* ,-./01, ↦ *) + + *#45 ,-./01,

  • vs. * ,-./01, ↦ *) + + 0(*)

Achieves within 6 of entropy rate ( * ,-./01, ↦ *[) + + 6] ) at blocklength * ≥ +01-(#

:)

  • 3. Linearity?
  • Useful for channel coding (as we will see)
slide-5
SLIDE 5

Our Scheme: Compressing HMM sources

Compression/decompression algorithms which, given the HMM source, achieve:

  • 1. Poly-time compression/decompression
  • 2. Linear
  • 3. Rapidly approach entropy rate: For !" ≔ !$, !&, … !" from source

( symbols ↦ 0 !" + 23 $ ⋅ ($56 symbols

  • Previously unknown how to achieve all 3 above.

Non-explicit: ( ↦ 0 !" + ( [Lempel-Ziv]: ( ↦ 0 !" + 7 ( . Nonlinear. But, works for unknown HMM.

  • Our result: Enabled by Polar Codes

(for HMM with mixing time 2)

slide-6
SLIDE 6

Detour: Compression ⇒ Error-Correction

  • Given a source ", corresponding Additive Channel:

Alice sends $ ∈ &'

(

Bob receives ) = $ + , for - = -., -/, … -( ∼ "

  • Linear compression scheme for - ∼ " ⇒ Linear

error-correcting code for "-channel:

  • Let 2: &'

( → &' (56 be compression matrix. Pe can be

decoded to e whp when - ∼ "

  • Alice encodes into nullspace(P): $ ∈ 7899(;)
  • Bob receives ) = $ + ,
  • Bob computes ;) = ;$ + ;, = ;,, and recovers the error e

Efficiency: compression which rapidly approaches entropy rate ⇒ code which rapidly approaches capacity

P e Alice Bob ⊕

  • ∼ "
slide-7
SLIDE 7

Application: Correcting Markovian Errors

  • Our result yields efficient error-

correcting codes for Markovian errors.

  • Eg: Channel has two states,

“noisy” and “nice”, and transitions between them.

0.9

BSC(0.5) BSC(0.1)

0.9 0.1 0.1 “Noisy” channel “Nice” channel

slide-8
SLIDE 8

Remainder of this talk

  • Focus on compressing Hidden-Markov Sources
  • For simplicity, alphabet = !"

The plan:

  • 1. Polar codes for compressing iid Bernoulli(p) bits.
  • 2. Reduce HMM to iid case
slide-9
SLIDE 9

Polar Codes

  • Linear compression / error-correcting codes
  • Introduced by [Arikan ‘08], efficiency first analyzed in [Guruswami-Xia ’13],

extended in [BGNRS ’18]

  • Efficiency: First error-correcting codes to ``achieve capacity at polynomial

blocklengths’’: within ! of capacity at blocklengths " ≥ $%&'()

*)

  • Simple, elegant, purely information-theoretic construction
slide-10
SLIDE 10

Compression via Polarization

  • Goal: Compress n iid Bernoulli(p)

bits

  • Polarization ⇒ Compression:
  • Suppose we have invertible transform

P such that, on input " # $, first block

  • f outputs (set S) have ≈ full entropy
  • Compression: Output &'.
  • Decompression: Since ( &) | &' ≈ 0,

can guess &) whp, then invert P to decompress.

P "(#) "(#) … "(#) Set S: ( &' ≈ . Set /: ( &) | &' ≈ 0

slide-11
SLIDE 11

Polar Transform

  • The following 2x2 transform over

!"“polarizes” entropies:

#

"

X Y X + Y Y

  • Consider $, & iid B(p), for ' ∈ (0, 1)
  • #" invertible ⟹ . $, & = .($ + &, &)
  • H(X + Y) > H(X)
  • Thus, H(Y | X+Y) < H(Y)
  • Now recurse!

H(X+Y) > H(X) H(X) = H(Y) H(Y| X+Y) < H(Y)

H(X) H(X + Y ) H(Y |X + Y ) 1 t = 1 t = 0

slide-12
SLIDE 12

Polar Transform

X1 X2 X3 X4 !

"

!

"

W1 W2 W3 W4 Consider #$ iid B(p), for % ∈ (0, 1)

H(Xi) H(W1) H(W2|W1) 1 t = 1 t = 0

slide-13
SLIDE 13

H(Xi) 1 t = 1 t = 0 t = 2

Polar Transform

X1 X2 X3 X4 !

"

!

"

!

"

W1 W2 W3 W4 Z1 Z2 Consider #$ iid B(p), for % ∈ (0, 1)

slide-14
SLIDE 14

Polar Transform

X1 X2 X3 X4 !

"

!

"

!

"

!

"

W1 W2 W3 W4 Z1 Z2 Z3 Z4 Consider #$ iid B(p), for % ∈ (0, 1)

H(Xi) 1 t = 1 t = 0 t = 2

slide-15
SLIDE 15

Polar Transform

X1 X2 X3 X4 !

"

!

"

!

"

!

"

W1 W2 W3 W4 Z1 Z2 Z3 Z4 Consider #$ iid B(p), for % ∈ (0, 1) Consider , -$ -.$): Hope: most of these entropies eventually close to 0 or 1

H(Xi) 1 t = 1 t = 0 t = 2

slide-16
SLIDE 16

Polar Transform

  • In general, the recursion is:

Equivalent to: !"# ≝ !

" ⊗&

slide-17
SLIDE 17

Analysis: Arikan Martingale

  • Let !" be entropy of a random wire

conditioned on wires above it: !" = $ %" & %"[< &])

  • +, forms a martingale
  • !"./ | !" = !"

because entropy conserved

X1 X2 X3 X4 1

2

1

2

1

2

1

2

W1 W2 W3 W4 Z1 Z2 Z3 Z4

H(Xi) 1 t = 1 t = 0 t = 2

slide-18
SLIDE 18

Analysis: Arikan Martingale

We want fast convergence: To achieve !-close to entropy rate efficiently, ie with blocklength - = 20 = 1234(

6 7) , we need:

1/-; X1 X2 X3 X4 <

;

<

;

<

;

<

;

W1 W2 W3 W4 Z1 Z2 Z3 Z4

H(Xi) 1 t = 1 t = 0 t = 2

slide-19
SLIDE 19

Martingale Convergence

  • NOT every [0, 1] martingale converges to 0
  • r 1:
  • !"#$ = !" ± 2("
  • lim"→- !" converges to Uniform[0, 1]
  • Will introduce sufficient local conditions for

fast convergence: ``Local Polarization”

H(Xi) 1 t = 1 t = 0 t = 2

slide-20
SLIDE 20

Recall, we want to show:

Local Polarization

Properties of the Martingale:

  • 1. Variance in the Middle:
  • 2. Suction at the Ends:

and symmetrically for the upper end.

(easy to show these properties)

slide-21
SLIDE 21

Results of Polarization

  • So far: After ! = #(log 1/*) steps of polarization, the resulting polar

code of blocklength , = 2. = /012

3 4 has a set T of indices s.t:

  • ∀6 ∈ 8: : ;

< ;=<) ≈ 0

  • 8 /, ≤ 1 − : / + *
  • Compression: Output ;C
  • Decompression: Guess ;D given ;C (ML decoding)

P E(/) E(/) … E(/) Set S: : ;C ≈ F Set 8: : ;D | ;C ≈ 0

21 /26

slide-22
SLIDE 22

Polar Codes

Theorem: For every distribution ! over ", $ , where " ∈ &', Let " = "), "*, … ", and $ = $

), $ *, … $ , where ("., $ .) ∼ ! 112

Then, entropies of 3 ≔ 5

,(") are polarized:

∀7: if 9 ≥ ;<=>

) ? , then all but 7-fraction of indices 1 ∈ [9]

have entropies B 3

. 3 C., $) ∉ (9EF, 1 − 9EF)

Inputs Auxiliary Info P ") "* … ", $

)

$

*

$

,

All B(3

. 3 C., $ ≈ 0 <K 1

7-fraction bad 3

)

slide-23
SLIDE 23

Compressing Hidden Markov Sources

  • !", !$, … !& are outputs of a Hidden-Markov Model
  • Not independent: Lots of dependencies between

neighboring symbols

  • Goal: Want to compress to within ' !& + )*
  • First glance: everything breaks!
  • Polar code analysis (Martingale) relied on input being

independent, identical

  • But, simple construction works…

!" !$ … !&

slide-24
SLIDE 24

Compression Construction

  • !", !$, … !&: outputs of a stationary HMM
  • Mixing time ≪ (
  • Break input into

( blocks of (.

  • Polarize the 1st symbols of each block.
  • These are approx. independent!
  • Then Polarize the 2nd symbols
  • Polarizing

, conditioned on

  • Joint distribution of all {( , )} is approx.

independent across blocks

  • Output last )-fraction of each block in the

clear

!" !$ … !& P P P …

)-fract.

Output high- entropy set

slide-25
SLIDE 25

Example

  • HMM: Marginally,

!" is uniform bit

  • P1: inputs have full

entropy

  • P2: inputs have

lower entropy, conditioned on P1

!# !$ … !% P1 P2 P3 … 0.9 0.1

B(0.9) B(0.1) B(0.9) B(0.1) B(0.5)

Entire set

  • utput

Smaller set

  • utput
slide-26
SLIDE 26

Decompression

Polar-decoder Black Box: Input:

  • Product distribution on inputs
  • Setting of high-entropy polarization outputs

Output:

  • Estimate of input

Markov decoding: 1. Decompress P1 outputs 2. Compute distribution of P2 inputs, conditioned on P1 3. Decompress P2 outputs 4. …

!" !# … !$ P1 P2 P3 …

1 ? 1 1 ? ? 1 1 1

slide-27
SLIDE 27

Decompression: Extras

Note: Could have done this with any black-box compression scheme for independent, non- identically distributed symbols. But: non-linear (and messy)

  • Linear compression black-box for every

fixed distribution on symbols ⇏ overall linear compression Polar codes are particularly suited for this

"# "$ … "% C C’ C’’ …