Sparse Regression Codes Sekhar Tatikonda (Yale University) in - PowerPoint PPT Presentation

Sparse Regression Codes Sekhar Tatikonda (Yale University) in collaboration with Ramji Venkataramanan (University of Cambridge) Antony Joseph (UC-Berkeley) Tuhin Sarkar (IIT-Bombay) Information and Control in Networks October 18, 2012

Summary • Lossy coding fundamental component of networked control • Efficient codes for lossy Gaussian source coding • Based on sparse regression Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

Gaussian Data Compression R bits/sample Codebook size 2 nR S = ˆ S 1 , . . . , ˆ ˆ S = S 1 , . . . , S n S n S i.i.d Gaussian source N (0 , σ 2 ) S � 2 ≤ D MSE distortion: 1 n � S − ˆ 2 log σ 2 Possible iff R > R ∗ ( D ) = 1 D 2 / 18

Achieving R ∗ ( D ) Shannon random coding S (2 nR ) } each ∼ i.i.d N (0 , σ 2 − D ) - { ˆ S (1) , . . . , ˆ Exponential storage & encoding complexity Lattice codes - compact representation - Conway-Sloane, Eyboglu-Forney, Zamir-Shamai-Erez, . . . GOAL: Compact representiation + Fast encoding & decoding 3 / 18

Related Work Sparse regression codes for source coding - [Kontoyiannis, Rad, Gitzenis ITW ’10] Comp. feasible constructions for finite alphabet sources: - Gupta, Verdu, Weissman [ISIT ’08] - Jalali, Weissman [ISIT ’10] - Kontoyiannias, Gioran [ITW’10] - LDGM codes: [Wainwright, Maneva, Martinian ’10] - Polar codes: [Korada, Urbanke ’10] 4 / 18

In this talk . . . Ensemble of codes based on sparse linear regression - For point-to-point & multi-terminal problems Provably achieve rates close to info-theoretic limits - with fast encoding + decoding Based on construction of Barron & Joseph for AWGN channel - Achieve capacity with fast decoding [ISIT ’10, Arxiv ’12] 6 / 20

Outline • Background • Sparse Regression Codes • Optimal Encoding • Practical Encoding • Multi-terminal Extensions • Conclusions

Sparse Regression Codes (SPARC) A : n × ML design matrix or ‘dictionary’ with i.i.d N (0 , 1) entries Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c, 0 , c, 0 , c, 0 , , 0 Codewords of the form A β c 2 = codeword variance - β : sparse ML × 1 binary vector, L 5 / 18

SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c, 0 , c, 0 , c, 0 , , 0 Choosing M and L For rate R codebook, need M L = 2 nR Shannon codebook: L = 1, M = 2 nR We choose M = L b ⇒ L ∼ Θ ( n / log n ) log n ) b +1 : polynomial in n n Size of A ∼ n × ( 6 / 18

Minimum Distance Encoding Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c, β : 0 , 0 , c, 0 , c, 0 , , 0 Encoder : Find ˆ β = argmin � S − A β � β S = A ˆ Decoder : Reconstruct ˆ β � 1 � S � 2 > D n � S − ˆ P n = P n log P n ⇒ P n < 1 ∼ e − nT Error Exponent : T = − lim sup n 7 / 18

Correlated Codewords Each codeword sum of L columns Codewords ˆ S ( i ) , ˆ S ( j ) dependent if they have common columns L b columns L b columns L b columns A : n rows Section 2 Section L Section 1 S ( i ) = M L − 1 − ( M − 1) L # codewords dependent with ˆ 11 / 18

Error Analysis for SPARC P ( E ) ≤ P ( | S | 2 ≥ a 2 ) + P ( E | | S | 2 < a 2 ) . � �� KL divergence ? � S ( i ) − S | 2 < D if | ˆ 1 Define U i ( S ) = 0 otherwise   2 nR � P ( E ( S ) | | S | 2 < a 2 ) = P U i ( S ) = 0 | | S | 2 < a 2   i =1 { U i ( S ) } are dependent 12 / 18

Dependency Graph A B For random variables { U i } i ∈I , any graph with vertex set I s.t: If A and B are two disjoint subsets of I such that there are no edges with one vertex in A and the other in B, then the families { U i } i ∈ A and { U i } i ∈ B are independent. 13 / 18

For our problem . . . � S ( i ) − S | 2 < D if | ˆ 1 i = 1 , . . . , 2 nR U i ( S ) = , 0 otherwise For the family { U i ( S ) } , { i ∼ j : i � = j and ˆ S ( i ) , ˆ S ( j ) share at least one common term } is a dependency graph. 14 / 18

Suen’s correlation inequality Let { U i } i ∈I , be Bernoulli rvs with dependency graph Γ. Then �� λ �� 2 , λ 2 8∆ , λ U i = 0 ≤ exp − min P 6 δ i ∈I where � λ = E U i , i ∈I ∆ = 1 � � E ( U i U j ) , 2 i ∈I j ∼ i � δ = max E U k . i ∈I k ∼ i 15 / 18

Optimal Error Exponent for Gaussian Source σ 2 = | 2 S | 2 log a 2 R = 1 | S | 2 = a 2 D [Ihara, Kubo ’00] 2 nR codewords i.i.d N (0 , a 2 − D ) P ( | S | 2 ≥ a 2 ) + P ( | S | 2 < a 2 ) · P ( error | | S | 2 < a 2 ) P n < � �� ∼ exp( − n D ( a 2 � σ 2 )) ↓ double-exponentially 8 / 18

Main Result L b columns L b columns L b columns A : Section 1 Section L n rows Theorem SPARCs with minimum distance encoding achieve the rate-distortion function with the optimal error exponent when 3 . 5 R b > R − (1 − 2 − 2 R ) . This is possible whenever D σ 2 < 0 . 203 log n ) b +1 elements n Codebook representation polynomial in n : n × ( 9 / 18

Performance: Min-distance Encoding 3.5 3 0.5 log σ 2 /D 2.5 Rate (bits) 2 1.5 1−D/ σ 2 1 0.5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 D/ σ 2 10 / 18

SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T 0 , c 1 , β : 0 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns Choosing M and L : For rate R codebook, need M L = 2 nR Choose M polynomial of n ⇒ L ∼ n / log n Storage Complexity ↔ Size of A : polynomial in n 10 / 20

A Simple Encoding Algorithm M columns A : Section 1 n rows T 0 , c 1 , β : 0 , Step 1: Choose column in Sec.1 that minimizes � X − c 1 A j � 2 - Max among inner products � X , A j � - ‘Residue’ R 1 = X − c 1 ˆ A 1 11 / 20

A Simple Encoding Algorithm M columns A : Section 2 n rows T β : 0 , c 2 , 0 , Step 2: Choose column in Sec.2 that minimizes � R 1 − c 2 A j � 2 - Max among inner products � R 1 , A j � - Residue R 2 = R 1 − c 2 ˆ A 2 11 / 20

A Simple Encoding Algorithm M columns A : Section L n rows T β : c L , 0 , , 0 Step L : Choose column in Sec. L that minimizes � R L − 1 − c L A j � 2 - Max among inner products � R L − 1 , A j � - Final residue R L = R L − 1 − c L ˆ A L 11 / 20

Performance Theorem (RV, Sarkar, Tatikonda ’12) The proposed encoding algorithm approaches the rate-distortion function with exponentially small probability of error. In particular, � Distortion > σ 2 e − 2 R + ∆ � ≤ e − L ∆ P for 1 ∆ ≥ log M . Computation Complexity ML inner products and comparisons ⇒ polynomial in n Storage Complexity Design matrix A : n × ML ⇒ polynomial in n 12 / 20

Point-to-point Communication Noise X Z ˆ M + M � X � 2 Z = X + Noise , ≤ P , Noise ∼ Normal (0 , N ) n SPARCs Provably good with low-complexity decoding - [Barron-Joseph, ISIT ’10,’11, Arxiv ’12] 13 / 20

SPARC Construction Section 1 Section 2 Section L M columns M columns M columns A : n rows T β : 0 , 0 , c 1 , 0 , c 2 , 0 , c L , 0 , , 0 n rows, ML columns β ↔ message, Codeword A β For rate R codebook, need M L = 2 nR - choose M polynomial of n ⇒ L ∼ n / log n Adaptive successive decoding achieves R < Capacity 14 / 20

Wyner-Ziv coding R ˆ X Encoder Decoder X Y Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) 16 / 20

Wyner-Ziv coding U R ˆ X Encoder Decoder X Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Encoder U = X + V , V ∼ N (0 , Q ) Quantize X to U σ 2 - Find U that minimizes � X − a U � 2 , a = σ 2 + Q 16 / 20

Wyner-Ziv coding U R ˆ X Encoder Decoder X 2 nR bins Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Encoder U = X + V , V ∼ N (0 , Q ) Quantize X to U σ 2 - Find U that minimizes � X − a U � 2 , a = σ 2 + Q 16 / 20

Wyner-Ziv coding U R ˆ X Encoder Decoder X 2 nR bins Y 2 nR 1 cwds Side-info Y = X + Z X ∼ N (0 , σ 2 ) , Z ∼ N (0 , N ) Decoder Y = aU + Z ′ Y = X + Z ← → Find U within bin that minimizes � Y − a U � 2 - Reconstruct ˆ X = E [ X | UY ] 16 / 20

Binning with SPARCs A : Section 1 Section 2 Section L M columns M columns M columns T c L , 0 , β : 0 , 0 , c 1 , , c 2 , 0 , , 0 Quantize X to a U using n × ML SPARC (rate R 1 ) 17 / 20

Binning with SPARCs M ′ A : Section 1 Section L M columns M columns M columns T c L , 0 , β : , c 2 , 0 , , 0 0 , 0 , c 1 , Quantize X to a U using n × ML SPARC (rate R 1 ) ( M / M ′ ) L = 2 nR 17 / 20

Sparse Regression Codes Sekhar Tatikonda (Yale University) in - PowerPoint PPT Presentation

Sparse Regression Codes Sekhar Tatikonda (Yale University) in collaboration with Ramji Venkataramanan (University of Cambridge) Antony Joseph (UC-Berkeley) Tuhin Sarkar (IIT-Bombay) Information and Control in Networks October 18, 2012

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Building Codes Building Codes Building Codes Building Codes 1 1 Builder Responsibilities

ECEN 5682 Theory and Practice of Error Control Codes Cyclic Codes Peter Mathys University of

Formal Modeling in Cognitive Science Source Codes Lecture 30: Codes; Kraft Inequality; Source

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Sparse regression DS-GA 1013 / MATH-GA 2824 Mathematical Tools for Data Science

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

CODES FOR ALL SEASONS Emina Soljanin, Bell Labs IN THE CLOUD? CODES Emina @ Bell Labs Codes at

G ENERALIZED R EED -S OLOMON CODES (GRS CODES ) A CHARACTERIZATION OF MDS CODES THAT HAVE AN ERROR

Lattices from Codes or Codes from Lattices Amin Sakzad Dept of Electrical and Computer Systems

Error-Correcting codes: Application of convolutional codes to Video Streaming Diego Napp

Information Theory Lecture 8 BCH codes BCH codes: R8.45 (R5.6) Decoding BCH (and

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

Latent Structure Beyond Sparse Codes Benjamin Recht Department of EECS and Statistics

Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt

The angular momentum of galaxies and their haloes: a tracer of galaxy evolution Lorenzo Posti

SPARC 2 Consultations January-February 2016 1 Outline Introduction to Compute Canada

Table of Contents I Planning Agents Classical Planning with a Given Horizon Adding Planning to

Three Graph Algorithms Shortest Distance Paths Distance/Cost of a path in weighted graph sum of

Modulated Sparse Regression Codes Kuan Hsieh and Ramji Venkataramanan University of Cambridge, UK

baryons, dark matter or non-circular motions? Isabel Santos-Santos Postdoctoral Research Fellow

Parallel Segmented Merge and Its Applications to Two Sparse Matrix Kernels Weifeng Liu, Norwegian

Enabling Sparse Matrix Computation in Multi-locale Chapel Tyler Simon Laboratory for Physical

Sambuz

Useful Links

Newsletter

Mail Us