Data-Driven Ensembles for Deep and Hard-Decision Hybrid Decoding - - PowerPoint PPT Presentation

data driven ensembles for deep and hard decision
SMART_READER_LITE
LIVE PREVIEW

Data-Driven Ensembles for Deep and Hard-Decision Hybrid Decoding - - PowerPoint PPT Presentation

Introduction Introduction Introduction Data-Driven Ensembles for Deep and Hard-Decision Hybrid Decoding International Symposium on Information Theory Tomer Raviv Nir Raviv Yair Beery School of


slide-1
SLIDE 1

Introduction Introduction Introduction

Data-Driven Ensembles for Deep and Hard-Decision Hybrid Decoding

International Symposium on Information Theory Tomer Raviv Nir Raviv Yair Be’ery School of Electrical Engineering Tel Aviv University, Israel

June 2020

slide-2
SLIDE 2

I. Background II. On the Importance of Data

  • III. Data-Driven Ensembles
  • IV. Summary

Outline

slide-3
SLIDE 3

I. Background II. On the Importance of Data

  • III. Data-Driven Ensembles
  • IV. Summary

Outline

slide-4
SLIDE 4

Error Correction Codes

Error Correction Codes increases reliability of transmission by redundancy Assumptions Linear Codes AWGN channel Galois Field - GF(2) Encoding

slide-5
SLIDE 5

Simplified Communication System

slide-6
SLIDE 6

Parity Check Matrix

Property of linear codes Known as parity check matrix Size Each row corresponds to a check constraint Each column to a variable

slide-7
SLIDE 7
  • [1] Urbanke and Richardson ,“Modern Coding Theory” , 2008

Tanner Graph

slide-8
SLIDE 8

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-9
SLIDE 9

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-10
SLIDE 10

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-11
SLIDE 11

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-12
SLIDE 12

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-13
SLIDE 13

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-14
SLIDE 14

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-15
SLIDE 15

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

slide-16
SLIDE 16

Belief Propagation

[2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988

Hard decision rule:

slide-17
SLIDE 17

Suboptimal Performance

Cycles degrade performance of BP… How to compensate for cycles?

slide-18
SLIDE 18

Suboptimal Performance

Cycles degrade performance of BP… How to compensate for cycles? Deep learning!

slide-19
SLIDE 19

Deep learning

  • Assign weights over the edges
  • Choose appropriate loss
  • Apply SGD

Intuition Adjusting the weights compensates for small cycles

Weighted Belief Propagation

[3] Nachmani et al. , “Learning to Decode Linear Codes Using Deep Learning”, Sep 16

slide-20
SLIDE 20

WBP

[3] Nachmani et al. , “Learning to Decode Linear Codes Using Deep Learning”, Sep 16

slide-21
SLIDE 21

[3] Nachmani et al. , “Learning to Decode Linear Codes Using Deep Learning”, Sep 16

WBP

Binary cross entropy multiloss: The BP and channel exhibit symmetry Training is done with zero codeword

slide-22
SLIDE 22

I. Background II. On the Importance of Data

  • III. Data-Driven Ensembles
  • IV. Summary

Outline

slide-23
SLIDE 23
  • Incorporating domain knowledge into model
  • Constraints-free model

ML Model Selection in Decoding

slide-24
SLIDE 24
  • Incorporating domain knowledge into model

Choice of hypothesis class Learnable parameters assignment

ML Model Selection in Decoding

slide-25
SLIDE 25
  • Constraints-free model

Choice of any state-of-the-art architecture Less limitations on solution space

ML Model Selection in Decoding

[4] Tobias Gruber et al. “On deep learning-based channel decoding”. In: 51st Annual Conference on Information Sciences and Systems (CISS). 2017.

slide-26
SLIDE 26

Currently, the model based approach dominates Curse of dimensionality Yet, the model isn’t all…

ML Model Selection in Decoding

slide-27
SLIDE 27

Data Importance in Deep Learning

Training data is core in DL Not fully understood

slide-28
SLIDE 28

Classical ML

  • Limited amount
  • Distribution is unknown

Decoding ML

  • Unlimited amount
  • Distribution is specified by channel

Data Distinction

slide-29
SLIDE 29

I. Background II. On the Importance of Data

  • III. Data-Driven Ensembles
  • IV. Summary

Outline

slide-30
SLIDE 30

Ensembles

A single decoder is limited.. Combination of decoders is superior Divide and conquer principle

[5] Lior Rokach, “Pattern Classification using Ensemble Methods”, 2010

slide-31
SLIDE 31

Ensembles for Decoding

List-decoding is suboptimal ”there exists no clear evidence on which graph permutation performs best for a given input”

[6] Elkelesh et al., “Belief Propagation List Decoding of Polar Codes”, 2018

slide-32
SLIDE 32

Ensembles for Decoding

Key points decoding performance complexity

[6] Elkelesh et al., “Belief Propagation List Decoding of Polar Codes”, 2018

slide-33
SLIDE 33

Data-Driven Ensemble

slide-34
SLIDE 34

Data Specialized Decoders

Based on Hamming distance Error pattern computation Region assignment Match to dataset Train on Based on syndrome (won’t have time to cover)…

slide-35
SLIDE 35

Gating Function

Classical hard decision decoder (HDD) Outputs , calculate Gating types single-choice gating - for realizing all decoders gating - random-choice gating - randomly

slide-36
SLIDE 36

Combiner

Likelihood-maximizing function [7]:

[7] Ralf Koetter et al., “Characterizations of pseudo-codewords of (low-density) parity check codes”, 2007

slide-37
SLIDE 37

Results for BCH(63,45):

Experimental Results

CR-BCH(63,36) Code Waterfall - 0.3dB FER gain Error-floor – 1.25dB FER gain Compared to best previous results [8] 3-decoders

[8] Ishay Be’ery et al., "Active Deep Decoding of Linear Codes," in IEEE Transactions on Communications, Feb. 2020.

slide-38
SLIDE 38

Results for BCH(63,45):

Experimental Results

CR-BCH(63,45) Code Waterfall - 0.3dB FER gain Error-floor – 1dB FER gain Compared to best previous results [8] 3-decoders

[8] Ishay Be’ery et al., "Active Deep Decoding of Linear Codes," in IEEE Transactions on Communications, Feb. 2020.

slide-39
SLIDE 39

I. Background II. On the Importance of Data

  • III. Data-Driven Ensembles
  • IV. Summary

Outline

slide-40
SLIDE 40

Domain knowledge is vital for ML in communication Data is important as the algorithm (!) How to effectively and efficiently utilize data is still insufficiently researched

Summary

slide-41
SLIDE 41

[1] Urbanke and Richardson ,“Modern Coding Theory” , 2008 [2] Judea Pearl, “Probabilistic reasoning in intelligent systems: networks of plausible inference” , 1988 [3] Nachmani et al. , “Learning to Decode Linear Codes Using Deep Learning”, Sep 16 [4] Tobias Gruber et al. “On deep learning-based channel decoding”. In: 51st Annual Conference on Information Sciences and Systems (CISS). 2017. [5] Lior Rokach, “Pattern Classification using Ensemble Methods”, 2010 [6] Elkelesh et al., “Belief Propagation List Decoding of Polar Codes”, 2018 [7] Ralf Koetter et al., “Characterizations of pseudo-codewords of (low-density) parity check codes”, 2007 [8] Ishay Be’ery et al., "Active Deep Decoding of Linear Codes," in IEEE Transactions on Communications, Feb. 2020.

Bibliography