1
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Support Material for Presentation of Orange Book on LDPC Code Selection for CCSDS Standard
CCSDS, Toulouse, Nov. 15, 2004
Support Material for Presentation of Orange Book on LDPC Code - - PowerPoint PPT Presentation
Jet Propulsion Laboratory California Institute of Technology Support Material for Presentation of Orange Book on LDPC Code Selection for CCSDS Standard CCSDS, Toulouse, Nov. 15, 2004 JPL Proprietary Material 1 CCSDS, Toulouse, Nov., 2004
1
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
CCSDS, Toulouse, Nov. 15, 2004
2
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Degree measure
codewords
number of iterations required.
3
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
3 4 2 1 2 1 code rate =(n+1)/(n+2) 2n n=0, 1, ...... Protograph of ARA Family
This simple seed protograph, replicated enough times to obtain the large code, yields a much more structured code, suitable for high speed decoding
1/2 0.516 0.187 0.329 2/3 1.288 1.059 0.229 4/5 2.277 2.040 0.237 7/8 3.129 2.845 0.284
Difference Capacity Code Rate Protograph Threshold
Sparse circulant G matrices Input message Output codeword
D
Sparse matrix multiplies Accumulate Permute Puncture
α Π1 Π3
Π6+Π7 Π4+Π5
Π2
s0,s1 s1 p1,p2 p0 +
Threshold table (near-capacity) Family of protographs
Information Code block length n block length k rate 1/2 rate 2/3 rate 4/5 1024 2048 1536 1280 4096 8192 6144 5120 16384 32768 24576 20480
Code Family (Code rates and block lengths) Fast encoder structure
4
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
ARAx2_4c_64c parity check matrix, with structure indicated Protograph is 3 rows by 5 columns Expanded 2 times (by hand) to eliminate parallel edges Expanded 4 times with circulants to introduce necessary irregularity Expanded 64 times with circulants to construct full code
5
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
m+k p1 p2 s0 p0 s1 s2 transmitted transmitted transmitted punctured transmitted 0 1 2 3 4 5 6 7
c2(j)
n = 2048 k = 1024 m = 1536 punctured rate = k/n = 1/2 unpunctured rate = k/(m+k) = 2/5 m m = n + (punctured) - k j
1 2 3 4 5 6 7
6
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
7
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
8
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Conceived and developed two promising types of
BenONE™: Single-slot DIME-II™ Motherboard PCI card Java GUI user interface for remote access to HW platform $14K purchase from Nallatech Daughter Card BenDATA-WS™: 24MByte ZBT SRAM Xilinx Virtex II 8M gates
9
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Parallel Decoder Type-1 – for protograph codes
Parallelization method
Decoder speed = (k/2) x clock speed/iterations
from seed protograph) yields 20 Mbps FPGA decoder with 50 MHz clock and 20 average iterations
Pros:
decoders
Cons:
protograph codes Pros:
decoders
Cons:
protograph codes Expanded protograph has 40 variable nodes, 24 check nodes, and 112 edges. FPGA can support up to 512 slices of protograph. This corresponds to an input block size of up to 8192 bits FPGA utilization factor is 39% logic, 67% RAM. Expanded protograph has 40 variable nodes, 24 check nodes, and 112 edges. FPGA can support up to 512 slices of protograph. This corresponds to an input block size of up to 8192 bits FPGA utilization factor is 39% logic, 67% RAM. Protograph slice # 1 slice # 2 slice # N Parallelization method check node variable node connected to channel variable node not connected to channel
10
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Parallel Decoder Type-1 – for protograph codes (Cont’d)
Hardware implementation
both variable and check nodes –Variable nodes add “reliabilities” = Log-likelihoods –Check nodes add “unreliabilities” –Exchanged messages transformed between reliability and unreliability
transformation
Edge memories Variable node processors Constraint node processors Rel/Unrel transformation Variable nodes Check nodes
Decoder implementation for sample protograph
Quantized reliability/unreliability transformation
tolerant FPGA (largest rad-tol FPGA available today)
11
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
decoder’s speed and decodable code size
–Comparison: Type-1 protograph decoder processing e edges in parallel every half-iteration is roughly e/(2L) times faster than Type-2 universal decoder processing 2L edges in parallel every half-iteration – e.g., e = 140 vs L = 16 yields speedup factor > 4 –Nearly a factor of 2 additional parallelizability/speedup may be possible if the FPGA logic can make use of the Virtex RAM’s read-before-write mode
– This would increase the parallelizability limit on e; revised constraint would be e/2 + n/2 < B
– e.g., e = 18*14 = 252 for the rate-1/2 ARA protograph would yield speedup factor ≈ ≈ ≈ ≈ 8 vs universal decoder with L = 16 –Maximum decodable code size is (nT, kT), where (n,k) is the size of the protograph and T is the size of the circulant expansion – e.g., (nT,kT) = (40960, 20480) for the rate-1/2 ARA protograph expanded to e = 140 – e.g., (nT,kT) = (73728, 36864) for the rate-1/2 ARA protograph expanded to e = 252
Notation and other relevant details: – Virtex block RAMs are 2048 x 9 bits – Design achieves protograph expansion factor T = 1024 by using two half-RAMs for 1024 inputs and 1024 outputs for each protograph edge
– Half-RAM addresses are accessed in sequence, exploiting simplicity of circulant permutations on protograph edges
– Design achieves decoder parallelizability corresponding to a maximum protograph size with e + n/2 < B where
– B = # block RAMs (B = 168 for current FPGA, Virtex II 8000) – e = # edges in protograph – n = # channel symbols input to protograph
– Example: small rate-1/2 ARA protograph can be preliminarily expanded T′ ′ ′ ′ = 10 times to yield e = 140, n = 40, e + n/2 = 160 < B =168
12
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Pros:
write (no FPGA redesign) Cons:
description Pros:
write (no FPGA redesign) Cons:
description
4x4 MUX 4x4 MUX Inverse Interleaver RAMs Interleaver RAMs
to 2L=32 edges/clock cycle are feasible
– Check nodes use reduced-complexity approx min*: min(reliability) + correction terms – 8-bit quantizer is uniform in reliability domain
Double buffered edge memories Variable node processors Constraint node processors
Decoder architecture for parallelism L=4
algorithm for load balancing and collision avoidance
– Implemented stopping rule – Integrated noise generation module -> FER testing to 10-9 BER to 10-11. Measured frame error rates two orders of magnitude lower than possible by software simulation – Screened more than 20 candidate codes for error floor location – Receiver data can be interfaced to decoder via PCI backplane (tested)
Universal Decoder for SPArse CodEs (UDSPACE)
13
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
– Max edges depends on # of available Block RAMs – Speed of decoder approximately proportional to parallelism factor (L)
codeblock size by using a larger portion of available RAM
Technology migration path Oct 2004 new accomplishments/revised plans in red
RAMs
– Includes 6 of 9 CCSDS codes at 10 Msps
underway
XC2V8000) – Double speed (20 Msps) now verified – Add’l testing/tweaking in progress
V2P100 — price quote received this month) will supply more block RAM and faster clock speed
– V2P100 is nearly capable of decoding 3 largest CCSDS codes (~25 Msps for L=16) — V2P100’s RAM can accommodate codes with k up to ~14000, not quite k = 16384 – V2P125 part with sufficient block RAM for 3 largest CCSDS codes will not be available — new technology path skips this stage and goes straight to Virtex 4 – Planned Virtex 4 part will provide futher improvements
128K 64K 32K 16K 8K Edges 10 20 Decoding Speed, Msps (8 iterations, 90 MHz) Virtex II XC2V8000 Virtex II Pro V2P100 Virtex II Pro V2P125
L=8 L=4 L=1
2xL=8 L=8
CCSDS k=1K CCSDS k=4K CCSDS k=16K 4K
L=16
Virtex 4
Sep’04 Oct’04
Revised plan to skip V2P125 (part not available)
14
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
decoder down to 10-11 BER
hardware decoder
achieves lower error floor than SW decoder
in SW decoder, which was using full precision values.
noticeably the error floor
that can be decoded incorrectly while satisfying all but a few neighboring check nodes at the set’s “frontier”
1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 10 -11 10 -10 10 -9 10 -8 10 -7 10 -6 10 -5 10 -4 10 -3 10 -2 10 -1
Eb/No (dB) BER
IMPLEMENTATION GAIN !!! HW SW IMPLEMENTATION LOSS ARA FLARION
PROPOSED CCSDS STANDARD (5/04)
15
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Goddard C1 Goddard C2 DVB JPL ARA All-but-one Rate 0.822222222 0.875244618 1/4 to 11/12 1/2; 2/3; 4/5 n 4095 8176 16200; 64800 k 3367 7156 1K; 4K; 16K Family no no yes yes Threshold * ** ** *** Regular (3,6) Error floor ***? **? **? * JPL ARA Decoder computation * *** *** *** Encoder computation ** ***? *** *** Jeremy's linear dmin Simple code description *** *** ** *** Code J Public domain yes yes no? yes Flarion
Comparison of some codes wrt. Requirements and evaluation criteria
16
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Measuring the Asymptotic Near-Optimality of Protograph Families
it
ML decoding
17
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Measuring the Asymptotic Near-Optimality of Protograph Families
Optimality of Decoding Thresholds for Code Families
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Code Rate Decoding Threshold minus Capacity Limit (dB) Ci G4d G3d Ci+ Ci+pre AR4A AR3A
18
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
19
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
20
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
0.5 1 1.5 2 2.5 3 3.5 4 4.5 10
10
10
10
10
10
10
10
10
10
10
10 Eb/No BER/FER ARA3chr12 4c 512c 14 8 ACE, k = 4096 ARA3chr23 4c 256c 9 5 ACE, k = 4096 ARA3chr34 4c 170c 8 5 ACE, k = 4080 ARA3chr45 4c 128c 8 5 ACE, k = 4096 ARA3chr56 4c 102c 8 4 ACE, k = 4080 ARA3chr78 4c 128c 8 4 ACE, k = 7168 GSFC C2 Rate 0.875 HW, k = 7154
21
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
code rate =(n+1)/(n+2) 2n n=0, 1, ......
Code Rate Protograph Threshold Capacity Difference 1/2 0.516 0.187 0.329 2/3 1.288 1.059 0.229 3/4 1.848 1.626 0.222 4/5 2.277 2.040 0.237 5/6 2.620 2.362 0.258 6/7 2.897 2.625 0.272 7/8 3.129 2.845 0.284 8/9 3.324 3.033 0.291 Code Rate Protograph Threshold Capacity Difference 1/2 0.560 0.187 0.373 2/3 1.414 1.059 0.355 3/4 1.980 1.626 0.354 4/5 2.396 2.040 0.356 5/6 2.717 2.362 0.355 6/7 2.980 2.625 0.355 7/8 3.197 2.845 0.352 8/9 3.385 3.033 0.352
code rate =(n+1)/(n+2)
2n
n=0, 1, ......
22
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
Attached are nine files containing the parity check matrices for our proposed CCSDS codes. Each file contains one line per edge in the graph, i.e. each nonzero entry in the H matrix. Each line contains two numbers, giving the column and row indices (numbered from 0) for those entries. Matlab users, for example, could read the file "YCCSDS1280.cr" and construct the sparse parity check matrix with the two commands: cr=load('YCCSDS1280.cr')+1; H=sparse(cr(:,2),cr(:,1),ones(1,length(cr))); Note these are punctured codes, so their rates are higher than is apparent from the dimensions of the parity check
inclusive (numbered from zero). The filenames and codes are: filename n k edges rate punctured columns YCCSDS1280.cr 1280 1024 4096 4/5 384-511 YCCSDS1536.cr 1536 1024 5120 2/3 768-1023 YCCSDS2048.cr 2048 1024 7168 1/2 1536-2047 YCCSDS5120.cr 5120 4096 16384 4/5 1536-2047 YCCSDS6144.cr 6144 4096 20480 2/3 3072-4095 YCCSDS8192.cr 8192 4096 28672 1/2 6144-8191 YCCSDS20480.cr 20480 16384 65536 4/5 6144-8191 YCCSDS24576.cr 24576 16384 81920 2/3 12288-16383 YCCSDS32768.cr 32768 16384 114688 1/2 24576-32767
Parity check matrices Back-up
23
Jet Propulsion Laboratory
California Institute of Technology CCSDS, Toulouse, Nov., 2004
JPL Proprietary Material
accumulator to Channel D
puncture 0x puncture P1
accumulator
π
permutation (interleaver)
D
puncture P2
Family of Accumulate-Repeat-Accumulate codes (encoder)
repetition 3 repetition 3
puncture P0
input
Two possible, low-complexity implementations of the encoder
Patent Pending “Puncturing” yields a family of codes with higher code rates
ARA Back-up