SoC-Network for Interleaving in Wireless Communications Norbert - PDF document

Microelectronic System Design Research Group University Kaiserslautern www.eit.uni-kl.de/wehn SoC-Network for Interleaving in Wireless Communications Norbert Wehn wehn@eit.uni-kl.de MPSoC’03 7-11 July 2003, Chamonix, France Outline MPSoC’03 N. Wehn � Motivation � Outer Modem Algorithms � Channel Coding � Interleaving (Turbo-Codes) � Application Specific Processing Node � Application Specific Communication Network � Network Structure � Network Analysis � Results � Conclusion 2 1

Wireless Implementation Challenges I MPSoC’03 N. Wehn � DECT 10 MIPS, GSM 100 MIPS, UMTS x 1000 MIPS 3 Wireless Implementation Challenges II MPSoC’03 N. Wehn � Algorithmic Complexity � “Shannon‘s Law beats Moore‘s Law” � Programmability and Flexibility � different QoS � „multi-mode“ support: different algorithms & standards � „software radio“ � different throughput requirements � Low Power/Low Energy � BUT: „Energy-Flexibility Gap“ � Design Space � algorithms, architecture …. 4 2

Motivation MPSoC’03 N. Wehn New architectures: AP-MPSoC � scalable, highly parallel, programmable, energy-efficient � application-specific processor node running with low frequency � application-specific communication network Wireless baseband algorithms � Inner modem � signal processing based on matrix computations e.g. multi-user detection, interference cancellation, filtering, correlators � many publications on efficient multi-processor implementations of matrix computations e.g. systolic arrays � Outer Modem � Channel coding, Interleaving, Data stream segmentation � efficient multi-processor implementation largely unexplored 5 Importance of Channel Coding MPSoC’03 N. Wehn Efficient channel coding is key for reliable communication High throughput: complexity is in data distribution and not in computation 6 3

Channel Coding Techniques MPSoC’03 N. Wehn � Convolutional Codes � Viterbi decoding algorithm � intensively studied (HW/SW/DSP_extensions) � Most efficient Codes: Turbo-Codes (1993), LDPC-Codes (1996) � block-based � iterative decoding techniques � computational complexity increased by order of magnitude � memory access and data transfers are very critical � Turbo-Codes � one of the big changes when moving from 2G to 3G � part of many emerging standards e.g. WLAN, 4G � Turbo-principle extended to modulation � Very active research area in the communication community Mapping of this type of algorithms onto programmable architectures largely unexplored 7 Turbo-En/Decoder Structure MPSoC’03 N. Wehn r r Systematic r s s x x s x r Parity r x 1 p RSC Coder 1 x 1 RSC Coder 1 Interleaver p Interleaver r r In s x int x 2 p RSC Coder 2 RSC Coder 2 int reliability information Deinterleaver r Deinterleaver r Λ Λ 1 2 a r a r Λ int 1 Λ e r 2 r e Softoutput Softoutput Softoutput Softoutput Λ Λ int s s Decoder Interleaver Decoder Decoder Interleaver r int Decoder r MAP 1 MAP 2 Λ Λ 1 MAP 1 2 MAP 2 p p int reliability information 8 4

Turbo-Codes MPSoC’03 N. Wehn � Iterative decoding process � block-based 3GPP: 20-5114 bits, 3GPP2: 378-20730 bits � DEC1, Interleaving, DEC2, Deinterleaving � interleaved reliability information is exchanged between decoders � Softoutput Decoder � determine Log-Likelihood Ratio (LLR) of each bit being sent „0“ or „1“ (Viterbi determines only most likely path in trellis) � three step algorithm: forward/backward recursion, LLR calculation � ~2.5 x computational complexity of Viterbi algorithm � memory complexity (size,access) >> Viterbi algorithm � Interleaving/Deinterleaving � important step on the physical layer � scrambles data processing order to yield timing diversity � minimizes burst errors 9 Implementation Challenges MPSoC’03 N. Wehn � Programmability and Flexibility „...It is critical for next generation programmable DSP to adress the requirements of algorithms such as Turbo-Codes since these algorithms are essential for improved 2G and 3G wireless communication “ (I. Verbauwhede „DSP‘s for wireless communications“) � High throughput requirements � UMTS: 2 Mbit/s (terminal), >10Mbit/s (basestation) � emerging standards >100 Mbit/s � DSP performance (UMTS compliant based on Log-MAP algorithm) Clock freq. cycles/ Throughput Processor Architecture [MHz] (bit*MAP) @ 5 Iter. MOT 56603 16-bit DSP 80 472 17 kbit/s STM ST120 VLIW, 2 ALU 200 100 ~ 200 kbit/s SC140 VLIW, 4 ALU 300 50 600 kbit/s ADI TS (1) VLIW, 2 ALU 180 27 666 kbit/s (1) With special ACS-instruction support 10 5

Multiprocessor Solution (Block Level) MPSoC’03 N. Wehn Simple MP solution Multiprocessor solution becomes mandatory MAP- Interleaver/ P 1 MAP- Interleaver/ Decoder Deinterleav Decoder Deinterleav Single Processor MAP- Interleaver/ MAP- Interleaver/ MAP- Interleaver/ P 2 Decoder Deinterleav MAP- Interleaver/ Decoder Deinterleav Decoder Deinterleaver Decoder Deinterleaver ............... MAP- Interleaver/ P N MAP- Interleaver/ Decoder Deinterleav Decoder Deinterleav � Sequential processing of � MAP algorithm � two MAP component decoders � N blocks are processed � Interleaving and Deinterleaving � Large latency � Low architectural efficiency � large area (memory!) � high energy 11 Optimized MPSoC (Sub-Block Level) MPSoC’03 N. Wehn Better solution: parallelization on algorithmic level (sub-block level) � MAP decoder parallelization (exploiting trellis windowing technique) � each processor can execute a sub-block of of the complete block independently � slight increase in computational complexity due to acquisition phase � allows distributed computing � Iterative exchange of interleaved information yields only limited locality write P 1 P 1 Subblock 1 � Low Latency (decreases with N) Subblock 1 read � Large architectural efficiency � Computational locality but P 2 Interleaver/ P 2 Interleaver/ Subblock 2 Deinterleaver Subblock 2 Deinterleaver network-centric architecture Network Network P N P N Subblock N Subblock N 12 6

Interleaver Bottleneck MPSoC’03 N. Wehn � Data from N sources have to be „perfectly randomly“ distributed BI T P I I nterl. P I Interleaving position Network 1 1 3 1 1,2,3 M 1 2 6 1 2 P 1 3 5 1 2 4 2 2 1 4,5,6 5 2 4 2 M 2 P 2 6 1 2 1 � Average : P i sends & receives same amount of values/cycle � Peak : P i can receive up to N-1 more values than average value Crossbar functionality, but with output blocking conflict 13 Interleaving Network Requirements MPSoC’03 N. Wehn � Flexibility and Scalability � Interleaver scheme can change from decoding block to block � e.g. ~ 5000 different interleaver tables in UMTS � Different throughput requirements � Global data distribution � Good interleavers imply no locality � 0-latency penalty � data distribution should be completely done in parallel to data calculation � Write conflicts i.e. different PEs write simultanously onto same target PE � multi-port memories infeasable � conflict-free interleaver design (e.g. IMEC approach), but lack of flexibility 14 7

Application Specific Processing Node MPSoC’03 N. Wehn � Increased ILP by Tensilica Xtensa RISC core for MAP calculation � double add-compare-select operation (butterfly) α k (2n) = max* ( α k-1 (n) + Λ in k (I), α k-1 (n+M/2) + Λ in k (II)) α k (2n+1) = max* ( α k-1 (n) + Λ in k (II), α k-1 (n+M/2) + Λ in k (I)) � max* operation max*(x 1 , x 2 ) = max (x 1 , x 2 ) + ln(1+exp(-| x 2 -x 1 |)) � zero overhead data-transfers: memory operations parallel to butterfly operation 1.54mm 2 (0.18um techology), f=133 MHz � Clock freq. cycles/ Throughput Processor [MHz] (bit*MAP) @ 5 Iter. Xtensa 133 9 1,4 Mbit/s STM ST120 200 100 ~ 200 kbit/s SC140 300 50 600 kbit/s ADI TS 180 27 666 kbit/s 15 Processing Node Interface MPSoC’03 N. Wehn � Fast single-cycle local data memory M C � mapped into processors adress space � XLMI single-cycle data interface for interprocessor communication � Communication device for data distribution � message passing network (message=data + target addr.) � single cycle access CPU-Core (Xtensa) CPU-Address-Space Custom-Hardware 32 32 S R PIF Core I/O 32 32 FIFO X L 16 XLMI CPU Bus Data Comm. Data 16 M M C M P Buffer Addr. Bus Dev. I Sel 0 Addr. 0 Interface Sel 1 16 Data 16 Cluster Bus Buffer 1 Addr. Cluster Bus Local address in Node ID target 0 Buffer ID 0 Message format Data (8bit) (1bit) Processor (7bit) buffer (14bit) 16 8

SoC-Network for Interleaving in Wireless Communications Norbert - PDF document

Microelectronic System Design Research Group University Kaiserslautern www.eit.uni-kl.de/wehn SoC-Network for Interleaving in Wireless Communications Norbert Wehn wehn@eit.uni-kl.de MPSoC03 7-11 July 2003, Chamonix, France Outline

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

Eduardo Gandara LEARNING OUTCOMES Introduced to the SOC Program Brief History of SOC

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Error Resilient Image Communication with Chaotic Pixel Interleaving for Wireless Sensor Networks

Preparing SOC 1, SOC 2 or SOC 3 Reports: Best Practices Meeting Challenges Arising From SSAE 16,

TKT TKT- -2431 SoC design 2431 SoC design Introduction to exercises SoC design / September 09

FPAA: FPGA Extension SoC FPAA SoC FPAA: Computing SoC Dev Board Large Scale FPAA Devices

WIRELESS SYSTEMS CO.,LTD Wireless System Company Wireless & Computer & Telecom Solution

Interweaving THE ROLE OF RETRIEVAL, SPACING AND INTERLEAVING IN THE CURRICULUM BY MARK ENSER

SCTP User Message Interleaving Integration and Validation Felix Weinrank Michael Txen Irene

Interleaving Cryptography and Mechanism Design The Case of Online Auctions Edith Elkind and

Intention Interleaving Via Classical Replanning Mengwei Xu , Kim Bauters, Kevin McAreavey, Weiru

SoC Pain Points and Gaps An Overview - MUKUND PAI Intel Corp. All Discussions in this Slide

System on on a Chip (SoC) Cristian Sisterna Universidad Nacional San Juan Argentina SoC ICTP

Elements Home network Networks of a wireless Network Regional ISP Yanmin Zhu Institutional

Wireless monitoring device Goal Wireless monitoring device monitors existing wireless network

Search Algorithms for Speech Recognition Berlin Chen 2004 References Books 1. X. Huang,

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

Hidden Markov Models CMSC 473/673 UMBC Recap from last time Expectation Maximization

ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2019 Recap: HMM Elements of HMM:

Problem statement of SDN and NFV co-deploy ment in cloud datacenters dr af t - gu- sdnr g- pr obl

EM & Variational Bayes Hanxiao Liu September 9, 2014 1 / 19 Outline 1. EM Algorithm 1.1

Improving Data Centre Performance using Multipath TCP (work in progress) Mark Handley Costin

OPEN-O Unified NFV/SDN Open Source Orchestrator Hui Deng, China Mobile Kai Liu, China Telecom

SoC-Network for Interleaving in Wireless Communications Norbert - PDF document

Microelectronic System Design Research Group University Kaiserslautern www.eit.uni-kl.de/wehn SoC-Network for Interleaving in Wireless Communications Norbert Wehn wehn@eit.uni-kl.de MPSoC03 7-11 July 2003, Chamonix, France Outline

SoC SoC Design SoC SoC Design Design Design Lecture Lecture 1 1: Introduction :

Eduardo Gandara LEARNING OUTCOMES Introduced to the SOC Program Brief History of SOC

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Error Resilient Image Communication with Chaotic Pixel Interleaving for Wireless Sensor Networks

Preparing SOC 1, SOC 2 or SOC 3 Reports: Best Practices Meeting Challenges Arising From SSAE 16,

TKT TKT- -2431 SoC design 2431 SoC design Introduction to exercises SoC design / September 09

FPAA: FPGA Extension SoC FPAA SoC FPAA: Computing SoC Dev Board Large Scale FPAA Devices

WIRELESS SYSTEMS CO.,LTD Wireless System Company Wireless &amp; Computer &amp; Telecom Solution

Interweaving THE ROLE OF RETRIEVAL, SPACING AND INTERLEAVING IN THE CURRICULUM BY MARK ENSER

SCTP User Message Interleaving Integration and Validation Felix Weinrank Michael Txen Irene

Interleaving Cryptography and Mechanism Design The Case of Online Auctions Edith Elkind and

Intention Interleaving Via Classical Replanning Mengwei Xu , Kim Bauters, Kevin McAreavey, Weiru

SoC Pain Points and Gaps An Overview - MUKUND PAI Intel Corp. All Discussions in this Slide

System on on a Chip (SoC) Cristian Sisterna Universidad Nacional San Juan Argentina SoC ICTP

Elements Home network Networks of a wireless Network Regional ISP Yanmin Zhu Institutional

Wireless monitoring device Goal Wireless monitoring device monitors existing wireless network

Search Algorithms for Speech Recognition Berlin Chen 2004 References Books 1. X. Huang,

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov

Hidden Markov Models CMSC 473/673 UMBC Recap from last time Expectation Maximization

ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2019 Recap: HMM Elements of HMM:

Problem statement of SDN and NFV co-deploy ment in cloud datacenters dr af t - gu- sdnr g- pr obl

EM &amp; Variational Bayes Hanxiao Liu September 9, 2014 1 / 19 Outline 1. EM Algorithm 1.1

Improving Data Centre Performance using Multipath TCP (work in progress) Mark Handley Costin

OPEN-O Unified NFV/SDN Open Source Orchestrator Hui Deng, China Mobile Kai Liu, China Telecom

WIRELESS SYSTEMS CO.,LTD Wireless System Company Wireless & Computer & Telecom Solution

EM & Variational Bayes Hanxiao Liu September 9, 2014 1 / 19 Outline 1. EM Algorithm 1.1