Performance Driven Optimizations in FPGA Based QAM Systems KIT - - PowerPoint PPT Presentation

performance driven optimizations in fpga based qam systems
SMART_READER_LITE
LIVE PREVIEW

Performance Driven Optimizations in FPGA Based QAM Systems KIT - - PowerPoint PPT Presentation

Institut fr Technik der Informationsverarbeitung Directors Master Thesis Prof. Dr.-Ing. Dr. h. c. J. Becker Prof. Dr.-Ing. E. Sax Prof. Dr. rer. nat. W. Stork Supervising Tutors Alberto Sonnino M. Tech G. Shalina Dipl.-Ing. P . Figuli


slide-1
SLIDE 1

KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association

Institut für Technik der Informationsverarbeitung (ITIV)

www.kit.edu

Directors

  • Prof. Dr.-Ing. Dr. h. c. J. Becker
  • Prof. Dr.-Ing. E. Sax
  • Prof. Dr. rer. nat. W. Stork

Supervising Tutors Institut für Technik der Informationsverarbeitung

Master Thesis

Alberto Sonnino

Performance Driven Optimizations in FPGA Based QAM Systems

  • M. Tech G. Shalina

Dipl.-Ing. P . Figuli

slide-2
SLIDE 2

Institut für Technik der Informationsverarbeitung (ITIV) 2 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Introduction and Motivation

Challenges in the current trend

Pursuit of high SNR and high data rate Contribution to reach future terabit’s communication FPGAs clocked below 1GHz: need for parallelism

06.10.2015

[1]

slide-3
SLIDE 3

Institut für Technik der Informationsverarbeitung (ITIV) 3 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Introduction and Motivation

My work: performance optimization of QAM transmitter

Exploiting parallelism FPGA platform Mixed-domains (time and frequency) approach

Current state-of-the-art

2012: 128.6 MHz achieved (University of Shanghai, China) [2]

Transmitter and receiver On Xilinx Virtex IV

2013: 625.0 MHz achieved (University of Paderborn, Germany) [3]

Only transmitter On Xilinx Virtex VI

2015: 750.0 MHz achieved (E2v Semiconductor, UK) [4]

Only transmitter and no filter On Xilinx Virtex VI

06.10.2015

slide-4
SLIDE 4

Institut für Technik der Informationsverarbeitung (ITIV) 4 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Introduction and Motivation

Hardware Choice

FPGA because of great configurability, flexibility and cost Growing technology

Modulation Choice

Quadrature Amplitude Modulation (QAM) Allow carrying many bits per symbol

Filter Choice

Avoid Inter Symbol Interferences (ISI) Finite impulse response (FIR) Squared Raised Root Cosine (SRRC) No filter optimizations in this work [5]

06.10.2015

slide-5
SLIDE 5

Institut für Technik der Informationsverarbeitung (ITIV) 5 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Outline

Introduction and Motivation Fundamentals

Standard transmission chain Fundamentals of each block

Concepts & Methodology

Strategy Ideal model

Implementation

Implementation of each block

Experimental Results

Achieved precision Achieved performances

Summary & Further Improvements

06.10.2015

slide-6
SLIDE 6

Institut für Technik der Informationsverarbeitung (ITIV) 6 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Fundamentals

Standard Transmission Chain Focus of this work

QAM mapper Filter Modulator

C0

Z-1 Z-1

C1 CN-1

+ +

FIR Filter

C0

Z-1 Z-1

C1 CN-1

+ +

FIR Filter QAM in I Q

+ x

sin cos

x

+

  • Modulator

Re Im

  • ut

06.10.2015

C0

Z-1 Z-1

C1 CN

+ +

FIR Filter

C0

Z-1 Z-1

C1 C1

+ +

FIR Filter QAM in I Q

+ x

sin cos

x

+

  • Modulator

Re Im

  • ut

Generator

…0101110101011 1011010011100110 011110…

Encryptor Encoder

  • Smb. mapper

Filter Modulator Decryptor Decoder

  • Smb. demapper

Filter Demodulator Channel

I III III V I III III V

+ x x

+

  • +

x x

+

  • C0

Z-1 Z-1

C1 C1

+ +

C0

Z-1 Z-1

C1 C1

+ +

slide-7
SLIDE 7

Institut für Technik der Informationsverarbeitung (ITIV) 7 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Fundamentals

QAM Mapper

M-QAM formats (M=8, 16, 32, etc.) Clusterization in log2(M) bits Gray code for hamming distance of 1 Rectangular constellation is considered Large M implies higher data rate But symbol’s misinterpretation

Modulator

Local oscillator delivering trigonometric orthogonal carriers Multiplication and subtraction operation

0010 0110 0011 0111 0001 0101 0000 0100 1101 1001 1100 1000 1110 1010 1111 1011 Q I

  • ut(t)

= R n [I(t) + iQ(t)]e2πf0to = I(t) cos(2πf0t) − Q(t) sin(2πf0t)

06.10.2015 cos sin

+

  • LO

90° X X

+

Re Im

  • ut
slide-8
SLIDE 8

Institut für Technik der Informationsverarbeitung (ITIV) 8 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Fundamentals

Fourier Transform

Signal’s decomposition into an alternative representation Discrete Fourier Transform (DFT) sends in the Fourier domain Inverse Discrete Fourier Transform (IDFT) takes it back Linear operations have equivalent in Fourier domain Useful for this work: convolution becomes multiplication

X[k] = PN1 x[n]e2πikn/N k ∈ Z x[n] = 1 N PN1 X[k]e2πikn/N n ∈ Z

F{f > g} = F{f} · F{g} = G · F f > g = F1{F{f} · F{g}} = F1{G · F}

06.10.2015

slide-9
SLIDE 9

Institut für Technik der Informationsverarbeitung (ITIV) 9 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Fundamentals

Filter

Nyquist criteria avoids ISI Pulse Shaping Filter to limit the transmission band FIR filter: linear phase, inherent stability, no feedback Matched filter improves SNR (if only stochastic noises) Good compromise: SRRC filter

Z-1

C1

+ + +

Z-1 Z-1

C0 C2 CN y[n] x[n]

Time Domain

y[n] = x[n] > h[n]

  • Convolution: difficultly parallelizable

Frequency Domain

YN YN YN HN H1 H0 X0 X1 XN

X X X

Y [k] = X[k] · H[k]

  • Multiplication: easily parallelizable

06.10.2015

slide-10
SLIDE 10

Institut für Technik der Informationsverarbeitung (ITIV) 10 09.11.2009

X 90°

Modulator FIR Filter FIR Filter QAM in

I Q

DFT

  • ut

00 01 10 11 d 3d

  • d
  • 3d

3d = c5 N-bit frame

xn xk

IDFT

N-bit frame

xk xn

c0 = 0.022507907 c1= 0.028298439 c2= - 0.07620194

ci X ci X

+

  • LO

+

X c3 = - 0.03750077 c4= 0.307673479 c5= 0.540985931 Carriers’s frequency f0

  • rder 10
  • rder 10

Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Concepts & Methodology

Strategy

Reference MATLAB model Identify which part to implement in frequency domain Prototype a single channel (non parallel) transmitter Optimize for Xilinx Virtex 7 Generic model with parallelization and scalability

Conceptual Model

06.10.2015

slide-11
SLIDE 11

Institut für Technik der Informationsverarbeitung (ITIV) 11 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Concepts & Methodology

Ideal Behaviour

X 90°

Modulator FIR Filter FIR Filter QAM in

I Q

DFT

  • ut

00 01 10 11 d 3d

  • d
  • 3d

3d = c5 N-bit frame

xn xk

IDFT

N-bit frame

xk xn

c0 = 0.022507907 c1= 0.028298439 c2= - 0.07620194

ci X ci X

+

  • LO

+

X c3 = - 0.03750077 c4= 0.307673479 c5= 0.540985931 Carriers’s frequency f0

  • rder 10
  • rder 10

0001101010 …

06.10.2015

X 90°

Modulator FIR Filter FIR Filter QAM in

I Q

DFT

  • ut

00 01 10 11 d 3d

  • d
  • 3d

3d = c5 N-bit frame

xn xk

IDFT

N-bit frame

xk xn

c0 = 0.022507907 c1= 0.028298439 c2= - 0.07620194

ci X ci X

+

  • LO

+

X c3 = - 0.03750077 c4= 0.307673479 c5= 0.540985931 Carriers’s frequency f0

  • rder 10
  • rder 10

0001101010 …

I component Filter im. part Filter re. part IDFT re. part IDFT im. part DFT im. part DFT re. part Q component Transmitter out

slide-12
SLIDE 12

Institut für Technik der Informationsverarbeitung (ITIV) 12 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

Implemented System Data Packing

Parallel inputs/outputs packed into the same bus Precision fixed to 16 bits Each datai is a 16-bit vector

mult mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

255 in

I Q

dft.v

  • ut

255 N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 comb. logic add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

add dft_coeff.v filter_coeff.v 255 carriers.v 255 255

dataN-1 dataN-2 data0 N-1 N-2 N-3 1

06.10.2015

slide-13
SLIDE 13

Institut für Technik der Informationsverarbeitung (ITIV) 13 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

Specifications Characteristics

Input width: (FORMAT x N) Output width: 16N Uses 2N2 complex multipliers, 4N2-2N adder and 4N multipliers

Latency 17 cycles Parameters N # of parallel inputs FORMAT QAM format Inputs clk Clock reset Reset in Cluseterd stream Outputs tvalid Validity flag

  • ut

Output data

Transmitter clock tvalid

  • ut

reset in

N FORMAT

06.10.2015

slide-14
SLIDE 14

Institut für Technik der Informationsverarbeitung (ITIV) 14 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

QAM Mapper

Three parameters (N, W, FORMAT): number of inputs, bus width, QAM format 8-QAM, 16-QAM, 32-QAM, 64-QAM support Each format implemented in a separated Verilog file Generates only the circuit for the desired format

I

255

Q

comb. logic add mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

255 N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult 255 255 06.10.2015

slide-15
SLIDE 15

Institut für Technik der Informationsverarbeitung (ITIV) 15 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

DFT & IDFT

One parameter (N) : number of inputs No parallel DFT / IDFT Xilinx IP cores available yet Each one uses N2 complex multipliers and 2N(N-1) adders Rescaling of 217 to fit the 16-bit bus

I

255

Q

comb. logic add mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

255 N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult 255 255

Q

255 mult add mult modulator.v mult

64

in

I

  • ut

N = 16 W = 16 FORMAT = 4

255

qam.v comb. logic xk_im xk_re srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last add 4096 comb. logic add

  • com. mult

idft.v 255 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult add 4096 add

  • com. mult

4096 N = 16 dft.v 255 comb. logic 255 4096 4096 4096 4096 255 255 255 255 255 06.10.2015

slide-16
SLIDE 16

Institut für Technik der Informationsverarbeitung (ITIV) 16 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

Filter

One parameter (N) : number of inputs Frequency domain: simple multiplication with filter coefficients Uses 2N multipliers Rescaling of 216 to fit the 16-bit bus

I

255

Q

comb. logic add mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

255 N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult 255 255 255

I Q

comb. logic add mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

N = 16 W = 16 FORMAT = 4

255

qam.v comb. logic xk_im xk_re 4096 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult 255 255 255 255 255 255 255 06.10.2015

slide-17
SLIDE 17

Institut für Technik der Informationsverarbeitung (ITIV) 17 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

Modulator

One parameter (N) : number of inputs Uses 2N multipliers and N adders (configured in subtracter mode) Rescaling of 216 to fit the 16-bit bus

I

255

Q

comb. logic add mult modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

255 N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v mult 255 255 mult mult 255

I Q

comb. logic add modulator.v mult add 4096 4096 comb. logic add

  • com. mult

mult

64

in dft.v

  • ut

N = 16 W = 16 FORMAT = 4 255 255 255

255

qam.v comb. logic xk_im xk_re 4096 255 srrc_filter.v Y_re Y_im sn_im sn_re

255 255

clk reset seq. logic tvalid transmitter.v 16 last 4096 N = 16 add 4096 4096 add

  • com. mult

idft.v 255 4096 255 4096 N = 16 N = 16 N = 16

255

dft_coeff.v filter_coeff.v 255 carriers.v 255 255 255 06.10.2015

slide-18
SLIDE 18

Institut für Technik der Informationsverarbeitung (ITIV) 18 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Implementation

Fourier QAM Modulator (FQM) Utility

carrier’s frequency summary filter’s coefficients generate files add & remove rows

06.10.2015

slide-19
SLIDE 19

Institut für Technik der Informationsverarbeitung (ITIV) 19 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Experimental Results

Test Conditions

N = 16, 100 Hz carriers Different configurations for Adders and Multipliers cores All supported QAM formats

Design Precision

Less than 1% error respect to MATLAB !

zoom

06.10.2015

amplitude samples samples magnitude

slide-20
SLIDE 20

Institut für Technik der Informationsverarbeitung (ITIV) 20 09.11.2009

Final Result

Adders using the fabric and Multipliers using DSP Slices Effective speed of 16 x 62.5 = 1 GHz (instead of 750 MHz [3]) Throughput per modulation formats:

Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Experimental Results

{

62.5 MHz

06.10.2015

slide-21
SLIDE 21

Institut für Technik der Informationsverarbeitung (ITIV) 21 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Summary & Further Improvements

Topic

Performance optimization of QAM transmitter Exploiting parallelism using a mixed-domain approach

Achieved during this term

Familiarization with Xilinx tools Understanding of the underlying physical concepts MATLAB simulation and prototyping a single-cannel transmitter Build and optimize the parallel design Scalable generic model

Further improvements

Implement FFT instead of DFT (or wait for next Xilinx release) Reduce the DSP utilization to allow N = 32 Support additional modulation formats

06.10.2015

slide-22
SLIDE 22

Institut für Technik der Informationsverarbeitung (ITIV) 22 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Bibliography

[1] The End of Moore’s Law? Why It Matters

TIMnovate, Prof. S. Maital https://timnovate.wordpress.com/2015/01/23/the-end-of-moores-law-why-it-matters/

[2] FPGA Implementation of High-throughput Complex Adaptive Equalizer for QAM Receiver

Siqiang MA, Yong’en CHEN Tongji University, Shanghai, China 2012

[3] The Influence of Laser Phase noise on Carrier Phase Estimation of a Real- Time 16-QAM Transmission with FPGA Based Coherent Receiver

  • A. Al-Bermani a, C. Wördehoff b, O. Jana, K. Puntsria, M. F. Panhwara, U. Rückert

b, R. Noé a University of Paderborn, Paderborn, Germany 2013

06.10.2015

slide-23
SLIDE 23

Institut für Technik der Informationsverarbeitung (ITIV) 23 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems

Bibliography

[4] A high speed transmission system using QAM and direct conversion with high bandwidth converters

Marc Stackler, Andrew Gloascott-Johnes, NicolasChantier E2v Semiconductors 2015

[5] Parametric Design Space Exploration for Optimizing QAM Based High- speed Communication

  • S. Percy George Ford, P. Figuli and J. Becker

IEEE/CIC International Conference on Communications in China 2015

06.10.2015

slide-24
SLIDE 24

Institut für Technik der Informationsverarbeitung (ITIV) 24 09.11.2009 Alberto Sonnino – Performance driven optimization in FPGA based QAM systems 06.10.2015

Thank you for your attention !