SLIDE 1 Multi-Dimensional Indexing: FFT
- Use a transform to generate an index
pattern that still contains all values in the range (0 to N-1).
- Size N
- n takes on values of
n: 0, 1, 2, 3, ... N-2, N-1 or N, 1+N, 2+N, 3+N, ... 2N-2, 2N-1
- Use 2 variables n1 and n2.
- Rewrite n = (An1 + Bn2) mod N
n1: 0, 1, 2, 3, ... N1-2, N1-1 n2: 0, 1, 2, 3, ... N2-2, N2-1 where N=N1N2
SLIDE 2 N-Point complex DFT (even)
z-1 CORDIC
Sin ROM k-counter n-counter Concat I&Q Data In Memory concat I&Q Data Out Memory
WN
done rst ena addr
SLIDE 3 Matlab Example
N = 30, N1=5, N2=6 n1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 n2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 n = N2*n1 + n2; 0 6 12 18 24 1 7 13 19 25 2 8 14 20 26 3 9 15 21 27 4 10 16 22 28 5 11 17 23 29
SLIDE 4
Do the same thing for k.
SLIDE 5 Matlab Example
N = 30, N1=5, N2=6 n1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 n2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 n = N2*n1 + n2; 0 6 12 18 24 1 7 13 19 25 2 8 14 20 26 3 9 15 21 27 4 10 16 22 28 5 11 17 23 29 N = 30, N1=5, N2=6 k1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 k2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 k = k1 + N1*k2; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
SLIDE 6 Multi-Dimensional Indexing
𝑌 𝑙 =
𝑜=0 𝑂−1
𝑦 𝑜 𝑋
𝑂 𝑜𝑙
Replace n and k. 𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂 𝑂2𝑜1+𝑜2 𝑙1+𝑂1𝑙2
Multiply terms 𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂 𝑂2𝑜1𝑙1+𝑂1𝑂2𝑜1𝑙2+𝑜2𝑙1+𝑂1𝑜2𝑙2
Eliminate N1N2=N term 𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂 𝑂2𝑜1𝑙1+𝑜2𝑙1+𝑂1𝑜2𝑙2
SLIDE 7 Multi-Dimensional Indexing
𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂 𝑂2𝑜1𝑙1+𝑜2𝑙1+𝑂1𝑜2𝑙2
Move out of summation. 𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑋
𝑂 𝑂1𝑜2𝑙2𝑋 𝑂 𝑜2𝑙1 𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂 𝑂2𝑜1𝑙1
Finally group terms. 𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑋
𝑂 𝑜2𝑙1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂1 𝑜1𝑙1
𝑋
𝑂2 𝑜2𝑙2
SLIDE 8 Interpretation
𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑋
𝑂 𝑜2𝑙1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂1 𝑜1𝑙1
𝑋
𝑂2 𝑜2𝑙2
- For each n2 from 0 to N2-1, perform an N1 point DFT
- ver values of (see next slide):
- DFT across rows.
x(n = [(0)N2,(1)N2,(2)N2,...,(N1-1)N2]+0 )→x[n2=0, k1=0:N1-1] x(n = [(0)N2,(1)N2,(2)N2,...,(N1-1)N2]+1 )→x[n2=1, k1=0:N1-1] x(n = [(0)N2,(1)N2,(2)N2,...,(N1-1)N2]+2 )→x[n2=2, k1=0:N1-1] ... x(n = [(0)N2,(1)N2,(2)N2,...,(N1-1)N2]+N2-1)→x[n2=N2-1,k1=0:N1-1]
- Multiply each intermediate term accorking to k1&n2.
SLIDE 9 Matlab Example
N = 30, N1=5, N2=6 n1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 n2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 n = N2*n1 + n2; 0 6 12 18 24 1 7 13 19 25 2 8 14 20 26 3 9 15 21 27 4 10 16 22 28 5 11 17 23 29 N = 30, N1=5, N2=6 k1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 k2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 k = k1 + N1*k2; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
SLIDE 10 Interpretation
𝑌 𝑙1 + 𝑂1𝑙2 =
𝑜2=0 𝑂2−1
𝑋
𝑂 𝑜2𝑙1
𝑜1=0 𝑂1−1
𝑦 𝑂2𝑜1 + 𝑜2 𝑋
𝑂1 𝑜1𝑙1
𝑋
𝑂2 𝑜2𝑙2
- For each k1 from 0 to N1-1, perform an N2 point DFT
- ver values of:
- DFT across columns.
x[n2=0:N2-1,k1=0 ] → X[k2=0:N2-1,k1=0 ] x[n2=0:N2-1,k1=1 ] → X[k2=0:N2-1,k1=1 ] x[n2=0:N2-1,k1=2 ] → X[k2=0:N2-1,k1=2 ] ... x[n2=0:N2-1,k1=N1-1] → X[k2=0:N2-1,k1=N1-1]
SLIDE 11 Matlab Example
N = 30, N1=5, N2=6 n1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 n2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 n = N2*n1 + n2; 0 6 12 18 24 1 7 13 19 25 2 8 14 20 26 3 9 15 21 27 4 10 16 22 28 5 11 17 23 29 N = 30, N1=5, N2=6 k1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 k2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 k = k1 + N1*k2; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
SLIDE 12 Matlab Example
N = 30, N1=5, N2=6 n1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 n2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 n = N2*n1 + n2; 0 6 12 18 24 1 7 13 19 25 2 8 14 20 26 3 9 15 21 27 4 10 16 22 28 5 11 17 23 29 n = N2*n1 + n2; x(n1,n2=0) = x(0) x(6) x(12) x(18) x(24) x(n1,n2=1) = x(1) x(7) x(13) x(19) x(25) x(n1,n2=2) = x(2) x(8) x(14) x(20) x(26) x(n1,n2=3) = x(3) x(9) x(15) x(21) x(27) x(n1,n2=4) = x(4) x(10) x(16) x(22) x(28) x(n1,n2=5) = x(5) x(11) x(17) x(23) x(29) Perform N2 DFTs along n1 k1 = 0 1 2 3 4 y(k1,n2=0) = fft(x(0) x(6) x(12) x(18) x(24)) y(k1,n2=1) = fft(x(1) x(7) x(13) x(19) x(25)) y(k1,n2=2) = fft(x(2) x(8) x(14) x(20) x(26)) y(k1,n2=3) = fft(x(3) x(9) x(15) x(21) x(27)) y(k1,n2=4) = fft(x(4) x(10) x(16) x(22) x(28)) y(k1,n2=5) = fft(x(5) x(11) x(17) x(23) x(29)) Multiply k1 = 0 1 2 3 4 n2=0 = W0 W0 W0 W0 W0 n2=1 = W0 W1 W2 W3 W4 n2=2 = W0 W2 W4 W6 W8 n2=3 = W0 W3 W6 W9 W12 n2=4 = W0 W4 W8 W12 W16 n2=5 = W0 W5 W10 W15 W20
SLIDE 13 Matlab Example
Multiply k1 = 1 2 3 4 n2=0 = W0y00 W0y10 W0y20 W0y30 W0y40 n2=1 = W0y01 W1y11 W2y21 W3y31 W4y41 n2=2 = W0y02 W2y12 W4y22 W6y32 W8y42 n2=3 = W0y03 W3y13 W6y23 W9y33 W12y43 n2=4 = W0y04 W4y14 W8y24 W12y34 W16y44 n2=5 = W0y05 W5y15 W10y25 W15y35 W20y45 fft along columns k1 = 0 1 2 3 4 k2=0 = z00 z10 z20 z30 z40 k2=1 = z01 z11 z21 z31 z41 k2=2 = z02 z12 z22 z32 z42 k2=3 = z03 z13 z23 z33 z43 k2=4 = z04 z14 z24 z34 z44 k2=5 = z05 z15 z25 z35 z45 Reorder based on k = k1 + N1*k2; N = 30, N1=5, N2=6 k1 = ones(6,1)*(0:4); 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 k2 = (ones(5,1)*(0:5))’; 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 k = k1 + N1*k2; 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
SLIDE 14 Performance numbers
- 1x 30 point DFT:
- 30*30 (900) complex multiplications.
- 30*30 (900) complex additions.
- 6x 5 point DFTs:
- 6*5*5 (150) complex multiplications.
- 6*5*5 (150) complex additions.
- 6*5 (30) complex multiplications.
- 5x 6 point DFTs:
- 5*6*6 (180) complex multiplications.
- 5*6*6 (180) complex additions.
- For a total of:
- 360 complex multiplications.
- 330 complex additions.
SLIDE 15 Performance: one step further
- 6x 5 point DFTs:
- 6*5*5 (150) complex multiplications.
- 6*5*5 (150) complex additions.
- 6*5 (30) complex multiplications.
- 5x 6 point DFTs:
- 5*6*6 (180) complex multiplications.
- 5*6*6 (180) complex additions.
- For a total of:
- 360 complex multiplications.
- 330 complex additions.
SLIDE 16 Performance
- 1x 6 point DFT:
- 6*6 (36) complex multiplications.
- 6*6 (36) complex additions.
- 3x 2 point DFTs:
- 3*2*2 (12) complex multiplications.
- 3*2*2 (12) complex additions.
- 3*2 (6) complex multiplications.
- 2x 3 point DFTs:
- 2*3*3 (18) complex multiplications.
- 2*3*3 (18) complex additions.
- For a total of:
- 36 complex multiplications.
- 30 complex additions.
SLIDE 17 Performance: one step further
- 6x 5 point DFTs:
- 6*5*5 (150) complex multiplications.
- 6*5*5 (150) complex additions.
- 6*5 (30) complex multiplications.
- 5x 6 point DFTs:
- 5*6*6 (180) complex multiplications.
- 5*6*6 (150) complex additions.
- For a total of:
- 360 complex multiplications.
- 300 complex additions.
SLIDE 18 Basic building blocks
- 2 point fft (just addition and subtraction)
- X[0] = x[0] + x[1]
- Xr[0] = xr[0] + xr[1]
- Xi[0] = xi[0] + xi[1]
- X[1] = x[0] – x[1]
- Xr[1] = xr[0] - xr[1]
- Xi[1] = xi[0] - xi[1]
SLIDE 19 Basic building blocks
- 4 point fft (just addition, subtraction, swap r/i)
y00 = x[0] + x[2], y10 = x[0] – x[2] y01 = x[1] + x[3], y11 = -j(x[1] – x[3]) z00 = y00+y01, z10 = y10 + y11 z01 = y00-y01, z11 = y10 – y11 X[0] = z00 = x[0] + x[1] + x[2] + x[3] Xr[0] = xr[0] + xr[1] + xr[2] + xr[3] Xi[0] = xi[0] + xi[1] + xi[2] + xi[3] X[1] = z10 = x[0] –jx[1] – x[2] +jx[3] Xr[1] = xr[0] + xi[1] - xr[2] - xi[3] Xi[1] = xi[0] - xr[1] - xi[2] + xr[3] X[2] = z01 = x[0] – x[1] + x[2] – x[3] Xr[2] = xr[0] - xr[1] + xr[2] - xr[3] Xi[2] = xi[0] - xi[1] + xi[2] - xi[3] X[3] = z11 = x[0] +jx[1] – x[2] –jx[3] Xr[3] = xr[0] - xi[1] - xr[2] + xi[3] Xi[3] = xi[0] + xr[1] - xi[2] - xr[3]
SLIDE 20
8-point via 2x2x2
SLIDE 21
8-point via 2x2x2
2 groups of 2. 4 groups of 1 1 group of 4 butterflies Distance of 4
SLIDE 22 8-point sequence
- Distances:
- (butterfly distance, group distance)
- Twiddle mult. distance is the # groups
- 1 Group with a distances of 8
- 0 1 2 3
- 4 5 6 7 .* W0 W1 W2 W3
- 2 Groups with a distances of 4
- 0 1, 4 5
- 2 3, 6 7 .* W0 W2, W0 W2
- 4 Groups with a step/distances of 2
- 0, 2, 4, 6
- 1, 3, 5, 7 .* W0, W0, W0, W0
SLIDE 23 16-point sequence
- 1 Group with a distances of 16
- 0 1 2 3 4 5 6 7
- 8 9 10 11 12 13 14 15 .* W0 W1 W2 … W7
- Group counter: limit 16, step 16
- butterfly counter: limit 8, step 1
- 2 Groups with a distances of 8
- 0 1 2 3, 8 9 10 11
- 4 5 6 7, 12 13 14 15 .* W0 W2 W4 W6, …
- Group counter: limit 16, step 8
- Butterfly counter: limit 4, step 1
- 4 Groups with a distances of 4
- 0 1, 4 5, 8 9, 12 13
- 2 3, 6 7, 10 11, 14 15.* W0 W4, W0 …
- Group counter: limit 16, step 4
- Butterfly counter: limit 2, step 1
- 8 Groups with a distances of 2
- 0, 2, 4, 6, 8, 10, 12, 14
- 1, 3, 5, 7, 9, 11, 13, 15 .* W0, W0 …
- Group counter: limit 16, step 2
- Butterfly counter: limit 1, step 1
SLIDE 24 Output Sequencing for N=8
- X[0] is located at 0:000
- X[1] is located at 4:100
- X[2] is located at 2:010
- X[3] is located at 6:110
- X[4] is located at 1:001
- X[5] is located at 5:101
- X[6] is located at 3:011
- X[7] is located at 7:111
SLIDE 25
8-point via 2x2x2
2 groups of 2. 4 groups of 1 1 group of 4 butterflies Distance of 4
SLIDE 26 Basic building blocks
- 2 point DFT with a Multiply
- Xr[0] = xr[0] + xr[1]
- Xi[0] = xi[0] + xi[1]
- X[1] = (x[0] - x[1])*(Wr + Wi)
- Xr[1] = Wr(xr[0] – xr[1]) –
- Wi(xi[0] – xi[1])
- Xi[1] = Wi(xr[0] – xr[1]) +
- Wr(xi[0] – xi[1])
SLIDE 27 Basic building blocks
- 2 point fft
- Xr[0] = xr[0] + xr[1]
- Xi[0] = xi[0] + xi[1]
- Xr[1] = xr[0] - xr[1]
- Xi[1] = xi[0] - xi[1]
- 2x DSP48E1 Blocks
- 2x 24-bit adder
- 2x 24-bit subtraction
{xr[0],xi[0]} {xr[1],xi[1]} {xr[0],xi[0]} {xr[1],xi[1]} {Xr[0],Xi[0]} {Xr[1],Xi[1]}
SLIDE 28 Basic building blocks
- 2-point dft with twiddle multiply.
- Similar utilization as 1 channel
- f the IIR.
- 16-point latency
- 8*4 = 32
- In general N/2*log2(N)
- 1024-point
- 512*10 = 5120 vs
- 1024*1024 = 1048576
- 200 times faster
- Complexity in counting, indexing
SLIDE 29
8-point via 2x2x2
SLIDE 30 Implementation
- Three Counters
- Stage Counter 0:log2(N)-1
- Group Counter 0:(2Stage)-1
- Butterfly Counter 0:(N/(2*(Stage+1))-1)
SLIDE 31 Implementation
- Three Counters
- Stage Counter
- Group Counter
- Butterfly Counter
- Stage 0
- Group counts 0:0 (there is 1 group)
- Butterfly counts 0:3
- Stage 1
- Group counts 0:1
- Butterfly counts 0:1
- Stage 2
- Group counts 0:3
- Butterfly counts 0:0
SLIDE 32 Implementation
- Three Counters
- Stage Counter 0:log2(N)-1
- Group Counter 0:(2Stage)-1
- Butterfly Counter 0:(N/2)/(2Stage))-1)
- Stage 0
- Group counts 0:0 (there is 1 group)
- Butterfly counts 0:3
- Stage 1
- Group counts 0:1
- Butterfly counts 0:1
- Stage 2
- Group counts 0:3
- Butterfly counts 0:0
SLIDE 33 Implementation
- Three Counters
- Stage Counter 0:log2(N)-1
- Group Counter 0:(2Stage)-1
- Butterfly Counter 0:(N/2)/(2Stage)-1
limit count step(1) stage (s) reset enable wrap limit count step(1) group (g) reset enable wrap limit count step b-fly (b) reset enable wrap
log2(N) 1<<s N/2>>s done Depending on how you implement you counter module the “and” could be contained within the counter.
SLIDE 34
Implementation N=32
SLIDE 35
Implementation N=32
SLIDE 36
Implementation Data Storage
SLIDE 37 Implementation Data Storage
di_a do_a addr_a di_b do_b addr_b Data ram reset enable di_a do_a di_b do_b 2-pt DFT Mult coef do_a addr_a Coef rom reset enable
SLIDE 38 Implementation Addressing
- Addressing the 2-point DFTs
Stage 0: 0 0 0 0 group = 0 0 0 0 b-fly = 0 1 2 3 xb[0] = x[0], x[1], x[2], x[3] xb[1] = x[4], x[5], x[6], x[7] Stage 1: 1 1 1 1 group = 0 0 1 1 b-fly = 0 1 0 1 xb[0] = x[0], x[1], x[4], x[5] xb[1] = x[2], x[3], x[6], x[7] Stage 2: 2 2 2 2 group = 0 1 2 3 b-fly = 0 0 0 0 xb[0] = x[0], x[2], x[4], x[6] : address_a xb[1] = x[1], x[3], x[5], x[7] : address_b
SLIDE 39
Implementation addressing
Stage 0: 0 0 0 0 group = 0 0 0 0 b_fly = 0 1 2 3 xb[0] = x[0], x[1], x[2], x[3] xb[1] = x[4], x[5], x[6], x[7] Stage 1: 1 1 1 1 group = 0 0 1 1 b_fly = 0 1 0 1 xb[0] = x[0], x[1], x[4], x[5] xb[1] = x[2], x[3], x[6], x[7] Stage 2: 2 2 2 2 group = 0 1 2 3 b_fly = 0 0 0 0 xb[0] = x[0], x[2], x[4], x[6] : address_a xb[1] = x[1], x[3], x[5], x[7] : address_b address_a = (b_fly) + (group << (3 – stage))
SLIDE 40
Implementation
Stage 0: 0 0 0 0 group = 0 0 0 0 b_fly = 0 1 2 3 xb[0] = x[0], x[1], x[2], x[3] xb[1] = x[4], x[5], x[6], x[7] Stage 1: 1 1 1 1 group = 0 0 1 1 b_fly = 0 1 0 1 xb[0] = x[0], x[1], x[4], x[5] xb[1] = x[2], x[3], x[6], x[7] Stage 2: 2 2 2 2 group = 0 1 2 3 b_fly = 0 0 0 0 xb[0] = x[0], x[2], x[4], x[6] : address_a xb[1] = x[1], x[3], x[5], x[7] : address_b address_b = address_a + (4 >> stage)
SLIDE 41
Implementation
Stage 0: 0 0 0 0 group = 0 0 0 0 b_fly = 0 1 2 3 xb[0] = x[0], x[1], x[2], x[3] xb[1] = x[4], x[5], x[6], x[7] Stage 1: 1 1 1 1 group = 0 0 1 1 b_fly = 0 1 0 1 xb[0] = x[0], x[1], x[4], x[5] xb[1] = x[2], x[3], x[6], x[7] Stage 2: 2 2 2 2 group = 0 1 2 3 b_fly = 0 0 0 0 xb[0] = x[0], x[2], x[4], x[6] : address_a xb[1] = x[1], x[3], x[5], x[7] : address_b address_a = (b_fly) + (group << (log2(N) – stage)) address_b = address_a + ((N/2) >> stage)
SLIDE 42
32-point fft
SLIDE 43 Coefficients WN
Stage 0: 0 0 0 0 group = 0 0 0 0 b_fly = 0 1 2 3 xb[0] = x[0], x[1], x[2], x[3] xb[1] = x[4], x[5], x[6], x[7] WN = W[0], W[1], W[2], W[3] Stage 1: 1 1 1 1 group = 0 0 1 1 b_fly = 0 1 0 1 xb[0] = x[0], x[1], x[4], x[5] xb[1] = x[2], x[3], x[6], x[7] WN = W[0], W[2], W[0], W[2] Stage 2: 2 2 2 2 group = 0 1 2 3 b_fly = 0 0 0 0 xb[0] = x[0], x[2], x[4], x[6] : address_a xb[1] = x[1], x[3], x[5], x[7] : address_b WN = W[0], W[0], W[0], W[0] : address_c address_a = (b_fly) + (group << (log2(N) – stage)) address_b = address_a + ((N/2) >> stage) address_c = (b_fly << stage)
SLIDE 44 Implementation
di_a do_a addr_a di_b do_b addr_b Data ram reset enable di_a do_a di_b do_b 2-pt DFT Mult coef do_a addr_a Coef rom reset enable
SLIDE 45
Basic building blocks
2 point DFT with a Multiply Xr[0] = xr[0] + xr[1] Xi[0] = xi[0] + xi[1] X[1] = (x[0] - x[1])*(Wr + Wi) Xr[1] = Wr*(xr[0] – xr[1]) – Wi*(xi[0] – xi[1]) Xi[1] = Wi*(xr[0] – xr[1]) + Wr*(xi[0] – xi[1])
SLIDE 46 Implementation
di_a do_a addr_a di_b do_b addr_b Data ram reset enable di_a do_a di_b do_b 2-pt DFT Mult coef do_a addr_a Coef rom reset enable
SLIDE 47
Basic building blocks
2 point DFT with a Multiply Xr[0] = xr[0] + xr[1] Xi[0] = xi[0] + xi[1] X[1] = (x[0] - x[1])*(Wr + Wi) Xr[1] = Wr*(xr[0] – xr[1]) – Wi*(xi[0] – xi[1]) Xi[1] = Wi*(xr[0] – xr[1]) + Wr*(xi[0] – xi[1])
SLIDE 48
8-point via 2x2x2
SLIDE 49
8-point via 2x2x2
SLIDE 50 Implementation
di_a do_a addr_a di_b do_b addr_b Data ram reset enable di_a do_a di_b do_b 2-pt DFT Mult coef do_a addr_a Coef rom reset enable di_a do_a addr_a di_b do_b addr_b Data ram reset enable
SLIDE 51
SLIDE 52
Results +j
SLIDE 53 32-point comparison
Operation
- complex multiply
- 4 multiplies
- 2 adds
- complex adder
- 2 adds
- FFT
- Main Processing
Operation
- complex multiply
- 4 multiplies
- 2 adds
- 2 complex adders
- 4 adds
SLIDE 54 32-point comparison
Overhead
SLIDE 55 32-point comparison
Latency
- k-counter
- 0:N-1
- n-counter
- 0:N-1
- FFT
- Counters &
Latency
- b_fly & group:
- 0:N/2-1
- Stage
- 0:log2(N)
SLIDE 56 Comparison, 32 and 1024 point
- Latency
- 32x32 = 1024
- 1024*1024 =
1048576
- Latency
- 16*5 = 80
- 512*10 = 5120