Hardware Implementations of Fixed-Point Atan2 Florent de Dinechin - - PowerPoint PPT Presentation

hardware implementations of fixed point atan2
SMART_READER_LITE
LIVE PREVIEW

Hardware Implementations of Fixed-Point Atan2 Florent de Dinechin - - PowerPoint PPT Presentation

Hardware Implementations of Fixed-Point Atan2 Hardware Implementations of Fixed-Point Atan2 Florent de Dinechin Matei I stoan Universit e de Lyon, INRIA, INSA-Lyon, CITI-Lab ARITH22 Florent de Dinechin, Matei I stoan Hardware


slide-1
SLIDE 1

Hardware Implementations of Fixed-Point Atan2

Hardware Implementations

  • f Fixed-Point Atan2

Florent de Dinechin Matei I¸ stoan

Universit´ e de Lyon, INRIA, INSA-Lyon, CITI-Lab

ARITH22

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2

slide-2
SLIDE 2

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Methods for Computing atan2 in Hardware

Yet another arithmetic function . . .

  • . . . that is useful in telecom (to recover the phase of a signal)

(12–24 bits of precision)

  • . . . and in general for cartesian to polar coordinate transformation
  • and an interesting function, nonetheless

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 x y Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 2 / 24

slide-3
SLIDE 3

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Common Specification

  • target function

f (x, y) = 1 π arctan (y x )

1 −1 1 −1

x y (x, y) α = atan2(y, x)

  • input: fixed-point format

arctan (ky kx ) = arctan (y x ) −1 [ 1 )

  • output: fixed-point format and binary angles

(0, 1) (1, 0) (−1, 0)

x y

π 2

π −π

= ⇒ −1 [ 1 )

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 3 / 24

slide-4
SLIDE 4

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Common Specification

  • target function

f (x, y) = 1 π arctan (y x )

1 −1 1 −1

x y (x, y) α = atan2(y, x)

  • input: fixed-point format

arctan (ky kx ) = arctan (y x ) −1 [ 1 )

  • output: fixed-point format and binary angles

(0, 1) (1, 0) (−1, 0)

x y

π 2

π −π

= ⇒ −1 [ 1 )

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 3 / 24

slide-5
SLIDE 5

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Common Specification

  • target function

f (x, y) = 1 π arctan (y x )

1 −1 1 −1

x y (x, y) α = atan2(y, x)

  • input: fixed-point format

arctan (ky kx ) = arctan (y x ) −1 [ 1 )

  • output: fixed-point format and binary angles

(0, 1) (1, 0) (−1, 0)

x y

π 2

π −π

= ⇒ −1 [ 1 )

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 3 / 24

slide-6
SLIDE 6

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Common Specification

  • target function

f (x, y) = 1 π arctan (y x )

1 −1 1 −1

x y (x, y) α = atan2(y, x)

  • input: fixed-point format

arctan (ky kx ) = arctan (y x ) −1 [ 1 )

  • output: fixed-point format and binary angles

(0, 1) (1, 0) (−1, 0)

x y

π 2

π −π

= ⇒ −1 [ 1 )

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 3 / 24

slide-7
SLIDE 7

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

A Meaningful Comparison

3 different methods for evaluating atan2 in hardware

  • same accuracy specification:

f (x, y) computed with last-bit accuracy (faithful rounding)

  • same implementation effort

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 4 / 24

slide-8
SLIDE 8

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

A Meaningful Comparison

3 different methods for evaluating atan2 in hardware

  • same accuracy specification:

f (x, y) computed with last-bit accuracy (faithful rounding)

  • same implementation effort

Target platform: FPGAs (Field Programmable Gate Arrays)

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 4 / 24

slide-9
SLIDE 9

Hardware Implementations of Fixed-Point Atan2 Introduction: Methods for computing Atan2

Hello FPGAs!

Island-style homogeneous FPGAs

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 5 / 24

slide-10
SLIDE 10

Hardware Implementations of Fixed-Point Atan2 CORDIC

First Method: An Unrolled CORDIC

   x0 = x y0 = y α0 =    xi+1 = xi − 2−isiyi yi+1 = yi + 2−isixi αi+1 = αi − si arctan 2−i    xn − → K

  • x2 + y 2

yn − → αi − → arctan y

x

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 6 / 24

slide-11
SLIDE 11

Hardware Implementations of Fixed-Point Atan2 CORDIC

CORDIC Iteration: Datapath Implementation

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 7 / 24

slide-12
SLIDE 12

Hardware Implementations of Fixed-Point Atan2 CORDIC

CORDIC Iteration: Datapath Implementation

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 7 / 24

slide-13
SLIDE 13

Hardware Implementations of Fixed-Point Atan2 CORDIC

CORDIC Iteration: Accurate Datapath Implementation

= ⇒ p = w − 1 − ⌈log2εw−1⌉ bits for the xi and yi datapath

  • we can stop updating xi when 2i − 1 > p (unrolled operator)

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 7 / 24

slide-14
SLIDE 14

Hardware Implementations of Fixed-Point Atan2 CORDIC

CORDIC Iteration: Accurate Datapath Implementation

= ⇒ p = w − 1 − ⌈log2εw−1⌉ bits for the xi and yi datapath

  • we can stop updating xi when 2i − 1 > p (unrolled operator)

= ⇒ gα = 1 + ⌈log2((w − 1) × 0.5)⌉ guard bits for the αi datapath

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 7 / 24

slide-15
SLIDE 15

Hardware Implementations of Fixed-Point Atan2 CORDIC

Hello, again, FPGAs!

Current heterogeneous FPGAs

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 8 / 24

slide-16
SLIDE 16

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Polynomial Approximations

Polynomial approximation, and their derivatives (bipartite etc.):

  • the straight-forward solution for implementing univariate functions
  • problem: area asymptotically exponential in the input width...

for a bivariate function, we double the input width.

  • solutions:
  • range reduction?
  • multiple consecutive one-input functions?

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 9 / 24

slide-17
SLIDE 17

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Polynomial Approximations

Polynomial approximation, and their derivatives (bipartite etc.):

  • the straight-forward solution for implementing univariate functions
  • problem: area asymptotically exponential in the input width...

for a bivariate function, we double the input width.

  • solutions:
  • range reduction?
  • multiple consecutive one-input functions?

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 9 / 24

slide-18
SLIDE 18

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Polynomial Approximations

Polynomial approximation, and their derivatives (bipartite etc.):

  • the straight-forward solution for implementing univariate functions
  • problem: area asymptotically exponential in the input width...

for a bivariate function, we double the input width.

  • solutions:
  • range reduction?
  • multiple consecutive one-input functions?

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 9 / 24

slide-19
SLIDE 19

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

The 1

x and arctan (x) Functions arctan (y x ) = arctan (y × 1 x ) reciprocal function arctangent function

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 10 / 24

slide-20
SLIDE 20

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Range Reductions – Symmetry and Parity

x y

arctan y x

  • = −π

2 − arctan |x| |y|

  • Florent de Dinechin, Matei I¸

stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 11 / 24

slide-21
SLIDE 21

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Range Reductions – Scaling

arctan 2sy 2sx

  • = arctan

y x

  • 1 x

1 y s = 0

s = 1 s = 2 s = 3

normalized domain

|x| |y| bitwise OR LZC ShiftX ShiftY s xr yr

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 12 / 24

slide-22
SLIDE 22

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

The 1

x and arctan (x) Functions – Reduced Domain reciprocal function on [0.5, 1) arctangent function on [0, 1) Now we can evaluate them with tables, or multipartite tables,

  • r polynomial approximators, etc.

(all available as faithful FloPoCo operators)

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 13 / 24

slide-23
SLIDE 23

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Reciprocal-Multiply-Arctangent

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 14 / 24

slide-24
SLIDE 24

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Reciprocal-Multiply-Arctangent

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 14 / 24

slide-25
SLIDE 25

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Reciprocal-Multiply-Arctangent

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 14 / 24

slide-26
SLIDE 26

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Reciprocal-Multiply-Arctangent: Datapath Dimensioning

Goal: minimize architecture cost, such that |εtotal| < ulp = 2−w |εtotal| <

1 3 |εrecip| + 1 3 |εmult| + |εatan|

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 14 / 24

slide-27
SLIDE 27

Hardware Implementations of Fixed-Point Atan2 Recip-Mult-Atan

Delays, Delays, Delays

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 15 / 24

slide-28
SLIDE 28

Hardware Implementations of Fixed-Point Atan2 Bi-variate Polynomial Approximations

The arctan (y

x ) Function

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 x y

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 16 / 24

slide-29
SLIDE 29

Hardware Implementations of Fixed-Point Atan2 Bi-variate Polynomial Approximations

First Order Bi-variate Polynomial Approximation

α ≈ T1(x, y) = ax + by + c

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 17 / 24

slide-30
SLIDE 30

Hardware Implementations of Fixed-Point Atan2 Bi-variate Polynomial Approximations

First Order Bi-variate Polynomial Approximation

α ≈ T1(x, y) = ax + by + c

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 17 / 24

slide-31
SLIDE 31

Hardware Implementations of Fixed-Point Atan2 Bi-variate Polynomial Approximations

Second Order Bi-variate Polynomial Approximation

α ≈ T2(x, y) = ax+by +c +dx2 + ey 2 + fxy

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 18 / 24

slide-32
SLIDE 32

Hardware Implementations of Fixed-Point Atan2 Bi-variate Polynomial Approximations

Second Order Bi-variate Polynomial Approximation

α ≈ T2(x, y) = ax+by +c +dx2 + ey 2 + fxy Goal: |εtotal| < ulp = 2−w εtotal = εmeth + εrnd + εfinal rnd

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 18 / 24

slide-33
SLIDE 33

Hardware Implementations of Fixed-Point Atan2 Comparisons

Comparisons: Logic-only Synthesis

Bitwidth LUT Latency (ns) CORDIC 8 173 9.3 12 435 14.6 16 734 19.7 24 1504 31.0 32 2606 43.1 Bitwidth LUT Latency (ns) Taylor degree 1 8 207 12.64 12 1258 14.74 16 37744 20.20 Bitwidth LUT Latency (ns) Taylor degree 2 8 356 13.72 12 469 14.75 16 1509 17.90 Bitwidth Method LUT Latency (ns) RecipMultAtan 8 degree 0 175 11.8 12 degree 0 683 16.2 12 degree 1 443 19.0 16 degree 1 1049 19.1 24 degree 2 2583 35.2 32 degree 2 6190 40.7 32 degree 3 5423 50.8

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 19 / 24

slide-34
SLIDE 34

Hardware Implementations of Fixed-Point Atan2 Comparisons

Logic-only Synthesis: Area

8 12 16 103 104 Bitwidth LUTs

Taylor 1 Taylor 2 CORDIC RecipMultAtan

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 20 / 24

slide-35
SLIDE 35

Hardware Implementations of Fixed-Point Atan2 Comparisons

Logic-only Synthesis: delay

8 12 16 24 32 9.3 14.6 19.7 31 41.3 Bitwidth Latency (ns)

CORDIC Taylor 1 Taylor 2 RecipMultAtan

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 21 / 24

slide-36
SLIDE 36

Hardware Implementations of Fixed-Point Atan2 Comparisons

Comparisons: 16-bit Pipelined Architectures

Method LUT + Reg. BRAM + DSP Speed cycles@freq. CORDIC 816 + 44 0+0 2@191 799 + 202 5@274 796 + 336 8@389 RecipMultAtan 1 320 + 51 2+1 2@112 315 + 68 3@199 RecipMultAtan 2 425 + 199 0+5 10@130 432 +250 14@253 Taylor degree 2 331 + 53 4+6 1@135 327 + 103 3@144 329 + 140 5@220

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 22 / 24

slide-37
SLIDE 37

Hardware Implementations of Fixed-Point Atan2 Comparisons

Conclusions

Very unlike ”Fixed-Point Trigonometric Functions on FPGAs“ (in HEART 2013) CORDIC?

  • efficient
  • scales well

Polynomial Approximations?

  • limited by memory requirements
  • no unique optimal solution
  • Could it be saved by better bivariate polynomial approximation?

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 23 / 24

slide-38
SLIDE 38

Hardware Implementations of Fixed-Point Atan2 Comparisons

Questions?

Thank you for your attention! Questions?

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22 24 / 24

slide-39
SLIDE 39

Hardware Implementations of Fixed-Point Atan2 Comparisons

Hidden Frame

Hidden Frame

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22

slide-40
SLIDE 40

Hardware Implementations of Fixed-Point Atan2 Comparisons

First Method: CORDIC

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22

slide-41
SLIDE 41

Hardware Implementations of Fixed-Point Atan2 Comparisons

The CORDIC Iteration: Datapath Dimensioning

The xi datapath:

  • xi+1 =

xi − si2−i yi + ux

i

= xi + εx

i − si2−i(yi + εy i ) + ux i

= xi+1 + εx

i − si2−iεy i + ux i

  • using the error bound εi for εx

i

and εy

i , and ux i < 2−p

εi+1 = εi(1 + 2−i) + 2−p = ⇒ p = w − 1 − ⌈log2εw−1⌉ bits for the xi and yi datapath

  • we can stop updating xi when

2i − 1 > p (useful because unrolled operator)

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22

slide-42
SLIDE 42

Hardware Implementations of Fixed-Point Atan2 Comparisons

The CORDIC Iteration: Datapath Dimensioning (2)

The αi datapath:

  • εatan(2−i) = 2−pα−1

(or 0.5 ulp on the pα precision)

  • εfinal round = 2−w

= ⇒ gα = 1 + ⌈log2((w − 1) × 0.5)⌉ extra guard bits needed

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22

slide-43
SLIDE 43

Hardware Implementations of Fixed-Point Atan2 Comparisons

Bi-variate Polynomial Approximations: Datapath Dimensioning

Goal: |εtotal| < ulp = 2−w εtotal = εmeth + εfinal rnd + εrnd

  • method error: εmeth
  • due to neglected terms of Taylor series
  • constraint on k =

⇒ k ≥ ⌈ w+1

3 ⌉

  • rounding errors: εrnd =

εaδx + εbδy + εc + εdδx2 + εeδy 2 + εf δxδy

  • depends on δx, δy =

⇒ εround depends

  • n k
  • εmethod + εround < 2−w−1

= ⇒ number of guard bits g ⇐ ⇒ compromise between size of tables and size of multipliers

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22

slide-44
SLIDE 44

Hardware Implementations of Fixed-Point Atan2 Comparisons

The arctan (y

x ) Function

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 x y

Florent de Dinechin, Matei I¸ stoan Hardware Implementations of Fixed-Point Atan2 ARITH22