Fixed Point Real Numbers 16-bit Unsigned with Binary Point: 8 - - PowerPoint PPT Presentation

fixed point real numbers
SMART_READER_LITE
LIVE PREVIEW

Fixed Point Real Numbers 16-bit Unsigned with Binary Point: 8 - - PowerPoint PPT Presentation

Fixed Point Real Numbers 16-bit Unsigned with Binary Point: 8 XXXXXXXX.XXXXXXXX Maximum/Minimum Values 11111111.11111111 = 255+255/256 00000000.00000000 = 0 16-bit Signed with Binary Point: 8 XXXXXXXX.XXXXXXXX


slide-1
SLIDE 1

Fixed Point Real Numbers

  • 16-bit Unsigned with Binary Point: 8
  • XXXXXXXX.XXXXXXXX
  • Maximum/Minimum Values
  • 11111111.11111111 = 255+255/256
  • 00000000.00000000 = 0
  • 16-bit Signed with Binary Point: 8
  • XXXXXXXX.XXXXXXXX
  • Maximum/Minimum Values
  • 01111111.11111111 = 127+255/256
  • 10000000.00000000 = -128
slide-2
SLIDE 2

Multiplication of Signed FP

  • If a has width Wa and binary point

bpa and b has width Wb and binary point bpb.

  • The output of the multiplier will

need width Wa+Wb and a bp of bpa+bpb.

slide-3
SLIDE 3

Number Representation

  • Previous examples of FIR filters

used integer representations for the filter coefficients.

  • What if we have coefficients with

fractional components?

  • Two options.
  • 1. Apply a scaling factor to all the

coefficients to get the desired resolution.

  • 2. Use a binary point numbers to represent
  • ur coefficients.
slide-4
SLIDE 4

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values digital of b should we use? What is the required output data width?

slide-5
SLIDE 5

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values of b should we use? What is the required output data width?

  • Scaling factor approach
  • Multiply coefficients by 100 and use a

10-bit signed format (-512 to 511). b = [142 205 -323 471 -311 -510]

  • Determine the maximum output.

max = abs(128)*sum(abs(b)) = 251136

  • ceil(log2(251136))+1 = 19-bit signed
  • 19-bit signed has a range(-262144 to

262143)

slide-6
SLIDE 6

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values of b should we use? What is the required output data width?

  • Scaling factor approach
  • If the absolute scale of the output is to

be retained, it will need to be divided by 100 to revert back to the original filter coefficients.

  • How do we divide by 100 in binary?
  • Maybe not a good approach in all

instances.

slide-7
SLIDE 7

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values of b should we use? What is the required output data width?

  • Binary Point Approach
  • Represent coefficients with 10-bit signed

and binary point at the 6th position

  • This is a design choice.
  • XXXX.XXXXXX
  • Can handle values -8+(0/64) to 7+(63/64)
slide-8
SLIDE 8

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values of b should we use? What is the required output data width?

  • Binary Point Approach
  • Determine the digital coefficient values.
  • bbp = dec2bin(mod(round(64*b)+1024,1024))
  • bbp = [0001011011

0010000011 1100110001 0100101101 1100111001 1010111010]

slide-9
SLIDE 9

Example:

  • input is 8-bit signed
  • filter coefficients

b = [1.42 2.05 -3.23 4.71 -3.11 -5.10] What values of b should we use? What is the required output data width?

  • Binary Point Approach
  • Determine the maximum output.

bbp = round(64*b) max = abs(128)*sum(abs(bbp)) = 160640

  • ceil(log2(160640))+1 = 19-bit signed
  • 19-bit signed has a range(-262144 to

262143)

  • Final output is a 19-bit signed with a bp
  • f 6.
slide-10
SLIDE 10

IIR Implementation: a0 = 1

z-1 z-1 a1 a2

+ - subtraction

slide-11
SLIDE 11

IIR Implementation: Pipelining?

z-1 z-1 a1 a2 z-1 z-1 a1 a2 z-1 z-1 z-1 a1 a2 a1 a2 z-1 z-1 z-1 z-1 z-1 z-1

slide-12
SLIDE 12

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 //functional description assign dif = in_reg - p1; assign m1 = dif*a1; assign m2 = dif*a2; always@(posedge clock) begin in_reg <= in;

  • ut_reg

<= dif; p1 <= m1 + p2; p2 <= m2; end dif m1 m2

slide-13
SLIDE 13

IIR Implementation: DSP Blocks

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 //dsp48 structural always@(posedge clock) begin in_reg <= in;

  • ut_reg <= dif;

end macc_wrap dsp2 (.C(0),.A(dif),.B(a2),.PCOUT(p2)); macc_wrap dsp1 (.PCIN(p2),.A(dif),.B(a1),POUT(p1)); assign dif = p1 + in; dif

slide-14
SLIDE 14

Number Representation

  • For the IIR filter diagram in the

previous slides, there is a requirement that a0=1.

  • For cases when a1 and a2 are near 1
  • r fractional values, we cannot

accurately represent these values.

  • Two options.
  • 1. add a pre-multiplier to the input to

incorporate an a0 scale term.

  • 2. If we care about the absolute scale,

use a binary point numbers to represent

  • ur coefficients.
  • 3. Remember to keep track of binary point

locations especially in the feedback path.

slide-15
SLIDE 15

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 //functional description assign dif = m0 - p1; assign m0 = in_reg*a0; assign m1 = dif*a1; assign m2 = dif*a2; always@(posedge clock) begin in_reg <= in;

  • ut_reg

<= dif; p1 <= m1 + p2; p2 <= m2; end dif m1 m2 m0

a0

+ - subtraction

slide-16
SLIDE 16

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 Keeping track of bp locations We will use (W:BP) notation. Assume all values are signed. Input is 8-bit signed (8:0) Coefficients are (10:6) Assume p1 is (18:6) for subtraction m0 (18:6) diff (19:6) m1,m2 (29:12) p1 (30:12) p1 needs to have a bp of 6 so the subtraction will have equivalent input formats. dif m1 m2 m0

a0

+ - subtraction

slide-17
SLIDE 17

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 //functional description assign dif = m0 – (p1 >>> bp); //bp is the binary point of the coefficients assign m0 = in_reg*a0; assign m1 = dif*a1; assign m2 = dif*a2; always@(posedge clock) begin in_reg <= in;

  • ut_reg

<= dif; p1 <= m1 + p2; p2 <= m2; end dif m1 m2 m0

a0

+ - subtraction

slide-18
SLIDE 18

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

in_reg p2 p1 //functional description assign dif = m0 – p1; assign m0 = in_reg*a0; assign m1 = (dif*a1) >>> bp; //or shift assign m2 = (dif*a2) >>> bp; //here always@(posedge clock) begin in_reg <= in;

  • ut_reg

<= dif; p1 <= m1 + p2; p2 <= m2; end dif m1 m2 m0

a0

+ - subtraction

slide-19
SLIDE 19

IIR Implementation

z-1 z-1 a1 a2 z-1 z-1

  • ut_reg

p2 p1 //dsp48 structural always@(posedge clock) begin

  • ut_reg <= dif;

end macc_wrap dsp0 (.C(p1 >>> bp),.A(in),.B(a0),.POUT(dif)); macc_wrap dsp2 (.C(0),.A(sum),.B(a2),.PCOUT(p2)); macc_wrap dsp1 (.PCIN(p2),.A(sum),.B(a1),POUT(p1)); dif

a0

slide-20
SLIDE 20

IIR Implementation

z-1 z-1

  • a1
  • a2

z-1 z-1

  • ut_reg

p2 p1 //dsp48 structural always@(posedge clock) begin

  • ut_reg <= dif;

end macc_wrap dsp0 (.C(p1 >>> bp),.A(in),.B(a0),.POUT(sum)); macc_wrap dsp2 (.C(0),.A(sum),.B(-a2),.PCOUT(p2)); macc_wrap dsp1 (.PCIN(p2),.A(sum),.B(-a1),POUT(p1)); sum

a0

slide-21
SLIDE 21

IIR Implementation: Pipelining?

a3 a4 a2 a1 z-4 z-1 z-1 z-1 z-1 z-1 a3 a4 z-1 a2 z-1 a1 z-1

Really only need to pipeline a 2nd order IIR filter to realize 2 poles. Then we can cascade a number of them to realize M poles.

slide-22
SLIDE 22

IIR Implementation: Pipelining?

z-1 a4 z-1 a2 z-1 z-1

The above diagram has 4 poles and has extra registers for pipelining. Idea is to start with an IIR filter with 4 poles (2 we want to keep and 2 of our choosing that will be canceled). Based on the 2 we want to keep, determine what the 2 additional poles need to be to eliminate the a1 and a3 terms. Pre-multiply with a cascaded FIR filter with zeros placed at the locations of the two additional poles.

slide-23
SLIDE 23

IIR Implementation: Pipelining?

  • Turn to the math...
  • Z-domain

𝐼 𝑨 = 1 1 − 𝑏1𝑨−1 − 𝑏2𝑨−2 = 1 1 − 𝑞1𝑨−1 1 − 𝑞2𝑨−1

  • Add some poles but compensate in the

numerator with an FIR filter.

𝐼 𝑨 = 1 1 − 𝑞1𝑨−1 1 − 𝑞2𝑨−1 1 − 𝑞3𝑨−1 1 − 𝑞4𝑨−1 1 − 𝑞3𝑨−1 1 − 𝑞4𝑨−1

  • Choose p3 & p4 to cancel the z-1 & z-3

coefficients in the denominator.

slide-24
SLIDE 24

IIR Implementation: Pipelining

  • Using the original polynomial.

𝐼 𝑨 = 1 1 − 𝑏1𝑨−1 − 𝑏2𝑨−2 1 + 𝑏1𝑨−1 + −𝑏2 𝑨−2 1 − −𝑏1 𝑨−1 − 𝑏2𝑨−2 𝐼 𝑨 = 1 + 𝑏1𝑨−1 + −𝑏2 𝑨−2 1 − 𝑏1

2 + 2𝑏2 𝑨−2 − 𝑏2 2𝑨−4

slide-25
SLIDE 25

IIR Implementation: Pipelining

z-2 z-2 a1

2+2a2

a2

2

z-1 z-1 a1

  • a2

z-1 z-1 a1

2+2a2

a2

2

z-1 z-1 a1

  • a2

z-1 z-1 z-1 z-1

slide-26
SLIDE 26

IIR Implementation: Pipelining

  • Still not fully pipelined.
  • Add 4 zeros/poles instead of 2.

z-1 z-2 a1’ a2’ z-1 z-1 z-2 z-1 z-1 z-2 b1 z-2 b2 b3 b4 z-1 z-1 z-1 z-1 z-2 z-1 z-1 z-1 z-1 z-1 z-1