AN ASYNCHRONOUS DIVIDER IMPLEMENTATION Navaneeth Jamadagni and Jo - - PowerPoint PPT Presentation

an asynchronous divider implementation
SMART_READER_LITE
LIVE PREVIEW

AN ASYNCHRONOUS DIVIDER IMPLEMENTATION Navaneeth Jamadagni and Jo - - PowerPoint PPT Presentation

Asynchronous Research Center Asynchronous Research Center AN ASYNCHRONOUS DIVIDER IMPLEMENTATION Navaneeth Jamadagni and Jo Ebergen 2 Asynchronous Research Center Acknowledgements Oracle Labs DARPA Asynchronous Research Center


slide-1
SLIDE 1

Asynchronous Research Center Asynchronous Research Center

AN ASYNCHRONOUS DIVIDER IMPLEMENTATION

Navaneeth Jamadagni and Jo Ebergen

slide-2
SLIDE 2

Asynchronous Research Center

Acknowledgements

  • Oracle Labs
  • DARPA
  • Asynchronous Research Center
  • Reviewers

2

slide-3
SLIDE 3

Asynchronous Research Center

Take Away

  • Division Algorithm
  • Usually retires 1 quotient digit per iteration
  • Sometimes retires 2 quotient digits per

iteration

  • An Asynchronous Design
  • Exploits the time disparity between

addition and shift operations

  • Result
  • Improvement in speed

3

slide-4
SLIDE 4

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

4

slide-5
SLIDE 5

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

5

slide-6
SLIDE 6

Asynchronous Research Center

Division

6

Numerator, N = ((Quotient, Q) * (Denominator, D)) + Remainder, R

slide-7
SLIDE 7

Asynchronous Research Center

Long Division, Example

7

375 15 2 30 75 5 – 75 –

slide-8
SLIDE 8

Asynchronous Research Center

Long Division

  • Given a Numerator and a Denominator
  • 1. Guess the quotient digit

a.

Ten choices, from the set {0 … 9}

  • 2. Multiply the denominator by your guess
  • 3. Subtract the product from the remainder to get

a new remainder

  • 4. Retire that quotient digit
  • 5. Repeat steps 1 to 4 until the remainder is 0 or

you run out of time

8

Iteration

slide-9
SLIDE 9

Asynchronous Research Center

Two Division Methods

  • Multiplicative Methods
  • Many choices (e.g., {0 …. 28})
  • Expensive multiplication
  • Retires many quotient bits per iteration
  • Few iterations
  • Digit-Recurrence Methods
  • Very few choices (e.g., {0,1} or {-2, -1, 0, 1, 2})
  • Inexpensive multiplication
  • Retires one or two quotient bits per iteration
  • Many iterations

9

slide-10
SLIDE 10

Asynchronous Research Center

SRT Division

  • Most common implementation in

microprocessors

  • Carry-Save Additions or Subtractions
  • Guess by Carry Propagate Addition
  • Guess by Table Lookup
  • After Sweeney, Robertson and Tocher

10

slide-11
SLIDE 11

Asynchronous Research Center

1.

Guess : quotient digit from the set {–1, 0, 1}

2.

Multiply : –1×D, 0×D, 1×D

SRT Division, Iteration

11

slide-12
SLIDE 12

Asynchronous Research Center

1.

Guess : quotient digit from the set {–1, 0, 1}

2.

Multiply : –1×D, 0×D, 1×D

3.

Subtract or Add:

SRT Division, Iteration

12

CSA

CSA = Carry Save Addition

slide-13
SLIDE 13

Asynchronous Research Center

1.

Guess : quotient digit from the set {–1, 0, 1}

2.

Multiply : –1×D, 0×D, 1×D

3.

Subtract or Add:

SRT Division, Iteration

13

CSA Table Lookup CPA

  • r

CPA = Carry Propagate Addition

slide-14
SLIDE 14

Asynchronous Research Center

1.

Guess : quotient digit from the set {–1, 0, 1}

2.

Multiply : –1×D, 0×D, 1×D

3.

Subtract or Add :

4.

Retire : One quotient digit always

SRT Division, Iteration

14

CSA Basis for the Guess

CPA = Carry Propagate Addition CSA = Carry Save Addition

slide-15
SLIDE 15

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

15

slide-16
SLIDE 16

Asynchronous Research Center

Divine Division

“Divine” = Discover by guess work or Intuition

  • Carry-Save Addition or Subtraction only
  • Two versions in the paper: E and H
  • Developed at Sun Labs, by

Jo Ebergen, Ivan Sutherland and Danny Cohen

16

slide-17
SLIDE 17

Asynchronous Research Center

Divine Division, Iteration

1.

Guess : quotient digit from the set {-2, -1, 0, 1, 2}

2.

Multiply : –2×D, -1×D, 0×D, 1×D, 2×D

17

slide-18
SLIDE 18

Asynchronous Research Center

Divine Division, Iteration

1.

Guess : quotient digit from the set {-2, -1, 0, 1, 2}

2.

Multiply : –2×D, -1×D, 0×D, 1×D, 2×D

3.

Subtract or Add :

18

CSA

slide-19
SLIDE 19

Asynchronous Research Center

Divine Division, Iteration

1.

Guess : quotient digit from the set {-2, -1, 0, 1, 2}

2.

Multiply : –2×D, -1×D, 0×D, 1×D, 2×D

3.

Subtract or Add :

19

CSA 2 MSBs

slide-20
SLIDE 20

Asynchronous Research Center

Divine Division, Iteration

1.

Guess : quotient digit from the set {-2, -1, 0, 1, 2}

2.

Multiply : –2×D, -1×D, 0×D, 1×D, 2×D

3.

Subtract or Add :

20

Parity Bits Majority Bits

CSA

slide-21
SLIDE 21

Asynchronous Research Center

Divine Division, Iteration

1.

Guess : quotient digit from the set {-2, -1, 0, 1, 2}

2.

Multiply : –2×D, -1×D, 0×D, 1×D, 2×D

3.

Subtract or Add :

1.

Retire : One quotient digit usually, Two quotient digits sometimes

2.

One more iteration than SRT for equal accuracy

21

Parity Bits Majority Bits

CSA

slide-22
SLIDE 22

Asynchronous Research Center 22

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choices

slide-23
SLIDE 23

Asynchronous Research Center 23

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choices

slide-24
SLIDE 24

Asynchronous Research Center 24

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: 4X*

Quotient = 0 and 0 Remainder = Left shift by 2 and Invert MSBs

slide-25
SLIDE 25

Asynchronous Research Center 25

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: 2X*

Quotient = Remainder = Left shift by 1 and Invert MSBs

slide-26
SLIDE 26

Asynchronous Research Center 26

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: 2X

Quotient = Remainder = Left shift by 1

slide-27
SLIDE 27

Asynchronous Research Center 27

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: SUB1 & 2X*

Quotient = 1 Remainder = SUB 1×D & Left shift by 1 and Invert MSBs

slide-28
SLIDE 28

Asynchronous Research Center 28

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: SUB2 & 2X*

Quotient = 2 Remainder = SUB 2×D & Left shift by 1 and Invert MSBs

slide-29
SLIDE 29

Asynchronous Research Center 29

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: ADD1 & 2X*

Quotient =

  • 1

Remainder = ADD 1×D & Left shift by 1 and Invert MSBs

slide-30
SLIDE 30

Asynchronous Research Center 30

SUB2 & 2X* SUB1 & 2X* SUB1 & 2X*

2X 2X* 4X* 4X* 2X* 2X* 4X* 2X

ADD1 & 2X* ADD2 & 2X* ADD1 & 2X*

2X* 4X* Value of the Remainder = Parity + Majority

4 3 2 1

  • 1
  • 2
  • 3
  • 4

Divine Division Choice: ADD2 & 2X*

Quotient =

  • 2

Remainder = ADD 2×D & Left shift by 1 and Invert MSBs

slide-31
SLIDE 31

Asynchronous Research Center

Number of Iterations per Division

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Probability Number of Iterations per division (25-bit operands) Divine Division SRT Division

31

Average = 22.6

  • One million pairs of uniform-random 25-bit input
  • perands
slide-32
SLIDE 32

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

32

slide-33
SLIDE 33

Asynchronous Research Center

Asynchronous Divine Divider

  • Control Path
  • Uses GasP modules
  • Generates the control signals for the registers
  • Delay matched to data path
  • Shift steps are faster than addition steps
  • Asynchronous loop counter
  • Data Path
  • Registers and computational blocks (e.g., CSA)
  • Single rail bundled data

33

slide-34
SLIDE 34

Asynchronous Research Center

Data Path

34

  • 1. Guess the

first Quotient Digit

slide-35
SLIDE 35

Asynchronous Research Center

Data Path

35

  • 2. Carry Save Add
  • 3. Retires one digit
  • 4. Guess the next

quotient digit

slide-36
SLIDE 36

Asynchronous Research Center

Data Path

36

  • 2. Left shift by 1
  • 3. Retires 1 quotient

digit, namely 0

  • 4. Guess the next

quotient digit

slide-37
SLIDE 37

Asynchronous Research Center

Data Path

37

2. Left shift by 2 3. Retires 2 quotient digits, 0 and 0 4. Guess the next quotient digit

slide-38
SLIDE 38

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

38

slide-39
SLIDE 39

Asynchronous Research Center

Simulation

  • SPICE Simulation in TSMC 90nm
  • Partial layout with estimated wire lengths
  • 50 pairs of random test vectors
  • Iteration statistics similar to 106 random test vectors
  • Delay measurements are normalized to FO4 delays
  • Comparisons 1FO4 ≈ 25ps in 90nm process
  • Shift Path = 6 FO4, Add Path = 8.5 FO4
  • Energy data estimated for Data and Control Paths
  • Comparison normalized to 1V Vdd

39

slide-40
SLIDE 40

Asynchronous Research Center

Average Delay

  • From SPICE
  • Shift Path = 6 FO4
  • Add Path = 8.5 FO4
  • Asynchronous Design
  • Sometimes retires two bits and Shift steps are quicker
  • Average delay per bit is 6.3 FO4
  • Synchronous Design
  • Sometimes retires two bits
  • Average delay per bit is 7.4 FO4

40

slide-41
SLIDE 41

Asynchronous Research Center 06 16 16 10 07 07 00 04 08 12 16

Delay per Quotient bit Normalized

41

Jamadagni and Ebergen, 2012 Divine Division, CMOS Williams and Horowitz, 1991, Radix-2 SRT, Domino, Async Renaudin et al, 1996 Radix-2 SRT, LDCVSL, Async Harris et al, 1996 Radix-4 SRT, CMOS, Sync

Async Sync

Delay in FO4

Liu and Nannarelli, 2008, Radix-4, CMOS, Sync

slide-42
SLIDE 42

Asynchronous Research Center 118 92 112 275 100 200 300 Normalized by Vdd2 to 1V process

Energy per Division

42

Energy in pico Joules

Jamadagni and Ebergen, 2012, 25-bit Division, TSMC 90nm, Asynchronous Liu and Nannarelli, 2008, 24-bit Division, STM 90nm, Synchronous Liu and Nannarelli, 2008, 24-bit Division, STM 90nm, Synchronous-Low Power Renaudin et al, 1996, 32-bit Division, 0.5um, Asynchronous – Low Power

slide-43
SLIDE 43

Asynchronous Research Center

Outline

  • Introduction
  • Divine Division
  • Hardware Design
  • Results
  • Conclusion

43

slide-44
SLIDE 44

Asynchronous Research Center

Summary and Conclusion

  • An Asynchronous design
  • Exploits the average case behavior of the Divine Division

algorithm

  • Exploits the disparity in data path delays
  • Future Work
  • Add computation in the feedback path
  • Insert another data path
  • Mitigate the effect of sequencing overhead
  • Reduce power consumption
  • Controlling the data inputs to the adder

44

slide-45
SLIDE 45

Asynchronous Research Center

Questions?

45

slide-46
SLIDE 46

Asynchronous Research Center

Thank You 

46