PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and - - PowerPoint PPT Presentation

patmos2010 patmos2010
SMART_READER_LITE
LIVE PREVIEW

PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and - - PowerPoint PPT Presentation

20th International Workshop 20th International Workshop on Power and Timing Modeling. on Power and Timing Modeling. PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and Simulation, Grenoble, France Residue arithmetic for


slide-1
SLIDE 1

PATMOS2010 PATMOS2010

  • I. Kouretas and V. Paliouras

Electrical and Computer Engineering Dept.. University of Patras. GREECE

Residue arithmetic for designing low-power multiply-add units

20th International Workshop 20th International Workshop

  • n Power and Timing Modeling.
  • n Power and Timing Modeling.

Optimization and Simulation Optimization and Simulation, Grenoble, France

slide-2
SLIDE 2

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 2 2

Outline

  • Review of RNS basics
  • Architecture of RNS-based systems
  • Multi-Vdd RNS architecture
  • Structure of processing units
  • Relevance to RNS bases
  • Results
  • Conclusions
slide-3
SLIDE 3

Low-power through alternative number representations

Sign-magnitude versus two’s-complement

Depends on data (signal) statistics

Logarithmic number system

Choice of representation parameters

  • V. Paliouras, T. Stouraitis, “Low-power properties of logarithmic number

system,” IEEE Symposium on Computer Arithmetic, 2001.

Residue representations

Numerical properties of RNS Inherently parallel structure

  • T. Stouraitis, V. Paliouras, “Considering the alternatives in low power

design,” IEEE Circuits and Devices, 2001.

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 3 3

slide-4
SLIDE 4

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 4 4

RNS basics

RNS maps an integer X to a N-tuple of residues xi.

xi =X mod mi and xi is called the i-th residue.

mi is a member of a set of pair-wise co-prime integers

called base.

mi is called modulo. Dynamic range:

1

.

N i i m =

{ }

1 2

, ,...,

RNS N

X x x x ⎯⎯ ⎯ →

{ }

1 2

, ,...,

N

m m m

slide-5
SLIDE 5

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 5 5

RNS architecture vs binary architecture

  • perands
  • perands

results results

Data is processed in L parallel independent channels Benefit: ni << n

binary processor n bits n bits

n bits bin to RNS RNS to bin

mod m processor

1

n bits

2

n bits

1

n bits

i

n bits

mod m processor

2

mod m processor

L

slide-6
SLIDE 6

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 6 6

Remarks on RNS architecture

  • perands

results

The conversion issue Forward and inverse Implementation of moduli channels is not identical There are fast channels and slow channels

n bits bin to RNS RNS to bin

mod m processor

1

n bits

2

n bits

1

n bits

i

n bits

mod m processor

2

mod m processor

L

slide-7
SLIDE 7

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 7 7

RNS advantages / disadvantages

Advantages

Parallel multiplication or addition Fault tolerance Reduced power dissipation in filters

Disadvantages

Difficult comparisons

Overflow detection Sign detection

Division Scaling / Rounding / Truncation

slide-8
SLIDE 8

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 8 8

RNS multi-VDD architecture

  • perands

results

p is the number of moduli channels Power is quadratically related to voltage Distiguished moduli channels with Vdd(H) and Vdd(L) Benefit: Easy to implement

2 dyn , 1

i

p L dd i i i i

P C V f a

=

= ⋅ ⋅ ⋅

slide-9
SLIDE 9

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 9 9

Employed RNS bases

{ }

3 1 2

2 ,2 1,2 1

n n n −

+

Bases used in this study:

three-moduli base

Cases of common choices in the literature

{ } { }

3 1 2

2 , 2 1, 2 1 , 2 , 2 1, 2 1, ..., 2 1

n n n n n n n

+ − − − +

{ }

3 1 2 4

2 ,2 1,2 1,2 1

n n n n

− − +

four-moduli base

{ }

3 1 2

4 5

2 ,2 1,2 1,2 1,2 1

n n n n n

− − + +

five-moduli base

slide-10
SLIDE 10

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 10 10

Architectures of multiply adders

( )

modulo- 2 1

n −

( )

modulo- 2 1

n +

modulo-2n binary

slide-11
SLIDE 11

Four moduli-bases results

base Power(mW) Area(um2) Delay(ns)

Power Savings(%)

before after

{16, 31, 2047, 1025*} 3.16 2.28 12507.15 2 27.79 {32, 15, 511*, 4097} 3.11 2.36 12056.04 2 24.05 {16, 31, 2047*, 1025*} 3.16 2.07 14390.08 2 34.52 {32, 511*, 2047, 17} 2.99 2.24 11585.72 2 25.01 {16, 31*, 2047, 1025*} 3.16 2.15 12301.9 2 31.9 {32, 511*, 2047*, 17} 2.99 2.03 13468.65 2 32.14 {16, 31*, 2047*, 1025*} 3.16 1.94 14184.83 2 38.63 {256*, 31, 4095, 17} 1.82 1.68 7958.7 2 7.77 {16*, 31, 2047, 1025*} 3.16 2.25 12265.13 2 28.89 {32*, 15, 511*, 4097} 3.11 2.33 12082.38 2 25.08 {16*, 31, 2047*, 1025*} 3.16 2.03 14148.06 2 35.63 {32*, 511*, 2047, 17} 2.99 2.21 11612.06 2 26.08 {16*, 31*, 2047, 1025*} 3.16 2.12 12059.88 2 33 {32*, 511*, 2047*, 17} 2.99 2 13494.99 2 33.2

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 11 11

slide-12
SLIDE 12

Five moduli-bases results

base Power(mW) Area(um2) Delay(ns)

Power Savings(%)

before After

{16, 31, 63, 17, 1025*} 3.3735 2.4955 13912.0799 1.67 26.03 {64, 31, 127, 33*, 65} 2.1642 2.0944 10590.7424 1.3 3.23 {16, 31, 63, 17*, 1025*} 3.3735 2.4145 14082.7567 1.67 28.43 {64, 31, 511*, 17, 33} 3.1573 2.4103 12844.1152 1.73 23.66 {16, 31, 63*, 17, 1025*} 3.3735 2.3016 13527.3711 1.67 31.77 {64, 31, 511*, 17*, 33} 3.1573 2.3293 13014.792 1.73 26.22 {16, 31, 63*, 17*, 1025*} 3.3735 2.2206 13698.0479 1.67 34,.8 {64, 63*, 127, 17, 65} 2.2674 2.0735 10667.5744 1.47 8.55 {16, 31*, 63, 17, 1025*} 3.3735 2.3656 13706.8287 1.67 29.88 {64, 63*, 127, 17*, 65} 2.2674 1.9925 10838.512 1.47 12.12 {16, 31*, 63, 17*, 1025*} 3.3735 2.2846 13877.5055 1.67 32.28 {64, 31*, 511*, 17, 33} 3.1573 2.2804 12638.864 1.73 27.77 {16, 31*, 63*, 17, 1025*} 3.3735 2.1717 13322.1199 1.67 35.62 {64, 31*, 511*, 17*, 33} 3.1573 2.1994 12809.5408 1.73 30.34 {16, 31*, 63*, 17*, 1025*} 3.3735 2.0907 13492.7967 1.67 38.03

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 12 12

slide-13
SLIDE 13

Conclusions

Multi-Vdd low power technique is suitable for

RNS systems.

Application of multi-Vdd further reduces

power dissipation in RNS systems.

High and low Vdd channels can be easily

determined.

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 13 13

slide-14
SLIDE 14

University of Patras University of Patras PATMOS 2010, Grenoble, France PATMOS 2010, Grenoble, France 14 14

The End

Thank you for your attention!