Generic Circuit Operators Jean Vuillemin cole Normale Suprieure, - - PowerPoint PPT Presentation

generic circuit operators
SMART_READER_LITE
LIVE PREVIEW

Generic Circuit Operators Jean Vuillemin cole Normale Suprieure, - - PowerPoint PPT Presentation

Generic Circuit Operators Jean Vuillemin cole Normale Suprieure, Paris Minimal area meets IO/Bandwidth. Maximal IO/Bandwidth meets area. Motivations Fit design to technology constraints. Synthesis tools to ease exploring


slide-1
SLIDE 1

DCC/ETAPS02 Jean.Vuillemin@ens.fr 1

Generic Circuit Operators

Jean Vuillemin

École Normale Supérieure, Paris

  • Minimal area meets IO/Bandwidth.
  • Maximal IO/Bandwidth meets area.
  • Fit design to technology constraints.

Motivations Synthesis tools to ease exploring area/speed hardware trade-offs.

slide-2
SLIDE 2

DCC/ETAPS02 Jean.Vuillemin@ens.fr 2

Overview

  • Multi-media Data Flow: MPEG, FHT, …
  • Focus on feed-forward networks.

,¬, , , , , , , ∪ ∩ ⊕ ⊗ + − × P

  • Generic operators:
  • Synthesis driven by input types.
  • Types implement identical semantics.
  • Systematic Area/Time trade-offs.
  • Efficient software synthesis.
slide-3
SLIDE 3

DCC/ETAPS02 Jean.Vuillemin@ens.fr 3

Digital Number

2 2 2

2 ( ) z →

  • N

D฀ N F ฀ Z Z

Integer Set: { } { : 1}

n

b n b = ∈ = N ( z-se ) ries ( ) :

t n n n

b z b t z dt b z

∞ ∈

= = ∑

N

(2 2-adi ) ( c integer: )2 2

t n n n

b b t dt b

∞ ∈

= = ∑

N

0 1 2

Binary sequence: [ ] b b b b =

  • N

Synchronous Signal: ( ) ( )

N

N

b t b t = ∂ −

Norm: 2− →

N

D

1 2 1 2 2 b b b = + = =

Distance: ' ' b b b b − = ⊕

1

2 n

n

b b

− −

− <

  • 1

2n

n n

b b b

= +

slide-4
SLIDE 4

DCC/ETAPS02 Jean.Vuillemin@ens.fr 4

Minus x

2 2 2 2

N N N N N N N N

x x r r y y u u = = = =

∑ ∑ ∑ ∑

y x = −

r y x r − = +

y x = −

r z u y x r u x r = = ⊕ = ∪

1 1 N N N N N N N N

r u u y x r u x r

− −

= = = ⊕ = ∪

1 1 1 1

2

N N N N N N N N N N

y x r x r r x r x r

− − − −

= + − = + −

2 2( ) 2 y x r p r x r p = + − = + −

slide-5
SLIDE 5

DCC/ETAPS02 Jean.Vuillemin@ens.fr 5

Digital Algebra

2 2 2

2 ( ) z →

  • N

D฀ N F ฀ Z Z

is a Boolean Algebra isomorphic to the subset ,¬, s o . , f ∩ ∪ D N

2

is an Integral Domain isomorphic to the power series . ) (

,z, ,

z

⊕ ⊗

Z

D

2

is an Integral Domain isomorphic to the 2-adic integer . s

, , ,

  • +

×

Z

D

2

2

c

⊂ ⊂ ⊂ ⊂ ⊂ ⊂ ⊂ ⊂

N

F ฀ B ฀ P P A D Z N D

Finite Integer Rational Algebraic Computable Type T implements D:

1. T supports some subset of the Digital operators.

  • 2. For each supported operator, the semantics is that of D.

T.not T.and T.or T.xor T.shift T.conv T.add T.sub T.mul T.input T.constant

slide-6
SLIDE 6

DCC/ETAPS02 Jean.Vuillemin@ens.fr 6

Area vs. Time

y x = −

2 2

N N N N

x x y y = =

∑ ∑

( ) r z x r y x r = ∪ = ⊕

2 z =

W[1]

y x = −

1 1 1 1 1 1

( ) r x r y x r r z x r y x r = ∪ = ⊕ = ∪ = ⊕

2n z =

W[n]

2 2 1 2 1 2

[0] 2 [1] [0] 2 [1] [0] 4 [0] 4 [1] 4 [1] 4

N N N N N N N N

x x x y y y x x y y x x y y

+ +

= + = + = = = =

∑ ∑ ∑ ∑

slide-7
SLIDE 7

DCC/ETAPS02 Jean.Vuillemin@ens.fr 7

Jazz

http://www.exentis.com/jazz/

  • Goals

– High-level language for synchronous circuits – Single source from specification to synthesis – Invariant 2-adic semantics – Circuit proofs by symbolic evaluation

  • Means

– Strong types & inference = ML, Haskell, Lava – Higher types & lazy evaluation = ML, Haskell , Lava – Objects & classes = Java, Haskell – Generic operators & overload = C++, PamDC – Nets as a first class type > Lava, JHDL

– Net-lists are not first-class < Lava, JHDL – Symbolic net-lists can be programmed = Lava, JHDL.

slide-8
SLIDE 8

DCC/ETAPS02 Jean.Vuillemin@ens.fr 8

Example

fun SumDiff (a,b) = (s,d) { s = a+b; d = a-b; }

Generic code interface

fun SumDiff@(a,b:T)->(s,d:T) { s = T.add(a,b); d = T.sub(a,b); }

Type T default Implementation: All types implement the same 2-adic semantics.

fun SumDiff@H { H(s,d)= Haddsub(H(a,b)); }

Specific type H implementation

SumDiff@N generates bit-level simulator. As efficient as BigNum package N.and, … N.add, N.sub, … SumDiff@H generates hyper-serial circuit. Area > A(1)/2. Bandwidth > B(1)/2. z=1/2 SumDiff@W(1) generates bit-serial circuit. Least area A(1). Least bandwidth B(1). z=2 SumDiff@W(4) generates nibble-serial circuit. Area <16A(1). Bandwidth <16B(1). z=16 SumDiff@W(∞) generates an ∞ parallel Boolean Circuit

slide-9
SLIDE 9

DCC/ETAPS02 Jean.Vuillemin@ens.fr 9

JPEG DCT

software add sub mul mask cycles/px 1px / 32b 14 15 5 5 39 2px / 32b 7 8 3 4 21

, : ( ) : ( ) a b a b a b T p p a T p + ∈ ∈ − ∈ ∈ ∈ × ∈ T T T T P T

4px / 64b 4 4 3 4 14

hardware fulladd reg cycles/px add/px 1bit / cycle 59 89 12 708 0.5bit / cycle 29,5 178 24 708 2bit / cycle 118 45 6 708 4bit / cycle 236 22 3 708 12bit / cycle 622 1 622

c2 c1 c0 b2 b1 b0 p2 p1 p0 * * * r2 r1 r0 * * * q2 q1 q0

x =

slide-10
SLIDE 10

DCC/ETAPS02 Jean.Vuillemin@ens.fr 10

Linear Hough Transform

L

h

p L

(L) = p

max

max { ( )} h

L

L = L

max

L

slide-11
SLIDE 11

DCC/ETAPS02 Jean.Vuillemin@ens.fr 11

Fast Hough Transform FHT

m = n div 2; // middle point lh = FHT(m)(in[0..m-1]); // Left Histogram rh = FHT(m)(in[m..n-1]); // Right Histogram for (k<m) { // FHT Butterfly dh[k] = lh[k] << k; // Delay k lines ht[2*k] = dh[k] + rh[k]; // even Histogram ht[2*k+1] = rh[k]+dh[k]<<1;} // odd Histogram fun FHT(n: int)(in : _[n]) = ht : _[n] { if (n==1) ht = in; // end of recursion else { } }

slide-12
SLIDE 12

DCC/ETAPS02 Jean.Vuillemin@ens.fr 12

FHT Circuit

1 bit 2 bits 3 bits Line delay: z Pixel sum: +

  • Serial
  • Numeric
  • Parallel
  • Symbolic => Circuit Proofs
slide-13
SLIDE 13

DCC/ETAPS02 Jean.Vuillemin@ens.fr 13

TRT Circuit

Input Line Receiver Bit Reverse Max Tree Max Serial Minimize Registers: 2a+2b = 2(a+b) max(2a,2b) = 2max(a,b)

slide-14
SLIDE 14

DCC/ETAPS02 Jean.Vuillemin@ens.fr 14

Conclusions

  • Methodology

– All hardware synthesis from a single source code. – All (must) implement the same 2-adic semantics. – Software synthesis from same source.

  • JPEG Synthesis

– Compare JPEG layout for W(4), W(8) and W(12). – Dynamic instructions support Hyper-Serial implementations: half size/half rate JPEG on CHESS.

  • FHT Synthesis

– Compare FHT layout for for W(1) thru W(8). – Uses simple symbolic simplification along synthesis: 0-fold, register swap.

  • Software Synthesis

– Efficiency from underlying BigNum package – Limited by I/O corner turning. – Symbolic evaluation can lead to circuits proofs: periodic, algebraic, …