Chapter 6: Folding Keshab K. Parhi Folding is a t echnique t o - - PowerPoint PPT Presentation
Chapter 6: Folding Keshab K. Parhi Folding is a t echnique t o - - PowerPoint PPT Presentation
Chapter 6: Folding Keshab K. Parhi Folding is a t echnique t o reduce t he silicon area by t ime- mult iplexing many algorit hm operat ions int o single f unct ional unit s (such as adders and mult ipliers) Fig(a) shows a DSP program
2
- Folding is a t echnique t o reduce t he silicon area by t ime-
mult iplexing many algorit hm operat ions int o single f unct ional unit s (such as adders and mult ipliers)
- Fig(a) shows a DSP
program : y(n) = a(n) + b(n) + c(n) .
- Fig(b) shows a f olded archit ect ure where 2 addit ions are
f olded or t ime-mult iplexed t o a single pipelined adder One out put sample is produced every 2 clock cycles ⇒ input should be valid f or 2 clock cycles.
- I n general, t he dat a on t he input of a f olded realizat ion is
assumed t o be valid f or N cycles bef ore changing, where N is t he number of algorit hm operat ions execut ed on a single f unct ional unit in hardware.
- Chap. 6
3
Folding Transf ormat ion :
- Nl + u and Nl + v are respect ively t he t ime unit s at which l-t h
it erat ion of t he nodes U and V are scheduled.
- u and v are called f olding orders (t ime part it ion at which t he
node is scheduled t o be execut ed) and sat isf y 0 ≤ u,v ≤ N-1.
- N is t he f olding f act or i.e., t he number of operat ions f olded t o
a single f unct ional unit .
- Hu and Hv are f unct ional unit s t hat execut e u and v respect ively.
- Hu is pipelined by P
u st ages and it s out put is available at Nl + u + P u.
- Edge U→V has w(e) delays ⇒ t he l-t h it erat ion of U is used by
(l + w(e)) t h it erat ion of node V, which is execut ed at N(l + w(e)) + v. So, t he result should be st ored f or : DF(U→V) = [N(l + w(e)) + v] – [Nl + P
u + u]
⇒ DF(U→V) = Nw(e) - P
u + v – u ( independent of l )
- Chap. 6
4
- Folding Set : An ordered set of N operat ions execut ed by t he
same f unct ional unit . The operat ions are ordered f rom 0 t o N-
- 1. Some of t he operat ions may be null. For example, Folding
set S1={A1,0,A2} is f or f olding order N=3. A1 has a f olding
- rder of 0 and A2 of 2 and are respect ively denot ed by (S1|0)
and (S2| 2).
- Example: Folding a ret imed biquad f ilt er by N = 4.
Addit ion t ime = 1u.t ., Mult iplicat ion t ime = 2u.t ., 1 st age pipelined adder and 2 st age pipelined mult iplier(i.e., P
A=1 and P M=2)
The f olding set s are S1 = {4, 2, 3, 1} and S2 = {5, 8, 6, 7}
- Chap. 6
5
Folding equat ions f or each of t he 11 edges are as f ollows: DF(1→2) = 4(1) – 1 + 1 – 3 = 1 DF(1→5) = 4(1) – 1 + 0 – 3 = 0 DF(1→6) = 4(1) – 1 + 2 – 3 = 2 DF(1→7) = 4(1) – 1 + 3 – 3 = 3 DF(1→8) = 4(2) – 1 + 1 – 3 = 5 DF(3→1) = 4(0) – 1 + 3 – 2 = 0 DF(4→2) = 4(0) – 1 + 1 – 0 = 0 DF(5→3) = 4(0) – 2 + 2 – 0 = 0 DF(6→4) = 4(1) – 2 + 0 – 2 = 0 DF(7→3) = 4(1) – 2 + 2 – 3 = 1 DF(8→4) = 4(1) – 2 + 0 – 1 = 1
- Chap. 6
6
- Ret iming f or Folding :
– For a f olded syst em t o be realizable DF(UV) ≥ 0 f or all edges. – I f D’F(UV) is t he f olded delays in t he edge UV f or t he ret imed graph t hen D’F(UV) ≥ 0. So, Nwr(e) – P
U + v – u ≥ 0 …
where wr(e) = w(e) + r(V) - r(U)
⇒ N(w(e) + r(V) – r (U) ) - P
U + v – u ≥ 0
⇒r(U) – r(V) ≤ DF(UV) / N ⇒r(U) – r(V) ≤ DF(UV) / N (since ret iming values are
int egers)
7
- Regist er Minimizat ion Technique : Lif et ime analysis is used f or
regist er minimizat ion t echniques in a DSP hardware.
- A ‘dat a sample or variable’ is live f rom t he t ime it is produced
t hrough t he t ime it is consumed. Af t er t hat it is dead.
- Linear lif et ime chart : Represent s t he lif et ime of t he variables in a
linear f ashion.
- Example :
Not e : Linear lif et ime chart uses t he convent ion t hat t he variable is not live during t he clock cycle when it is produced but live during t he clock cycle when it is consumed.
- Chap. 6
8
- Due t o t he periodic nat ure of DSP programs t he lif et ime chart can
be drawn f or only one it erat ion t o give an indicat ion of t he # of regist ers t hat are needed. This is done as f ollows : Let N be t he it erat ion period Let t he # of live variables at t ime part it ions n ≥ N be t he # of live variables due t o 0-t h it erat ion at cycles n-kN f or k ≥ 0. I n t he example, # of live variables at cycle 7 ≥ N (=6) is t he sum
- f t he # of live variables due t o t he 0-t h it erat ion at cycles 7
and (7 - 1×6) = 1, which is 2 + 1 = 3.
- Mat rix t ranspose example :
Mat r ix Tr ansposer
i | h | g | f | e | d | c | b | a i | f | c | h | e | b | g | d | a
- Chap. 6
812 12 8 8 i 79 9
- 2
5 7 h 66 6
- 4
2 6 g 511 11 2 7 5 f 48 8 4 4 e 35 5
- 2
1 3 d 210 10 4 6 2 c 17 7 2 3 1 b 04 4 a Lif e Tout Tdif f Tzlout Tin Sample
To make t he syst em causal a lat ency of 4 is added t o t he dif f erence so t hat Tout is t he act ual out put t ime.
- Chap. 6
10
- Circular lif et ime chart : Usef ul t o represent t he periodic
nat ure of t he DSP programs.
- I n a circular lif et ime chart of periodicit y N, t he point
marked i (0 ≤ i ≤ N - 1) represent s t he t ime part it ion i and all t ime inst ances {(Nl + i)} where l is any non-negat ive int eger.
- For example : I f N = 8, t hen t ime part it ion i = 3 represent s
t ime inst ances {3, 11, 19, … }.
- Not e : Variable produced during
t ime unit j and consumed during t ime unit k is shown t o be alive f rom ‘j + 1’ t o ‘k’.
- The numbers in t he bracket in
t he adj acent f igure correspond t o t he # of live variables at each t ime part it ion.
- Chap. 6
11
Forward Backward Regist er Allocat ion Technique :
Not e : Hashing is done t o avoid conf lict during backward allocat ion.
- Chap. 6
12
St eps f or Forward-Backward Regist er allocat ion :
- Det ermine t he minimum number of regist ers using lif et ime
analysis.
- I nput each variable at t he t ime st ep corresponding t o t he
beginning of it s lif et ime. I f mult iple variables are input in a given cycle, t hese are allocat ed t o mult iple regist ers wit h pref erence given t o t he variable wit h t he longest lif et ime.
- Each variable is allocat ed in a f orward manner unt il it is dead
- r it reaches t he last regist er. I n f orward allocat ion, if t he
regist er i holds t he variable in t he current cycle, t hen regist er i + 1 holds t he same variable in t he next cycle. I f (i + 1)-t h regist er is not f ree t hen use t he f irst available f orward regist er.
- Being periodic t he allocat ion repeat s in each it erat ion. So
hash out t he regist er Rj f or t he cycle l + N if it holds a variable during cycle l.
- For variables t hat reach t he last regist er and are st ill alive,
t hey are allocat ed in a backward manner on a f irst come f irst serve basis.
- Repeat st eps 4 and 5 unt il t he allocat ion is complet e.
- Chap. 6
13
- Example : Forward backward Regist er Allocat ion
- Chap. 6
14
- Folded archit ect ure f or mat rix t ranposer :
15
- Regist er minimizat ion in f olded archit ect ures :
P erf orm ret iming f or f olding Writ e t he f olding equat ions Use t he f olding equat ions t o const ruct a lif et ime t able Draw t he lif et ime chart and det ermine t he required number of regist ers P erf orm f orward-backward regist er allocat ion Draw t he f olded archit ect ure t hat uses t he minimum number of regist ers.
34 8 56 7 44 6 22 5 11 4 33 3
- 2
49 1 TinTout Node
- Example : Biquad Filt er
St eps 1 & 2 have already been done. St ep 3:The lif et ime t able is t hen const ruct ed. The 2nd row is empt y as DF(2U) is not present . Not e : As ret iming f or f olding ensures causalit y, we need not add any lat ency.
- Chap. 6
16
St ep 4 : Lif et ime chart is const ruct ed and regist ers det ermined. St ep 5 : Forward-backward regist er allocat ion
- Chap. 6
17
Folded archit ect ure is drawn wit h minimum # of regist ers.