CS137: Today Electronic Design Automation • Retiming – Cycle time (clock period) – C-slow Day 11: October 21, 2005 – Initial states Retiming – Register minimization – Necessary delays (time permitting) 1 2 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Task Problem • Move registers to: • Given : clocked circuit – Preserve semantics • Goal : minimize clock period without – Minimize path length between registers changing (observable) behavior – (make path length 1 for maximum • I.e . minimize maximum delay between throughput or reuse) any pair of registers – Maximize reuse rate • Freedom : move placement of internal – …while minimizing number of registers registers required 3 4 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Other Goals Simple Example • Minimize number of registers in circuit • Achieve target cycle time • Minimize number of registers while achieving target cycle time Path Length (L) = 4 • …start talking about minimizing cycle... Can we do better? 5 6 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 1
Canonical Graph Legal Register Moves Representation • Retiming Lag/Lead Separate arc for each path Weight edges by number of registers (weight nodes by delay through node) 7 8 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Critical Path Length Retiming Lag/Lead Retiming : Assign a lag to every vertex Critical Path : Length of longest path of zero weight nodes weight(e ′ ) = weight(e) + lag(head(e))-lag(tail(e)) Compute in O(|E|) time by levelizing network: Topological sort, push path lengths forward until find register. 9 10 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Valid Retiming Retiming Task • Move registers ≡ assign lags to nodes • Retiming is valid as long as: – ∀ e in graph – lags define all locally legal moves • weight(e ′ ) = weight(e) + lag(head(e))-lag(tail(e)) ≥ 0 • Preserving non-negative edge weights • Assuming original circuit was a valid – (previous slide) synchronous circuit, this guarantees: – guarantees collection of lags remains – non-negative register weights on all edges consistent globally • no travel backward in time :-) – all cycles have strictly positive register counts – propagation delay on each vertex is non-negative (assumed 1 for today) 11 12 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 2
Retiming Transformation Optimal Retiming • N.B.: unchanged by retiming • There is a retiming of – number of registers around a cycle – graph G – delay along a cycle – w/ clock cycle c – iff G- 1 /c has no cycles with negative edge • Cycle of length P must have weights – at least P/c registers on it – to be retimeable to cycle c • G - α ≡ subtract α from each edge weight 13 14 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 1/c Intuition G -1/ c • Want to place a register every c delay units • Each register adds one • Each delay subtracts 1/c • As long as remains more positives than negatives around all cycles – can move registers to accommodate – Captures the regs=P/c constraints 15 16 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Compute Retiming Bellman Ford • For I ← 0 to N • Lag(v) = shortest path to I/O in G -1/ c – u i ←∞ (except u i =0 for IO) • For k ← 0 to N • Compute shortest paths in O(|V||E|) – for e i,j ∈ E – Bellman-Ford • u i ← min(u i , u j +w(e i,j )) – also use to detect negative weight cycles • For e i,j ∈ E //still update � negative cycle when c too small • if u i >u j +w(e i,j ) – cycles detected 17 18 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 3
Apply to Example Try c=1 19 20 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Apply: Find Lags Apply: Lags Negative weight cycles? Shortest paths? 21 22 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Apply: Retimed Apply: Move Registers 1 1 1 1 1 weight(e ′ ) = weight(e) + lag(head(e))-lag(tail(e)) 23 24 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 4
Revise Example Apply: Retimed Design (fanout delay) 25 26 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Revised: Graph Revised: Graph 27 28 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Revised: C=1? Revised: C=2? 29 30 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 5
Revised: Lag Revised: Lag Take ceiling to convert to integer lags: 0 -1 0 31 32 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Revised: Apply Lag Revised: Apply Lag 0 0 -1 -1 -1 -1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 33 34 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Revised: Retimed Pipelining 1 1 0 1 • We can use this retiming to pipeline 1 0 1 0 0 1 • Assume we have enough (infinite 0 1 1 supply) registers at edge of circuit • Retime them into circuit 35 36 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 6
Add Registers C>1 ==> Pipeline G n 1 1 1 0 0 0 0 1 1 0 0 0 37 38 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Add Registers Pipeline Retiming: Lag n 1 1 1 G 0 0 0 0 1 1 0 0 0 G-1/1 39 40 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Pipelined Retimed Real Cycle 41 42 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 7
Real Cycle Cycle C=1? 43 44 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Cycle: C-slow Cycle C=2? Cycle=c ⇒ C-slow network has Cycle=1 45 46 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 2-slow Cycle ⇒ C=1 2-Slow Lags 47 48 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 8
2-Slow Retime Retimed 2-Slow Cycle 49 50 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon C-Slow applicable? Max Example • Available parallelism – solve C identical, independent problems • e.g. process packets (blocks) separately • e.g. independent regions in images • Commutative operators – e.g. max example 51 52 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Max Example Note • Algorithm/examples shown – for special case of unit-delay nodes • For general delay, – a bit more complicated – still polynomial 53 54 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 9
Initial State Initial State • What about initial state? 0 0 1 55 56 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Initial State Initial State 0 0 0 0 0 0 1 0 0,1? 1 0 1 1 0 1 1 0 1 In general, constraints � satisfiable? 57 58 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Initial State Initial State 1 • Cannot always get exactly the same initial Cycle1: 1 state behavior on the retimed circuit 0 Cycle2: /(0*/in)=1 – without additional care in the retiming transformation – sometimes have to modify structure of retiming to ? preserve initial behavior init Cycle1: /init • Only a problem for startup transient Cycle2: /(/init*/in)=in+init – if you’re willing to clock to get into initial state, not a limitation init=0 init=1 Cycle1: 1 Cycle1: 0 Cycle2: /(/init*/in)=in Cycle2: /(/init*/in)=1 59 60 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 10
Minimize Registers Minimize Registers • Number of registers: Σ w(e) • After retime: Σ w(e)+ Σ (FI(v)-FO(v))lag(v) • delta only in lags • So want to minimize: Σ (FI(v)-FO(v))lag(v) – subject to earlier constraints • non-negative register weights, delays • positive cycle counts 61 62 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Minimize Registers HSRA Retiming • HSRA • Can be formulated as flow problem – adds mandatory • Can add cycle time constraints to flow pipelining to problem interconnect • Time: O(|V||E|log(|V|)log|(|V| 2 /|E|)) • One additional twist – long, pipelined interconnect • ⇒ need more than one register on paths 63 64 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Accommodating HSRA Accommodating HSRA Interconnect Delays Interconnect Delays • Add buffers to LUT → LUT path to match interconnect register requirements • Retime to C=1 as before • Buffer chains force enough registers to cover interconnect delays 65 66 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon 11
Summary Admin • Can move registers to minimize cycle time • Homework Due Today • Formulate as a lag assignment to every node • No class on Monday and Wednesday • Optimally solve cycle time in O(|V||E|) time • Also • Class next Friday – Compute multithreaded computations – Minimize registers • Watch out for initial values • Can accommodate mandatory delays 67 68 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 -- DeHon Big Ideas • Exploit freedom • Formulate transformations (lag assignment) • Express legality constraints • Technique: – graph algorithms – network flow 69 CALTECH CS137 Fall2005 -- DeHon 12
Recommend
More recommend