analysis and optimization of global interconnects
play

Analysis and Optimization of Global Interconnects Sachin Sapatnekar - PowerPoint PPT Presentation

Analysis and Optimization of Global Interconnects Sachin Sapatnekar ECE Department University of Minnesota Minneapolis, MN, USA sachin@umn.edu 2 Prashant Saxena, Synopsys Many slides borrowed from Jiang Hu, Texas A&M Acknowledgements


  1. Analysis and Optimization of Global Interconnects Sachin Sapatnekar ECE Department University of Minnesota Minneapolis, MN, USA sachin@umn.edu

  2. 2 Prashant Saxena, Synopsys Many slides borrowed from Jiang Hu, Texas A&M Acknowledgements Chuck Alpert, IBM • • •

  3. Outline of the talk • Interconnect delay metrics • Interconnects and scaling theory • Synthesis of signal interconnects • Noise and congestion issues 3

  4. 4 Simple delay metrics

  5. Interconnect modeling • Precise model requires transmission line analysis dx • Break up wire into segments Each segment can be modeled as • π -model L-model T-model R(+sL) R(+sL) R/2(+sL/2) R/2(+sL/2) C C/2 C/2 C • Other issues (crosstalk etc.) modeled using coupling caps • Interconnect extraction – Most precise with a 3-D field solver (takes a long time!) – Other faster approximate techniques useful for design analysis/optimization (R per square, C per unit area, 2.5-D models) 5

  6. Gate delay models • Traditionally: assume that the gate drives a capacitor – Build macromodels for individual gates • Delay = f(widths, transition times, loads) • Example: K-factor equations • Similar idea used in standard cell characterization: Delay = f (transition times, load) – Table lookup models: storage/accuracy tradeoff (e.g. .lib format) – Fast circuit simulation – used in many delay calculators More recently: effective capacitances, current source/voltage • source models 6

  7. RC delay calculations • Delays can be calculated easily • For example: RC driven by a step excitation R V(t) C Response V(t) = ( 1 - e -t/RC ) Time constant = RC Time constants for more complicated circuits? 7

  8. Elmore delay for an RC tree ∑ ∑ = T R C D , k i j ∈ ∈ i Path ( k ) j downstream ( i ) Rd Cd Rb Cb Re Ra Root Ca Ce Rc Cc – Elmore Delay to node e = Ra.(Ca+Cb+Cc+Cd+Ce) + Rb.(Cb+Cd + Ce) + Re.Ce 8

  9. 9 2 C C 2 2 R C + ) Incrementally calculating the Elmore delay 2 C + R 2 1 C C 1 ( 1 R B = ) C R 1 − A ( Delay A

  10. Model order reduction methods e(t) • Elmore delay: RC transfer function t e’(t) H(s) ≈ a 0 t b 0 + b 1 s t d • Can approximate RC circuit transfer function as a 0 + a 1 s + ... + a n-1 s n-1 b 0 + b 1 s + ... + b n-1 s n-1 + b n s n – Response approximated as a sum of exponentials – Useful for interconnect simulation – Other variants: PVL, PRIMA, etc. – Handles linear systems, but drivers may be nonlinear 10

  11. Effective capacitance model • Includes the effects of gate nonlinearities • Gate driving RC interconnect x x – Determine waveform at gate output; analyze interconnect as a linear system after that • Possible model for waveform at x R – Gate driving total capacitance of net? C 1 C 2 • Gives erroneous results due to resistive shielding – Actual effective capacitance < total wiring capacitance – Techniques exist for determining C effective , or modeling the gate using a voltage/current source 11

  12. Match charge 12 To get C new Compute Thevenin model at C eff C new C eff Computing C eff : Overall flow C eff =C new C eff =C new ? No delay,slew C new =C tot Compute yes [C. Kashyap]

  13. Current source model • Represents the transistor I-V curve as a function of input slew and output load • Linear Thevenin driver delay = f( slew ,C load ) rd ± V out • CCSM (Synopsys), ECSM (Cadence) I out = f( slew ,C load ) [Amin, DAC06] 13

  14. Wire tapering and layer assignment • Elmore delay ∑ ∑ = T R C D , k i j ∈ ∈ ( ) ( ) i Path k j downstream i Root – Wires near the root must have low resistances – Wires near the leaves must have low capacitances – Wider wires near root, narrower near leaves • In practice: # of wire widths limited to two or three • Same principle applies to layer assignment

  15. Simple buffer insertion problem Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance RAT 4 Buffer RAT 3 s 0 RAT 2 RAT 1 15

  16. Simple buffer insertion problem Find: Buffer locations and a routing tree such that slack at the source is minimized = − q ( s ) min { RAT ( s ) delay ( s , s )} ≤ ≤ 0 1 i 4 i 0 i RAT 4 RAT 3 s 0 RAT 2 RAT 1 16

  17. 17 delay = 400 delay = 600 delay = 350 delay = 300 RAT = 500 RAT = 400 RAT = 500 RAT = 400 slack = + 100 slack = -200 Slack example

  18. Interconnects and Scaling Theory

  19. A scaling primer G G • Ideal process scaling: S S D D – Device geometries shrink by σ ( = 0.7x) • Device delay shrinks by σ w S h – Wire geometries shrink by σ ρ l /( w σ . h σ ) = R/ σ 2 • Resistance : ε ( h σ ) l /( S σ ) = same • Coupling cap : l • Capacitance to ground : similar • In each process generation h σ R doubles, C and Cc unchanged l σ • But it doesn’t quite work that way • h scales by less than σ to control R S σ w σ

  20. Block scaling • Block area often stays same – # cells, # nets doubles • Wiring histogram shape (almost) invariant – Global interconnect lengths don’t shrink – Local interconnect lengths shrink by σ

  21. A typical chip cross-section • Wires become “fatter” as you move to upper layers • From one technology to the next, wire aspect ratios become more skewed [Intel] • R is controlled, at the expense of coupling capacitance 21

  22. The role of interconnects • Short interconnect – Used to connect nearby cells, R driver >> R interconnect – Minimize wire C, i.e., use short minwidth wires • Medium to long-distance (“global”) interconnect – R driver ≈ R interconnect – Size wires to tradeoff area vs. delay – Increasing width ⇒ Capacitance increases, Resistance decreases Need to find acceptable tradeoff - wire sizing problem • “Fat” wires – Thicker cross-sections in higher metal layers – Useful for reducing delays for global wires – Inductance issues, sharing of limited resource

  23. Interconnect delay scaling • Delay of a wire of length l : τ int = (rl)(cl) = rcl 2 (first order) • Local interconnects : τ int : (r/ σ 2 )(c)(l σ ) 2 = rcl 2 – Local interconnect delay unchanged (but devices get faster) • Global interconnects : τ int : (r/ σ 2 )(c)(l) 2 = (rcl 2) / σ 2 – Global interconnect delay doubles – unsustainable! – Problem somewhat mitigated using buffers, using nonideal scaling as outlined earlier • Interconnect delay increasingly more dominant

  24. ITRS projections Feature size (nm) Relative 250 180 130 90 65 45 32 delay 100 IT RS IL D Roadmap E volution Gate delay (fanout 4) Local interconnect (M1,2) 5 Global interconnect with repeaters Global interconnect without repeaters 4 10 ffe c tive k Industry Ac tua l T re nd 3 E 1 1997 IT RS 2 1999 IT RS 2003 IT RS 1 Source: I TRS, 2003 Source: I TRS, 2003 0.25 0.18 0.13 0.09 .065 .045 0 1 2 3 4 5 6 7 0.1 e c hnolog y Node ( µ m) T Source: Chia Hong Jan, IEDM 2003 Interconnect Short Course ITRS projections often a “best case scenario” projection

  25. 25 A buffer effectively isolates the downstream capacitance Vs Buffer insertion Consider • •

  26. Optimizing medium/long interconnects • Delays of interconnects may become very large • Wire sizing helps to control the delay • Repeater insertion is another effective technique • Effects of a buffer – Isolates load capacitances of different “stages” – Adds a delay Subtree cap. Subtree cap. C L1 C L2 C buf Downstream capacitance here is C L1 + C buf (C L2 is isolated by the buffer) R driver Subtree cap. Subtree cap. C L1 C L2 C buf 26

  27. Buffered global interconnects: Intuition l Interconnect delay = r.c.l 2 l 1 l 2 l 3 l n 2 < r.c.l 2 (where l = Σ l j ) Now, interconnect delay = Σ r.c.l i since Σ (l j 2 ) < ( Σ l j ) 2 (Of course, account for intrinsic buffer delay also)

  28. More precise analysis: Optimal inter-buffer length • First order (lumped parasitic, Elmore delay) analysis L … … C g R d R d – On resistance of inverter C g – Gate input capacitance l r, c – Resistance, cap. per micron • Assume N identical buffers with equal inter-buffer length l ( L = Nl ) [ ] ( ) ( ) = + + + T N R C cl rl C cl d g g ( ) ( ) ⎥ ⎡ ⎤ 1 = + + + L rcl rC R c R C ⎢ g d d g ⎣ ⎦ l • For minimum delay, ⎡ ⎤ R C R C dT = − = d g opt = ⎢ ⎥ d g 0 L rc 0 l 2 ⎢ ⎥ dl l ⎣ ⎦ rc opt

  29. Optimal interconnect delay • Substituting l opt back into the interconnect delay expression: ⎡ ⎤ ( ) ( ) 1 = + + + ⎢ ⎥ T L rcl rC R c R C opt opt g d d g ⎢ ⎥ l ⎣ ⎦ opt [ ] ( ) = + + T L 2 R C rc rC R c opt d g g d Delay grows linearly with L (instead of quadratically) R C opt = d g l rc Buffer-to-buffer spacing reduces in successive technology nodes d σ Dumb shrink d Smart shrink

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend