Fast Buffer Insertion Considering Process Variation Jinjun Xiong, Lei He EE Department EE Department University of California, Los Angeles University of California, Los Angeles Sponsors: NSF, UC MICRO, Actel Actel, , Mindspeed Mindspeed Sponsors: NSF, UC MICRO,
Agenda � Introduction and motivation � Modeling � Problem formulation � Detailed algorithms with complexity analysis � Experimental results � Conclusion 2
Buffer Insertion Flashback � Buffer insertion and sizing is a commonly used technique for high- performance chip designs to minimize delay � Classic results on buffer insertion – Two-pin nets: closed form for optimal solution [Bakoglu 90] – Multi-pin nets: dynamic-programming based algorithm to find the optimal solution [Van Ginneken 90] � Extensions – Multiple buffer libraries considering power minimization [Lillis 96] – Wire segmentation [Alpert DAC97] – Simultaneous buffer insertion and wire sizing [Chu, ISPD97] – Simultaneous tree construction and buffer insertion [Okamoto DAC96] – Simultaneous dual Vdd assignment and buffered tree construction [Tam DAC05] – ….. 3
Design Optimization in Nanometer Manufacturing � Probabilistic design approaches showed great promise to achieve better design quality – Compared to deterministic approaches, statistical circuit tuning achieved • 20% area reduction [Choi DAC04] • 17% power reduction [Mani DAC05] � Buffer insertion considering process variation is also gaining attention recently – Limited consideration of process variation • Wire-length variation [Khandelwal ICCAD03] – Independency assumption on process variation • Ignores global and spatial correlation [Xiong DATE05], [He ISPD05] Our major contributions: – High complexity theoretical foundations • Numerical integration to obtain JPDF [Xiong DATE05] that lifts these – Applicable to only special routing structures limitations • Two-pin nets only [Deng ICCAD05] 4
Agenda � Introduction and motivation � Modeling � Problem formulation � Detailed algorithms with complexity analysis � Experimental results � Conclusion 5
Modeling � Linear delay model for buffer – Input capacitance (Cb), output resistance (Rb), and intrinsic delay (Tb) � π - model for interconnect – Wire capacitance (Cw) and wire resistance (Rw) � How to model these quantities with correlated process variation? 6
First-order Canonical Form for Variation Modeling = + + + + + L A a a X a X a X a X 0 1 1 2 2 n n R Ra � Mean value E(A) = a 0 � Random variables X 1 , X 2 , …, X n model – Die-to-die global variation: instances are affected in the same way – Within-die spatial correlation: instances physically nearby are more likely to be similar [Agarwal ASPDAC03, Chang ICCAD03, Khandelwal DAC05] � Random variable X Ra model – Independent variation: instances next to each other are different � All X i follow independent normal distributions – Well accepted practice in SSTA [Chang ICCAD03, Visweswariah DAC04] � In vector form, write device and interconnect with process variation T X, C b = C b0 + η b T X, R b = R b0 + ζ b T X – Device: T b = T b0 + γ b T X, R w = R w0 + ζ w T X – Interconnect: C w = C w0 + η w 7
Buffer Insertion Considering Process Variation � Given: a routing tree with required arrival time (RAT) and loading capacitance specified at sinks, and N possible buffer locations � Considering: both FEOL device and BEOL interconnect process variations � Find: locations to insert buffers � So that: the timing slack at the root is maximized – Timing slack: min i (RAT i – delay i ) s 2 sinks s 1 s 0 s 3 s 4 root possible buffer locations 8
Agenda � Introduction and motivation � Modeling � Problem formulation � Detailed algorithms – Key operations for buffering solutions – Transitive-closure pruning rule – Complexity analysis � Experimental results � Conclusion 9
Key Operations in Van Ginneken Algorithm � Associate each node with two metrics (C t , T t ) – Downstream loading capacitance (C t ) and RAT (T t ) – DP-based alg propagates potential solutions bottom-up [Van Ginneken, 90] � Add a wire = + C C C C t , T t C n , T n C w , R w t n w 1 = − ⋅ − ⋅ T T R L R C t n w n w w 2 � Add a buffer = C n , T n C C C t , T t t b = − − ⋅ T T T R L t n b b n C t , T t � Merge two solutions = + C C C t n m = T m in( T , T ) C n , T n C m , T m t n m � How to define these operations in statistical sense? 10
Atomic Operations � Keep all quantities in canonical form after operations – Maintain correlation w.r.t. sources of variation – Updated solutions can still be handled by the same set of operations � Add a wire Addition/subtraction of two = + C C C canonical forms is another t n w canonical form 1 = − ⋅ − ⋅ T T R L R C + = + α + + β T T t n w n w w A B ( a X ) ( b X ) 2 0 0 � Add a buffer = + + α + β T ( a b ) ( ) X 0 0 = C C t b Multiplications No longer a = − − ⋅ T T T R L t n b b n canonical � Merge two solutions form Minimum = + C C C t n m = T m in( T , T ) t n m 11
Approximate Multiplication as Canonical Form � Multiplication of two canonical forms results in a quadratic term = ⋅ = + α + β T T C A B ( a X )( b X ) 0 0 = + α + β + αβ T T T T a b ( b a ) X X X 0 0 0 0 = + γ + Γ T T a b X X X 0 0 – Matrix Γ = αβ T � Approximate it as a canonical form by matching the mean and variance with that of the exact solution − 2 2 E C ( ) E C ( ) = + γ = + η ' T T C E C ( ) X c X γ γ 0 T – E(C) is the mean value (first moment) of C – E(C 2 ) is the second moment of C � E(C 2 )-E(C) 2 is the variance – C ’ is a new canonical form with the same mean and variance as C 12
Closed Form for Moment Computation 1 st Moment = + γ + Γ T T E C ( ) c E X ( ) E X ( X ) 0 2 nd Moment = + γ + Γ γ + Γ + γγ + Γ 2 2 T T T T T T T 2 E C ( ) E c ( 2 c X 2 X X X 2 c X X X X ( X X ) ) 0 0 0 = + γ + Γ γ + Γ+ γγ + Γ 2 T T T T T T 2 c 2 c E X ( ) 2 ( E X X X ) E X ( (2 c ) ) X E X (( X ) ) 0 0 0 � Theorem: If X is an independent multivariate normal distribution ~N(0,I), then for any vector γ and matrix Γ Γ = Γ T E X ( X ) tr ( ) − 2 2 E C ( ) E C ( ) Γ γ = T T ⋅ ≈ = + γ = + η E X ( X X ) 0 ' T T A B C E C ( ) X c X γ γ 0 T Γ = Γ + Γ T 2 2 2 E (( X X ) ) 2 ( tr ) tr ( ) – Trace of a matrix (tr) equals to the sum of all diagonal elements � In general, tr( Γ ) and tr( Γ 2 ) are expensive, but if Γ = αβ T + εη T (a row rank matrix), we can show Γ = β α + ηε Γ = β α + ηε + β α ηε T T 2 T 2 T 2 T T tr ( ) , tr ( ) ( ) ( ) 2( )( ) 13
Approximate Minimum as Canonical Form � Minimum of two canonical forms is also not a canonical form � Approximate it as a canonical from by matching the exact mean and variance = + β + α + T T min( , ) A B c ( T T ) X c X 0 A B R R − ⎛ ⎞ a b – Tightness probability of A: = > = Φ⎜ 0 0 ⎟ T P A ( B ) θ A ⎝ ⎠ Φ is the CDF of a standard normal distribution • θ is given by • θ = σ + σ − 2 2 2 cov( A B , ) A B – Exact mean and variance can be computed in closed form [Clack 65] • Well known for statistical timing analysis � Design for mean value ≠ design for nominal value because of mean − ⎛ ⎞ shift b a = + − θφ ≠ 0 0 ⎜ ⎟ E (min( , )) A B T a T b min( a b , ) θ A 0 B 0 0 0 ⎝ ⎠ Design for nominal value Design for mean value 14
Agenda � Introduction and motivation � Modeling � Problem formulation � Detailed algorithms – Key operations for buffering solutions – Transitive-closure pruning rule – Complexity analysis � Experimental results � Conclusion 15
Deterministic Pruning Rule � If T 1 >T 2 and C 1 < C 2 � (C 1 , T 1 ) dominates (C 2 , T 2 ) – Dominated solution (C 2 , T 2 ) is redundant RAT Redundant solutions Load � Deterministic pruning has linear time complexity because of the following two desired properties Can we achieve the – Ordering property same time complexity • Either A>B or A<B holds for statistical pruning? – Transitive ordering (transitive-closure) property • A>B, B>C � A>C – Make it possible to sort solutions in order • Assume sorted by load � linear time to prune redundant solutions 16
Recommend
More recommend