Hsien-Han Cheng1, Tung-Wei Lin2, Yu-Cheng Lin2, Iris Hui-Ru Jiang2 ,Pei-Yu Lee3
1National Chiao Tung University 2National Taiwan University 3Maxeda Technology
Team: iTimer Hsien-Han Cheng 1 , Tung-Wei Lin 2 , Yu-Cheng Lin 2 , - - PowerPoint PPT Presentation
TAU2019 Timing Contest Team: iTimer Hsien-Han Cheng 1 , Tung-Wei Lin 2 , Yu-Cheng Lin 2 , Iris Hui-Ru Jiang 2 ,Pei-Yu Lee 3 1 National Chiao Tung University 2 National Taiwan University 3 Maxeda Technology Problem Formulation The Design
1National Chiao Tung University 2National Taiwan University 3Maxeda Technology
2
The Design Optimization Problem
Initial circuit netlist (.v)
RC parasitics (.spef)
Timing and design constraint file (.sdc)
Multiple corner liberties (.lib),
No hold time violations across multiple corners
No slew or cap violations across multiple corners
Maximize working frequency
Minimize leakage
Minimize area
Minimize runtime
Minimize memory
3
Gate sizing is NP-hard Multi-corner timing optimization is first considered Unbalanced clock tree complicates timing optimization
4
5
The corner which has the slowest cells bounds the
The corner with the most total negative slack (TNS) is
All subsequent optimization steps focus on timing from
6
Gate upsizing or buffer insertion can solve the violations Apply the following procedures sequentially unless the
Perform cap/slew violation fixing in BFS order first and
7
CLK Buffer Removal
CLK Buffer Insertion for Hold Time Fixing
Clock tree split point buffer insertion
Clock tree leaf point buffer insertion
Data path buffer insertion
8
Gate Upsizing
Useful Skew
9
Segment Dependency Graph (SDG) can estimate the
With the global view provided by SDG, we can identify
10
Apply Max Cap/Slew Fixing and Multi-corner Hold Time
Multi-corner Hold Time Fixing
11
Platform: Intel Xeon 2.6GHz Linux Workstation with
w.r.t. zero clock period |WNS (Setup)| = longest path delay = 1/frequency
12
usb_function: enormous clock skew Clk Tree Opt reduces |WNS (setup)| by 30%
13
On average, our flow can decrease worst setup slack by
Experiment results show that our proposed algorithm is
Our future work includes further shortening the runtime
14