Construction of Realistic Gate Construction of Realistic Gate - PowerPoint PPT Presentation

Construction of Realistic Gate Construction of Realistic Gate Sizing Benchmarks Sizing Benchmarks With Known Optimal Solutions With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego International Symposium on Physical Design March 27 th , 2012 UC San Diego / VLSI CAD Laboratory -1-

Outline Outline  Background and Motivation  Benchmark Generation  Experimental Framework and Results  Conclusions and Ongoing Work -2-

Gate Sizing in VLSI Design Gate Sizing in VLSI Design  Gate sizing – Essential for power, delay and area optimization – Tunable parameters: gate-width, gate-length and threshold voltage – Sizing problem seen in all phases of RTL-to-GDS flow  Common heuristics/algorithms – LP, Lagrangian relaxation, convex optimization, DP, sensitivity-based gradient descent, ... 1. Which heuristic is better? 2. How suboptimal a given sizing solution is?  systematic and quantitative comparison is required -3-

Suboptimality of Sizing Heuristics Suboptimality of Sizing Heuristics  Eyechart * Chain STAR MESH – Built from three basic topologies, optimally sized with DP – allow suboptimalities to be evaluated – Non-realistic: Eyechart circuits have different topology from real design – large depth (650 stages) and small Rent parameter (0.17)  More realistic benchmarks are required along w/ automated generation flow *Gupta et al., “Eyecharts: Constructive Benchmarking of Gate Sizing Heuristics”, DAC 2010. -4-

Our Work: Realistic Benchmark Our Work: Realistic Benchmark Generation w/ Known Optimal Solution Generation w/ Known Optimal Solution 1. Propose benchmark circuits with known optimal solutions 2. The benchmarks resemble real designs – Gate count, path depth, Rent parameter and net degree 3. Assess suboptimality of standard gate sizing approaches Automated benchmark generation flow -5-

Outline Outline  Background and Motivation  Benchmark Considerations and Generation  Experimental Framework and Results  Conclusions and Ongoing Work -6-

Benchmark Considerations Benchmark Considerations  Realism vs. Tractability to Analysis – opposing goals  To construct realistic benchmark: use design characteristic parameters – # primary ports, path depth, fanin/fanout distribution Path depth: 72 design: 0.6 JPEG Encoder Avg. net degree: 1.84 0.4 Rent parameter: 0.72 Fanin distirbution 0.2 fanin fanout 25% : 1-input 60% : 2-input 0 15% : > 3-input 1 2 3 4 5 6  To enable known optimal solutions – Library simplification as in Gupta et al. 2010: slew-independent library -7-

Benchmark Generation Benchmark Generation  Input parameters 1. timing budget T 2. depth of data path K 3. number of primary ports N 4. fanin, fanout distribution fid(i), fod(j)  Constraints – T should be larger than min. delay of K -stage chain � � ��  Generation flow 1. construct N chains with depth K 2. attach connection cells ( C ) 3. connect chains  netlist with N* K + C cells -8-

Benchmark Generation: Benchmark Generation: Construct Chains Construct Chains 1. Construct N chains each with depth k ( N* k cells) 2. Assign gate instance according to fid(i) 3. Assign # fanouts to output ports according to fod(o)  Assignment strategy: arranged and random -9-

Benchmark Generation: Benchmark Generation: Construct Chains Construct Chains fanout fanin Random assignment Arranged assignment 1. Construct N chains each with depth k ( N* k cells) 2. Assign gate instance according to fid(i) 3. Assign # fanouts to output ports according to fod(o)  Assignment strategy: arranged and random -10-

Benchmark Generation: Benchmark Generation: Find Optimal Solution with DP Find Optimal Solution with DP 1. Attach connection cells to all open fanouts - to connect chains keeping optimal solution 2. Perform dynamic programming with timing budget T - optimal solution is achievable w/ slew-independent lib. -11-

Benchmark Generation: Benchmark Generation: Solving a Chain Optimally (Example) Solving a Chain Optimally (Example) D max = 8 Stage 3 Stage 2 delay Stage 1 input leakage size cap power load 3 load 6 6 Size 1 3 5 3 4 INV3 INV1 INV2 Size 2 6 10 1 2 Stage 1 Stage 3 Stage 2 Budget Power Size Budget Power Size Budget Power Size 3 20 2 1 10 2 8 20 1 2 10 2 4 15 1 Load 3 5 1 Load 5 15 2 8 25 2 4 5 1 = 3 6 10 1 = 3 5 5 1 7 10 1 6 5 1 OPTIMIZED CHAIN 8 10 1 7 5 1 8 5 1 2 10 2 4 20 2 3 10 2 5 15 1 4 5 1 Load size 2 size 1 size 1 Load 6 15 2 5 5 1 = 6 = 6 7 10 1 6 5 1 8 10 1 7 5 1 8 5 1 -12-

Benchmark Generation: Benchmark Generation: Connect Chains Connect Chains VDD 1. Run STA and find arrival time for each gate 2. Connect each connection cell to open fanin port - connect only if timing constraints are satisfied - connection cells do not change the optimal chain solution 3. Tie unconnected ports to logic high or low -13-

Benchmark Generation: Benchmark Generation: Generated Netlist Generated Netlist  Generated output: – benchmark circuit of N* K + C cells w/ optimal solution Chains are connected to each other  various topologies Schematic of generated netlist (N = 10, K = 20) -14-

Outline Outline  Background and Motivation  Benchmark Generation  Experimental Framework and Results  Conclusions and Ongoing Work -15-

Experimental Setup Experimental Setup  Delay and Power model (library) – LP: linear increase in power – gate sizing context – EP: exponential increase in power – Vt or gate-length  Heuristics compared – Two commercial tools (BlazeMO, Cadence Encounter) – UCLA sizing tool – UCSD sensitivity-based leakage optimizer  Realistic benchmarks: six open-source designs  Suboptimality calculation power heuristic - power opt Suboptimality = power opt -16-

Generated Benchmark - Complexity Generated Benchmark - Complexity  Complexity (suboptimality) of generated benchmark Chain-only vs. connected-chain topologies Greedy Commercial tool 20.0% 20.0% chain-only chain-only Suboptimality 15.0% 15.0% connected connected 10.0% 10.0% 5.0% 5.0% 0.0% 0.0% [library]-[N]-[k] Chain-only: avg. 2.1% Connected-chain: avg. 12.8% -17-

Generated Benchmark - Connectivity Generated Benchmark - Connectivity  Problem complexity and circuit connectivity 1. Arranged assignment: improve connectivity (larger fanin – later stage, larger fanout – earlier stage) 2. Random assignment: improve diversity of topology arranged random unconnected Subopt. 100% 0% 0.00% 2.60% 75% 25% 0.00% 6.80% 50% 50% 0.25% 10.30% 25% 75% 0.75% 11.20% 0% 100% 17.00% 7.70% -18-

Suboptimality w.r.t. Parameters Suboptimality w.r.t. Parameters  For different number of chains 14% 10000 13% subopt.(Comm) 1000 suboptimality 12% subopt.(Greedy) runtime (min) 11% 100 subopt.(SensOpt) runtime(Comm) 10% 10 runtime(Greedy) 9% runtime(SensOpt) 8% 1 40 80 160 320 640 number of chains  For different number of stages 14% 1000 subopt.(Comm) 13% runtime (min) suboptimality subopt.(Greedy) 12% 100 subopt.(SensOpt) 11% runtime(Comm) 10% 10 runtime(Greedy) 9% runtime(SensOpt) 8% 1 20 40 60 80 100 number of stages Total # paths increase significantly w.r.t. N and K -19-

Suboptimality w.r.t. Parameters (2) Suboptimality w.r.t. Parameters (2)  For different average net degrees 120% 1000.0 100% suboptimality 100.0 80% subopt.(Comm) subopt.(Greedy) 60% 10.0 runtime (min) subopt.(SensOpt) 40% 1.0 runtime(Comm) 20% runtime(Greedy) 0% 0.1 runtime(SensOpt) 1.2 1.6 2 2.4 average net degree  For different delay constraints 25% 100.0 20% suboptimality subopt.(Comm) 10.0 runtime (min) subopt.(Greedy) 15% subopt.(SensOpt) 10% runtime(Comm) 1.0 runtime(Greedy) 5% runtime(SensOpt) 0% 0.1 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 timing constraint (ns) -20-

Generated Realistic Benchmarks Generated Realistic Benchmarks  Target benchmarks – SASC, SPI, AES, JPEG, MPEG (from OpenCores ) – EXU (from OpenSPARC T1 )  Characteristic parameters of real and generated benchmarks real designs generated data # instance Rent net Rent net depth param. degree param. degree SASC 20 624 0.858 2.06 0.865 2.06 SPI 33 1092 0.880 1.81 0.877 1.80 EXU 31 25560 0.858 1.91 0.814 1.90 AES 23 23622 0.810 1.89 0.820 1.88 JPEG 72 141165 0.721 1.84 0.831 1.84 MPEG 33 578034 0.848 1.59 0.848 1.60 -21-

Suboptimality of Heuristics Suboptimality of Heuristics  Suboptimality w.r.t. known optimal solutions for generated realistic benchmarks Suboptimality 60.00% With EP library Comm1 Comm2 Greedy SensOpt Vt swap 40.00% context – 20.00% up to 52.2% avg. 16.3% 0.00% eyechart SASC SPI AES EXU JPEG MPEG * Greedy results for MPEG are missing 60.00% With LP library Comm1 Comm2 Greedy SensOpt Gate sizing 40.00% context – up to 43.7% 20.00% avg. 25.5% 0.00% eyechart SASC SPI AES EXU JPEG MPEG -22-

Construction of Realistic Gate Construction of Realistic Gate - PowerPoint PPT Presentation

Construction of Realistic Gate Construction of Realistic Gate Sizing Benchmarks Sizing Benchmarks With Known Optimal Solutions With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego International

Advanced GATE Embedded Track II, Module 8 Second GATE Training Course May 2010 Advanced GATE

Lesson 6 Combinational Logic Circuits Gate Review AND Gate OR Gate NOT Gate NAND

Gate B Gate B Gate B Gate D Gate D Gate D Gate E Gate E Gate E Ferry Plaza Ferry Plaza

CHAPTER IV GATE DESIGN R.M. Dansereau; v.1.0 GATE NETWORKS INTRO. TO COMP. ENG. GATE

The GATE Embedded API Track II, Module 5 Second GATE Training Course May 2010 The GATE Embedded

GATE APIs Track II, Module 6 Second GATE Training Course May 2010 GATE APIs 1 / 62 Using Java

CSS GATE TESTING AND IDENTIFICATION 2017-2018 GATE PROGRAM DESCRIPTION GATE Mission

Xpanda security products The gate way to peace of mind Retail security gate solutions

Advanced GATE Embedded Track II, Module 8 Sixth GATE Training Course June 2013 2013 The

FOR SINGLE POLE SLALOM & SINGLE GATE GIANT SLALOM* THE CHIEF GATE JUDGE

Advanced GATE Embedded Track II, Module 8 Fifth GATE Training Course June 2012 2012 The

Advanced GATE Embedded Track II, Module 8 Third GATE Training Course AugustSeptember 2010

Advanced GATE Embedded Additional material: UIMA/GATE integration Fifth GATE Training Course

CVUSD GIFTED & TALENTED PROGRAM DAC PRESENTATION May 12, 2015 GATE Program GATE

Jericho Gate | 2014 Presentation JERICHO GATE THE PROJECT Jericho Gate | 2014 Presentation 2

CSD Entry Gate Improvement Project Town Hall November 14, 2018 Origin of Entry Gate Origin of

Incremental learning of motion primitives for full body motions Dana Kuli c Department of

Homework 8 due Tues 11/16 CLRS 18.1-5 (red-black vs. BTrees) CLRS 18.2-6 (complexity in t )

A Framework for Efficient Computations of Belief Theoretic Operations Lalintha G. Polpitiya,

Variations on a Theme: Fields of Definition, Fields of Moduli, Automorphisms, and Twists

Fit for the Future Programme Citizens Jury Travel Times Analysis 21 st January 2020

Managing Large Scale Drupal and Agile Culture by Dinesh Waghmare, TCS Myself @DrupalCon Dublin

Evaluation of a Clinical Simulation-based Method for EHR-platforms Authors: Sanne Jensen, Stine

Sharper Tools Topic 15 Implementing and Using Stacks "stack n. The set of things a person

Construction of Realistic Gate Construction of Realistic Gate - PowerPoint PPT Presentation

Construction of Realistic Gate Construction of Realistic Gate Sizing Benchmarks Sizing Benchmarks With Known Optimal Solutions With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego International

Advanced GATE Embedded Track II, Module 8 Second GATE Training Course May 2010 Advanced GATE

Lesson 6 Combinational Logic Circuits Gate Review AND Gate OR Gate NOT Gate NAND

Gate B Gate B Gate B Gate D Gate D Gate D Gate E Gate E Gate E Ferry Plaza Ferry Plaza

CHAPTER IV GATE DESIGN R.M. Dansereau; v.1.0 GATE NETWORKS INTRO. TO COMP. ENG. GATE

The GATE Embedded API Track II, Module 5 Second GATE Training Course May 2010 The GATE Embedded

GATE APIs Track II, Module 6 Second GATE Training Course May 2010 GATE APIs 1 / 62 Using Java

CSS GATE TESTING AND IDENTIFICATION 2017-2018 GATE PROGRAM DESCRIPTION GATE Mission

Xpanda security products The gate way to peace of mind Retail security gate solutions

Advanced GATE Embedded Track II, Module 8 Sixth GATE Training Course June 2013 2013 The

FOR SINGLE POLE SLALOM &amp; SINGLE GATE GIANT SLALOM* THE CHIEF GATE JUDGE

Advanced GATE Embedded Track II, Module 8 Fifth GATE Training Course June 2012 2012 The

Advanced GATE Embedded Track II, Module 8 Third GATE Training Course AugustSeptember 2010

Advanced GATE Embedded Additional material: UIMA/GATE integration Fifth GATE Training Course

CVUSD GIFTED &amp; TALENTED PROGRAM DAC PRESENTATION May 12, 2015 GATE Program GATE

Jericho Gate | 2014 Presentation JERICHO GATE THE PROJECT Jericho Gate | 2014 Presentation 2

CSD Entry Gate Improvement Project Town Hall November 14, 2018 Origin of Entry Gate Origin of

Incremental learning of motion primitives for full body motions Dana Kuli c Department of

Homework 8 due Tues 11/16 CLRS 18.1-5 (red-black vs. BTrees) CLRS 18.2-6 (complexity in t )

A Framework for Efficient Computations of Belief Theoretic Operations Lalintha G. Polpitiya,

Variations on a Theme: Fields of Definition, Fields of Moduli, Automorphisms, and Twists

Fit for the Future Programme Citizens Jury Travel Times Analysis 21 st January 2020

Managing Large Scale Drupal and Agile Culture by Dinesh Waghmare, TCS Myself @DrupalCon Dublin

Evaluation of a Clinical Simulation-based Method for EHR-platforms Authors: Sanne Jensen, Stine

Sharper Tools Topic 15 Implementing and Using Stacks &quot;stack n. The set of things a person

FOR SINGLE POLE SLALOM & SINGLE GATE GIANT SLALOM* THE CHIEF GATE JUDGE

CVUSD GIFTED & TALENTED PROGRAM DAC PRESENTATION May 12, 2015 GATE Program GATE

Sharper Tools Topic 15 Implementing and Using Stacks "stack n. The set of things a person