From Recovering Time to Timing Recovery: Some Challenges for the - PowerPoint PPT Presentation

From Recovering Time to Timing Recovery: Some Challenges for the TAU Community Andrew B. Kahng Depts. of CSE and ECE UC San Diego abk@ucsd.edu http://vlsicad.ucsd.edu/~abk

TAU-2016 Keynote: “In Search of Lost Time” • “Recovering Time”: machine learning, optimization, margin reduction, … 2

Agenda • Motivations 3

Design Crises: Cost, Expertise, Unpredictability • Design cost: not scaling • Design, process roadmaps not coupled • Figure: Andreas Olofsson, DARPA, ISPD-2018 keynote • Quality: also not scaling • Design Capability Gap • Available density: 2x/node • Realizable density: 1.6x/node • Figure: UCSD / 2013 ITRS 4

Design is Too Difficult ! • Tools and flows have steadily increased in complexity • Modern P&R tool: 10000+ commands/options • Hard to design with latest tools in latest technologies • Even harder to predict quality, schedule • Expert users required • Increased cost and risk not good for industry ! • Still have “CAD” mindset more than “DA” mindset • Again: assumes expert users How do we escape this “local minimum” ? 5

IDEA: No-Humans, 24-Hours A. Olofsson, DARPA ISPD-2018 keynote • Part of DARPA Electronics Resurgence Initiative • Traditional focus: ultimate quality • New focus = ultimate ease of use • No humans, 24-hour TAT = “equivalent scaling” • Overarching goal: designer access to silicon 6

DARPA IDEA and POSH Programs, 2018-2022 https://vlsicad.ucsd.edu/NEWS18/dac_v5_DISTAR.pdf 7

theopenroadproject.org 8

OpenROAD: A New Design Paradigm 24 hours, no humans – no PPA loss Machine Learning Restricted layout of tools, flows optimization partitioning Extreme Parallel Design Complexity Mindsets  Quality • Achieve predictability from the user’s POV  Schedule • Use cloud/parallel to recover solution quality • Focus on reducing time and effort = schedule, cost  Cost Machine Learning is CENTRAL to this 9

The OpenROAD Project • Initial target: digital IC flow “RTL to GDS” • Open source • No-human-in-loop • Limited “knobs”, restricted field of use • Must replace intelligent humans (partition, floorplan, …) 10

Agenda • Motivations • OpenROAD + Initial Target 11

Initial Target: RTL-to-GDS Layout Generation Verilog + .lib, .sdc, .lef • Inputs: .v, .sdc, .lib, .lef Logic Synthesis • .def, .spef in point tools • config files required • pre-characterizations required Floorplan/PDN • Outputs: post-route .def, timing/power estimates Placement • V1.0 release: June 2020 Clock Tree Synthesis Global and Detailed Routing Layout Finishing GDSII 12

Placement https://github.com/abk-openroad/RePlAce • RePlAce features • Timing-driven (OpenSTA timer integrated) • Mixed-size (macros + cells) .def from FP/PDN • Electrostatics analogy in analytic placement (+ .v, .sdc, .lef, .lib) • RePlAce used in: • Physical synthesis Placement • Floorplanning • Clock tree synthesis • Traditional standard-cell placement Placed .def • BSD-3 License 13

RePlAce: Routability-Driven Placement • Global routing during routability-driven global placement Routability-driven loop 14

Static Timing Analysis https://github.com/abk-openroad/OpenSTA • OpenSTA : open-sourced static timing analysis tool • Developer: James Cherry (Parallax Software) • Tested with ASAP7, GF14, TSMC16, ST28, etc. • GPLv3 license 15

Slack, WNS, TNS 28nm aes_cipher_top (28nm, 12T, clkp=1000ps) Reg-to-Out/ Reg-to-Reg In-to-Reg aes_cipher_top WNS (ps) TNS (ps) #viol. Signoff STA -61 -289 7 OpenSTA (arnoldi) -57 -314 9 16

Slack, WNS, TNS 16nm Coyote (16nm, 9T, clkp=2000ps) Signoff STA OpenSTA WNS (ns) -0.660 -0.603 TNS (ns) -1758.004 -1219.239 #viol. 8096 6926 17

Challenges for the TAU Community • #1. Help improve open-source STA engine • In particular: OpenSTA • Delay calculation, SI analysis, advanced timing models, MCMM, … • Priorities = ? • Will revisit: Signoff STA OpenSTA WNS (ns) -0.660 -0.603 TNS (ns) -1758.004 -1219.239 #viol. 8096 6926 18

The OpenROAD Project • Initial target: digital IC flow “RTL to GDS” • Open source • No-human-in-loop • Limited “knobs”, restricted field of use • Must replace intelligent humans (partition, floorplan, …) 19

Agenda • Motivations • OpenROAD + Initial Target • Machine Learning 20

ML in IC Design: Not Like Chess or Cat Pics • Getting to self-driving IC design: not so obvious • Do recent ML successes transfer well? • 3-week SP&R&Opt run is NOT like playing chess! • Design lives in a {servers, licenses, schedule} box • Distributions of outcomes matter cloud, parallel • A “stack of models” is mandatory: Predictions of downstream outcomes are also optimization objectives • Still uncharted road to self-driving tools and flows • How do we overcome “small, expensive data” challenges? • Standards : Learning comes from {design + tool + technology}, all of which are highly proprietary • Need mechanisms for IP-preserving sharing of data and models 21

4 Stages of ML to Recover Time, Effort Four Stages of Machine Learning 1. Mechanization and Automation 2. Orchestration of Search and Optimization 3. Pruning via Predictors and Models 4. From Reinforcement Learning Huge space of tool, command, option trajectories through design through Intelligence flow 22

Stage 3. Modeling and Prediction • Prediction of tool- and design-specific outcomes over longer and longer subflows • Wiggling of longer and longer ropes • Enables pruning and termination  avoid wasted design resources • Simple way to think about it: “identify doomed X” • Doomed floorplan, Opt run, DRoute run, … • Allocate resources elsewhere • Better outcome within given resource budget • Complementary dream: New heuristics and tools that are inherently more predictable and modelable  lessen chaos • Ensembles might be modeled/predicted • Prediction requirement might be relaxed “get user into a ballpark”? 23

Generic Need: Predicting Doomed Runs • NOTE: “Doomed” often wrt timing, or due to fear of timing!!! • Picture: progressions of #DR violations in commercial router • Simple approach: track and project metrics as time series • Can use Markov decision process (MDP): “GO” vs. “STOP” strategy card to terminate “doomed runs” early 24

Obtaining Golden From Non-Golden ML shifts the Accuracy-Cost Tradeoff Curve (for free) ! 25

DATE14, SLIP15 (Old) Example: ML-based Timer Correlation If INCREMENTAL Outliers error > (data points) threshol d New Designs MODELS (Path slack, setup Train Validate Test time, stage, cell, wire delays) Artificial Real Circuits Designs ONE-TIME AFTER BEFORE 0.1 T 2 Path Slack (ns) 0 T 2 Path Slack (ns) -0.1 -0.2 31 ps ML -0.3 Modeling ~4 � reduction -0.4 123 ps -0.5 -0.6 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 T 1 Path Slack (ns) T 1 Path Slack (ns) 26

ICCD18 Lately: Predicting PBA from GBA • PBA (Path-Based Analysis) is less pessimistic than GBA (Graph-Based Analysis) • But, can have MUCH more expensive runtime ! • ML task: Predict PBA timing from GBA timing •  Improved quality of results in P&R, optimization •  Less-expensive timing analysis usable earlier in flow GBA Mode PBA Mode 27

Bigram- and CART-based Modeling • Bigram-based path modeling • Classification and regression tree (CART) approach • Model based on 13 bigram parameters Reduced GBA pessimism vs. PBA https://vlsicad.ucsd.edu/Publications/Conferences/361/c361.pdf 28

DATE19 Lately: Reduce #Corners in STA and Opt • Want all the benefits of STA at N corners , but want to pay for analysis at only M << N corners • “Missing Corner Prediction” (“matrix completion”) saves runtime, licenses • “Primary corners” methodology  errors caught at signoff cause iteration 29

“Missing Corners” = Matrix Completion Predicting missing delay values = matrix completion problem STA at relatively few known corners  reasonably accurate prediction of timing at all unknown corners PCA: low-dimensional modeling problem 30

Recent: Strong Design-Independent Models Trained using Trained using Error initial artificial testcases richer artificial testcases 10X improvement !! # Corners megaboom (990K instances, 350K FF) 31

Recent: “ML-LEAK” (leakage recovery predictor) • ML to predict how much leakage will be recovered if user runs {Tweaker, Tempus ECO, PTSI ECO, homegrown script, …} • Gives expectation of post-recovery power • Beneficial to methodology team when trying out various DOEs. • Saves time for implementation team: skip leakage recovery if it won’t help • Blended model of design and instance level predictions gives best results. Power recovered in this design was 0.076%. Our Plot showing actual vs predicted percentage change in leakage model predicts 1% power recovery for this graph power after recovery 32

From Recovering Time to Timing Recovery: Some Challenges for the - PowerPoint PPT Presentation

From Recovering Time to Timing Recovery: Some Challenges for the TAU Community Andrew B. Kahng Depts. of CSE and ECE UC San Diego abk@ucsd.edu http://vlsicad.ucsd.edu/~abk TAU-2016 Keynote: In Search of Lost Time Recovering

Timing and Coordination Essential Knowledge 2.E.2 and 2.E.3 Timing and Coordination Timing

Is the ozone layer Is the ozone layer recovering ? recovering ? Johannes Staehelin Institute

Recovering Minerals and Bitumen Recovering Minerals and Bitumen from Oil Sands Tailings from Oil

Brighton High School Credit Recovery What is Credit Recovery? Recovering the credit if you

Brighton High School Credit Recovery What is Credit Recovery? Recovering the credit if you

Liberty Timing File (LIB) Advanced VLSI Design CMPE 641 Liberty Timing File The .lib file is an

Timing Library Format (TLF) Advanced VLSI Design CMPE 414 Timing Library Format (TLF) TLF is an

Timing Analysis Timing Path Groups and Types Timing paths are grouped into path groups

Digital Design Discussion: RTL Storage Components Shift Register Timing Register File Timing

Common Challenges with CDBG-Disaster Recovery Programs Guidance for Navigating the Recovery

Strip Recovery: Strip Recovery: Strip Recovery: Strip Recovery: A 12 A 12- -Step

Timing Attacks for Recovering Private Entries From Database Engines August 1, 2007 Damian

Iterative Timing Recovery John R. Barry School of Electrical and Computer Engineering, Georgia

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

Community Recovery Forum Presenter: Cr Mary Brown Overview of Recovery Structure

RECOVERY OPERATIONS Performing recovery and related operations Acronis Training and Certification

Translating Open Source Apps Kukuh Syafaat openSUSE-ID cho2@opensuse.{org,id} openSUSE

Deadlocks: Detection & Avoidance (Chapter 6) CS 4410 Operating Systems The slides are the

Threat modelling for developers Arne Padmos xkcd Safety vs Security William Warby Warner

Lies, Damned Lies and Statistics EuroPython 2018 Edinburgh, UK July 2018 @MarcoBonzanini

Securi rity P y Principles & & Sandboxes CS 161: Computer Security Prof. Raluca Ada

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Content, Topology and Cooperation in In-Network Caching Liang Wang Email:

Kevin Stadmeyer Garrett Held Worst of the Best of the Best Agenda Motives Goals

From Recovering Time to Timing Recovery: Some Challenges for the - PowerPoint PPT Presentation

From Recovering Time to Timing Recovery: Some Challenges for the TAU Community Andrew B. Kahng Depts. of CSE and ECE UC San Diego abk@ucsd.edu http://vlsicad.ucsd.edu/~abk TAU-2016 Keynote: In Search of Lost Time Recovering

Timing and Coordination Essential Knowledge 2.E.2 and 2.E.3 Timing and Coordination Timing

Is the ozone layer Is the ozone layer recovering ? recovering ? Johannes Staehelin Institute

Recovering Minerals and Bitumen Recovering Minerals and Bitumen from Oil Sands Tailings from Oil

Brighton High School Credit Recovery What is Credit Recovery? Recovering the credit if you

Brighton High School Credit Recovery What is Credit Recovery? Recovering the credit if you

Liberty Timing File (LIB) Advanced VLSI Design CMPE 641 Liberty Timing File The .lib file is an

Timing Library Format (TLF) Advanced VLSI Design CMPE 414 Timing Library Format (TLF) TLF is an

Timing Analysis Timing Path Groups and Types Timing paths are grouped into path groups

Digital Design Discussion: RTL Storage Components Shift Register Timing Register File Timing

Common Challenges with CDBG-Disaster Recovery Programs Guidance for Navigating the Recovery

Strip Recovery: Strip Recovery: Strip Recovery: Strip Recovery: A 12 A 12- -Step

Timing Attacks for Recovering Private Entries From Database Engines August 1, 2007 Damian

Iterative Timing Recovery John R. Barry School of Electrical and Computer Engineering, Georgia

Cycle time: 40 sec Cycle time: 12 sec Cycle time: 0.75 sec Cycle time: 1.25 sec Cycle time: 5

Community Recovery Forum Presenter: Cr Mary Brown Overview of Recovery Structure

RECOVERY OPERATIONS Performing recovery and related operations Acronis Training and Certification

Translating Open Source Apps Kukuh Syafaat openSUSE-ID cho2@opensuse.{org,id} openSUSE

Deadlocks: Detection &amp; Avoidance (Chapter 6) CS 4410 Operating Systems The slides are the

Threat modelling for developers Arne Padmos xkcd Safety vs Security William Warby Warner

Lies, Damned Lies and Statistics EuroPython 2018 Edinburgh, UK July 2018 @MarcoBonzanini

Securi rity P y Principles &amp; &amp; Sandboxes CS 161: Computer Security Prof. Raluca Ada

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Content, Topology and Cooperation in In-Network Caching Liang Wang Email:

Kevin Stadmeyer Garrett Held Worst of the Best of the Best Agenda Motives Goals

Deadlocks: Detection & Avoidance (Chapter 6) CS 4410 Operating Systems The slides are the

Securi rity P y Principles & & Sandboxes CS 161: Computer Security Prof. Raluca Ada