George Chen Amit Dhuria Intel Cadence The TAU 2017 Contest Path - - PowerPoint PPT Presentation

โ–ถ
george chen amit dhuria
SMART_READER_LITE
LIVE PREVIEW

George Chen Amit Dhuria Intel Cadence The TAU 2017 Contest Path - - PowerPoint PPT Presentation

George Chen Amit Dhuria Intel Cadence The TAU 2017 Contest Path Reporting Contest George Xi Amit Chen Chen Dhuria Intel Synopsys Cadence [Speaker] Sponsors: Programmable Solutions Group 2 Why Path Reporting? Path reporting occupies a


slide-1
SLIDE 1

George Chen Amit Dhuria Intel Cadence

slide-2
SLIDE 2

Programmable Solutions Group

2

The TAU 2017 Contest

Path Reporting Contest

George Chen

Intel

[Speaker]

Amit Dhuria

Cadence

Xi Chen

Synopsys

Sponsors:

slide-3
SLIDE 3

Programmable Solutions Group

Why Path Reporting?

Path reporting occupies a large portion of static timing analysis

  • Core timing analysis is generated only once.
  • But timing report generation occurs many times, after initial timing update.
  • 1000s+ commands may be issued for specific subset of paths to check for

timing closure

  • Web-based dashboard systems built to parse data depend upon the path

reporting infrastructure Raise awareness of the need for efficient parallelization of path reporting

  • Accurate handling of from/through/to parameters
  • Explore limitations to the performance of path reporting.

3

slide-4
SLIDE 4

Programmable Solutions Group

Contest Overview

Leveraged past TAU Timing Contest Infrastructure

4

Very limited time frame: 2 months only (12/17/2017 to 2/16/2018) Contest scope: Graph-Based Analysis only, with support of

  • from/-rise_from/-fall_from
  • through/-rise_through/-fall_through
  • to/-rise_to/fall_to

Provided to Contestants

Benchmarks

Timing and path reporting tutorials, file formats, timing model basics, evaluation rules, etc.

  • 5. TAU 2015 binary:

iTimerC v2.0

  • 6. OpenTimer

(UI-Timer v2.0)

Detailed Documentation

Previous contest winners, utilities Verilog (.v) Liberty (.lib) Design Parasitics SPEF (.spef) wrapper file (.tau2018)

Evaluation

~100K individual path queries across two designs

Performance (.output)

Memory Usage Golden Result* Runtime

Accuracy Early and Late Libraries Design Connectivity

Open Source Code and Binaries

  • 1. PATMOS 2011: NTU-Timer
  • 2. TAU 2013: IITiMer
  • 3. TAU 2014: UI-Timer
  • 4. ISPD 2013: .spef/.lib parsers

(.ops)

STA Run

slide-5
SLIDE 5

Programmable Solutions Group

Benchmarks

Based on TAU 2015 Benchmarks Test setup

  • ~50K paths per testcase pre-generated by commercial timer
  • Used special constant-delay Liberty models to reduce/eliminate delay

calculation impact.

5

Design Gates Nets vga_lcd 139.5K 139.6K leon3mp_iccad 1247.7K 1248.0K

slide-6
SLIDE 6

Programmable Solutions Group

Evaluation Metrics

Overview Detailed Scoring Algorithm (10 points possible)

  • Accuracy = 3 *

# ๐‘‘๐‘๐‘ ๐‘ ๐‘“๐‘‘๐‘ข ๐‘ž๐‘๐‘ขโ„Ž๐‘ก # ๐‘ข๐‘๐‘ข๐‘๐‘š ๐‘ž๐‘๐‘ขโ„Ž๐‘ก (custom script)

  • Runtime = 3 *

min(๐‘๐‘š๐‘š ๐‘“๐‘œ๐‘ข๐‘ ๐‘—๐‘“๐‘ก) ๐‘“๐‘œ๐‘ข๐‘ ๐‘งโ€ฒ๐‘ก ๐‘ ๐‘ฃ๐‘œ๐‘ข๐‘—๐‘›๐‘“ (measured wall clock time using โ€œtimeโ€ cmd)

  • Memory = 2 *

min ( ๐‘๐‘š๐‘š ๐‘“๐‘œ๐‘ข๐‘ ๐‘—๐‘“๐‘ก ) ๐‘“๐‘œ๐‘ข๐‘ ๐‘งโ€ฒ๐‘ก ๐‘›๐‘“๐‘›๐‘๐‘ ๐‘ง (measured peak memory using โ€œtimeโ€ cmd)

6

Metric Weight Remarks Accuracy 30% Correct sub-path reported Runtime 30% Elapsed time (not user time) Memory 20% Peak memory usage, including allocated memory Documentation 20% Clear description of algorithm and features

slide-7
SLIDE 7

Programmable Solutions Group

TAU 2018 Contestants

University Team National Chiao Tung University / National Taiwan University iTimerP Texas A&M University Concha-STA National Tsing Hua University BattleCats

7

slide-8
SLIDE 8

Pei-Yu Lee1, Hsien-Han Cheng1,Iris Hui-Ru Jiang12

1National Chiao Tung University 2National Taiwan University

8

slide-9
SLIDE 9

Team Name: iTimerP Pei-Yu Lee1, Hsien-Han Cheng1,Iris Hui-Ru Jiang12

1National Chiao Tung University 2National Taiwan University

Tau2018 Contest

slide-10
SLIDE 10

10

Outline

๏ฌ Algorithm flow ๏ฌ Motivation ๏ฌ Timing graph marking ๏ฌ Path searching ๏ฌ Experimental result

slide-11
SLIDE 11

11

Timing Graph Marking Backward Traversing Critical Paths Report Generation Timing Graph Marking Backward Traversing Critical Paths Report Generation Timing Graph Marking Backward Traversing Critical Paths Report Generation

Algorithm Flow

Forward Propagating Arrival Time Worst Slack Calculation Initial Timing Graph Construction Timing Graph Marking Backward Traversing Critical Paths Report Generation Path Extracting Block-Based Timing Analysis

slide-12
SLIDE 12

12

Timing Graph Marking

๏ฌ Mark each pair from endpoint to startpoint

Startpoint Endpoint

slide-13
SLIDE 13

13

Motivation

๏ฌ For each two levels, fanin cone size < fanout cone size

โ€“ Always mark fanin first to bound fanout region

Fanin cone size = fanout cone size

slide-14
SLIDE 14

14

Pairwise Marking

Start Level End Level Left Level Right Level Fanin cone Fanout cone (Bounded by fanin cone) Overlapped Region

slide-15
SLIDE 15

15

Path Searching

๏ฌ Backward traverse nodes in overlapped region ๏ฌ Enhanced by two techniques

โ€“ Local slack bounds โ€“ Slack priority queue

๏ฌ Local slack bounds

โ€“ Record worst-k slack for each nodes โ€“ Bound if current slack is better than slack bound(best slack of the node) โ€“ Reduce redundant traversal

๏ฌ Slack priority queue

โ€“ Obtain globally worst slack node during backward traversing

๏ฎ

Use a priority queue prioritized with path slack

slide-16
SLIDE 16

16

Runtime & Memory Tradeoff

๏ฌ Perform multiple report_timing concurrently

โ€“ Without parallelism, slowest runtime/smallest memory usage (100 sec/300MB) โ€“ With parallelism, command cache size equals to 2^8 to 2^14 gives best runtime and memory tradeoff (200 sec/300MB)

slide-17
SLIDE 17

17

Scalability

๏ฌ Perform 2^21 (1 Million) report_timing in 20 sec/360MB ๏ฌ Runtime increase linearly when command size doubles

slide-18
SLIDE 18

18

Thank you!

slide-19
SLIDE 19

19

slide-20
SLIDE 20

Programmable Solutions Group

2018 Contest Results

Accuracy Correct Paths Total Score Team 1 100% 3.0 Team 2 100% 3.0

20

Runtime Runtime Total Score Team 1 128 1.1 Team 2 49 3.0 Memory Memory Total Score Team 1 4854800 0.9 Team 2 2252696 2.0 Documentation Score Team 1 1.5 Team 2 2.0 Totals Score Placement Team 1 6.6 2nd Place Team 2 10.0 1st Place

slide-21
SLIDE 21

Song Chen Technical Chair George Chen Contest Chair Tom Spyrou General Chair

TAU 2018

Timing Contest on Path Reporting

1s 1st Pr t Prize ize

Presented to For

iTimerP

National Chiao Tung University National Taiwan University

Pei-Yu Lee, Hsien-Han Cheng, and Iris Hui-Ru Jiang

slide-22
SLIDE 22

2nd

nd Prize

rize

Presented to For

BattleCat

National Tsing Hua University

Kuan-Ming Lai and Wen-Hung Hsu

TAU 2018

Timing Contest on Path Reporting

Song Chen Technical Chair George Chen Contest Chair Tom Spyrou General Chair

slide-23
SLIDE 23

Programmable Solutions Group

Requests

Ideas for future contests?

  • What sort of topics may be interesting to the group?

Increasing contestant participation?

  • How to increase contestants?

Increasing committee participation?

  • We are looking for additional individuals to help plan the contest. Please

contact George.J.Chen@intel.com if you are interested!

23

slide-24
SLIDE 24

Programmable Solutions Group

Acknowledgements

24

Xi Chen

Contest Committee Member

Tom Spyrou

Workshop General Chair

Song Chen

Workshop Technical Chair

TAU 2018 Contestants

This contest would not have been successful without your hard work and dedication

slide-25
SLIDE 25