Bridging Pre- and Post-silicon Debugging with BiPeD Andrew DeOrio - - PowerPoint PPT Presentation

bridging pre and post silicon debugging with biped
SMART_READER_LITE
LIVE PREVIEW

Bridging Pre- and Post-silicon Debugging with BiPeD Andrew DeOrio - - PowerPoint PPT Presentation

Bridging Pre- and Post-silicon Debugging with BiPeD Andrew DeOrio Jialin Li and Valeria Bertacco University of Michigan ICCAD Simulation - based Verification Session November 2012 Verification Opportunities Pre-Silicon Post-Silicon -


slide-1
SLIDE 1

Bridging Pre- and Post-silicon Debugging with BiPeD

Andrew DeOrio Jialin Li and Valeria Bertacco

November 2012

University of Michigan

ICCAD “Simulation-based Verification” Session

slide-2
SLIDE 2

Verification Opportunities

Pre-Silicon Post-Silicon

2

  • Low speed

+ High observability + Reproducible bugs + High speed

  • Poor observability
  • Intermittent bugs

little information sharing

slide-3
SLIDE 3

Verification Opportunities

Pre-Silicon Post-Silicon

3

  • Low speed

+ High observability + Reproducible bugs + High speed

  • Poor observability
  • Intermittent bugs

little information sharing

High observability → learn correct behavior High speed → enforce correct behavior Shared correctness model

slide-4
SLIDE 4

High speed High observability, detailed debugging info No need for bug reproduction

Contributions

Pre-Silicon Post-Silicon

4

little information sharing

Shared correctness model

slide-5
SLIDE 5

BiPeD Overview

5

Post

  • silicon

Pre

  • silicon
  • Run correct tests
  • Monitor

interfaces

  • Learn correct

protocols

  • Transfer debug

data off-chip

  • Extract

debugging information

  • Run many

unknown tests

  • HW detects

protocols

  • Detect errors in

protocols

  • n-line
  • ff-line

Protocol detection Protocol extraction Transaction extraction

slide-6
SLIDE 6

BiPeD Overview

  • 1. Pre-silicon protocol extraction
  • 2. Post-silicon protocol detection
  • 3. Offline transaction extraction

6

Post

  • silicon
  • n
  • line protocol detection

test platform

  • ff
  • line transaction extraction

error ! transaction extraction

  • ccurrence

time errant transaction location ( signals ) transaction history

Pre

  • silicon

test test

tests

Protocol Database logic simulator

protocol extraction

slide-7
SLIDE 7

Pre-silicon Protocol Extraction

7

Design Under Test

module testbench initial begin clock = 0; #5 clock = 1; end

Pre-silicon Tests

select interface signals to analyze

Simulation

protocol diagram:

describes interface behavior

protocol extraction

01100 00000 00010 00100 00101

“INFERNO: Streamlining Verification with Inferred Semantics”, DeOrio, et. al, 2009

slide-8
SLIDE 8

Pre-silicon Protocol Extraction

8

transition

01100 00000 00010 00100 00101

event protocol diagram

protect thread sync TLB bypass ASI reload flush

time (cycles)

slide-9
SLIDE 9

TLU Protocol Example

9

01100 00000 00010 00100 00101

SPARC core

TLU LSU interface

bit 0: protect bit 1: thread sync bit 2: TLB bypass bit 3: ASI reload bit 4: flush

slide-10
SLIDE 10

Outline

  • 1. Pre-silicon protocol extraction
  • 2. Post-silicon protocol detection
  • 3. Offline transaction extraction

10

Post

  • silicon
  • n
  • line protocol detection

test platform

error

  • ff
  • line transaction extraction

! transaction extraction

  • ccurrence

time errant transaction location ( signals ) transaction history

Pre

  • silicon

test test

tests

Protocol Database logic simulator

protocol extraction

slide-11
SLIDE 11

Post-silicon Protocol Detection

11

test platform

protocol detector circular buffer

Error!

module testbench initial begin clock = 0; #5 clock = 1; end

post-si tests load protocols into programmable HW run high-coverage post-silicon tests

  • nly stop when

error is detected

01100 00000 00010 00100 00101

TLU Cache crossbar Memory

slide-12
SLIDE 12

Post-silicon protocol detection

12

monitored interface

...

event CAM

...

priority enc

...

transition CAM

...

current event previous event valid event valid transition

error

  • ut

... event history

history

  • ut

test platform

protocol detector circular buffer

detect multiple protocols simultaneously

slide-13
SLIDE 13

Protocol detector hardware

13

monitored interface

...

event CAM

...

priority enc

...

transition CAM

...

current event previous event valid event valid transition

error

  • ut

... event history

history

  • ut
  • Programmable
  • Circular history buffer

check event check transition record history

slide-14
SLIDE 14

Area overhead

14

monitored interface

...

event CAM

...

priority enc

...

transition CAM

...

current event previous event valid event valid transition

error

  • ut

... event history

history

  • ut
  • 0.7% of OpenSPARC T2 for 10 detectors

– 15.3KB storage each, for biggest OST2 protocol 33 bits x 62 events 622 transitions 1,024 events 10 protocols

slide-15
SLIDE 15

TLU Protocol Example

  • Injected bug in OpenSPARC TLU/LSU interface

– Cycle 10,000

  • Programmed TLU/LSU protocol into detector
  • Ran test
  • BiPeD HW detected bug at cycle 10,017

15

SPARC core

TLU LSU interface

slide-16
SLIDE 16

Outline

  • 1. Pre-silicon protocol extraction
  • 2. Post-silicon protocol detection
  • 3. Offline transaction extraction

16

Post

  • silicon
  • n
  • line protocol detection

test platform

error

  • ff
  • line transaction extraction

! transaction extraction

  • ccurrence

time errant transaction location ( signals ) transaction history

Pre

  • silicon

test test

tests

Protocol Database logic simulator

protocol extraction

slide-17
SLIDE 17

Off-line transaction extraction

17

test platform

protocol detector circular buffer

module testbench initial begin clock = 0; #5 clock = 1; end

Post-si Tests

transfer off-chip transaction extraction

01100 00000 00010 00100 00101

TLU Cache crossbar Memory

slide-18
SLIDE 18

Transaction extraction

  • Leverage transaction extraction similar to

Inferno [DeOrio, et. al, 2009]

  • Input: circular event buffer
  • Output: intuitive, high-level transactions

18

01100 00100

thread sync

00100 00000 01100

burst TLB bypass w/ thread sync

00000 00100

TLB bypass

slide-19
SLIDE 19

TLU Protocol Example

19

01100 00000 00010 00100 00101

SPARC core

TLU LSU interface

bit 0: protect bit 1: thread sync bit 2: TLB bypass bit 3: ASI reload bit 4: flush address reload TLB bypass w/ flush TLB bypass burst TLB bypass w/sync

slide-20
SLIDE 20

00100 00000 00101

buggy transition!

00010 10100

Transaction extraction example

20

01100 00000 00010 00100 00101

01100 00100 00100 00000 01100 00000 00100

...

Extracted transaction history

thread sync burst TLB bypass w/ thread sync TLB bypass TLB bypass

00000 00100

3,694-3,732

TLU protocol diagram (subset)

4,492-4,531 4,539-4,543 4,545-4,602 ... 4,609 – 10,017

cycle

slide-21
SLIDE 21

Transaction extraction example

  • Time: cycle 10,017
  • Interface: TLU
  • Signals: protect, thread sync, TLB bypass, ASI

reload, flush

  • Preceding activity: thread sync, burst TLB bypass

w/thread sync, TLB bypass, TLB bypass

  • Event: 10100 Transition: 00100 -> 10100
  • Transaction:

21

00100 00000 00101

buggy transition!

00010 10100

slide-22
SLIDE 22

Limitations

  • False negatives

– May miss bugs that only affect data signals – Interface signal selection important

  • Control signals work well in practice
  • False positives

– High pre-silicon coverage → fewer false positives – If f.p. is encountered, update the database

22

slide-23
SLIDE 23

1,000 passing runs

Experimental setup

23

10 testcases 100 random seeds: variable memory delay, crossbar random traffic 10 bugs: e.g., functional bug in PCX, fetch thread ID BiPeD HW BiPeD SW

detected transactions

100 buggy runs 10 interfaces

slide-24
SLIDE 24

Signal Localization

branch EX valid inst. cache-proc MEM rd ack FPU execept. fetch thread LSU access table walk PCX stall CCX/PCX req CPX

1,719 16 f.n.

branch

242

CCX

16k 39 16 742

memory

223

execute

16

FPU

f.p. 22k 48k 739 48k 22k

fetch

47

perf. TLU

16

PCX

767 764

24

first interface to find bug

f.p. false positive f.n. false negative

Interfaces Bugs

slide-25
SLIDE 25

Protocol Extraction

25

5 10 15 20 25 40 80 120 160 200

Cumulative events Cumulative transitions Testcase and total number of test executions transitions events

slide-26
SLIDE 26

Transaction Extraction

26

40 80 120 160 200 240 32 64 128 256 512 1024

Number of transactions Circular buffer size (entries) Total transactions Unique transactions

0.1 KB 4 KB

slide-27
SLIDE 27

Leave-one-out Cross Validation

27

0% 10% 20% 30%

False positives (percent) Omitted testcase

slide-28
SLIDE 28

Related Work

  • Invariant detection

[Ammons 2002, Ernst 2008]

– Detect invariants – Check tests against invariants

  • Pre-silicon verification

– Inferno: verification with transactions [DeOrio 2009] – Data mining high-level specifications [Li 2010]

  • Post-silicon validation

– Manual debugging [Abramovici 2006]

– Automated debugging of specific components [Park 2011]

– Manual, hardcoded txn checkers [Singerman 2011]

28

slide-29
SLIDE 29

Conclusions and Future Work

  • BiPeD bridges pre-silicon protocol extraction

with post-silicon detection

  • Automatically detects bugs
  • Provides intuitive debugging information
  • Future applications for flexible hardware

– Coverage metrics – Runtime verification

29