Consistency of RTL Designs Yatin A. Manerkar , Daniel Lustig*, - - PowerPoint PPT Presentation

consistency of rtl designs
SMART_READER_LITE
LIVE PREVIEW

Consistency of RTL Designs Yatin A. Manerkar , Daniel Lustig*, - - PowerPoint PPT Presentation

RTLCheck: Verifying the Memory Consistency of RTL Designs Yatin A. Manerkar , Daniel Lustig*, Margaret Martonosi, and Michael Pellauer* Princeton University *NVIDIA MICRO-50 http:/ ://check.cs.p .princeton.edu/ Memory Consistency Models


slide-1
SLIDE 1

Yatin A. Manerkar, Daniel Lustig*, Margaret Martonosi, and Michael Pellauer*

RTLCheck: Verifying the Memory Consistency of RTL Designs

http:/ ://check.cs.p .princeton.edu/

Princeton University *NVIDIA MICRO-50

slide-2
SLIDE 2

Memory Consistency Models (MCMs) are Complex

▪MCMs specify ordering requirements of memory operations in parallel programs

  • Essential to correct parallel systems

▪Difficult to specify and verify!

Core 0 Core 1 While (Flag != 1) {} int r1 = Data; (All locations initially have a value of 0) Flag = 1; Data = 100;

slide-3
SLIDE 3

Memory Consistency Models (MCMs) are Complex

▪MCMs specify ordering requirements of memory operations in parallel programs

  • Essential to correct parallel systems

▪Difficult to specify and verify!

Core 0 Core 1 While (Flag != 1) {} int r1 = Data; (All locations initially have a value of 0) Flag = 1; Data = 100;

slide-4
SLIDE 4

Memory Consistency Models (MCMs) are Complex

▪MCMs specify ordering requirements of memory operations in parallel programs

  • Essential to correct parallel systems

▪Difficult to specify and verify!

Core 0 Core 1 While (Flag != 1) {} int r1 = Data; (All locations initially have a value of 0) Flag = 1; Data = 100;

slide-5
SLIDE 5

How to Verify Hardware MCM Behaviour?

▪Hardware enforces consistency model using smaller localized orderings

  • In-order fetch/WB
  • Coherence protocol orderings
  • …and many more

Coh

  • herence Protocol

l (S (SWMR, DVI, etc.) Lds. L2 WB Mem. SB L1 Exec. Dec. Fetch WB Mem. SB L1 Exec. Dec. Fetch

slide-6
SLIDE 6

How to Verify Hardware MCM Behaviour?

▪Hardware enforces consistency model using smaller localized orderings

  • In-order fetch/WB
  • Coherence protocol orderings
  • …and many more

Coh

  • herence Protocol

l (S (SWMR, DVI, etc.) Lds. L2 WB Mem. SB L1 Exec. Dec. Fetch WB Mem. SB L1 Exec. Dec. Fetch

slide-7
SLIDE 7

How to Verify Hardware MCM Behaviour?

▪Hardware enforces consistency model using smaller localized orderings

  • In-order fetch/WB
  • Coherence protocol orderings
  • …and many more

Coh

  • herence Protocol

l (S (SWMR, DVI, etc.) Lds. L2 WB Mem. SB L1 Exec. Dec. Fetch WB Mem. SB L1 Exec. Dec. Fetch

FIFO store buffers help ensure Total Store Order (TSO)

slide-8
SLIDE 8

How to Verify Hardware MCM Behaviour?

▪Hardware enforces consistency model using smaller localized orderings

  • In-order fetch/WB
  • Coherence protocol orderings
  • …and many more

Coh

  • herence Protocol

l (S (SWMR, DVI, etc.) Lds. L2 WB Mem. SB L1 Exec. Dec. Fetch WB Mem. SB L1 Exec. Dec. Fetch

FIFO store buffers help ensure Total Store Order (TSO)

Do individual orderings correctly work together to satisfy consistency model?

slide-9
SLIDE 9

Our Prior Work: Microarchitectural Consistency Verification

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

slide-10
SLIDE 10

Our Prior Work: Microarchitectural Consistency Verification

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

Each axiom specifies an ordering that µarch should respect

slide-11
SLIDE 11

Our Prior Work: Microarchitectural Consistency Verification

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

slide-12
SLIDE 12

Our Prior Work: Microarchitectural Consistency Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

slide-13
SLIDE 13

Our Prior Work: Microarchitectural Consistency Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

  • Microarch. verification checks that

combination of axioms satisfies MCM

[h [http tp://check.cs.prin inceton.ed edu]

slide-14
SLIDE 14

Our Prior Work: Microarchitectural Consistency Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “StoreBuffer_is_in_order": ... EdgeExists ((i1, SB_Enter), (i2, SB_Enter)) => AddEdge ((i1, SB_Exit), (i2, SB_Exit)). Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

  • Microarch. verification checks that

combination of axioms satisfies MCM

[h [http tp://check.cs.prin inceton.ed edu]

Higher-level verif. requires maintaining ordering axioms

Does RTL maintain microarchitectural orderings?

slide-15
SLIDE 15

RTL Verification is Maturing…

▪…but usually ignores memory consistency! ▪Often use SystemVerilog Assertions (SVA)

slide-16
SLIDE 16

RTL Verification is Maturing…

▪…but usually ignores memory consistency! ▪Often use SystemVerilog Assertions (SVA)

No MCM verification!

ISA-Formal [Reid et al. CAV 2016]

  • Instr. Operational Semantics
slide-17
SLIDE 17

RTL Verification is Maturing…

▪…but usually ignores memory consistency! ▪Often use SystemVerilog Assertions (SVA)

No MCM verification!

ISA-Formal [Reid et al. CAV 2016]

  • Instr. Operational Semantics

No multicore MCM verification!

DOGReL [Stewart et al. DIFTS 2014]

  • Memory subsystem transactions
slide-18
SLIDE 18

RTL Verification is Maturing…

▪…but usually ignores memory consistency! ▪Often use SystemVerilog Assertions (SVA)

No MCM verification!

ISA-Formal [Reid et al. CAV 2016]

  • Instr. Operational Semantics

No multicore MCM verification!

DOGReL [Stewart et al. DIFTS 2014]

  • Memory subsystem transactions

Needs Bluespec design and manual proofs!

Kami [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

  • MCM correctness for all programs, but…
slide-19
SLIDE 19

RTL Verification is Maturing…

▪…but usually ignores memory consistency! ▪Often use SystemVerilog Assertions (SVA)

No MCM verification!

ISA-Formal [Reid et al. CAV 2016]

  • Instr. Operational Semantics

No multicore MCM verification!

DOGReL [Stewart et al. DIFTS 2014]

  • Memory subsystem transactions

Needs Bluespec design and manual proofs!

Kami [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

  • MCM correctness for all programs, but…

Lack of automated memory consistency verification at RTL!

slide-20
SLIDE 20

RTLCheck: Verifying Consistency Orderings at RTL

RTL Design µspec Microarch. Axioms Litmus Test Mapping Functions Temporal SystemVerilog Assertions (SVA) JasperGold (RTL Verifier)

RTLCheck

Proven?

slide-21
SLIDE 21

RTLCheck: Verifying Consistency Orderings at RTL

RTL Design µspec Microarch. Axioms Litmus Test Mapping Functions Temporal SystemVerilog Assertions (SVA) JasperGold (RTL Verifier)

RTLCheck

Proven?

User-provided mapping functions translate microarch. primitives to RTL equivalents

slide-22
SLIDE 22

RTLCheck: Verifying Consistency Orderings at RTL

RTL Design µspec Microarch. Axioms Litmus Test Mapping Functions Temporal SystemVerilog Assertions (SVA) JasperGold (RTL Verifier)

RTLCheck

Proven?

RTLCheck automatically translates µarch.

  • rdering axioms to

temporal properties

slide-23
SLIDE 23

RTLCheck: Verifying Consistency Orderings at RTL

RTL Design µspec Microarch. Axioms Litmus Test Mapping Functions Temporal SystemVerilog Assertions (SVA) JasperGold (RTL Verifier)

RTLCheck

Proven?

Properties may be proven

  • r counterexample found
slide-24
SLIDE 24

Meaning can be Lost in Translation!

小心地滑

slide-25
SLIDE 25

Meaning can be Lost in Translation!

小心地滑

(Caution: Slippery Floor)

slide-26
SLIDE 26

Meaning can be Lost in Translation!

[Image: Barbara Younger] [Inspiration: Tae Jun Ham]

小心地滑

(Caution: Slippery Floor)

slide-27
SLIDE 27

RTLCheck: Verifying Consistency at RTL

Axiomatic Microarch. Verification

slide-28
SLIDE 28

RTLCheck: Verifying Consistency at RTL

Axiomatic Microarch. Verification Temporal RTL Verification (SVA, etc)

Core[0].DX Core[0].WB Core[1].DX Core[1].WB clk Core[1].LData St x St x St y St y Ld y Ld y Ld x Ld x 0x1 0x1 Core[0].SData 0x1 0x1 2 3 4 5 6 7

slide-29
SLIDE 29

RTLCheck: Verifying Consistency at RTL

Axiomatic Microarch. Verification Temporal RTL Verification (SVA, etc)

Core[0].DX Core[0].WB Core[1].DX Core[1].WB clk Core[1].LData St x St x St y St y Ld y Ld y Ld x Ld x 0x1 0x1 Core[0].SData 0x1 0x1 2 3 4 5 6 7

Abstract nodes and happens- before edges

slide-30
SLIDE 30

RTLCheck: Verifying Consistency at RTL

Axiomatic Microarch. Verification Temporal RTL Verification (SVA, etc)

Core[0].DX Core[0].WB Core[1].DX Core[1].WB clk Core[1].LData St x St x St y St y Ld y Ld y Ld x Ld x 0x1 0x1 Core[0].SData 0x1 0x1 2 3 4 5 6 7

Abstract nodes and happens- before edges Concrete signals and clock cycles

slide-31
SLIDE 31

RTLCheck: Verifying Consistency at RTL

Axiomatic Microarch. Verification Temporal RTL Verification (SVA, etc)

Core[0].DX Core[0].WB Core[1].DX Core[1].WB clk Core[1].LData St x St x St y St y Ld y Ld y Ld x Ld x 0x1 0x1 Core[0].SData 0x1 0x1 2 3 4 5 6 7

Axiomatic/Temporal Mismatch!

Abstract nodes and happens- before edges Concrete signals and clock cycles

slide-32
SLIDE 32

Instances of the Axiomatic/Temporal Mismatch

▪Outcome Filtering: enforcing particular outcome for litmus test

  • Discussed next

▪Mapping Individual Happens-Before Edges (detailed in paper) ▪Filtering Match Attempts (detailed in paper)

slide-33
SLIDE 33

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; mp (Message Passing)

Outcome Filtering in Axiomatic Verification

▪Axiomatic models make outcome filtering easy and efficient

slide-34
SLIDE 34

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; mp (Message Passing)

Outcome Filtering in Axiomatic Verification

▪Axiomatic models make outcome filtering easy and efficient

Outcome: r1 = 1, r2 = 1

Execution examined as a whole, so outcome can be enforced!

slide-35
SLIDE 35

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; mp (Message Passing)

Outcome Filtering in Axiomatic Verification

▪Axiomatic models make outcome filtering easy and efficient

Outcome: r1 = 1, r2 = 1

Execution examined as a whole, so outcome can be enforced!

rf

slide-36
SLIDE 36

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; mp (Message Passing)

Outcome Filtering in Axiomatic Verification

▪Axiomatic models make outcome filtering easy and efficient

Outcome: r1 = 1, r2 = 1

Execution examined as a whole, so outcome can be enforced!

rf

slide-37
SLIDE 37

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

slide-38
SLIDE 38

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp (i1) x = 1 Step 1

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

slide-39
SLIDE 39

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp (i1) x = 1 Step 1 Step 2 (i2) y = 1 (i3) r1 = y = 1 Step 3 (i4) r2 = x = 1 Step 4

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

slide-40
SLIDE 40

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp (i1) x = 1 Step 1 Step 2 (i2) y = 1 (i3) r1 = y = 1 Step 3 (i4) r2 = x = 0? (i4) r2 = x = 1 Step 4

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

slide-41
SLIDE 41

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp (i1) x = 1 Step 1 Step 2 (i2) y = 1 (i3) r1 = y = 1 Step 3 (i4) r2 = x = 0? (i4) r2 = x = 1 Step 4

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

(i3) r1 = y = 0

… … … …

Need to examine all possible paths from current step to end of execution: too expensive!

slide-42
SLIDE 42

Outcome Filtering in Temporal Verification

▪Filtering executions by outcome requires expensive glo lobal analysis

  • Not done by many SVA verifiers, including JasperGold!

mp (i1) x = 1 Step 1 Step 2 (i2) y = 1 (i3) r1 = y = 1 Step 3 (i4) r2 = x = 0? (i4) r2 = x = 1 Step 4

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; Is r1 = 1, r2 = 0 possible?

(i3) r1 = y = 0

… … … …

Need to examine all possible paths from current step to end of execution: too expensive!

SVA Verifier Approximation: Only check if constraints hold up to current step Makes Outcome Filtering impossible!

slide-43
SLIDE 43

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

µspec Verification Uses Outcome Filtering

Note: Axioms abstracted for brevity

mp

slide-44
SLIDE 44

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

µspec Verification Uses Outcome Filtering

Note: Axioms abstracted for brevity

mp

slide-45
SLIDE 45

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

µspec Verification Uses Outcome Filtering

Note: Axioms abstracted for brevity

mp

No write for load to read from!

slide-46
SLIDE 46

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

µspec Verification Uses Outcome Filtering

Note: Axioms abstracted for brevity

mp

Outcome Filtering leads to simpler axioms!

slide-47
SLIDE 47

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp Note: Axioms/properties abstracted for brevity

slide-48
SLIDE 48

After 3 cycles:

Core[0].Commit Core[1].Commit clk Core[1].LData Core[0].SData

St x 0x1 3

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp

2 1

Note: Axioms/properties abstracted for brevity

slide-49
SLIDE 49

After 3 cycles: Store happens before load! Property Violated?

Core[0].Commit Core[1].Commit clk Core[1].LData Core[0].SData

St x 0x1 3

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp

2 1

Note: Axioms/properties abstracted for brevity

slide-50
SLIDE 50

After 6 cycles: Load does not read 0 No Violation! After 3 cycles: Store happens before load! Property Violated?

Core[0].Commit Core[1].Commit clk Core[1].LData Core[0].SData

St x 0x1 3 St y 0x1 4 Ld y 0x1 5 Ld x 0x1 6

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp

2 1

Note: Axioms/properties abstracted for brevity

slide-51
SLIDE 51

After 6 cycles: Load does not read 0 No Violation! But verifiers don’t check future cycles! After 3 cycles: Store happens before load! Property Violated?

Core[0].Commit Core[1].Commit clk Core[1].LData Core[0].SData

St x 0x1 3 St y 0x1 4 Ld y 0x1 5 Ld x 0x1 6

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp

2 1

Note: Axioms/properties abstracted for brevity

slide-52
SLIDE 52

After 6 cycles: Load does not read 0 No Violation! But verifiers don’t check future cycles! After 3 cycles: Store happens before load! Property Violated?

Core[0].Commit Core[1].Commit clk Core[1].LData Core[0].SData

St x 0x1 3

Temporal Outcome Filtering Fails!

BeforeAllWrites: Unless Load returns non-zero value, Load happens before all stores to its address

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp

2 1

Note: Axioms/properties abstracted for brevity

Counterexample flagged despite hardware doing nothing wrong!

slide-53
SLIDE 53

Property to check: mapNode(Ld x → St x, Ld x == 0) or mapNode(St x → Ld x, Ld x == 1);

▪Don’t simplify axioms; translate all cases ▪Tag each case with appropriate load value constraints

  • reflect the data constraints required for edge(s)

Solution: Load Value Constraints

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp Note: Axioms and properties abstracted for brevity

slide-54
SLIDE 54

Property to check: mapNode(Ld x → St x, Ld x == 0) or mapNode(St x → Ld x, Ld x == 1);

▪Don’t simplify axioms; translate all cases ▪Tag each case with appropriate load value constraints

  • reflect the data constraints required for edge(s)

Solution: Load Value Constraints

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp Note: Axioms and properties abstracted for brevity

slide-55
SLIDE 55

Property to check: mapNode(Ld x → St x, Ld x == 0) or mapNode(St x → Ld x, Ld x == 1);

▪Don’t simplify axioms; translate all cases ▪Tag each case with appropriate load value constraints

  • reflect the data constraints required for edge(s)

Solution: Load Value Constraints

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp Note: Axioms and properties abstracted for brevity

slide-56
SLIDE 56

Property to check: mapNode(Ld x → St x, Ld x == 0) or mapNode(St x → Ld x, Ld x == 1);

▪Don’t simplify axioms; translate all cases ▪Tag each case with appropriate load value constraints

  • reflect the data constraints required for edge(s)

Solution: Load Value Constraints

Axiom "Read_Values": Every load either reads BeforeAllWrites OR reads FromLatestWrite

Core 0 Core 1 (i1) x = 1; (i3) r1 = y; (i2) y = 1; (i4) r2 = x; SC Forbids: r1 = 1, r2 = 0 mp Note: Axioms and properties abstracted for brevity

slide-57
SLIDE 57

Multi-V-scale: a Multicore Case Study

Core 0 Core 1 Core 2 Core 3

Arbiter Memory WB DX IF WB DX IF WB DX IF WB DX IF

slide-58
SLIDE 58

Multi-V-scale: a Multicore Case Study

Core 0 Core 1 Core 2 Core 3

Arbiter Memory WB DX IF WB DX IF WB DX IF WB DX IF

3-stage in-order pipelines

slide-59
SLIDE 59

Multi-V-scale: a Multicore Case Study

Core 0 Core 1 Core 2 Core 3

Arbiter Memory WB DX IF WB DX IF WB DX IF WB DX IF

Arbiter enforces that

  • nly one core

can access memory at any time

slide-60
SLIDE 60

▪ V-scale memory internally writes stores to wdata register ▪ wdata pushed to memory when subsequent store occurs ▪ Akin to single-entry store buffer ▪ When two stores are sent to memory in successive cycles, first of two stores is dropped by memory! ▪ Fixed bug by eliminating wdata ▪ V-scale has since been deprecated by RISC-V Foundation

Bug Discovered in V-scale

Core 0 Core 1 Core 2 Core 3

Arbiter WB DX IF WB DX IF WB DX IF WB DX IF

Memory

wdata

Mem array Stores

x = 1 y = 1

slide-61
SLIDE 61

▪ V-scale memory internally writes stores to wdata register ▪ wdata pushed to memory when subsequent store occurs ▪ Akin to single-entry store buffer ▪ When two stores are sent to memory in successive cycles, first of two stores is dropped by memory! ▪ Fixed bug by eliminating wdata ▪ V-scale has since been deprecated by RISC-V Foundation

Bug Discovered in V-scale

Core 0 Core 1 Core 2 Core 3

Arbiter WB DX IF WB DX IF WB DX IF WB DX IF

Memory

wdata

Mem array Stores

x = 1 y = 1

slide-62
SLIDE 62

▪ V-scale memory internally writes stores to wdata register ▪ wdata pushed to memory when subsequent store occurs ▪ Akin to single-entry store buffer ▪ When two stores are sent to memory in successive cycles, first of two stores is dropped by memory! ▪ Fixed bug by eliminating wdata ▪ V-scale has since been deprecated by RISC-V Foundation

Bug Discovered in V-scale

Core 0 Core 1 Core 2 Core 3

Arbiter WB DX IF WB DX IF WB DX IF WB DX IF

Memory

wdata

Mem array Stores

x = 1 y = 1

slide-63
SLIDE 63

▪Two configurations (Hybrid and Full_Proof), avg. runtime 6.2 hrs

  • See paper for configuration details

Results: Time to Verification

2 4 6 8 10 12

safe006 lb safe007 mp safe022 safe010 ssl safe000 safe008 n4 n5 co-mp safe001 wrc sb safe018 podwr000 safe003 mp+staleld safe012 safe002 safe014 iwp23b safe009 safe029 safe027 rwc n2 rfi013 safe030 safe011 rfi015 rfi003 safe021 iriw n7 iwp24 podwr001 safe017 rfi012 n6 safe019 rfi001 rfi000 rfi011 safe026 safe004 safe016 rfi002 rfi005 rfi014 rfi004 rfi006 n1 amd3 co-iriw Mean

Time (hours)

Hybrid Full_Proof

slide-64
SLIDE 64

▪Two configurations (Hybrid and Full_Proof), avg. runtime 6.2 hrs

  • See paper for configuration details

Results: Time to Verification

2 4 6 8 10 12

safe006 lb safe007 mp safe022 safe010 ssl safe000 safe008 n4 n5 co-mp safe001 wrc sb safe018 podwr000 safe003 mp+staleld safe012 safe002 safe014 iwp23b safe009 safe029 safe027 rwc n2 rfi013 safe030 safe011 rfi015 rfi003 safe021 iriw n7 iwp24 podwr001 safe017 rfi012 n6 safe019 rfi001 rfi000 rfi011 safe026 safe004 safe016 rfi002 rfi005 rfi014 rfi004 rfi006 n1 amd3 co-iriw Mean

Time (hours)

Hybrid Full_Proof

Verified very quickly through covering traces (details in paper)

slide-65
SLIDE 65

▪Two configurations (Hybrid and Full_Proof), avg. runtime 6.2 hrs

  • See paper for configuration details

Results: Time to Verification

2 4 6 8 10 12

safe006 lb safe007 mp safe022 safe010 ssl safe000 safe008 n4 n5 co-mp safe001 wrc sb safe018 podwr000 safe003 mp+staleld safe012 safe002 safe014 iwp23b safe009 safe029 safe027 rwc n2 rfi013 safe030 safe011 rfi015 rfi003 safe021 iriw n7 iwp24 podwr001 safe017 rfi012 n6 safe019 rfi001 rfi000 rfi011 safe026 safe004 safe016 rfi002 rfi005 rfi014 rfi004 rfi006 n1 amd3 co-iriw Mean

Time (hours)

Hybrid Full_Proof

Max runtime 11 hours (if some properties unproven)

slide-66
SLIDE 66

▪Full_Proof generally better (90%/test) than Hybrid (81%/test) ▪On average, Full_Proof can prove more properties in same time

Results: Proven Properties

10 20 30 40 50 60 70 80 90 100 safe006 lb safe007 safe000 n4 safe011 safe016 safe030 rfi000 safe017 safe019 safe004 safe021 rfi011 rfi006 n1 rfi012 n7 co-iriw rfi005 safe002 n2 iriw rfi002 safe012 rfi003 safe003 safe014 safe001 iwp24 rfi015 rfi001 safe026 safe027 podwr001 safe008 rfi014 n6 n5 wrc safe018 rwc safe009 rfi004 amd3 mp+staleld rfi013 mp safe022 safe010 ssl co-mp sb podwr000 iwp23b safe029 Mean

% Proven Properties Hybrid Full_Proof

slide-67
SLIDE 67

▪Full_Proof generally better (90%/test) than Hybrid (81%/test) ▪On average, Full_Proof can prove more properties in same time

Results: Proven Properties

10 20 30 40 50 60 70 80 90 100 safe006 lb safe007 safe000 n4 safe011 safe016 safe030 rfi000 safe017 safe019 safe004 safe021 rfi011 rfi006 n1 rfi012 n7 co-iriw rfi005 safe002 n2 iriw rfi002 safe012 rfi003 safe003 safe014 safe001 iwp24 rfi015 rfi001 safe026 safe027 podwr001 safe008 rfi014 n6 n5 wrc safe018 rwc safe009 rfi004 amd3 mp+staleld rfi013 mp safe022 safe010 ssl co-mp sb podwr000 iwp23b safe029 Mean

% Proven Properties Hybrid Full_Proof

Hybrid better for only a few tests

slide-68
SLIDE 68

MCM Verification: The Big Picture

High-Level Languages (HLL) Compiler Architecture Microarchitecture OS [Batty et al. POPL 2012] Processor RTL [PipeCheck, MICRO-47] [Sarkar et al. PLDI 2011] [COATCheck, ASPLOS 2016] [Vafeiadis et al. PLDI 2017] [TriCheck, ASPLOS 2017] [Alglave et al. TOPLAS 2014] [CCICheck, MICRO-48] [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

slide-69
SLIDE 69

MCM Verification: The Big Picture

High-Level Languages (HLL) Compiler Architecture Microarchitecture OS [Batty et al. POPL 2012] Processor RTL [PipeCheck, MICRO-47] [Sarkar et al. PLDI 2011] [COATCheck, ASPLOS 2016] [Vafeiadis et al. PLDI 2017] [TriCheck, ASPLOS 2017] [Alglave et al. TOPLAS 2014] [CCICheck, MICRO-48] [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

Higher-level tools directly or indirectly assume correctness

  • f underlying RTL!
slide-70
SLIDE 70

MCM Verification: The Big Picture

High-Level Languages (HLL) Compiler Architecture Microarchitecture OS [Batty et al. POPL 2012] Processor RTL [PipeCheck, MICRO-47] [Sarkar et al. PLDI 2011] [COATCheck, ASPLOS 2016] [Vafeiadis et al. PLDI 2017] [TriCheck, ASPLOS 2017] [Alglave et al. TOPLAS 2014] [CCICheck, MICRO-48] [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

Higher-level tools directly or indirectly assume correctness

  • f underlying RTL!

Requires Bluespec design and manual proof

slide-71
SLIDE 71

MCM Verification: The Big Picture

High-Level Languages (HLL) Compiler Architecture Microarchitecture OS [Batty et al. POPL 2012] Processor RTL [PipeCheck, MICRO-47] [Sarkar et al. PLDI 2011] [COATCheck, ASPLOS 2016] [Vafeiadis et al. PLDI 2017] [TriCheck, ASPLOS 2017] [Alglave et al. TOPLAS 2014] [CCICheck, MICRO-48] [Vijayaraghavan et al. CAV 2015] [Choi et al. ICFP 2017]

[RTLCheck, MICRO-50]

Higher-level tools directly or indirectly assume correctness

  • f underlying RTL!
  • RTLCheck validates RTL against µarch, filling µarch-RTL verification gap!
  • Automated MCM verification of arbitrary RTL for suite of litmus tests
slide-72
SLIDE 72

▪RTLCheck: Automated MCM Verification of arbitrary RTL against arbitrary microarchitectural orderings

  • Translates microarch. axioms into equivalent temporal SVA properties
  • Allows RTL to be validated against µarch ordering specification
  • Capable of handling arbitrary ISA-level MCMs (SC, TSO, ARM, Power,…)
  • Most of generated properties proven by JasperGold in minutes or hours

▪RTLCheck enables full-stack HLL-to-RTL MCM verification (with rest of Check suite) across a collection of litmus tests

Conclusions

Code available at https://github.com/ymanerka/rtlcheck

slide-73
SLIDE 73

Yatin A. Manerkar, Daniel Lustig*, Margaret Martonosi, and Michael Pellauer*

RTLCheck: Verifying the Memory Consistency of RTL Designs

http:/ ://check.cs.p .princeton.edu/

Code available at https://github.com/ymanerka/rtlcheck