PipeProof: Automated Memory Consistency Proofs for - - PowerPoint PPT Presentation

pipeproof
SMART_READER_LITE
LIVE PREVIEW

PipeProof: Automated Memory Consistency Proofs for - - PowerPoint PPT Presentation

PipeProof: Automated Memory Consistency Proofs for Microarchitectural Specifications Yatin A. Manerkar , Daniel Lustig*, Margaret Martonosi, and Aarti Gupta Princeton University *NVIDIA MICRO-51 http:/ ://check.cs.p .princeton.edu/ Memory


slide-1
SLIDE 1

Yatin A. Manerkar, Daniel Lustig*, Margaret Martonosi, and Aarti Gupta

PipeProof: Automated Memory Consistency Proofs for Microarchitectural Specifications

http:/ ://check.cs.p .princeton.edu/

Princeton University *NVIDIA MICRO-51

slide-2
SLIDE 2

Memory Consistency Models (MCMs)

▪Specify rules governing values returned by loads in parallel programs ▪MCM must be correctly implemented for all possible programs

Compiler Microarchitecture

slide-3
SLIDE 3

Memory Consistency Models (MCMs)

ISA-Level MCM (x86-TSO, Power, ARMv8, etc)

▪Specify rules governing values returned by loads in parallel programs ▪MCM must be correctly implemented for all possible programs

Compiler Microarchitecture

slide-4
SLIDE 4

Memory Consistency Models (MCMs)

ISA-Level MCM (x86-TSO, Power, ARMv8, etc)

▪Specify rules governing values returned by loads in parallel programs ▪MCM must be correctly implemented for all possible programs

Target for compilers… Compiler Microarchitecture

slide-5
SLIDE 5

Memory Consistency Models (MCMs)

ISA-Level MCM (x86-TSO, Power, ARMv8, etc)

▪Specify rules governing values returned by loads in parallel programs ▪MCM must be correctly implemented for all possible programs

Target for compilers… Compiler Microarchitecture …and a specification that microarchitecture must implement

slide-6
SLIDE 6

Memory Consistency Models (MCMs)

▪Specify rules governing values returned by loads in parallel programs ▪MCM must be correctly implemented for all possible programs

Target for compilers… Compiler Microarchitecture …and a specification that microarchitecture must implement

???

slide-7
SLIDE 7

[Images: HeeWann Kim, tzblacktd, audino]

The Infinite Forest

slide-8
SLIDE 8

+∞

+∞

[Images: HeeWann Kim, tzblacktd, audino]

The Infinite Forest Forest goes on forever (infinite number of possible programs)

slide-9
SLIDE 9

+∞

+∞

[Images: HeeWann Kim, tzblacktd, audino]

The Infinite Forest Can check known hideouts (verify design for test programs)

slide-10
SLIDE 10

+∞

+∞

[Images: HeeWann Kim, tzblacktd, audino]

The Infinite Forest Are Pokemon lurking in unexplored areas? (Do tested programs provide complete coverage?)

slide-11
SLIDE 11

+∞

+∞

[Images: HeeWann Kim, tzblacktd, audino]

The Infinite Forest

Have we caught all the Pokemon? (Are there any MCM bugs left in the design?)

slide-12
SLIDE 12

PipeProof Overview

µarch and ISA MCM Specs + Auxiliary Inputs All-Program MCM Correctness Proof!

PipeProof

▪First automated all-program microarchitectural MCM verification!

  • Covers all possible addresses, values, numbers of cores

▪Proof methodology based on automatic abstraction refinement ▪Early-stage: Can be conducted before RTL is written!

slide-13
SLIDE 13

Outline

▪Background

  • ISA-level MCM specs
  • Microarchitectural ordering specs

▪Microarchitectural Correctness Proof

  • Transitive Chain (TC) Abstraction

▪Overall PipeProof Operation

  • TC Abstraction Support Proof
  • Chain Invariants

▪Results

slide-14
SLIDE 14

ISA-Level MCM Specifications

▪Defined in terms of relational patterns [Alglave et al. TOPLAS 2014] ▪ISA-level executions are graphs

  • Nodes: instructions, edges: ISA-level relations between instrs

▪Correctness based on acyclicity, irreflexivity, etc of relational patterns

  • Eg: SC is 𝑏𝑑𝑧𝑑𝑚𝑗𝑑(𝑞𝑝 ∪ 𝑑𝑝 ∪ 𝑠𝑔 ∪ 𝑔𝑠)

Mes essage passin ing (mp mp) litm litmus tes est An IS ISA-level l execution of

  • f mp

mp [x] ← 1 fr [y] ← 1 r1 ← [y] r2 ← [x] rf po po

(i4) (i3) (i1) (i2)

slide-15
SLIDE 15

ISA-Level MCM Specifications

▪Defined in terms of relational patterns [Alglave et al. TOPLAS 2014] ▪ISA-level executions are graphs

  • Nodes: instructions, edges: ISA-level relations between instrs

▪Correctness based on acyclicity, irreflexivity, etc of relational patterns

  • Eg: SC is 𝑏𝑑𝑧𝑑𝑚𝑗𝑑(𝑞𝑝 ∪ 𝑑𝑝 ∪ 𝑠𝑔 ∪ 𝑔𝑠)

Mes essage passin ing (mp mp) litm litmus tes est An IS ISA-level l execution of

  • f mp

mp [x] ← 1 fr [y] ← 1 r1 ← [y] r2 ← [x] rf po po

(i4) (i3) (i1) (i2)

slide-16
SLIDE 16

ISA-Level MCM Specifications

▪Defined in terms of relational patterns [Alglave et al. TOPLAS 2014] ▪ISA-level executions are graphs

  • Nodes: instructions, edges: ISA-level relations between instrs

▪Correctness based on acyclicity, irreflexivity, etc of relational patterns

  • Eg: SC is 𝑏𝑑𝑧𝑑𝑚𝑗𝑑(𝑞𝑝 ∪ 𝑑𝑝 ∪ 𝑠𝑔 ∪ 𝑔𝑠)

Mes essage passin ing (mp mp) litm litmus tes est An IS ISA-level l execution of

  • f mp

mp [x] ← 1 fr [y] ← 1 r1 ← [y] r2 ← [x] rf po po

(i4) (i3) (i1) (i2)

slide-17
SLIDE 17

ISA-Level MCM Specifications

▪Defined in terms of relational patterns [Alglave et al. TOPLAS 2014] ▪ISA-level executions are graphs

  • Nodes: instructions, edges: ISA-level relations between instrs

▪Correctness based on acyclicity, irreflexivity, etc of relational patterns

  • Eg: SC is 𝑏𝑑𝑧𝑑𝑚𝑗𝑑(𝑞𝑝 ∪ 𝑑𝑝 ∪ 𝑠𝑔 ∪ 𝑔𝑠)

Mes essage passin ing (mp mp) litm litmus tes est An IS ISA-level l execution of

  • f mp

mp [x] ← 1 fr [y] ← 1 r1 ← [y] r2 ← [x] rf po po

(i4) (i3) (i1) (i2)

slide-18
SLIDE 18

ISA-Level MCM Specifications

▪Defined in terms of relational patterns [Alglave et al. TOPLAS 2014] ▪ISA-level executions are graphs

  • Nodes: instructions, edges: ISA-level relations between instrs

▪Correctness based on acyclicity, irreflexivity, etc of relational patterns

  • Eg: SC is 𝑏𝑑𝑧𝑑𝑚𝑗𝑑(𝑞𝑝 ∪ 𝑑𝑝 ∪ 𝑠𝑔 ∪ 𝑔𝑠)

Mes essage passin ing (mp mp) litm litmus tes est An IS ISA-level l execution of

  • f mp

mp [x] ← 1 fr [y] ← 1 r1 ← [y] r2 ← [x] rf po po

(i4) (i3) (i1) (i2)

slide-19
SLIDE 19

Microarchitectural Ordering Specifications

▪Set of axioms in µspec DSL [Lustig et al. ASPLOS 2016] ▪Used to generate microarchitectural executions as µhb graphs

  • Nodes: instr. sub-events, edges: happens-before relations between instrs

▪Observability based on cyclicity of graphs

  • Cyclic graph → Unobservable
  • Acyclic graph → Observable

Mes essage passin ing (mp mp) litm litmus tes est A µhb hb gr graph of

  • f mp

mp on

  • n sim

imple leSC (i1) (i2) IF EX WB po (i3) (i4) fr rf po

slide-20
SLIDE 20

Microarchitectural Ordering Specifications

▪Set of axioms in µspec DSL [Lustig et al. ASPLOS 2016] ▪Used to generate microarchitectural executions as µhb graphs

  • Nodes: instr. sub-events, edges: happens-before relations between instrs

▪Observability based on cyclicity of graphs

  • Cyclic graph → Unobservable
  • Acyclic graph → Observable

Mes essage passin ing (mp mp) litm litmus tes est A µhb hb gr graph of

  • f mp

mp on

  • n sim

imple leSC (i1) (i2) IF EX WB po (i3) (i4) fr rf po

slide-21
SLIDE 21

Microarchitectural Ordering Specifications

▪Set of axioms in µspec DSL [Lustig et al. ASPLOS 2016] ▪Used to generate microarchitectural executions as µhb graphs

  • Nodes: instr. sub-events, edges: happens-before relations between instrs

▪Observability based on cyclicity of graphs

  • Cyclic graph → Unobservable
  • Acyclic graph → Observable

Mes essage passin ing (mp mp) litm litmus tes est A µhb hb gr graph of

  • f mp

mp on

  • n sim

imple leSC (i1) (i2) IF EX WB po (i3) (i4) fr rf po

slide-22
SLIDE 22

Microarchitectural Ordering Specifications

▪Set of axioms in µspec DSL [Lustig et al. ASPLOS 2016] ▪Used to generate microarchitectural executions as µhb graphs

  • Nodes: instr. sub-events, edges: happens-before relations between instrs

▪Observability based on cyclicity of graphs

  • Cyclic graph → Unobservable
  • Acyclic graph → Observable

Mes essage passin ing (mp mp) litm litmus tes est A µhb hb gr graph of

  • f mp

mp on

  • n sim

imple leSC (i1) (i2) IF EX WB po (i3) (i4) fr rf po

slide-23
SLIDE 23

Microarchitectural Ordering Specifications

▪Set of axioms in µspec DSL [Lustig et al. ASPLOS 2016] ▪Used to generate microarchitectural executions as µhb graphs

  • Nodes: instr. sub-events, edges: happens-before relations between instrs

▪Observability based on cyclicity of graphs

  • Cyclic graph → Unobservable
  • Acyclic graph → Observable

Mes essage passin ing (mp mp) litm litmus tes est A µhb hb gr graph of

  • f mp

mp on

  • n sim

imple leSC (i1) (i2) IF EX WB po (i3) (i4) fr rf po

slide-24
SLIDE 24

Our Prior Work: Litmus Test-Based MCM Verification

Axiom “Decode_is_FIFO": ... EdgeExists ((i1, Decode), (i2, Decode)) => AddEdge ((i1, Execute), (i2, Execute)). ... Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

[Lustig et al. MICRO-47, …]

slide-25
SLIDE 25

Our Prior Work: Litmus Test-Based MCM Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “Decode_is_FIFO": ... EdgeExists ((i1, Decode), (i2, Decode)) => AddEdge ((i1, Execute), (i2, Execute)). ... Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL

[Lustig et al. MICRO-47, …]

slide-26
SLIDE 26

Our Prior Work: Litmus Test-Based MCM Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “Decode_is_FIFO": ... EdgeExists ((i1, Decode), (i2, Decode)) => AddEdge ((i1, Execute), (i2, Execute)). ... Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL ISA-Level Outcome Observable (≥ 1 Graph Acyclic) Not Observable (All Graphs Cyclic) Allowed OK OK (stricter than necessary) Forbidden Consistency violation! OK

[Lustig et al. MICRO-47, …]

slide-27
SLIDE 27

Our Prior Work: Litmus Test-Based MCM Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “Decode_is_FIFO": ... EdgeExists ((i1, Decode), (i2, Decode)) => AddEdge ((i1, Execute), (i2, Execute)). ... Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL ISA-Level Outcome Observable (≥ 1 Graph Acyclic) Not Observable (All Graphs Cyclic) Allowed OK OK (stricter than necessary) Forbidden Consistency violation! OK

[Lustig et al. MICRO-47, …]

slide-28
SLIDE 28

Our Prior Work: Litmus Test-Based MCM Verification

Mic icroarchit itectural happen ens-before (µ (µhb hb) gr graphs

Axiom “Decode_is_FIFO": ... EdgeExists ((i1, Decode), (i2, Decode)) => AddEdge ((i1, Execute), (i2, Execute)). ... Axiom "PO_Fetch": ... SameCore i1 i2 /\ ProgramOrder i1 i2 => AddEdge ((i1, Fetch), (i2, Fetch)).

Mic icroarchit itecture Litm Litmus Tes est in in µspec ec DS DSL ISA-Level Outcome Observable (≥ 1 Graph Acyclic) Not Observable (All Graphs Cyclic) Allowed OK OK (stricter than necessary) Forbidden Consistency violation! OK

[Lustig et al. MICRO-47, …]

Perennial Question: “Do your litmus tests cover all possible MCM bugs?” How to automatically prove correctness for all programs?

slide-29
SLIDE 29

The Transitive Chain (TC) Abstraction

i1 in

r1…n-1 fr All non-unary cycles containing fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po …

slide-30
SLIDE 30

The Transitive Chain (TC) Abstraction

i1 in

r1…n-1 fr All non-unary cycles containing fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po … Transitive chain (sequence)

  • f ISA-level edges
slide-31
SLIDE 31

The Transitive Chain (TC) Abstraction

i1 in

r1…n-1 fr All non-unary cycles containing fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po …

Using TC Abstraction

i1 in fr Som Some µhb hb edg edge fr from i1 to to in (transitive connection)

IF EX WB

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

slide-32
SLIDE 32

The Transitive Chain (TC) Abstraction

i1 in

r1…n-1 fr All non-unary cycles containing fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po …

Using TC Abstraction

i1 in fr Som Some µhb hb edg edge fr from i1 to to in (transitive connection)

IF EX WB

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

slide-33
SLIDE 33

The Transitive Chain (TC) Abstraction

Using TC Abstraction

Infinite

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po

slide-34
SLIDE 34

The Transitive Chain (TC) Abstraction

Using TC Abstraction

Infinite

i1 in

IF EX WB fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po

Finite!

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr
slide-35
SLIDE 35

The Transitive Chain (TC) Abstraction

Using TC Abstraction

Infinite

i1 in

IF EX WB fr

i1

fr

i2

po

i1 i3

fr

i2

po co po rf

i1 i3

fr

i2 i4

po co rf

i1 i3

fr

i2 i4

po

Finite!

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

i1 in

IF EX WB fr

Soundness verified as a supporting proof!

slide-36
SLIDE 36

Microarchitectural Correctness Proof

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Al All po possible tr tran. . con

  • nns.

Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-37
SLIDE 37

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-38
SLIDE 38

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-39
SLIDE 39

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

Acyclic graph with transitive connection => Abstract Counterexample (i.e. possible bug)

slide-40
SLIDE 40

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

Transitive connection may represent one or multiple ISA-level edges

slide-41
SLIDE 41

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Try to

  • Co

Concretiz ize (R (Rep epla lace tr transit itiv ive con

  • nnection wi

with th

  • n
  • ne

e ISA-level edg edge) Micr Microarch Bugg Buggy, Return Co Coun unterexample le Ob Observ rvable Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-42
SLIDE 42

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Try to

  • Co

Concretiz ize (R (Rep epla lace tr transit itiv ive con

  • nnection wi

with th

  • n
  • ne

e ISA-level edg edge) Uno nobs. Micr Microarch Bugg Buggy, Return Co Coun unterexample le Ob Observ rvable Co Cons nsider al all De Decompos

  • sit

itio ions (I (Ind nductiv ively ly br break do down wn Tran ansit itiv ive Cha Chain in) Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-43
SLIDE 43

Microarchitectural Correctness Proof

i1 in

IF EX WB fr ?

AbsCounterX

i1 in

IF EX WB fr ✓

NoDecomp

i1 in

fr Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Try to

  • Co

Concretiz ize (R (Rep epla lace tr transit itiv ive con

  • nnection wi

with th

  • n
  • ne

e ISA-level edg edge) Uno nobs. Micr Microarch Bugg Buggy, Return Co Coun unterexample le Ob Observ rvable Co Cons nsider al all De Decompos

  • sit

itio ions (I (Ind nductiv ively ly br break do down wn Tran ansit itiv ive Cha Chain in) Al All po possible tr tran. . con

  • nns.

Oth Other tr transit itiv ive connections… Oth Other ISA-level cycles…

“Refinement Loop”

i1 in

po Som Some µhb hb ed edge fr from i1 to

  • in

(transitive con

  • nnection)

Cy Cycle les con

  • ntaining fr

fr Cy Cycle les con

  • ntaining po

po

slide-44
SLIDE 44

Concretization

Concretization: Replace transitive connection with single ISA-level edge

p i1 r q in IF EX WB fr ?

AbsCounterX

▪All concretizations must be unobservable ▪Observable concretizations are counterexamples

slide-45
SLIDE 45

Concretization

p i1 r q in IF EX WB fr rf p i1 r q in IF EX WB fr po

Concretization: Replace transitive connection with single ISA-level edge

p i1 r q in IF EX WB fr ?

AbsCounterX

▪All concretizations must be unobservable ▪Observable concretizations are counterexamples

slide-46
SLIDE 46

Decomposition

p i1 r q in IF EX WB fr

▪Additional instruction and ISA-level edge modelled => extra constraints

  • May be enough to make execution unobservable

Decomposition: Inductively break down transitive chain

(Chain of length n == Chain of length n-1 + single “peeled-off” edge)

?

AbsCounterX

slide-47
SLIDE 47

Decomposition

p i1 in-1 IF EX WB rf r q in fr p i1 i2 IF EX WB co r q in fr p i1 r q in IF EX WB fr

▪Additional instruction and ISA-level edge modelled => extra constraints

  • May be enough to make execution unobservable

Decomposition: Inductively break down transitive chain

(Chain of length n == Chain of length n-1 + single “peeled-off” edge)

?

AbsCounterX

slide-48
SLIDE 48

Decomposition

p i1 in-1 IF EX WB rf r q in fr p i1 i2 IF EX WB co r q in fr

p i1 r q in IF EX WB fr

▪Additional instruction and ISA-level edge modelled => extra constraints

  • May be enough to make execution unobservable

Decomposition: Inductively break down transitive chain

(Chain of length n == Chain of length n-1 + single “peeled-off” edge)

?

AbsCounterX

slide-49
SLIDE 49

Decomposition

p i1 in-1 IF EX WB rf r q in fr p i1 i2 IF EX WB co r q in fr

✓ ?

p i1 r q in IF EX WB fr

▪Additional instruction and ISA-level edge modelled => extra constraints

  • May be enough to make execution unobservable

Decomposition: Inductively break down transitive chain

(Chain of length n == Chain of length n-1 + single “peeled-off” edge)

?

AbsCounterX

slide-50
SLIDE 50

Decomposition

p i1 in-1 IF EX WB rf r q in fr p i1 i2 IF EX WB co r q in fr

✓ ?

p i1 r q in IF EX WB fr

▪Additional instruction and ISA-level edge modelled => extra constraints

  • May be enough to make execution unobservable

Decomposition: Inductively break down transitive chain

(Chain of length n == Chain of length n-1 + single “peeled-off” edge)

?

AbsCounterX

slide-51
SLIDE 51

Outline

▪Background

  • ISA-level MCM specs
  • Microarchitectural ordering specs

▪Microarchitectural Correctness Proof

  • Transitive Chain (TC) Abstraction

▪Overall PipeProof Operation

  • TC Abstraction Support Proof
  • Chain Invariants

▪Results

slide-52
SLIDE 52

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

slide-53
SLIDE 53

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

slide-54
SLIDE 54

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

Links ISA- level and µarch executions

slide-55
SLIDE 55

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

Represent repeated ISA-level patterns

slide-56
SLIDE 56

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

If design can’t be verified, a counterexample (a forbidden execution that is observable) is often returned

slide-57
SLIDE 57

PipeProof Block Diagram

Microarchitecture Ordering Spec. ISA-Level MCM Spec. PipeProof ISA Edge ->

  • Microarch. Mapping

Result: All-Program MCM Correctness Proof? Counterexample found?

Chain Invariants Transitive Chain Abstraction Support Proof Microarch. Correctness Proof

  • Cex. Generation

Proof of Chain Invariants

Fail Fail Pass Pass

Supporting proofs provide foundation for correctness proof

slide-58
SLIDE 58

Transitive Chain (TC) Abstraction Support Proof

▪Ensure that ISA-level pattern and µarch. support TC Abstraction ▪Base case: Do initial ISA-level edges guarantee connection? ▪Inductive case: Extend transitive chain => extend transitive connection?

i1 i2 IF EX WB po i1 i2 IF EX WB rf i1 i2 IF EX WB fr i1 i2 IF EX WB co

i1 in IF EX WB rn in+1

So Some me Tran an Co Conn nn.

i1 in+1 IF EX WB

So Some me Tran ansitive Co Conn nnection

slide-59
SLIDE 59

Chain Invariants

▪Abstractly represent repeated ISA-level patterns ▪Sometimes needed for refinement loop to terminate ▪Inductively proven by PipeProof before their use in proof algorithms ▪Example: checking for edge from i1 to i5 (TC abstraction support proof)

Abstract Counterexample

i1 i3 i4 fr i5 po

slide-60
SLIDE 60

Chain Invariants

▪Abstractly represent repeated ISA-level patterns ▪Sometimes needed for refinement loop to terminate ▪Inductively proven by PipeProof before their use in proof algorithms ▪Example: checking for edge from i1 to i5 (TC abstraction support proof)

Repeating ISA-Level Pattern

i1 i3 i4 fr i5 po i1 i3 i4 fr i2 po i5 po

slide-61
SLIDE 61

Chain Invariants

▪Abstractly represent repeated ISA-level patterns ▪Sometimes needed for refinement loop to terminate ▪Inductively proven by PipeProof before their use in proof algorithms ▪Example: checking for edge from i1 to i5 (TC abstraction support proof)

Repeating ISA-Level Pattern

i1 i3 i4 fr i5 po i1 i3 i4 fr i2 po i5 po

Can continue decomposing in this way forever!

slide-62
SLIDE 62

Chain Invariants

▪Abstractly represent repeated ISA-level patterns ▪Sometimes needed for refinement loop to terminate ▪Inductively proven by PipeProof before their use in proof algorithms ▪Example: checking for edge from i1 to i5 (TC abstraction support proof)

Chain Invariant Applied

i1 i3 i4 fr i5 po i1 i3 i4 fr i2 po i5 po i1 i4 fr i2 po_plus i5

  • po_plus = arbitrary

number of repetitions of po

  • Next edge peeled off will

be something other than po

slide-63
SLIDE 63

In the paper…

▪Optimizations

  • Covering Sets: Eliminate redundant transitive connections
  • Memoization: Eliminate redundant ISA-level cycles

▪Inductive ISA edge generation ▪Adequate Model Over-Approximation

  • Needed to ensure soundness of PipeProof’s abstraction-based approach

▪…and more!

slide-64
SLIDE 64

simpleTSO simpleTSO (w/ Covering Sets + Memoization) Total Time Timeout 2449.7 sec (≈ 41 mins)

Results

▪Ran PipeProof on simpleSC (SC) and simpleTSO (TSO) µarches

  • 3-stage in-order pipelines

▪Proved correctness of both microarchitectures for all programs

  • With optimizations, runtimes < 1 hour!

simpleSC simpleSC (w/ Covering Sets + Memoization) Total Time 225.9 sec 19.1 sec

slide-65
SLIDE 65

simpleTSO simpleTSO (w/ Covering Sets + Memoization) Total Time Timeout 2449.7 sec (≈ 41 mins)

Results

▪Ran PipeProof on simpleSC (SC) and simpleTSO (TSO) µarches

  • 3-stage in-order pipelines

▪Proved correctness of both microarchitectures for all programs

  • With optimizations, runtimes < 1 hour!

simpleSC simpleSC (w/ Covering Sets + Memoization) Total Time 225.9 sec 19.1 sec

slide-66
SLIDE 66

simpleTSO simpleTSO (w/ Covering Sets + Memoization) Total Time Timeout 2449.7 sec (≈ 41 mins)

Results

▪Ran PipeProof on simpleSC (SC) and simpleTSO (TSO) µarches

  • 3-stage in-order pipelines

▪Proved correctness of both microarchitectures for all programs

  • With optimizations, runtimes < 1 hour!

simpleSC simpleSC (w/ Covering Sets + Memoization) Total Time 225.9 sec 19.1 sec

slide-67
SLIDE 67

simpleTSO simpleTSO (w/ Covering Sets + Memoization) Total Time Timeout 2449.7 sec (≈ 41 mins)

Results

▪Ran PipeProof on simpleSC (SC) and simpleTSO (TSO) µarches

  • 3-stage in-order pipelines

▪Proved correctness of both microarchitectures for all programs

  • With optimizations, runtimes < 1 hour!

simpleSC simpleSC (w/ Covering Sets + Memoization) Total Time 225.9 sec 19.1 sec

slide-68
SLIDE 68

simpleTSO simpleTSO (w/ Covering Sets + Memoization) Total Time Timeout 2449.7 sec (≈ 41 mins)

Results

▪Ran PipeProof on simpleSC (SC) and simpleTSO (TSO) µarches

  • 3-stage in-order pipelines

▪Proved correctness of both microarchitectures for all programs

  • With optimizations, runtimes < 1 hour!

simpleSC simpleSC (w/ Covering Sets + Memoization) Total Time 225.9 sec 19.1 sec

slide-69
SLIDE 69

Conclusions

▪PipeProof: Automated All-Program Microarchitectural MCM Verification

  • Designers no longer need to choose between completeness and automation

▪Transitive Chain Abstraction allows inductive modelling and verification

  • f the infinite set of all possible executions
  • Abstraction is automatically refined as necessary to prove correctness

▪Verified simple microarchitectures implementing SC and TSO in < 1 hour!

Code available at https://github.com/ymanerka/pipeproof

[Image: Napish]

slide-70
SLIDE 70

Yatin A. Manerkar, Daniel Lustig*, Margaret Martonosi, and Aarti Gupta

PipeProof: Automated Memory Consistency Proofs for Microarchitectural Specifications

http:/ ://check.cs.p .princeton.edu/

Code available at https://github.com/ymanerka/pipeproof

slide-71
SLIDE 71

Covering Sets Optimization

▪ Must verify across all possible transitive connections ▪ Each decomposition creates a new set of transitive connections

  • Can quickly lead to a case explosion

▪ The Covering Sets Optimization eliminates redundant transitive connections

x y i1 z in IF EX WB fr x y i1 z in IF EX WB fr

B A

slide-72
SLIDE 72

Covering Sets Optimization

▪ Must verify across all possible transitive connections ▪ Each decomposition creates a new set of transitive connections

  • Can quickly lead to a case explosion

▪ The Covering Sets Optimization eliminates redundant transitive connections

x y i1 z in IF EX WB fr x y i1 z in IF EX WB fr

B A

Graph A has an edge from x→z (tran conn.)

slide-73
SLIDE 73

Covering Sets Optimization

▪ Must verify across all possible transitive connections ▪ Each decomposition creates a new set of transitive connections

  • Can quickly lead to a case explosion

▪ The Covering Sets Optimization eliminates redundant transitive connections

x y i1 z in IF EX WB fr x y i1 z in IF EX WB fr

B A

Graph B has edges from y→z (tran conn.) and x→z (by transitivity) Graph A has an edge from x→z (tran conn.)

slide-74
SLIDE 74

Covering Sets Optimization

▪ Must verify across all possible transitive connections ▪ Each decomposition creates a new set of transitive connections

  • Can quickly lead to a case explosion

▪ The Covering Sets Optimization eliminates redundant transitive connections

x y i1 z in IF EX WB fr x y i1 z in IF EX WB fr

B A

Graph B has edges from y→z (tran conn.) and x→z (by transitivity) Graph A has an edge from x→z (tran conn.) Correctness of A => Correctness of B (since B contains A’s tran conn.) Checking B explicitly is redundant!

slide-75
SLIDE 75

Memoization Optimization

▪Base PipeProof algorithm examines some cycles multiple times ▪Memoization eliminates redundant checks of cycles that have already been verified

i1 fr i2 i3 i4 rf po po

slide-76
SLIDE 76

Memoization Optimization

▪Base PipeProof algorithm examines some cycles multiple times ▪Memoization eliminates redundant checks of cycles that have already been verified

i1 in IF EX WB fr

Some Tran. Conn.

i1 fr i2 i3 i4 rf po po fr

slide-77
SLIDE 77

Memoization Optimization

▪Base PipeProof algorithm examines some cycles multiple times ▪Memoization eliminates redundant checks of cycles that have already been verified

i1 in IF EX WB fr

Some Tran. Conn.

i1 fr i2 i3 i4 rf po po

i1 in IF EX WB po

Some Tran. Conn.

po po

slide-78
SLIDE 78

Memoization Optimization

▪Base PipeProof algorithm examines some cycles multiple times ▪Memoization eliminates redundant checks of cycles that have already been verified

i1 in IF EX WB fr

Some Tran. Conn.

i1 in IF EX WB rf

Some Tran. Conn.

i1 fr i2 i3 i4 rf po po

i1 in IF EX WB po

Some Tran. Conn.

rf Same cycle is checked 3 times!

slide-79
SLIDE 79

Memoization Optimization

▪Base PipeProof algorithm examines some cycles multiple times ▪Memoization eliminates redundant checks of cycles that have already been verified

i1 in IF EX WB fr

Some Tran. Conn.

i1 in IF EX WB rf

Some Tran. Conn.

i1 fr i2 i3 i4 rf po po

i1 in IF EX WB po

Some Tran. Conn.

rf Procedure: If all ISA-level cycles containing edge ri have been checked, do not peel off ri edges when checking subsequent cycles Same cycle is checked 3 times!

slide-80
SLIDE 80

The Adequate Model Over-Approximation

▪Addition of an instruction can make unobservable execution observable! ▪Need to work with over-approximation of microarchitectural constraints ▪PipeProof sets all exists clauses to true as its over-approximation

t i1 i2 IF EX WB fr v i3 co SubsetExec u t i1 i2 IF EX WB fr v i3 SubsetWithExternal u i4 rf co