[PPT] - Aditya V. Nori, Sriram K. Rajamani Programming Languages and Tools PowerPoint Presentation

SLIDE 1

Aditya V. Nori, Sriram K. Rajamani

Programming Languages and Tools

Microsoft Research India

SLIDE 2

 An industrial strength program verifier  Philosophy: Synergize verification and testing  Synergy [FSE ’06], Dash [ISSTA ‘08], Smash [POPL

‘10], Bolt [submitted] algorithms to perform scalable analysis

 Engineered a number of optimizations for

scalability

 Integrated with Microsoft’s Static Driver Verifier

(SDV) toolkit and used internally

SLIDE 3

void f(int *p, int *q) { 0: *p = 4; 1: *q = 5; 2: assert (¬𝜒𝑓𝑠𝑠𝑝𝑠) }

Questi tion

n

Does the assertion hold for all possible inputs? Must analysis: finds bugs, but can’t prove their absence May analysis: can prove the absence of bugs, but can result in false errors More generally, we are interested in the query

〈𝜒𝑞𝑠𝑓

?

𝑔 𝜒𝑓𝑠𝑠𝑝𝑠〉

SLIDE 4

𝑈

?

𝑔 ∗ 𝑞 ≠ 4 = 𝑧𝑓𝑡

Captures facts that are guaranteed to hold on particular

executions of the program (under-approximation)

Error condition is reachable by any input that satisfies (𝑞 = 𝑟)

⊆ ∗ 𝑞 ≠ 4 = (𝑞 = 𝑟)

void f(int *p, int*q) { 0: *p = 4; 1: *q = 5; }

test

SLIDE 5

𝑞 ≠ 𝑟

?

𝑔 ∗ 𝑞 ≠ 4 = 𝑜𝑝 proof

1

(𝑞 ≠ 𝑟) (𝑞 ≠ 𝑟) (∗ 𝑞 ≠ 4)

2 1

(𝑞 = 𝑟)

Captures facts that are true for all executions of the

program (over-approximation)

Proof can be obtained by keeping track of the predicates

(𝑞 = 𝑟) and (∗ 𝑞 ≠ 4)

void f(int *p, int*q) { 0: *p = 4; 1: *q = 5; }

SLIDE 6

Algorithm uses only test case generation
perations
Maintains two data structures:

▪ A forest of reachable concrete states (tests) ▪ Under-approximates executions of the program ▪ A region graph (an abstraction) ▪ Over-approximates all executions of the program

Our goal: bug finding and proving

▪ If a test reaches an error, we have found bug ▪ If we refine the abstraction so that there is *no* path from the initial region to error region, we have a proof

Key ideas

▪ Frontier ▪ 𝑋𝑄

𝛽 uses only aliases α that are present along

concrete tests that are executed

SLIDE 7

Step 1: Try to generate a test that crosses the frontier

Perform symbolic

simulation on the path until the frontier and generate a constraint 𝜒1

Conjoin with the condition

𝜒2 needed to cross frontier

Is 𝜒1∧ 𝜒2 satisfiable?

frontier

1 2 3 4 7 8 9 5 6 10 10

𝜒2 𝜒1

SLIDE 8

Step 1: Try to generate a test that crosses the frontier

Perform symbolic

simulation on the path until the frontier and generate a constraint 𝜒1

Conjoin with the condition

𝜒2 needed to cross frontier

Is 𝜒1∧ 𝜒2 satisfiable? [YES]

Step 2: run the test and extend the frontier

1 2 3 4 7 8 9 5 6 10 10

frontier

SLIDE 9

Step 1: Try to generate a test that crosses the frontier

Perform symbolic

simulation on the path until the frontier and generate a constraint 𝜒1

Conjoin with the condition

𝜒2 needed to cross frontier

Is 𝜒1∧ 𝜒2 satisfiable? [NO]

Step 2: use 𝑋𝑄

𝛽 to refine so

that the frontier moves back!

frontier

1 2 3 4 7 8 9 5 6 10 10 4

SLIDE 10

Can exte tend test t beyond fronti tier? Refine ne abstra tracti tion Cons nstr truc uct t initi tial al abstra tracti tion Cons nstruc uct rand ndom m tests Test t succeeded? Bug ug! Abstrac tracti tion succeeded? τ = error path in abstraction f = fronti ntier of error path th yes no no yes no no Proof! yes no no Input: ut: Program m 𝑸 Property rty 𝝎

void f(int y) { 0: int lock, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }

1 2 3 4 5 6 7 8 9

×

10

𝑧 = 1 𝜐 = (0,1,2,3,4,7,8,9) frontier

Symbolic execution + Theorem proving

SLIDE 11

8 9 8:¬ρ

8:ρ

9

refine

𝜍 = (𝑚𝑝𝑑𝑙. 𝑡𝑢𝑏𝑢𝑓 ! = 𝑀)

1 2 3 4 5 6 7 8 9 10

SLIDE 12

8 9 8:¬p

8:p

9

refine

p= (𝑚𝑝𝑑𝑙. 𝑡𝑢𝑏𝑢𝑓 ! = 𝑀)

1 2 3 4 5 6 7 9 10 8:¬𝑞 8:p

SLIDE 13

Can exte tend test t beyond fronti tier? Refine ne abstra tracti tion Cons nstr truc uct t initi tial al abstra tracti tion Cons nstruc uct rand ndom m tests Test t succeeded? Bug ug! Abstrac tracti tion succeeded? τ = error path in abstraction f = fronti ntier of error path th yes no no yes no no Proof! yes no no Input: ut: Program m 𝑸 Property rty 𝝎

void f(int y) { 0: int lock, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }

×

frontier

1 2 3 4 5 6 7 9 10 8:¬𝑞 8:p

𝜐 = (0,1,2,3,4,7, < 8, 𝑞 >, 9)

SLIDE 14

void f(int y) { 0: int lock, x; 1: do { 2: lock = 1; 3: x = y; 4: if (*) { 5: lock = 0; 6: y = y+1; } 7: } while (x != y) 8: if (lock != 1) 9: error(); 10: }

1 2 3 4⋀¬s 5⋀¬s 6⋀¬r 9 7⋀¬q 8⋀¬p 4⋀s 5⋀s 6⋀r 7⋀q 8⋀p 10

SLIDE 15

Sk-2 T Sk-1

𝐷𝐵𝑀𝑀(𝑔𝑝𝑝(𝑗, 𝑘))

Sk frontier

Key idea Perform a recursive Dash query

n the called procedure and use

the result to either generate a test or compute 𝑋𝑄

𝛽

SLIDE 16

Sk-2 T Sk-1

𝐷𝐵𝑀𝑀(𝑔𝑝𝑝(𝑗, 𝑘))

Sk

1 2

Dash〈𝜒1

?

𝑔𝑝𝑝 𝜒2〉

pass: perform refinement
fail: generate test

SLIDE 17

A must summary for a procedure 𝒬𝑗 is of

the form 𝜒1, 𝜒2 ∈

𝑛𝑣𝑡𝑢 𝒬𝑗

∀𝑢 ∈ 𝜒2 . ∃𝑡 ∈ 𝜒1 . 𝑢 can be obtained by

executing 𝒬𝑗 from an initial state 𝑡

𝒬𝑗 𝜒1 𝜒2 must summary

A ¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧 for a procedure 𝒬𝑗 is of

the form 𝜒1, 𝜒2 ∈

¬𝑛𝑏𝑧 𝒬𝑗

∀𝑡 ∈ 𝜒1 ∀𝑢 ∈ 𝜒2 . 𝑢 cannot be obtained by

executing 𝒬𝑗 starting in state 𝑡

𝒬𝑗 𝜒1 𝜒2 ¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧

SLIDE 18

𝜒1 ∈ Π𝑜1 𝜒2 ∈ Π𝑜2 𝜒1 ∩ Ω𝑜1 ≠ ∅ 𝜒2 ∩ Ω𝑜2 = ∅ 𝑓 = (𝑜1, 𝑜2) ∈ 𝐹𝒬𝑗 𝑗𝑡 𝑏 𝑑𝑏𝑚𝑚 𝑢𝑝 𝑞𝑠𝑝𝑑𝑓𝑒𝑣𝑠𝑓 𝒬

𝑘

𝜒 1, 𝜒 2 ∈

𝑛𝑣𝑡𝑢 𝒬𝑘 Ω𝑜1 ⊇ 𝜒

1 𝜄 ⊆ 𝜒 2 𝜒2 ∩ 𝜄 ≠ ∅ Ω𝑜2 ≔ Ω𝑜2 ∪ 𝜄 [MUST − POST − USESUM]

Check if frontier (𝑜1, 𝑜2) can be extended by a

must summary 𝜒 1, 𝜒 2

If yes, grow Ω𝑜2with 𝜄 ⊆ 𝜒

2

𝒬

𝑘

𝜒 1 ⊆ Ω𝑜1 (𝜒 2⊇ 𝜄) ∧ (𝜒2 ∩ 𝜄 ≠ ∅) must summary Γ𝑓 = 𝑑𝑏𝑚𝑚 𝒬

𝑘

procedure 𝒬𝑗

1 2 4 6 7 3 5

𝜒 2 𝑈 𝑈 𝜒1 𝑈 𝑈 𝑈 frontier Ω𝑜1 𝜒2

SLIDE 19

𝜒1 ∈ Π𝑜1 𝜒2 ∈ Π𝑜2 𝜒1 ∩ Ω𝑜1 ≠ ∅ 𝜒2 ∩ Ω𝑜2 = ∅ 𝑓 = (𝑜1, 𝑜2) ∈ 𝐹𝒬𝑗 𝑗𝑡 𝑏 𝑑𝑏𝑚𝑚 𝑢𝑝 𝑞𝑠𝑝𝑑𝑓𝑒𝑣𝑠𝑓 𝒬

𝑘

𝜒 1, 𝜒 2 ∈

𝑛𝑣𝑡𝑢 𝒬𝑘 Ω𝑜1 ⊇ 𝜒

1 𝜄 ⊆ 𝜒 2 𝜒2 ∩ 𝜄 ≠ ∅ Ω𝑜2 ≔ Ω𝑜2 ∪ 𝜄 [MUST − POST − USESUM]

Check if frontier (𝑜1, 𝑜2) can be extended by a

must summary 𝜒 1, 𝜒 2

If yes, grow Ω𝑜2with 𝜄 ⊆ 𝜒

2

procedure 𝒬𝑗

1 2 4 6 7 3 5

𝜒 2 𝑈 𝑈 𝜒1 𝜒2 𝑈 𝑈 𝑈 frontier Ω𝑜1 𝜄 Γ𝑓 = 𝑑𝑏𝑚𝑚 𝒬

𝑘

𝒬

𝑘

𝜒 1 ⊆ Ω𝑜1 (𝜒 2⊇ 𝜄) ∧ (𝜒2 ∩ 𝜄 ≠ ∅) must summary

SLIDE 20

𝜒1 ∈ Π𝑜1 𝜒2 ∈ Π𝑜2 𝜒1 ∩ Ω𝑜1 ≠ ∅ 𝜒2 ∩ Ω𝑜2 = ∅ 𝑓 = (𝑜1, 𝑜2) ∈ 𝐹𝒬𝑗 𝑗𝑡 𝑏 𝑑𝑏𝑚𝑚 𝑢𝑝 𝑞𝑠𝑝𝑑𝑓𝑒𝑣𝑠𝑓 𝒬

𝑘

𝜒 1, 𝜒 2 ∈

¬𝑛𝑏𝑧 𝒬𝑘 𝜒2 ⊆ 𝜒

2 𝜄 ⊆ 𝜒 1 ¬𝜄 ∩ Ω𝑜1 = ∅ Π𝑜1 ≔ Π𝑜1 ∖ 𝜒1 ∪ 𝜒1 ∩ 𝜄, 𝜒1 ∩ ¬𝜄 𝑂𝑓 ≔ 𝑂𝑓 ∪ { 𝜒1 ∩ 𝜄, 𝜒2 } [NMAY − PRE − USESUM]

Check if frontier (𝑜1, 𝑜2) can be refined by a

¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧 𝜒 1, 𝜒 2

If yes, use 𝜄 ⊆ 𝜒

1to refine the abstraction

If both must and ¬𝑛𝑏𝑧 summaries are not

available, analyze procedure 𝒬

𝑘

𝑧𝑓𝑡 𝑛𝑣𝑡𝑢 𝑡𝑣𝑛𝑛𝑏𝑠𝑧 for 𝒬

𝑘

𝑜𝑝 ¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧 for 𝒬

𝑘

1 2 4 6 7 3 5

𝜒 2 𝑈 𝑈 𝑈 𝑈 𝑈

2

𝑂𝑓 𝜒1 ∩ 𝜄 𝜒1 ∩ ¬𝜄 frontier 𝜒2 procedure 𝒬𝑗 Γ𝑓 = 𝑑𝑏𝑚𝑚 𝒬

𝑘

Γ𝑓 = 𝑑𝑏𝑚𝑚 𝒬

𝑘

𝒬

𝑘

(𝜒 1 ⊇ 𝜄) ∧ (¬𝜄 ∩ Ω𝑜1 = ∅) 𝜒 2 ⊇ 𝜒2 ¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧

SLIDE 21

 Engineering for making Yogi robust, scalable and industrial

strength

 Several of the implemented optimizations are folklore

Very difficult to design tools that are bug free evaluating
ptimizations is hard!
Our empirical evaluation gives tool builders information about

what gains can be realistically expected from optimizations

Details in ICSE ‘10

 Vanilla implementation of algorithms:

(flpydisk, CancelSpinLock) took 2 hours

 Algorithms + engineering + optimizations:

(flpydisk, CancelSpinLock) took less than 1 second!

SLIDE 22

 Benchmarks:

30 WDM drivers and 83 properties (2490 runs)
Anecdotal belief: most bugs in the tools are

usually caught with this test suite

SLIDE 23

Summa marie ies Total time me (minute tes) #defe fects cts #time imeouts

uts

yes 2160 241 77 no 3780 236 165

42%

SLIDE 24

 Bolt: a generic

framework that uses MapReduce style parallelism to scale top-down analysis

Intraprocedural parameter Summary database

SLIDE 25

SLIDE 26

~Linear speedup!

SLIDE 27

Aditya V. Nori, Sriram K. Rajamani

 An industrial strength program verifier  Philosophy: Synergize verification and testing  Synergy [FSE ’06], Dash [ISSTA ‘08], Smash [POPL

‘10], Bolt [submitted] algorithms to perform scalable analysis

 Engineered a number of optimizations for

scalability

 Integrated with Microsoft’s Static Driver Verifier

(SDV) toolkit and used internally

Questi tion

〈𝜒𝑞𝑠𝑓

𝑔 𝜒𝑓𝑠𝑠𝑝𝑠〉

𝑈

𝑔 ∗ 𝑞 ≠ 4 = 𝑧𝑓𝑡

executions of the program (under-approximation)

test

𝑞 ≠ 𝑟

𝑔 ∗ 𝑞 ≠ 4 = 𝑜𝑝 proof

program (over-approximation)

(𝑞 = 𝑟) and (∗ 𝑞 ≠ 4)

Step 1: Try to generate a test that crosses the frontier

simulation on the path until the frontier and generate a constraint 𝜒1

𝜒2 needed to cross frontier

𝜒2 𝜒1

Step 1: Try to generate a test that crosses the frontier

simulation on the path until the frontier and generate a constraint 𝜒1

𝜒2 needed to cross frontier

Step 2: run the test and extend the frontier

Step 1: Try to generate a test that crosses the frontier

simulation on the path until the frontier and generate a constraint 𝜒1

𝜒2 needed to cross frontier

Step 2: use 𝑋𝑄

that the frontier moves back!

8:ρ

refine

8:p

refine

1 2

Dash〈𝜒1

𝑔𝑝𝑝 𝜒2〉

the form 𝜒1, 𝜒2 ∈

executing 𝒬𝑗 from an initial state 𝑡

the form 𝜒1, 𝜒2 ∈

executing 𝒬𝑗 starting in state 𝑡

must summary 𝜒 1, 𝜒 2

2

must summary 𝜒 1, 𝜒 2

2

¬𝑛𝑏𝑧 𝑡𝑣𝑛𝑛𝑏𝑠𝑧 𝜒 1, 𝜒 2

1to refine the abstraction

available, analyze procedure 𝒬

strength

what gains can be realistically expected from optimizations

 Benchmarks:

usually caught with this test suite

Summa marie ies Total time me (minute tes) #defe fects cts #time imeouts

yes 2160 241 77 no 3780 236 165

 Bolt: a generic

framework that uses MapReduce style parallelism to scale top-down analysis

~Linear speedup!

PLDI 2012 tutorial

http://research.microsoft.com/yogi/pldi2012.aspx