machine learning
play

Machine learning Aditya V. Nori Programming Languages & Tools - PowerPoint PPT Presentation

Program verification via Machine learning Aditya V. Nori Programming Languages & Tools group Microsoft Research India Joint work with Rahul Sharma, Alex Aiken (Stanford University) Program verification 1: x = y = 0; 1: gcd(int x, int


  1. Program verification via Machine learning Aditya V. Nori Programming Languages & Tools group Microsoft Research India Joint work with Rahul Sharma, Alex Aiken (Stanford University)

  2. Program verification 1: x = y = 0; 1: gcd(int x, int y) 2: while (*) 2: { 3: x++; y++; 3: assume(x>0 && y>0); 4: while (x != 0) 4: while (x !=y ) { 5: x--; y--; 5: if (x > y) x = x-y; 6: assert (y == 0); 6: if (y > x) y = y-x; 7: } 8: return x; Qu Questi tion on 9 } Is the assertion satisfied for all Qu Questi tion on possible inputs? Does gcd terminate for all inputs 𝑦 , 𝑧 ?

  3. Current state of affairs β€’ Precision β€’ Scalability β€’ Testing is still the dominant technique for establishing software quality

  4. Question … β€’ Most applications are associated with test suites, primarily used for regression or fuzz testing β€’ Can we use these test suites profitably for proving program correctness?

  5. Here’s the plan … β€’ Guess: analyse data from tests in order to infer a candidate invariant (use ML techniques) β€’ Check: validate candidate invariant using Guess sound program analysis techniques β€’ If check succeeds, then we have a proof! β€’ If check fails, use failure to generate more data program 𝑒 πœ… and repeat guess+check Check β€’ Why is this nice? β€’ Program analysis not so good at guessing invariants β€’ Program analysis is good at checking invariants β€’ Able to make use of data generated from programs and existing ML algorithms for analysis

  6. Instantiations of Guess β€’ Classification  Interpolants as Classifiers. Sharma, N, Aiken, Computer-Aided Verification (CAV 2012)  Program Verification as Learning Geometric Concepts. Sharma, Gupta, Hariharan, Aiken, N. Submitted β€’ Linear algebra  A Data Driven Approach for Algebraic Loop Invariants. Sharma, Gupta, Hariharan, Aiken, N. European Symposium on Programming (ESOP 2012) β€’ Regression  Termination proofs from tests. N, Sharma. submitted

  7. Interpolants β€’ An interpolant for a pair of formulas 𝐡, 𝐢 s.t. (𝐡 ∧ 𝐢 =βŠ₯) is a formula 𝐽 satisfying: β€’ 𝐡 β‡’ 𝐽 β€’ 𝐽 ∧ 𝐢 =βŠ₯ β€’ 𝑀𝑏𝑠𝑑 𝐽 βŠ† 𝑀𝑏𝑠𝑑 𝐡 ∩ 𝑀𝑏𝑠𝑑 𝐢 β€’ An interpolant is a β€œsimple” proof

  8. Example β€’ 𝐡 = 𝑦 β‰₯ 𝑧 y β€’ 𝐢 = 𝑧 β‰₯ 𝑦 + 1 β€’ 𝐽 = 2𝑦 + 1 β‰₯ 2𝑧 x

  9. Binary classification β€’ Input: a set of points π‘Œ with labels π‘š ∈ +1, βˆ’1 β€’ Goal: find a classifier 𝐷: X β†’ {𝑒𝑠𝑣𝑓, π‘”π‘π‘šπ‘‘π‘“} such that: β€’ 𝐷 𝑏 = 𝑒𝑠𝑣𝑓, βˆ€π‘ ∈ π‘Œ . π‘šπ‘π‘π‘“π‘š 𝑏 = +1 , and β€’ 𝐷 𝑐 = π‘”π‘π‘šπ‘‘π‘“, βˆ€π‘ ∈ X . π‘šπ‘π‘π‘“π‘š 𝑐 = βˆ’1

  10. Verification & Machine-learning β€’ Interpolant: separates formula 𝐡 from formula 𝐢 β€’ Classifier: separates positive examples from negative examples Is there a connection?

  11. Yes! β€’ Main result: view interpolants as classifiers which distinguish β€œ + ” examples from β€œ βˆ’ ” examples β€’ Use state-of-the-art classification algorithms ( SVM s) for computing invariants β€’ SVM s are predictive β†’ generalized predicates for verification

  12. Verification & Machine-learning Get positive and negative Unroll the loops examples β€’ Find interpolants β€’ Find a classifier β€’ Get general proofs (loop β€’ This is a predicate which invariants) generalizes to test data

  13. Example 1: x = y = 0; 2: while (*) 3: x++; y++; 4: while (x != 0) 5: x--; y--; 6: assert (y == 0);

  14. Example … β€’ 𝐡 ≑ 𝑦 1 = 0 ∧ 𝑧 1 = 0 ∧ 𝑗𝑒𝑓(𝑐, 𝑦 = 𝑦 1 + 1 ∧ 𝑧 = 1: x = y = 0; 𝐡 2: while (*) 𝑧 1 + 1, 𝑦 = 𝑦 1 ∧ 𝑧 = 𝑧 1 ) 3: x++; y++; 4: while (x != 0) β€’ 𝐢 ≑ 𝑗𝑒𝑓(𝑦 = 0, 𝑦 2 = 𝑦 βˆ’ 1 ∧ 𝑧 2 = 𝑧 βˆ’ 1, 𝑦 2 = 5: x--; y--; 𝐢 6: assert (y == 0); 𝑦 ∧ 𝑧 2 = 𝑧) ∧ 𝑦 2 = 0 ∧ 𝑧 2 β‰  0 β€’ 𝐡 ∧ 𝐢 =βŠ₯ β€’ 𝐽 𝑦, 𝑧 ≑ 𝑦 = 𝑧

  15. Example ο‚‘ 𝐡 ≑ 𝑦 1 = 0 ∧ 𝑧 1 = 0 ∧ 𝑗𝑒𝑓(𝑐, 𝑦 = 𝑦 1 + 1 ∧ y 𝑧 = 𝑧 1 + 1, 𝑦 = 𝑦 1 ∧ 𝑧 = 𝑧 1 ) ο‚‘ 𝐢 ≑ 𝑗𝑒𝑓(𝑦 = 0, 𝑦 2 = 𝑦 βˆ’ 1 ∧ 𝑧 2 = 𝑧 βˆ’ + (1,1) 1, 𝑦 2 = 𝑦 ∧ 𝑧 2 = 𝑧) ∧ 𝑦 2 = 0 ∧ 𝑧 2 β‰  0 ο‚‘ 𝐽 1 ≑ 2𝑧 ≀ 2𝑦 + 1 + x (0,0)

  16. Example ο‚‘ 𝐡 ≑ 𝑦 1 = 0 ∧ 𝑧 1 = 0 ∧ 𝑗𝑒𝑓(𝑐, 𝑦 = 𝑦 1 + 1 ∧ y 𝑧 = 𝑧 1 + 1, 𝑦 = 𝑦 1 ∧ 𝑧 = 𝑧 1 ) ο‚‘ 𝐢 ≑ 𝑗𝑒𝑓(𝑦 = 0, 𝑦 2 = 𝑦 βˆ’ 1 ∧ 𝑧 2 = 𝑧 βˆ’ + (1,1) 1, 𝑦 2 = 𝑦 ∧ 𝑧 2 = 𝑧) ∧ 𝑦 2 = 0 ∧ 𝑧 2 β‰  0 ο‚‘ 𝐽 2 ≑ 2𝑧 ≀ 2𝑦 + 1 ∧ 2𝑧 β‰₯ 2𝑦 βˆ’ 1 + x (0,0) Interpolant!

  17. The algorithm Theorem: π½π‘œπ‘’π‘“π‘ π‘žπ‘π‘šπ‘π‘œπ‘’(𝐡, 𝐢) terminates only if π½π‘œπ‘’π‘“π‘ π‘žπ‘π‘šπ‘π‘œπ‘’(𝐡, 𝐢) output 𝐼 is an interpolant between 𝐡 and 𝐢 (π‘Œ + , π‘Œ βˆ’ ) = π½π‘œπ‘—π‘’(𝐡, 𝐢) while(true) { Find candidate interpolant 𝐼 = π‘‡π‘Šπ‘π½(π‘Œ + , π‘Œ βˆ’ ) if ( π‘‡π΅π‘ˆ 𝐡 ∧ ¬𝐼 ) 𝐡 β‡’ 𝐽 Add 𝑑 to π‘Œ + and continue; if ( π‘‡π΅π‘ˆ 𝐢 ∧ ¬𝐼 ) 𝐽 ∧ 𝐢 =βŠ₯ Add 𝑑 to π‘Œ βˆ’ and continue; break; Exit if interpolant found } return 𝐼 ;

  18. Evaluation β€’ 1000 lines of C++ β€’ LIBSVM for SVM queries β€’ Z3 theorem prover

  19. Proving termination β€’ For every loop, guess a bound on the number of iterations β€’ Check the bound with a safety checker

  20. Example: GCD 1: gcd(int x, int y) 2: { 3: assume(x>0 && y>0); 4: while (x !=y ) { 5: if (x > y) x = x-y; 6: if (y > x) y = y-x; 7: } 8: return x; 9 }

  21. Example: Instrumented GCD β€’ Inputs 1: gcd(int x, int y) 𝑦, 𝑧 = { 1,2 , 2,1 , 1,3 , 3,1 } 2: { 3: assume(x>0 && y>0); 𝑑 4: // instrumented code 1 𝑏 𝑐 5: a = x; b = y; c = 0; 1 1 1 2 6: while (x !=y ) { 1 1 2 1 7: // instrumented code β€’ 𝐡 = 1 , C = 1 1 3 8: c = c+1; 2 1 1 3 9: writeLog(a, b, c, x, y); 1 1 3 1 10: if (x > y) x = x-y; 2 11: if (y > x) y = y-x; 1 3 1 12: } 13: return x; β€’ Find 𝑑 β‰ˆ π‘₯ 1 𝑏 + π‘₯ 2 𝑐 + π‘₯ 3 (linear regression) 14: }

  22. Linear regression β€’ min 𝑗 (π‘₯ 1 𝑏 + π‘₯ 2 𝑐 + π‘₯ 3 βˆ’ 𝑑 𝑗 ) 2

  23. Quadratic programming β€’ min 𝑗 (π‘₯ 1 𝑏 + π‘₯ 2 𝑐 + π‘₯ 3 βˆ’ 𝑑 𝑗 ) 2 𝑑. 𝑒. 𝐡π‘₯ β‰₯ 𝐷 β€’ Guess is 𝜐 𝑏, 𝑐 = 𝑏 + 𝑐 βˆ’ 2

  24. Example: Annotated GCD β€’ Check with a safety checker 1: gcd(int x, int y) 2: { β€’ Free invariant to aid checker 3: assume(x>0 && y>0); 𝑑 ≀ 𝑏 + 𝑐 βˆ’ 𝑦 βˆ’ 𝑧 ∧ 𝑦 > 0 ∧ 𝑧 > 0 4: a = x; b = y; c = 0; 5: while (x !=y ) { β€’ Corrective measures 6: // annotation β€’ Sound rounding for polynomials 7: free_invariant(c <= a+b-x-y); with integer coefficients 8: // annotation β€’ Partitioning of tests for 9: assert(c <= a+b-2); 10: if (x > y) x = x-y; discovering disjunctive loop 11: if (y > x) y = y-x; bounds 12: } 13: return x; 14: }

  25. Evaluation

  26. Summary β€’ Classification based algorithms can be used for computing proofs in program verification β€’ Follow-up work on using techniques from linear algebra and PAC learning for scalable proofs β€’ Proving program termination via linear regression β€’ Data a Driven ven Program ram An Analys lysis is

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend