testing a saturation based theorem prover experiences and
play

Testing a Saturation-Based Theorem Prover: Experiences and - PowerPoint PPT Presentation

Testing a Saturation-Based Theorem Prover: Experiences and Challenges Giles Reger 1 , Martin Suda 2 , and Andrei Voronkov 1 , 2 1 School of Computer Science, University of Manchester, UK 2 TU Wien, Vienna, Austria TAP 2017 Marburg, July 19,


  1. Testing a Saturation-Based Theorem Prover: Experiences and Challenges Giles Reger 1 , Martin Suda 2 , and Andrei Voronkov 1 , 2 1 School of Computer Science, University of Manchester, UK 2 TU Wien, Vienna, Austria TAP 2017 – Marburg, July 19, 2017 1/16

  2. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire 1/16

  3. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire Often used in larger projects and systems as black boxes e.g., program verification, static analysis, interpolation, . . . 1/16

  4. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire Often used in larger projects and systems as black boxes e.g., program verification, static analysis, interpolation, . . . ➥ Importance of ensuring correctness 1/16

  5. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire Often used in larger projects and systems as black boxes e.g., program verification, static analysis, interpolation, . . . ➥ Importance of ensuring correctness How are we doing? 1/16

  6. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire Often used in larger projects and systems as black boxes e.g., program verification, static analysis, interpolation, . . . ➥ Importance of ensuring correctness How are we doing? CASC competition: preliminary period for testing soundness 1/16

  7. Introduction First-order Automatic Theorem Proving: a well-established discipline of automated deduction main approach: refutational, saturation-based proving example systems: E, SPASS, Vampire Often used in larger projects and systems as black boxes e.g., program verification, static analysis, interpolation, . . . ➥ Importance of ensuring correctness How are we doing? CASC competition: preliminary period for testing soundness SMT-COMP 2016: 79 answers classified as incorrect 1/16

  8. Our Prover Vampire Automatic Theorem Prover for first-order logic and theories 2/16

  9. Our Prover Vampire Automatic Theorem Prover for first-order logic and theories regular winner of the main divisions of the CASC competition since 2016, also a successful participant of SMT-COMP 2/16

  10. Our Prover Vampire Automatic Theorem Prover for first-order logic and theories regular winner of the main divisions of the CASC competition since 2016, also a successful participant of SMT-COMP Quite complex piece of software ( ≈ 194000 lines of C++) ➥ easy to introduce incorrectness when adding a new feature 2/16

  11. Outline What Does Correctness Means for Us 1 Detecting and Investigating Bugs 2 Challenges 3 Conclusion 4 3/16

  12. Theorem proving basics Standard form of the input: F := ( Axiom 1 ∧ . . . ∧ Axiom n ) → Conjecture 4/16

  13. Theorem proving basics Standard form of the input: F := ( Axiom 1 ∧ . . . ∧ Axiom n ) → Conjecture 1 Negate F (to seek a refutation): ¬ F := Axiom 1 ∧ . . . ∧ Axiom n ∧ ¬ Conjecture 4/16

  14. Theorem proving basics Standard form of the input: F := ( Axiom 1 ∧ . . . ∧ Axiom n ) → Conjecture 1 Negate F (to seek a refutation): ¬ F := Axiom 1 ∧ . . . ∧ Axiom n ∧ ¬ Conjecture 2 Preprocess and transform ¬ F to a normal form S := { C 1 , . . . , C n } 4/16

  15. Theorem proving basics Standard form of the input: F := ( Axiom 1 ∧ . . . ∧ Axiom n ) → Conjecture 1 Negate F (to seek a refutation): ¬ F := Axiom 1 ∧ . . . ∧ Axiom n ∧ ¬ Conjecture 2 Preprocess and transform ¬ F to a normal form S := { C 1 , . . . , C n } 3 saturate S with respect to an inference system I 4/16

  16. Theorem proving basics Standard form of the input: F := ( Axiom 1 ∧ . . . ∧ Axiom n ) → Conjecture 1 Negate F (to seek a refutation): ¬ F := Axiom 1 ∧ . . . ∧ Axiom n ∧ ¬ Conjecture 2 Preprocess and transform ¬ F to a normal form S := { C 1 , . . . , C n } 3 saturate S with respect to an inference system I C 1 ∨ P C 2 ∨ ¬ P Example inference rule: C 1 ∨ C 2 4/16

  17. The Saturation Process Saturation = fixed-point (closure) computation 5/16

  18. The Saturation Process Saturation = fixed-point (closure) computation Does the final set S contain false ? 5/16

  19. The Saturation Process Saturation = fixed-point (closure) computation Does the final set S contain false ? Basic properties: 5/16

  20. The Saturation Process Saturation = fixed-point (closure) computation Does the final set S contain false ? Basic properties: explosive in nature 5/16

  21. The Saturation Process Saturation = fixed-point (closure) computation Does the final set S contain false ? Basic properties: explosive in nature may not terminate 5/16

  22. The Saturation Process Saturation = fixed-point (closure) computation Does the final set S contain false ? Basic properties: explosive in nature may not terminate various tricks to mitigate the explosion 5/16

  23. Possible Answers: Theorem (together with a proof) if the input F is logically valid 6/16

  24. Possible Answers: Theorem (together with a proof) if the input F is logically valid Non-theorem if F is invalid (there is a counter-example) 6/16

  25. Possible Answers: Theorem (together with a proof) if the input F is logically valid Non-theorem if F is invalid (there is a counter-example) relies on a completeness argument 6/16

  26. Possible Answers: Theorem (together with a proof) if the input F is logically valid Non-theorem if F is invalid (there is a counter-example) relies on a completeness argument Unknown 6/16

  27. Possible Answers: Theorem (together with a proof) if the input F is logically valid Non-theorem if F is invalid (there is a counter-example) relies on a completeness argument Unknown time limit / memory limit 1 6/16

  28. Possible Answers: Theorem (together with a proof) if the input F is logically valid Non-theorem if F is invalid (there is a counter-example) relies on a completeness argument Unknown time limit / memory limit 1 incomplete strategy failed 2 6/16

  29. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) 7/16

  30. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) Check the proof and see what went wrong. 7/16

  31. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) Check the proof and see what went wrong. completeness issue: Reports Non-theorem for a valid F . (Finitely saturates unsat. S without deriving false .) 7/16

  32. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) Check the proof and see what went wrong. completeness issue: Reports Non-theorem for a valid F . (Finitely saturates unsat. S without deriving false .) Should have said Unknown here! 7/16

  33. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) Check the proof and see what went wrong. completeness issue: Reports Non-theorem for a valid F . (Finitely saturates unsat. S without deriving false .) Should have said Unknown here! fairness issue: Prover runs indefinitely, while a proof exists. (Violation of fairness criteria in saturation.) 7/16

  34. Different Ways of Being Incorrect unsoundness: Reports Theorem for an invalid F . (Derives false for a satisfiable S .) Check the proof and see what went wrong. completeness issue: Reports Non-theorem for a valid F . (Finitely saturates unsat. S without deriving false .) Should have said Unknown here! fairness issue: Prover runs indefinitely, while a proof exists. (Violation of fairness criteria in saturation.) never (strictly) violated after finitely many steps 7/16

  35. Violating the Contract of Proper Behaviour General error conditions shared by any other program: program crash E.g., 8/16

  36. Violating the Contract of Proper Behaviour General error conditions shared by any other program: program crash E.g., unhandled exceptions 8/16

  37. Violating the Contract of Proper Behaviour General error conditions shared by any other program: program crash E.g., unhandled exceptions signal interrupts (SIGFPE, SIGSEG) 8/16

  38. Violating the Contract of Proper Behaviour General error conditions shared by any other program: program crash E.g., unhandled exceptions signal interrupts (SIGFPE, SIGSEG) assertion violation defensive development via assertions around 2500 assertions in total; (one per 77 lines on average) potential errors detected early on 8/16

  39. Outline What Does Correctness Means for Us 1 Detecting and Investigating Bugs 2 Challenges 3 Conclusion 4 9/16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend