Automated Reasoning for Systems Engineering Laura Kov acs Vienna - - PowerPoint PPT Presentation

automated reasoning for systems engineering
SMART_READER_LITE
LIVE PREVIEW

Automated Reasoning for Systems Engineering Laura Kov acs Vienna - - PowerPoint PPT Presentation

Automated Reasoning for Systems Engineering Laura Kov acs Vienna University of Technology Future and Our Motivation 1. Automated reasoning, in particular theorem proving will remain central in software verification and program analysis.


slide-1
SLIDE 1

Automated Reasoning for Systems Engineering

Laura Kov´ acs

Vienna University of Technology

slide-2
SLIDE 2

Future and Our Motivation

  • 1. Automated reasoning, in particular theorem proving will remain

central in software verification and program analysis. The role of theorem proving in these areas will be growing.

  • 2. Theorem provers will be used by a large number of users who do

not understand theorem proving and by users with very elementary knowledge of logic.

  • 3. Reasoning with both quantifiers and theories will remain the

main challenge in practical applications of theorem proving (at least) for the next decade.

  • 4. Theorem provers will be used in reasoning with very large
  • theories. These theories will appear in knowledge mining and

natural language processing.

slide-3
SLIDE 3

Future and Our Motivation

  • 1. Automated reasoning, in particular theorem proving will remain

central in software verification and program analysis. The role of theorem proving in these areas will be growing.

  • 2. Theorem provers will be used by a large number of users who do

not understand theorem proving and by users with very elementary knowledge of logic.

  • 3. Reasoning with both quantifiers and theories will remain the

main challenge in practical applications of theorem proving (at least) for the next decade.

  • 4. Theorem provers will be used in reasoning with very large
  • theories. These theories will appear in knowledge mining and

natural language processing.

slide-4
SLIDE 4

Future and Our Motivation

  • 1. Automated reasoning, in particular theorem proving will remain

central in software verification and program analysis. The role of theorem proving in these areas will be growing.

  • 2. Theorem provers will be used by a large number of users who do

not understand theorem proving and by users with very elementary knowledge of logic.

  • 3. Reasoning with both quantifiers and theories will remain the

main challenge in practical applications of theorem proving (at least) for the next decade.

  • 4. Theorem provers will be used in reasoning with very large
  • theories. These theories will appear in knowledge mining and

natural language processing.

slide-5
SLIDE 5

Future and Our Motivation

  • 1. Automated reasoning, in particular theorem proving will remain

central in software verification and program analysis. The role of theorem proving in these areas will be growing.

  • 2. Theorem provers will be used by a large number of users who do

not understand theorem proving and by users with very elementary knowledge of logic.

  • 3. Reasoning with both quantifiers and theories will remain the

main challenge in practical applications of theorem proving (at least) for the next decade.

  • 4. Theorem provers will be used in reasoning with very large
  • theories. These theories will appear in knowledge mining and

natural language processing.

slide-6
SLIDE 6

Outline

Automated Theorem Proving - An Overview Challenges of Automated Theorem Proving

slide-7
SLIDE 7

First-Order Theorem Proving. Example

Group theory theorem: if a group satisfies the identity x2 = 1, then it is commutative.

slide-8
SLIDE 8

First-Order Theorem Proving. Example

Group theory theorem: if a group satisfies the identity x2 = 1, then it is commutative. More formally: in a group “assuming that x2 = 1 for all x prove that x · y = y · x holds for all x, y.”

slide-9
SLIDE 9

First-Order Theorem Proving. Example

Group theory theorem: if a group satisfies the identity x2 = 1, then it is commutative. More formally: in a group “assuming that x2 = 1 for all x prove that x · y = y · x holds for all x, y.” What is implicit: axioms of the group theory. ∀x(1 · x = x) ∀x(x−1 · x = 1) ∀x∀y∀z((x · y) · z = x · (y · z))

slide-10
SLIDE 10

Formulation in First-Order Logic

∀x(1 · x = x) Axioms (of group theory): ∀x(x−1 · x = 1) ∀x∀y∀z((x · y) · z = x · (y · z)) Assumptions: ∀x(x · x = 1) Conjecture: ∀x∀y(x · y = y · x)

slide-11
SLIDE 11

In the TPTP Syntax

The TPTP library (Thousands of Problems for Theorem Provers), http://www.tptp.org contains a large collection of first-order problems. For representing these problems it uses the TPTP syntax, which is understood by all modern theorem provers, including our Vampire prover.

slide-12
SLIDE 12

In the TPTP Syntax

The TPTP library (Thousands of Problems for Theorem Provers), http://www.tptp.org contains a large collection of first-order problems. For representing these problems it uses the TPTP syntax, which is understood by all modern theorem provers, including our Vampire prover. First-Order Logic (FOL) TPTP ⊥, ⊤ $false, $true ¬F ˜F F1 ∧ . . . ∧ Fn F1 & ... & Fn F1 ∨ . . . ∨ Fn F1 | ... | Fn F1 → Fn F1 => Fn (∀x1) . . . (∀xn)F ! [X1,...,Xn] : F (∃x1) . . . (∃xn)F ? [X1,...,Xn] : F

slide-13
SLIDE 13

Example in the TPTP Syntax

%---- 1 * x = x fof(left identity,axiom,( ! [X] : mult(e,X) = X )). %---- i(x) * x = 1 fof(left inverse,axiom,( ! [X] : mult(inverse(X),X) = e )). %---- (x * y) * z = x * (y * z) fof(associativity,axiom,( ! [X,Y,Z] : mult(mult(X,Y),Z) = mult(X,mult(Y,Z)) )). %---- x * x = 1 fof(group of order 2,hypothesis, ! [X] : mult(X,X) = e ). %---- prove x * y = y * x fof(commutativity,conjecture, ! [X,Y] : mult(X,Y) = mult(Y,X) ).

slide-14
SLIDE 14

Example in the TPTP Syntax

◮ Comments;

%---- 1 * x = x fof(left identity,axiom,( ! [X] : mult(e,X) = X )). %---- i(x) * x = 1 fof(left inverse,axiom,( ! [X] : mult(inverse(X),X) = e )). %---- (x * y) * z = x * (y * z) fof(associativity,axiom,( ! [X,Y,Z] : mult(mult(X,Y),Z) = mult(X,mult(Y,Z)) )). %---- x * x = 1 fof(group of order 2,hypothesis, ! [X] : mult(X,X) = e ). %---- prove x * y = y * x fof(commutativity,conjecture, ! [X,Y] : mult(X,Y) = mult(Y,X) ).

slide-15
SLIDE 15

Example in the TPTP Syntax

◮ Comments; ◮ Input formula names;

%---- 1 * x = x fof(left identity,axiom,( ! [X] : mult(e,X) = X )). %---- i(x) * x = 1 fof(left inverse,axiom,( ! [X] : mult(inverse(X),X) = e )). %---- (x * y) * z = x * (y * z) fof(associativity,axiom,( ! [X,Y,Z] : mult(mult(X,Y),Z) = mult(X,mult(Y,Z)) )). %---- x * x = 1 fof(group of order 2,hypothesis, ! [X] : mult(X,X) = e ). %---- prove x * y = y * x fof(commutativity,conjecture, ! [X,Y] : mult(X,Y) = mult(Y,X) ).

slide-16
SLIDE 16

Example in the TPTP Syntax

◮ Comments; ◮ Input formula names; ◮ Input formula roles (very important);

%---- 1 * x = x fof(left identity,axiom,( ! [X] : mult(e,X) = X )). %---- i(x) * x = 1 fof(left inverse,axiom,( ! [X] : mult(inverse(X),X) = e )). %---- (x * y) * z = x * (y * z) fof(associativity,axiom,( ! [X,Y,Z] : mult(mult(X,Y),Z) = mult(X,mult(Y,Z)) )). %---- x * x = 1 fof(group of order 2,hypothesis, ! [X] : mult(X,X) = e ). %---- prove x * y = y * x fof(commutativity,conjecture, ! [X,Y] : mult(X,Y) = mult(Y,X) ).

slide-17
SLIDE 17

Example in the TPTP Syntax

◮ Comments; ◮ Input formula names; ◮ Input formula roles (very important); ◮ Equality

%---- 1 * x = x fof(left identity,axiom,( ! [X] : mult(e,X) = X )). %---- i(x) * x = 1 fof(left inverse,axiom,( ! [X] : mult(inverse(X),X) = e )). %---- (x * y) * z = x * (y * z) fof(associativity,axiom,( ! [X,Y,Z] : mult(mult(X,Y),Z) = mult(X,mult(Y,Z)) )). %---- x * x = 1 fof(group of order 2,hypothesis, ! [X] : mult(X,X) = e ). %---- prove x * y = y * x fof(commutativity,conjecture, ! [X,Y] : mult(X,Y) = mult(Y,X) ).

slide-18
SLIDE 18

Running a Theorem Prover on a TPTP file

is easy: for example vampire <filename>

slide-19
SLIDE 19

Running a Theorem Prover on a TPTP file

is easy: for example vampire <filename> One can also run Vampire with various options. For example, save the group theory problem in a file group.tptp and try vampire group.tptp

slide-20
SLIDE 20

Running a Theorem Prover on a TPTP file

is easy: for example vampire <filename> One can also run Vampire with various options. For example, save the group theory problem in a file group.tptp and try vampire --thanks ECSS group.tptp

slide-21
SLIDE 21

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]
slide-22
SLIDE 22

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas;

slide-23
SLIDE 23

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus

slide-24
SLIDE 24

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus

slide-25
SLIDE 25

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus

slide-26
SLIDE 26

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus

slide-27
SLIDE 27

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus ◮ Proof by refutation, generating and simplifying inferences, unused formulas . . .

slide-28
SLIDE 28

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus ◮ Proof by refutation, generating and simplifying inferences, unused formulas . . .

slide-29
SLIDE 29

Proof by Vampire (Slightliy Modified)

Refutation found.

  • 270. $false [trivial inequality removal 269]
  • 269. mult(sk0,sk1) != mult (sk0,sk1) [superposition 14,125]
  • 125. mult(X2,X3) = mult(X3,X2) [superposition 21,90]
  • 90. mult(X4,mult(X3,X4)) = X3 [forward demodulation 75,27]
  • 75. mult(inverse(X3),e) = mult(X4,mult(X3,X4)) [superposition 22,19]
  • 27. mult(inverse(X2),e) = X2 [superposition 21,11]
  • 22. mult(inverse(X4),mult(X4,X5)) = X5 [forward demodulation 17,10]
  • 21. mult(X0,mult(X0,X1)) = X1 [forward demodulation 15,10]
  • 19. e = mult(X0,mult(X1,mult(X0,X1))) [superposition 12,13]
  • 17. mult(e,X5) = mult(inverse(X4),mult(X4,X5)) [superposition 12,11]
  • 15. mult(e,X1) = mult(X0,mult(X0,X1)) [superposition 12,13]
  • 14. mult(sK0,sK1) != mult(sK1,sK0) [cnf transformation 9]
  • 13. e = mult(X0,X0) [cnf transformation 4]
  • 12. mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [cnf transformation 3]
  • 11. e = mult(inverse(X0),X0) [cnf transformation 2]
  • 10. mult(e,X0) = X0 [cnf transformation 1]
  • 9. mult(sK0,sK1) != mult(sK1,sK0) [skolemisation 7,8]
  • 8. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) <=> mult(sK0,sK1) != mult(sK1,sK0)

[choice axiom]

  • 7. ?[X0,X1]: mult(X0,X1) != mult(X1,X0) [ennf transformation 6]
  • 6. ˜![X0,X1]: mult(X0,X1) = mult(X1,X0) [negated conjecture 5]
  • 5. ![X0,X1]: mult(X0,X1) = mult(X1,X0) [input]
  • 4. ![X0]: e = mult(X0,X0)[input]
  • 3. ![X0,X1,X2]: mult(X0,mult(X1,X2)) = mult(mult(X0,X1),X2) [input]
  • 2. ![X0]: e = mult(inverse(X0),X0) [input]
  • 1. ![X0]: mult(e,X0) = X0 [input]

◮ Each inference derives a formula from zero or more other formulas; ◮ Input, preprocessing, new symbols introduction, superposition calculus ◮ Proof by refutation, generating and simplifying inferences, unused formulas . . .

slide-30
SLIDE 30

Vampire

◮ Completely automatic: once you started a proof attempt, it can

  • nly be interrupted by terminating the process.
slide-31
SLIDE 31

Vampire

◮ Completely automatic: once you started a proof attempt, it can

  • nly be interrupted by terminating the process.

◮ Champion of the CASC world-cup in first-order theorem proving:

won CASC 38 times.

slide-32
SLIDE 32

What an Automated Theorem Prover is Expected to Do

Input:

◮ a set of axioms (first order formulas) or clauses; ◮ a conjecture (first-order formula or set of clauses).

Output:

◮ proof (hopefully).

slide-33
SLIDE 33

Proof by Refutation

Given a problem with axioms and assumptions F1, . . . , Fn and conjecture G,

  • 1. negate the conjecture (¬G);
  • 2. establish unsatisfiability of the set of formulas F1, . . . , Fn, ¬G.
slide-34
SLIDE 34

Proof by Refutation

Given a problem with axioms and assumptions F1, . . . , Fn and conjecture G,

  • 1. negate the conjecture (¬G);
  • 2. establish unsatisfiability of the set of formulas F1, . . . , Fn, ¬G.

Thus, we reduce the theorem proving problem to the problem of checking unsatisfiability.

slide-35
SLIDE 35

Proof by Refutation

Given a problem with axioms and assumptions F1, . . . , Fn and conjecture G,

  • 1. negate the conjecture (¬G);
  • 2. establish unsatisfiability of the set of formulas F1, . . . , Fn, ¬G.

Thus, we reduce the theorem proving problem to the problem of checking unsatisfiability. In this formulation the negation of the conjecture ¬G is treated like any other formula. In fact, Vampire (and other provers) internally treat conjectures differently, to make proof search more goal-oriented.

slide-36
SLIDE 36

General Scheme (simplified)

◮ Read a problem; ◮ Determine proof-search options to be used for this problem; ◮ Preprocess the problem; ◮ Convert it into a normal form (CNF); ◮ Run a saturation algorithm on it, try to derive false. ◮ If false is derived, report the result, maybe including a refutation.

slide-37
SLIDE 37

General Scheme (simplified)

◮ Read a problem; ◮ Determine proof-search options to be used for this problem; ◮ Preprocess the problem; ◮ Convert it into a normal form (CNF); ◮ Run a saturation algorithm on it, try to derive false. ◮ If false is derived, report the result, maybe including a refutation.

Trying to derive false using a saturation algorithm is the hardest part, which in practice may not terminate or run out of memory.

slide-38
SLIDE 38

How to Establish Unsatisfiability?

Idea:

◮ Take a set of clauses S (the search space), initially S = S0.

Repeatedly apply inferences to clauses in S and add their conclusions to S, unless these conclusions are already in S.

◮ If, at any stage, we obtain false, we terminate and report

unsatisfiability of S0.

slide-39
SLIDE 39

Saturation Algorithms

search space

slide-40
SLIDE 40

Saturation Algorithms

search space given clause

slide-41
SLIDE 41

Saturation Algorithms

search space given clause candidate clause

slide-42
SLIDE 42

Saturation Algorithms

search space given clause candidate clause children

slide-43
SLIDE 43

Saturation Algorithms

search space children

slide-44
SLIDE 44

Saturation Algorithms

search space children

slide-45
SLIDE 45

Saturation Algorithms

search space

slide-46
SLIDE 46

Saturation Algorithms

search space given clause

slide-47
SLIDE 47

Saturation Algorithms

search space given clause candidate clause

slide-48
SLIDE 48

Saturation Algorithms

search space given clause candidate clause children

slide-49
SLIDE 49

Saturation Algorithms

search space children

slide-50
SLIDE 50

Saturation Algorithms

search space children

slide-51
SLIDE 51

Saturation Algorithms

search space

slide-52
SLIDE 52

Saturation Algorithms

search space

slide-53
SLIDE 53

Saturation Algorithms

search space

MEMORY

slide-54
SLIDE 54

Saturation Algorithm in Practice

In practice there are three possible scenarios:

  • 1. At some moment false is generated, in this case the input set of

clauses is unsatisfiable.

  • 2. Saturation will terminate without ever generating false, in this

case the input set of clauses in satisfiable.

  • 3. Saturation will run until we run out of resources, but without

generating false. In this case it is unknown whether the input set is unsatisfiable.

slide-55
SLIDE 55

From Theory to Practice

In practice, saturation theorem provers implement:

◮ Preprocessing and CNF transformation; ◮ Superposition system; ◮ Orderings and selection functions; ◮ Fairness (saturation algorithms); ◮ Deletion and generation of clauses in the search space; ◮ Many, many proof options and stragegies

.

slide-56
SLIDE 56

From Theory to Practice

In practice, saturation theorem provers implement:

◮ Preprocessing and CNF transformation; ◮ Superposition system; ◮ Orderings and selection functions; ◮ Fairness (saturation algorithms); ◮ Deletion and generation of clauses in the search space; ◮ Many, many proof options and stragegies

.

slide-57
SLIDE 57

From Theory to Practice

In practice, saturation theorem provers implement:

◮ Preprocessing and CNF transformation; ◮ Superposition system; ◮ Orderings and selection functions; ◮ Fairness (saturation algorithms); ◮ Deletion and generation of clauses in the search space; ◮ Many, many proof options and stragegies

– example: limited resource strategy.

slide-58
SLIDE 58

From Theory to Practice

In practice, saturation theorem provers implement:

◮ Preprocessing and CNF transformation; ◮ Superposition system; ◮ Orderings and selection functions; ◮ Fairness (saturation algorithms); ◮ Deletion and generation of clauses in the search space; ◮ Many, many proof options and stragegies

– example: limited resource strategy. Try: vampire --age weight ratio 10:1

  • -time limit 86400

GRP140-1.p

slide-59
SLIDE 59

Outline

Automated Theorem Proving - An Overview Challenges of Automated Theorem Proving

slide-60
SLIDE 60

Automated Theorem Prover was Expected to Do

Input:

◮ a set of axioms (first order formulas) or clauses; ◮ a conjecture (first-order formula or set of clauses).

Output:

◮ proof (hopefully).

slide-61
SLIDE 61

What an Automated Theorem Prover is Expected to Do

Input:

◮ a set of axioms (first order formulas) or clauses; ◮ a conjecture (first-order formula or set of clauses).

Output:

◮ readable proof; ◮ relevant lemmas extracted from proofs; ◮ Craig interpolants extraced from software safety proofs; ◮ program analysis; ◮ invariant generation; ◮ inductive reasoning; ◮ Reasoning with first-order theories of data structures; ◮ . . .

slide-62
SLIDE 62

Chalmers

Chalmers

Laura Kovács

Focus of my Research:

Automated Reasoning for Program Analysis

(ex. ~200kLoC, Vampire prover)

slide-63
SLIDE 63

Chalmers

Chalmers

Laura Kovács

a=0, b=0, c=0; while (a<n) do if A[a]>0 then B[b]=A[a]+h(b); b=b+1; else C[c]=A[a]; c=c+1; a=a+1; end do

Focus of my Research:

Automated Reasoning for Program Analysis

slide-64
SLIDE 64

Chalmers

Chalmers

Laura Kovács

a=0, b=0, c=0; while (a<n) do if A[a]>0 then B[b]=A[a]+h(b); b=b+1; else C[c]=A[a]; c=c+1; a=a+1; end do

Program property: (∀p)(0≤p<b ⇒ (∃q)(0≤q<a ∧ B[p]=A[q]+h(p) ∧ A[q]>0)

Focus of my Research:

Automated Reasoning for Program Analysis

slide-65
SLIDE 65

Chalmers

Chalmers

Laura Kovács

a=0, b=0, c=0; while (a<n) do if A[a]>0 then B[b]=A[a]+h(b); b=b+1; else C[c]=A[a]; c=c+1; a=a+1; end do cnt=0, fib1=1, fib2=0; while (cnt<n) do t=fib1; fib1=fib1+fib2; fib2=t; cnt++; end do h

Focus of my Research:

Automated Reasoning for Program Analysis

slide-66
SLIDE 66

Chalmers

Chalmers

Laura Kovács

a=0, b=0, c=0; while (a<n) do if A[a]>0 then B[b]=A[a]+h(b); b=b+1; else C[c]=A[a]; c=c+1; a=a+1; end do cnt=0, fib1=1, fib2=0; while (cnt<n) do t=fib1; fib1=fib1+fib2; fib2=t; cnt++; end do h

Program property: fib14+ fib24 + 2*fib1*fib23 – 2 fib13*fib2 - fib12*fib22 -1 = 0

Focus of my Research:

Automated Reasoning for Program Analysis

slide-67
SLIDE 67

Chalmers

Chalmers

Laura Kovács

a=0, b=0, c=0; while (a<n) do if A[a]>0 then B[b]=A[a]+h(b); b=b+1; else C[c]=A[a]; c=c+1; a=a+1; end do cnt=0, fib1=1, fib2=0; while (cnt<n) do t=fib1; fib1=fib1+fib2; fib2=t; cnt++; end do h

Math Logic

fib14+ fib24 + 2*fib1*fib23 – 2 fib13*fib2 - fib12*fib22 -1 = 0 (∀p)(0≤p<b ⇒ (∃q)(0≤q<a ∧ B[p]=A[q]+h(p) ∧ A[q]>0)

Focus of my Research:

Automated Reasoning for Program Analysis

slide-68
SLIDE 68

Chalmers

Chalmers

Laura Kovács

Math Logic Program Analysis

My Research

Vampire prover

slide-69
SLIDE 69

Chalmers

Chalmers

Laura Kovács

Symbolic Computa:on Automated Theorem Proving Program Analysis

My Research

funded by: