Instantiation-Based Automated Theorem Proving for First-Order Logic - - PowerPoint PPT Presentation

instantiation based automated theorem proving for first
SMART_READER_LITE
LIVE PREVIEW

Instantiation-Based Automated Theorem Proving for First-Order Logic - - PowerPoint PPT Presentation

Instantiation-Based Automated Theorem Proving for First-Order Logic Konstantin Korovin The University of Manchester UK korovin@cs.man.ac.uk Theorem proving for first-order logic Theorem proving: Show that a given first-order formula is a


slide-1
SLIDE 1

Instantiation-Based Automated Theorem Proving for First-Order Logic

Konstantin Korovin

The University of Manchester UK korovin@cs.man.ac.uk

slide-2
SLIDE 2

Theorem proving for first-order logic

Theorem proving: Show that a given first-order formula is a theorem. Maths: Axioms of groups Group

◮ ∀x, y, z (x · (y · z) ≃ (x · y) · z) ◮ ∀x (x · x−1 ≃ e) ◮ ∀x (x · e ≃ x)

Consider F = ∀x∃y ((x · y)−1 ≃ y −1 · x−1) Is F a theorem in the group theory: Group | = F ?

2 / 1

slide-3
SLIDE 3

Theorem proving for first-order logic

Theorem proving: Show that a given first-order formula is a theorem. Maths: Axioms of groups Group

◮ ∀x, y, z (x · (y · z) ≃ (x · y) · z) ◮ ∀x (x · x−1 ≃ e) ◮ ∀x (x · e ≃ x)

Consider F = ∀x∃y ((x · y)−1 ≃ y −1 · x−1) Is F a theorem in the group theory: Group | = F ? Verification: Axioms of arrays

◮ ∀a, i, e (select(store(a, i, e), i) ≃ e) ◮ ∀a, i, j, e (i ≃ j → (select(store(a, i, e), j) ≃ select(a, j))) ◮ ∀a1, a2 ((∀i (select(a1, i) ≃ select(a2, i))) → a1 ≃ a2)

Is ∃a∃i∀j (select(a, i) ≃ select(a, j)) a theorem in the theory of arrays ?

3 / 1

slide-4
SLIDE 4

Why first-order logic

◮ Expressive most of mathematics can be formalised in FOL ◮ Complete calculi – uniform reasoning methods ◮ Efficient reasoning – well-understood algorithms and datastructures ◮ Reductions from HOL to FOL: Blanchette (Sledgehammer), Urban

(Mizar) FOL provides a good balance between expressivity and efficiency.

*

4 / 1

slide-5
SLIDE 5

Why first-order logic

◮ Expressive most of mathematics can be formalised in FOL ◮ Complete calculi – uniform reasoning methods ◮ Efficient reasoning – well-understood algorithms and datastructures ◮ Reductions from HOL to FOL: Blanchette (Sledgehammer), Urban

(Mizar) FOL provides a good balance between expressivity and efficiency.

* [”The Unreasonable Effectiveness of Mathematics in the Natural Sciences” E. Wigner]

5 / 1

slide-6
SLIDE 6

Why first-order logic

◮ Expressive most of mathematics can be formalised in FOL ◮ Complete calculi – uniform reasoning methods ◮ Efficient reasoning – well-understood algorithms and datastructures ◮ Reductions from HOL to FOL: Blanchette (Sledgehammer), Urban

(Mizar) FOL provides a good balance between expressivity and efficiency.

* [”The Unreasonable Effectiveness of Mathematics in the Natural Sciences” E. Wigner] [Unreasonable effectiveness of logic in computer science]

6 / 1

slide-7
SLIDE 7

Calculi for first-order logic

Calculi complete for first-order logic:

◮ natural deduction

◮ difficult to automate

◮ tableaux-based calculi

◮ popular with special fragments: modal and description logics ◮ difficult to automate efficiently in the general case

◮ resolution/superposition calculi

◮ general purpose ◮ can be efficiently automated ◮ decision procedure for many fragments

◮ instantiation-based calculi

◮ combination of efficient propositional reasoning with first-order

reasoning

◮ can be efficiently automated ◮ decision procedure for the effectively propositional fragment (EPR) 7 / 1

slide-8
SLIDE 8

Calculi for first-order logic

Calculi complete for first-order logic:

◮ natural deduction

◮ difficult to automate

◮ tableaux-based calculi

◮ popular with special fragments: modal and description logics ◮ difficult to automate efficiently in the general case

◮ resolution/superposition calculi

◮ general purpose ◮ can be efficiently automated ◮ decision procedure for many fragments

◮ instantiation-based calculi

◮ combination of efficient propositional reasoning with first-order

reasoning

◮ can be efficiently automated ◮ decision procedure for the effectively propositional fragment (EPR) 8 / 1

slide-9
SLIDE 9

Calculi for first-order logic

Calculi complete for first-order logic:

◮ natural deduction

◮ difficult to automate

◮ tableaux-based calculi

◮ popular with special fragments: modal and description logics ◮ difficult to automate efficiently in the general case

◮ resolution/superposition calculi

◮ general purpose ◮ can be efficiently automated ◮ decision procedure for many fragments

◮ instantiation-based calculi

◮ combination of efficient propositional reasoning with first-order

reasoning

◮ can be efficiently automated ◮ decision procedure for the effectively propositional fragment (EPR) 9 / 1

slide-10
SLIDE 10

Calculi for first-order logic

Calculi complete for first-order logic:

◮ natural deduction

◮ difficult to automate

◮ tableaux-based calculi

◮ popular with special fragments: modal and description logics ◮ difficult to automate efficiently in the general case

◮ resolution/superposition calculi

◮ general purpose ◮ can be efficiently automated ◮ decision procedure for many fragments

◮ instantiation-based calculi

◮ combination of efficient propositional reasoning with first-order

reasoning

◮ can be efficiently automated ◮ decision procedure for the effectively propositional fragment (EPR) 10 / 1

slide-11
SLIDE 11

Refutational theorem proving

Theorem proving: | = Axioms → Theorem Refutational theorem proving: Axioms ∧ ¬Theorem | = ⊥ Other reasoning problems: validity, equivalence etc can be reduced to (un)satisfiability In order to apply efficient reasoning methods we need to transform formulas into equi-satisfiable conjunctive normal form.

11 / 1

slide-12
SLIDE 12

CNF transformation

Main steps in the basic CNF transformation:

  • 1. Prenex normal form – moving all quantifiers up-front

∀y [∀x [p(f (x), y)] → ∀v∃z [q(f (z)) ∧ p(v, z)]] ⇒ ∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))]

12 / 1

slide-13
SLIDE 13

CNF transformation

Main steps in the basic CNF transformation:

  • 1. Prenex normal form – moving all quantifiers up-front

∀y [∀x [p(f (x), y)] → ∀v∃z [q(f (z)) ∧ p(v, z)]] ⇒ ∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))]

  • 2. Skolemization – eliminating existential quantifiers

∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))] ⇒ ∀y∀v [p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))]

13 / 1

slide-14
SLIDE 14

CNF transformation

Main steps in the basic CNF transformation:

  • 1. Prenex normal form – moving all quantifiers up-front

∀y [∀x [p(f (x), y)] → ∀v∃z [q(f (z)) ∧ p(v, z)]] ⇒ ∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))]

  • 2. Skolemization – eliminating existential quantifiers

∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))] ⇒ ∀y∀v [p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))]

  • 3. CNF transformation of the quantifier-free part

∀y∀v [ p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))] ⇒ ∀y∀v [ (¬p(f (sk1(y)), y) ∨ q(f (sk2(y, v)))) ∧ (¬p(f (sk1(y)), y) ∨ p(v, sk2(y, v)))]

14 / 1

slide-15
SLIDE 15

CNF transformation

Main steps in the basic CNF transformation:

  • 1. Prenex normal form – moving all quantifiers up-front

∀y [∀x [p(f (x), y)] → ∀v∃z [q(f (z)) ∧ p(v, z)]] ⇒ ∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))]

  • 2. Skolemization – eliminating existential quantifiers

∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))] ⇒ ∀y∀v [p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))]

  • 3. CNF transformation of the quantifier-free part

∀y∀v [ p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))] ⇒ ∀y∀v [ (¬p(f (sk1(y)), y) ∨ q(f (sk2(y, v)))) ∧ (¬p(f (sk1(y)), y) ∨ p(v, sk2(y, v)))]

15 / 1

slide-16
SLIDE 16

CNF transformation

Main steps in the basic CNF transformation:

  • 1. Prenex normal form – moving all quantifiers up-front

∀y [∀x [p(f (x), y)] → ∀v∃z [q(f (z)) ∧ p(v, z)]] ⇒ ∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))]

  • 2. Skolemization – eliminating existential quantifiers

∀y∃x∀v∃z [p(f (x), y) → (q(f (z)) ∧ p(v, z))] ⇒ ∀y∀v [p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))]

  • 3. CNF transformation of the quantifier-free part

∀y∀v [ p(f (sk1(y)), y) → (q(f (sk2(y, v))) ∧ p(v, sk2(y, v)))] ⇒ ∀y∀v [ (¬p(f (sk1(y)), y) ∨ q(f (sk2(y, v)))) ∧ (¬p(f (sk1(y)), y) ∨ p(v, sk2(y, v)))] Main reasoning problem: Given set of clauses S prove that it (un)satisfiable.

16 / 1

slide-17
SLIDE 17

Inference systems: propositional resolution

slide-18
SLIDE 18

Inference-based theorem proving

Given: S – set of clauses. Example: S = {q ∨ ¬p, p ∨ q, ¬q} We want to prove that S is unsatisfiable.

18 / 1

slide-19
SLIDE 19

Inference-based theorem proving

Given: S – set of clauses. Example: S = {q ∨ ¬p, p ∨ q, ¬q} We want to prove that S is unsatisfiable.

19 / 1

slide-20
SLIDE 20

Inference-based theorem proving

Given: S – set of clauses. Example: S = {q ∨ ¬p, p ∨ q, ¬q} We want to prove that S is unsatisfiable. General Idea:

◮ use a set of simple rules for deriving new logical consequences from

S.

◮ use these inference rules to derive the contradiction signified by the

empty clause

20 / 1

slide-21
SLIDE 21

Propositional Resolution

Propositional Resolution inference system BR, consists of the following inference rules:

◮ Binary Resolution Rule (BR):

C ∨ p ¬p ∨ D (BR) C ∨ D

◮ Binary Factoring Rule (BF):

C ∨ L ∨ L (BF) C ∨ L where L is a literal.

21 / 1

slide-22
SLIDE 22

Example

Given: S = {q ∨ ¬p, p ∨ q, ¬q} A proof in resolution calculus: q ∨ ¬p p ∨ q

(BR)

q ∨ q

(BF)

q ¬q

(BR)

  • 22 / 1
slide-23
SLIDE 23

Soundness/Completeness

Theorem (Soundness)

Resolution is a sound inference system: S ⊢BR implies S | = ⊥

23 / 1

slide-24
SLIDE 24

Soundness/Completeness

Theorem (Soundness)

Resolution is a sound inference system: S ⊢BR implies S | = ⊥

Theorem (Completeness)

Resolution is a complete inference system: S | = ⊥ implies S ⊢BR

24 / 1

slide-25
SLIDE 25

Proof search based on inference systems

Basic approach. A Saturation Process: Given set of clauses S we exhaustively apply all inference rules adding the conclusions to this set until the contradiction () is derived. S0 ⇒ S1 ⇒ . . . Sn ⇒ . . .

25 / 1

slide-26
SLIDE 26

Proof search based on inference systems

Basic approach. A Saturation Process: Given set of clauses S we exhaustively apply all inference rules adding the conclusions to this set until the contradiction () is derived. S0 ⇒ S1 ⇒ . . . Sn ⇒ . . . Three outcomes:

  • 1. is derived ( ∈ Sn for some n), then S is unsatisfiable

(soundness);

  • 2. no new clauses can be derived from S and ⊥ ∈ S, then S is

saturated; in this case S is satisfiable, (completeness).

  • 3. S grows ad infinitum, the process does not terminate.

26 / 1

slide-27
SLIDE 27

Proof search based on inference systems

Basic approach. A Saturation Process: Given set of clauses S we exhaustively apply all inference rules adding the conclusions to this set until the contradiction () is derived. S0 ⇒ S1 ⇒ . . . Sn ⇒ . . . Three outcomes:

  • 1. is derived ( ∈ Sn for some n), then S is unsatisfiable

(soundness);

  • 2. no new clauses can be derived from S and ⊥ ∈ S, then S is

saturated; in this case S is satisfiable, (completeness).

  • 3. S grows ad infinitum, the process does not terminate.

The main challenge: speed up the first two cases and reduce non-termination.

27 / 1

slide-28
SLIDE 28

First-order resolution

slide-29
SLIDE 29

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ∀x, y [¬p(x) ∨ ¬q(x, f (y))] . . . How to check if S is (un)satisfiable ?

29 / 1

slide-30
SLIDE 30

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ∀x, y [¬p(x) ∨ ¬q(x, f (y))] . . . How to check if S is (un)satisfiable ?

Theorem (Herbrand)

S is unsatisfiable if and only there is a finite set of ground instances of clauses in S which are propositionally unsatisfiable.

30 / 1

slide-31
SLIDE 31

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ∀x, y [¬p(x) ∨ ¬q(x, f (y))] . . . How to check if S is (un)satisfiable ?

Theorem (Herbrand)

S is unsatisfiable if and only there is a finite set of ground instances of clauses in S which are propositionally unsatisfiable. General approach: enumerate ground instances and apply resolution to the ground instances.

31 / 1

slide-32
SLIDE 32

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ¬p(z) ¬q(x, f (y)) How to check if S is (un)satisfiable ? Replace variables by ground terms and apply resolution: ¬q(a, f (a)) ¬q(b, f (f (a))) . . . ¬q(a, f (b))

32 / 1

slide-33
SLIDE 33

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ¬p(z) ¬q(x, f (y)) How to check if S is (un)satisfiable ? Replace variables by ground terms and apply resolution: ¬q(a, f (a)) ¬q(b, f (f (a))) . . . ¬q(a, f (b))

33 / 1

slide-34
SLIDE 34

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ¬p(z) ¬q(x, f (y)) How to check if S is (un)satisfiable ? Replace variables by ground terms and apply resolution: ¬q(a, f (a)) ¬q(b, f (f (a))) . . . ¬q(a, f (b)) p(a) (BR)

34 / 1

slide-35
SLIDE 35

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ¬p(z) ¬q(x, f (y)) How to check if S is (un)satisfiable ? Replace variables by ground terms and apply resolution: ¬q(a, f (a)) ¬q(b, f (f (a))) . . . ¬q(a, f (b)) p(a) (BR) ¬p(a)

35 / 1

slide-36
SLIDE 36

Herbrand theorem

First-order clauses S: p(a) ∨ q(a, f (b)) ¬p(z) ¬q(x, f (y)) How to check if S is (un)satisfiable ? Replace variables by ground terms and apply resolution: ¬q(a, f (a)) ¬q(b, f (f (a))) . . . ¬q(a, f (b)) p(a) (BR) ¬p(a)

  • (BR)

36 / 1

slide-37
SLIDE 37

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

37 / 1

slide-38
SLIDE 38

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

A common instance to which ground resolution is applicable: q(f (a), a) ∨ p(f (a)) and q(a, a) ∨ ¬p(f (a))

38 / 1

slide-39
SLIDE 39

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

A common instance to which ground resolution is applicable: q(f (a), a) ∨ p(f (a)) and q(a, a) ∨ ¬p(f (a))

◮ There are other ground instances e.g.:

q(f (f (a)), a) ∨ p(f (f (a))) and q(f (a), f (f (f (a))) ∨ ¬p(f (f (a))

39 / 1

slide-40
SLIDE 40

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

A common instance to which ground resolution is applicable: q(f (a), a) ∨ p(f (a)) and q(a, a) ∨ ¬p(f (a))

◮ There are other ground instances e.g.:

q(f (f (a)), a) ∨ p(f (f (a))) and q(f (a), f (f (f (a))) ∨ ¬p(f (f (a))

◮ In order to apply ground resolution we need find substitution which

make atoms p(x) and p(f (y)) syntactically equal.

40 / 1

slide-41
SLIDE 41

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

A common instance to which ground resolution is applicable: q(f (a), a) ∨ p(f (a)) and q(a, a) ∨ ¬p(f (a))

◮ There are other ground instances e.g.:

q(f (f (a)), a) ∨ p(f (f (a))) and q(f (a), f (f (f (a))) ∨ ¬p(f (f (a))

◮ In order to apply ground resolution we need find substitution which

make atoms p(x) and p(f (y)) syntactically equal.

◮ Such substitutions are called unifiers.

41 / 1

slide-42
SLIDE 42

Non-ground resolution

◮ A non-ground clause can be seen as representation of a (possibly

infinite) set of its ground instances.

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)).

A common instance to which ground resolution is applicable: q(f (a), a) ∨ p(f (a)) and q(a, a) ∨ ¬p(f (a))

◮ There are other ground instances e.g.:

q(f (f (a)), a) ∨ p(f (f (a))) and q(f (a), f (f (f (a))) ∨ ¬p(f (f (a))

◮ In order to apply ground resolution we need find substitution which

make atoms p(x) and p(f (y)) syntactically equal.

◮ Such substitutions are called unifiers. ◮ Even for two clauses there are infinite number of possible instances

to which resolution is applicable.

42 / 1

slide-43
SLIDE 43

Most general unifiers

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)) ◮ substitute σ = {x → f (y)} ◮ then q(f (y), a) ∨ p(f (y)) and q(y, z) ∨ ¬p(f (y)). ◮ Note:

  • 1. underlined atoms are syntactically equal
  • 2. any other substitution can be seen as an instance of σ

σ – most general unifier σ = mgu(p(x), p(f (y)))

  • 3. σ can be seen as a finite representation of all infinitely many

substitutions which makes terms equal.

43 / 1

slide-44
SLIDE 44

Most general unifiers

◮ Consider q(x, a) ∨ p(x) and q(y, z) ∨ ¬p(f (y)) ◮ substitute σ = {x → f (y)} ◮ then q(f (y), a) ∨ p(f (y)) and q(y, z) ∨ ¬p(f (y)). ◮ Note:

  • 1. underlined atoms are syntactically equal
  • 2. any other substitution can be seen as an instance of σ

σ – most general unifier σ = mgu(p(x), p(f (y)))

  • 3. σ can be seen as a finite representation of all infinitely many

substitutions which makes terms equal.

Theorem [Robinson 1965] If two atoms p(t(¯ x)) and p(s(¯ x)) have a common ground instance then there is a unique most general unifier σ, which can be effectively computed. Note p(t(¯ x))σ = p(s(¯ x))σ.

44 / 1

slide-45
SLIDE 45

First-order resolution:

◮ Resolution rule (BR):

C ∨ p ¬p′ ∨ D (BR) (C ∨ D)σ where σ = mgu(p, p′)

◮ Example:

q(x, a) ∨ p(x) q(y, z) ∨ ¬p(f (y)) (BR) q(f (y), a) ∨ q(y, z) where mgu(p(x), p(f (y))) = {x → f (y)}

45 / 1

slide-46
SLIDE 46

First-order resolution:

◮ Resolution rule (BR):

C ∨ p ¬p′ ∨ D (BR) (C ∨ D)σ where σ = mgu(p, p′)

◮ Example:

q(x, a) ∨ p(x) q(y, z) ∨ ¬p(f (y)) (BR) q(f (y), a) ∨ q(y, z) where mgu(p(x), p(f (y))) = {x → f (y)} Theorem [Bachmair, Ganzinger] Resolution with many refinements is complete for first-order logic.

46 / 1

slide-47
SLIDE 47

The magic of resolution

Resolution calculus with appropriate simplifications, selection functions and saturation strategies is a decision procedure for many fragments:

◮ monadic fragment [Bachmair, Ganzinger, Waldmann] ◮ modal logic translations [Hustadt, Schmidt] ◮ guarded fragment [Ganzinger, de Nivelle] ◮ two variable fragment [de Nivelle, Pratt-Hartmann] ◮ fluted fragment [Hustadt, Schmidt, Georgieva] ◮ many description logic fragments [Kazakov, Motik, Sattler, . . .] ◮ . . .

47 / 1

slide-48
SLIDE 48

The magic of resolution

Resolution calculus with appropriate simplifications, selection functions and saturation strategies is a decision procedure for many fragments:

◮ monadic fragment [Bachmair, Ganzinger, Waldmann] ◮ modal logic translations [Hustadt, Schmidt] ◮ guarded fragment [Ganzinger, de Nivelle] ◮ two variable fragment [de Nivelle, Pratt-Hartmann] ◮ fluted fragment [Hustadt, Schmidt, Georgieva] ◮ many description logic fragments [Kazakov, Motik, Sattler, . . .] ◮ . . . ◮ Original proofs of decidability for these fragments are based on

diverse, complicated, model theoretic arguments.

◮ Resolution-based methods provide practical procedures ◮ Vampire, E, SPASS are based on extensions resolution

48 / 1

slide-49
SLIDE 49

Modular instantiation-based reasoning

slide-50
SLIDE 50

SAT/SMT vs First-Order

The main reasoning problem: Check that a given a set of clauses S is (un)satisfiable. Ground (SAT/SMT) bv(a) ∨ mem(c, d) ¬bv(a) ∨ mem(d, c) Very efficient solvers Not very expressive CDCL/Congruence closure First-Order ∀x∃y ¬mem1(x, y) ∨ mem2(y, f (x)) bv(a) ∨ mem(d, c) Very expressive Ground: not as efficient Resolution/Superposition From ground to first-order: Efficient at ground + Expressive?

50 / 1

slide-51
SLIDE 51

Resolution weaknesses

Resolution : C ∨ L L′ ∨ D (C ∨ D)σ Example : Q(x) ∨ P(x) ¬P(a) ∨ R(y) Q(a) ∨ R(y) L1 ∨ C1 . . . Ln ∨ Cn Weaknesses:

◮ Inefficient in propositional case ◮ Proof search without model search ◮ Length of clauses can grow fast ◮ Recombination of clauses ◮ No effective model representation

51 / 1

slide-52
SLIDE 52

Basic idea behind instantiation proving

Can we approximate first-order by ground reasoning?

52 / 1

slide-53
SLIDE 53

Basic idea behind instantiation proving

Can we approximate first-order by ground reasoning?

Theorem (Herbrand). S is unsatisfiable if and only there is a finite set of ground instances of clauses of S which are propositionally unsatisfiable. Basic idea: Interleave instantiation with propositional reasoning. Main issues:

◮ How to restrict instantiations. ◮ How to interleave instantiation with propositional reasoning.

53 / 1

slide-54
SLIDE 54

Basic idea behind instantiation proving

Can we approximate first-order by ground reasoning?

Theorem (Herbrand). S is unsatisfiable if and only there is a finite set of ground instances of clauses of S which are propositionally unsatisfiable. Basic idea: Interleave instantiation with propositional reasoning. Main issues:

◮ How to restrict instantiations. ◮ How to interleave instantiation with propositional reasoning.

[Wang’59; Gilmore’60; Plaisted’92; Inst-Gen Ganzinger, Korovin; Model Evolution Baumgartner Tinelli; AVATAR Voronkov; SGGS Bonacina Plaisted; Weidenbach,. . . , SMT quantifier instantiations Ge, de Moura, Reynolds. . . ]

54 / 1

slide-55
SLIDE 55

Overview of the Inst-Gen procedure

First-Order Clauses S

55 / 1

slide-56
SLIDE 56

Overview of the Inst-Gen procedure

First-Order Clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥

56 / 1

slide-57
SLIDE 57

Overview of the Inst-Gen procedure

First-Order Clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ Theorem Proved S⊥ UnSAT

57 / 1

slide-58
SLIDE 58

Overview of the Inst-Gen procedure

First-Order Clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ Theorem Proved S⊥ UnSAT Igr | = L⊥, L′⊥ σ = mgu(L, L′) S⊥ SAT Igr | = S⊥

58 / 1

slide-59
SLIDE 59

Overview of the Inst-Gen procedure

First-Order Clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ Theorem Proved S⊥ UnSAT C ∨ L L′ ∨ D (C ∨ L)σ (L′ ∨ D)σ Igr | = L⊥, L′⊥ σ = mgu(L, L′) S⊥ SAT Igr | = S⊥

59 / 1

slide-60
SLIDE 60

Overview of the Inst-Gen procedure

First-Order Clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ Theorem Proved S⊥ UnSAT C ∨ L L′ ∨ D (C ∨ L)σ (L′ ∨ D)σ Igr | = L⊥, L′⊥ σ = mgu(L, L′) S⊥ SAT Igr | = S⊥ Theorem.(Ganzinger, Korovin) Inst-Gen is sound and complete for FOL.

60 / 1

slide-61
SLIDE 61

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x)

61 / 1

slide-62
SLIDE 62

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥)

62 / 1

slide-63
SLIDE 63

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥)

63 / 1

slide-64
SLIDE 64

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥) p(f (f (x)), b) ∨ q(f (x), y) ¬p(f (f (x)), b) p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x)

64 / 1

slide-65
SLIDE 65

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥) p(f (f (x)), b) ∨ q(f (x), y) ¬p(f (f (x)), b) p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (f (⊥)), b) ∨ q(f (⊥), ⊥) ¬p(f (f (⊥)), b) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥)

65 / 1

slide-66
SLIDE 66

Example:

p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥) p(f (f (x)), b) ∨ q(f (x), y) ¬p(f (f (x)), b) p(f (x), b) ∨ q(x, y) ¬p(f (f (x)), y) ¬q(f (x), x) p(f (f (⊥)), b) ∨ q(f (⊥), ⊥) ¬p(f (f (⊥)), b) p(f (⊥), b) ∨ q(⊥, ⊥) ¬p(f (f (⊥)), ⊥) ¬q(f (⊥), ⊥) The final set is propositionally unsatisfiable.

66 / 1

slide-67
SLIDE 67

Resolution vs Inst-Gen

Resolution : (C ∨ L) (L′ ∨ D) (C ∨ D)σ σ = mgu(L, L′) Instantiation : (C ∨ L) (L′ ∨ D) (C ∨ L)σ (L′ ∨ D)σ σ = mgu(L, L′) Weaknesses of resolution: Proof search without model search Inefficient in the ground/EPR case Length of clauses can grow fast Recombination of clauses No explicit model representation Strengths of instantiation: Proof search guided by prop. models Modular ground reasoning Length of clauses is fixed Decision procedure for EPR No recombination Redundancy elimination Effective model representation

67 / 1

slide-68
SLIDE 68

Redundancy Elimination (Inst-Gen)

The key to efficiency is redundancy elimination.

◮ usual: tautology elimination, strict subsumption ◮ global subsumption: non-ground simplifications using SAT/SMT

reasoning

◮ blocking non-proper instantiators ◮ dismatching constraints ◮ predicate elimination ◮ sort inference/redundancies ◮ definitional redundancies ◮ . . .

68 / 1

slide-69
SLIDE 69

Redundancy Elimination

The key to efficiency is redundancy elimination.

69 / 1

slide-70
SLIDE 70

Redundancy Elimination

The key to efficiency is redundancy elimination.

Ground clause C is redundant if

◮ C 1, . . . , Cn |

= C

◮ C 1, . . . , Cn ≺ C ◮ P(a) |

= Q(b) ∨ P(a)

◮ P(a) ≺ ✭✭✭✭✭

✭ Q(b) ∨ P(a) Where ≺ is a well-founded ordering.

70 / 1

slide-71
SLIDE 71

Redundancy Elimination

The key to efficiency is redundancy elimination.

Ground clause C is redundant if

◮ C 1, . . . , Cn |

= C

◮ C 1, . . . , Cn ≺ C ◮ P(a) |

= Q(b) ∨ P(a)

◮ P(a) ≺ ✭✭✭✭✭

✭ Q(b) ∨ P(a) Where ≺ is a well-founded ordering. Theorem Redundant clauses/closures can be eliminated. Consequences:

◮ many usual redundancy elimination techniques ◮ redundancy for inferences ◮ new instantiation-specific redundancies

71 / 1

slide-72
SLIDE 72

Simplifications by SAT/SMT solver (K. IJCAR’08)

Can off-the-shelf ground solver be used to simplify ground clauses?

72 / 1

slide-73
SLIDE 73

Simplifications by SAT/SMT solver (K. IJCAR’08)

Can off-the-shelf ground solver be used to simplify ground clauses? Abstract redundancy: C1, . . . , Cn | = C C1, . . . , Cn ≺ C Sgr | = C — ground solver follows from smaller ?

73 / 1

slide-74
SLIDE 74

Simplifications by SAT/SMT solver (K. IJCAR’08)

Can off-the-shelf ground solver be used to simplify ground clauses? Abstract redundancy: C1, . . . , Cn | = C C1, . . . , Cn ≺ C Sgr | = C — ground solver follows from smaller ? Basic idea:

◮ split D ⊂ C ◮ check Sgr |

= D

◮ add D to S and remove C

74 / 1

slide-75
SLIDE 75

Simplifications by SAT/SMT solver (K. IJCAR’08)

Can off-the-shelf ground solver be used to simplify ground clauses? Abstract redundancy: C1, . . . , Cn | = C C1, . . . , Cn ≺ C Sgr | = C — ground solver follows from smaller ? Basic idea:

◮ split D ⊂ C ◮ check Sgr |

= D

◮ add D to S and remove C

Global ground subsumption: ✘✘✘ ✘ D ∨ C ′ D where Sgr | = D and C ′ = ∅

75 / 1

slide-76
SLIDE 76

Global Ground Subsumption

Sgr ¬Q(a, b) ∨ P(a) ∨ P(b) P(a) ∨ Q(a, b) ¬P(b) C P(a) ∨ Q(c, d) ∨ Q(a, c)

76 / 1

slide-77
SLIDE 77

Global Ground Subsumption

Sgr ¬Q(a, b) ∨ P(a) ∨ P(b) P(a) ∨ Q(a, b) ¬P(b) C P(a) ∨ Q(c, d) ∨✘✘✘ ✘ Q(a, c)

77 / 1

slide-78
SLIDE 78

Global Ground Subsumption

Sgr ¬Q(a, b) ∨ P(a) ∨ P(b) P(a) ∨ Q(a, b) ¬P(b) C P(a) ∨✘✘✘ ✘ Q(c, d) ∨✘✘✘ ✘ Q(a, c) A minimal D ⊂ C such that Sgr | = D can be found in a linear number of implication checks.

78 / 1

slide-79
SLIDE 79

Global Ground Subsumption

Sgr ¬Q(a, b) ∨ P(a) ∨ P(b) P(a) ∨ Q(a, b) ¬P(b) C P(a) ∨✘✘✘ ✘ Q(c, d) ∨✘✘✘ ✘ Q(a, c) A minimal D ⊂ C such that Sgr | = D can be found in a linear number of implication checks. Global Ground Subsumption generalises:

◮ strict subsumption ◮ subsumption resolution ◮ . . .

79 / 1

slide-80
SLIDE 80

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses?

80 / 1

slide-81
SLIDE 81

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses? Yes!

81 / 1

slide-82
SLIDE 82

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses? Yes! The main idea: Sgr | = ∀¯ xC(¯ x)

82 / 1

slide-83
SLIDE 83

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses? Yes! The main idea: Sgr | = ∀¯ xC(¯ x) Sgr | = C( ¯ d) for fresh ¯ d

83 / 1

slide-84
SLIDE 84

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses? Yes! The main idea: Sgr | = ∀¯ xC(¯ x) C1(¯ x), . . . , Cn(¯ x) ∈ S Sgr | = C( ¯ d) for fresh ¯ d C1( ¯ d), . . . , Cn( ¯ d) | = C( ¯ d)

84 / 1

slide-85
SLIDE 85

Non-ground simplifications by SAT/SMT (K. IJCAR’08)

Off-the-shelf SAT solver can be used to simplify ground clauses. Can we also use SAT solver to simplify non-ground clauses? Yes! The main idea: Sgr | = ∀¯ xC(¯ x) C1(¯ x), . . . , Cn(¯ x) ∈ S C1(¯ x), . . . , Cn(¯ x) ≺ C(¯ x) Sgr | = C( ¯ d) for fresh ¯ d C1( ¯ d), . . . , Cn( ¯ d) | = C( ¯ d) as in Global Subsumption Non-Ground Global Subsumption

85 / 1

slide-86
SLIDE 86

Non-Ground Global Subsumption

S ¬P(x) ∨ Q(x) ¬Q(x) ∨ S(x, y) P(x) ∨ S(x, y) C S(x, y) ∨ Q(x) Simplify first-order by purely ground reasoning!

86 / 1

slide-87
SLIDE 87

Non-Ground Global Subsumption

S ¬P(x) ∨ Q(x) ¬Q(x) ∨ S(x, y) P(x) ∨ S(x, y) C S(x, y) ∨ Q(x) Sgr ¬P(a) ∨ Q(a) ¬Q(a) ∨ S(a, b) P(a) ∨ S(a, b) Cgr S(a, b) ∨ Q(a) Simplify first-order by purely ground reasoning!

87 / 1

slide-88
SLIDE 88

Non-Ground Global Subsumption

S ¬P(x) ∨ Q(x) ¬Q(x) ∨ S(x, y) P(x) ∨ S(x, y) C S(x, y) ∨ Q(x) Sgr ¬P(a) ∨ Q(a) ¬Q(a) ∨ S(a, b) P(a) ∨ S(a, b) Cgr S(a, b) ∨✟✟ ✟ Q(a) Simplify first-order by purely ground reasoning!

88 / 1

slide-89
SLIDE 89

Non-Ground Global Subsumption

S ¬P(x) ∨ Q(x) ¬Q(x) ∨ S(x, y) P(x) ∨ S(x, y) C S(x, y) ∨✟✟ ✟ Q(x) Sgr ¬P(a) ∨ Q(a) ¬Q(a) ∨ S(a, b) P(a) ∨ S(a, b) Cgr S(a, b) ∨✟✟ ✟ Q(a) Simplify first-order by purely ground reasoning!

89 / 1

slide-90
SLIDE 90

Non-Ground Global Subsumption

S ¬P(x) ∨ Q(x) ✭✭✭✭✭✭✭ ✭ ¬Q(x) ∨ S(x, y) ✭✭✭✭✭✭ ✭ P(x) ∨ S(x, y) C S(x, y) ∨✟✟ ✟ Q(x) Sgr ¬P(a) ∨ Q(a) ✭✭✭✭✭✭✭ ¬Q(a) ∨ S(a, b) ✭✭✭✭✭✭ ✭ P(a) ∨ S(a, b) Cgr S(a, b) ∨✟✟ ✟ Q(a) Simplify first-order by purely ground reasoning!

90 / 1

slide-91
SLIDE 91

Inst-Gen summary

Inst-Gen modular instantiation based reasoning for first-order logic.

◮ Inst-Gen combines efficient ground reasoning with first-order

reasoning

◮ sound and complete for first-order logic ◮ decision procedure for effectively propositional logic (EPR) ◮ redundancy elimination

◮ strict subsumption, subsumption resolution ◮ global subsumption:

non-ground simplifications using SAT/SMT reasoning

◮ dismatching constraints ◮ preprocessing: ◮ predicate elimination ◮ sort inference: EPR and non-cyclic sorts ◮ semantic filter ◮ definition inference 91 / 1

slide-92
SLIDE 92

Equational instantiation-based reasoning

slide-93
SLIDE 93

Equality and Paramodulation

Superposition calculus: C ∨ s ≃ t L[s′] ∨ D (C ∨ D ∨ L[t])θ

where (i) θ = mgu(s, s′), (ii) s′ is not a variable, (iii) sθσ ≻ tθσ , (iv) . . .

The same weaknesses as resolution has:

◮ Inefficient in the ground/EPR case ◮ Length of clauses can grow fast ◮ Recombination of clauses ◮ No explicit model representation

93 / 1

slide-94
SLIDE 94

Equality Superposition vs Inst-Gen

Superposition C ∨ l ≃ r L[l′] ∨ D (C ∨ D ∨ L[r])θ θ = mgu(l, l′) Instantiation? C ∨ l ≃ r L[l′] ∨ D (C ∨ l ≃ r)θ (L[l′] ∨ D)θ θ = mgu(l, l′)

94 / 1

slide-95
SLIDE 95

Equality Superposition vs Inst-Gen

Superposition C ∨ l ≃ r L[l′] ∨ D (C ∨ D ∨ L[r])θ θ = mgu(l, l′) Instantiation? C ∨ l ≃ r L[l′] ∨ D (C ∨ l ≃ r)θ (L[l′] ∨ D)θ θ = mgu(l, l′) Incomplete !

95 / 1

slide-96
SLIDE 96

Superposition+Instantiation

f (h(y)) ≃ c h(x) ≃ x f (a) ≃ c This set is inconsistent but the contradiction is not deducible by the inference system above.

96 / 1

slide-97
SLIDE 97

Superposition+Instantiation

f (h(y)) ≃ c h(x) ≃ x f (a) ≃ c This set is inconsistent but the contradiction is not deducible by the inference system above. The idea is to consider proofs generated by unit superposition: h(x) ≃ x f (h(y)) ≃ c f (x) ≃ c f (a) ≃ c c ≃ c

  • 97 / 1
slide-98
SLIDE 98

Superposition+Instantiation

f (h(y)) ≃ c h(x) ≃ x f (a) ≃ c This set is inconsistent but the contradiction is not deducible by the inference system above. The idea is to consider proofs generated by unit superposition: h(x) ≃ x f (h(y)) ≃ c f (x) ≃ c [x/y] f (a) ≃ c c ≃ c [a/x]

  • 98 / 1
slide-99
SLIDE 99

Superposition+Instantiation

f (h(y)) ≃ c h(x) ≃ x f (a) ≃ c This set is inconsistent but the contradiction is not deducible by the inference system above. The idea is to consider proofs generated by unit superposition: h(x) ≃ x f (h(y)) ≃ c f (x) ≃ c [x/y] f (a) ≃ c c ≃ c [a/x]

  • Propagating substitutions:

{h(a) ≃ a; f (h(a)) ≃ c; f (a) ≃ c} ground unsatisfiable.

99 / 1

slide-100
SLIDE 100

Superposition+Instantiation

f (h(y)) ≃ c ∨ C1(y, u) h(x) ≃ x ∨ C2(x, v) f (a) ≃ c ∨ C3(e) This set is inconsistent but the contradiction is not deducible by the inference system above. The idea is to consider proofs generated by unit superposition: h(x) ≃ x f (h(y)) ≃ c f (x) ≃ c [x/y] f (a) ≃ c c ≃ c [a/x]

  • Propagating substitutions:

{h(a) ≃ a; f (h(a)) ≃ c; f (a) ≃ c} ground unsatisfiable.

100 / 1

slide-101
SLIDE 101

Superposition+Instantiation

f (h(y)) ≃ c ∨ C1(y, u) h(x) ≃ x ∨ C2(x, v) f (a) ≃ c ∨ C3(e) f (h(a)) ≃ c ∨ C1(a, u) h(a) ≃ a ∨ C2(a, v) f (a) ≃ c ∨ C3(e) This set is inconsistent but the contradiction is not deducible by the inference system above. The idea is to consider proofs generated by unit superposition: h(x) ≃ x f (h(y)) ≃ c f (x) ≃ c [x/y] f (a) ≃ c c ≃ c [a/x]

  • Propagating substitutions:

{h(a) ≃ a; f (h(a)) ≃ c; f (a) ≃ c} ground unsatisfiable.

101 / 1

slide-102
SLIDE 102

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S

102 / 1

slide-103
SLIDE 103

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥

103 / 1

slide-104
SLIDE 104

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT

104 / 1

slide-105
SLIDE 105

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

= L⊥ S⊥ SAT I⊥ | = S⊥

105 / 1

slide-106
SLIDE 106

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

= L⊥ S⊥ SAT I⊥ | = S⊥

  • Inst. gen.

from UP proofs L ⊢

106 / 1

slide-107
SLIDE 107

Inst-Gen-Eq instantiation-based equational reasoning

f.-o. clauses S Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

= L⊥ S⊥ SAT I⊥ | = S⊥

  • Inst. gen.

from UP proofs L ⊢ S satisfiable L ⊢

  • Theorem. Inst-Gen-Eq is sound and complete.

107 / 1

slide-108
SLIDE 108

Inst-Gen-Eq: Key properties

Inst-Gen-Eq:

◮ combines SMT for ground reasoning and superposition-based unit

reasoning

◮ sound and complete for first-order logic with equality ◮ unit superposition does not have weaknesses of the general

superposition

◮ all redundancy elimination techniques from Inst-Gen are applicable

to Inst-Gen-Eq

◮ redundancy elimination become more powerful: now we can use

SMT to simplify first-order rather than SAT

108 / 1

slide-109
SLIDE 109

Theory instantiation

slide-110
SLIDE 110

Theory instantiation

f.-o. clauses S theory T

110 / 1

slide-111
SLIDE 111

Theory instantiation

f.-o. clauses S theory T Ground Clauses S⊥ ⊥ : ¯ x → ⊥

111 / 1

slide-112
SLIDE 112

Theory instantiation

f.-o. clauses S theory T Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT

112 / 1

slide-113
SLIDE 113

Theory instantiation

f.-o. clauses S theory T Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

=T L⊥ S⊥ SAT I⊥ | =T S⊥

113 / 1

slide-114
SLIDE 114

Theory instantiation

f.-o. clauses S theory T Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

=T L⊥ S⊥ SAT I⊥ | =T S⊥ L1 ∨ C1, . . . , Ln ∨ Cn (L1 ∨ C1)θ, . . . , (Ln ∨ Cn)θ L1θ⊥ ∧ . . . ∧ Lnθ⊥ | =T 0 L ⊢T

114 / 1

slide-115
SLIDE 115

Theory instantiation

f.-o. clauses S theory T Ground Clauses S⊥ ⊥ : ¯ x → ⊥ theorem proved S⊥ UnSAT Semantic selection

  • f literals I⊥ |

=T L⊥ S⊥ SAT I⊥ | =T S⊥ L1 ∨ C1, . . . , Ln ∨ Cn (L1 ∨ C1)θ, . . . , (Ln ∨ Cn)θ L1θ⊥ ∧ . . . ∧ Lnθ⊥ | =T 0 L ⊢T S satisfiable L ⊢T

115 / 1

slide-116
SLIDE 116

Implementation

slide-117
SLIDE 117

iProver general features

iProver an instantiation-based theorem prover for FOL based on Inst-Gen.

◮ Proof search guided by SAT solver ◮ Redundancy elimination global subsumption, dismatching

constraints, predicate elimination, semantic filtering, splitting. . .

◮ Indexing techniques for inferences and simplifications ◮ Sort inference, non-cyclic sorts ◮ Combination with resolution ◮ Finite model finding based on EPR/sort inference/non-cyclic sorts ◮ Bounded model checking and k-induction ◮ QBF and bit-vectors ◮ Planning ◮ Query answering ◮ Proof representation: non-trivial due to global solver simplifications ◮ Model representation: using definitional extensions

117 / 1

slide-118
SLIDE 118

Inst-Gen Loop

Passive (Queues) Given Clause

  • simpl. II

SAT passive empty Active (Unif. Index) literal selection change Instantiation Inferences Unprocessed

  • simpl. I

Input SAT Solver grounding Unsatisfiable unsat sat, propositional model literal selection

118 / 1

slide-119
SLIDE 119

CASC 2018

EPR: iProver Vampire E LEO-III prob solved 133 128 27 17 First-order SAT: Vampire iProver CVC4 E prob solved 191 137 116 38

119 / 1

slide-120
SLIDE 120

Applications and the EPR fragment

slide-121
SLIDE 121

Effectively Propositional Logic (EPR)

EPR: ∃∗∀∗ fragment of first-order logic EPR after Skolemization: No functions except constants P(x, y, d) ∨ ¬Q(c, y, x)

121 / 1

slide-122
SLIDE 122

Effectively Propositional Logic (EPR)

EPR: ∃∗∀∗ fragment of first-order logic EPR after Skolemization: No functions except constants P(x, y, d) ∨ ¬Q(c, y, x) Transitivity: ¬P(x, y) ∨ ¬P(y, z) ∨ P(x, z) Symmetry: P(x, y) ∨ ¬P(y, x) Verification: ∀A(wrenh1 ∧ A = wraddrFunc → ∀B(range[35,0](B) → (imem′(A, B) ↔ iwrite(B)))).

Applications:

◮ Hardware verification: bounded model checking/bit-vectors ◮ Program verification: linked data structures (Sagiv) ◮ Planning/Scheduling ◮ Knowledge representation ◮ Finite model finding

EPR is hard for resolution, but decidable by instantiation methods.

122 / 1

slide-123
SLIDE 123

Hardware verification

Functional Equivalence Checking

◮ The same functional behaviour can be implemented in different ways ◮ Optimised for:

◮ Timing – better performance ◮ Power – longer battery life ◮ Area – smaller chips

◮ Verification: optimisations do not change functional behaviour

Method of choice: Bounded Model Checking (BMC)

Biere, Cimatti, Clarke, Zhu (TACAS’99)

123 / 1

slide-124
SLIDE 124

SAT-based bounded model checking

c a b g d

Symbolic representation:

I = (a0 ↔ ¬c0) ∧ (c0 → b0) (g0 ↔ a0 ∧ b0) ∧ (d0 ↔ ¬g0 ∧ ¬c0) T = a′ ↔ a ∧ b′ ↔ b ∧ g ′ ↔ a′ ∧ b′ ∧ c′ ↔ d ∧ d′ ↔ ¬c′ ∧ ¬g ′ P = (d ↔ ¬g)

124 / 1

slide-125
SLIDE 125

SAT-based bounded model checking (unrolling)

I0 . . .

a0 b0 c0 g0 d0 a1 b1 c1 g1 d1 ak bk ck gk dk ¬Pk

The system is unsafe if and only if I0 ∧ T<1,2> ∧ . . . ∧ T<k−1,k> ∧ ¬Pk is satisfiable for some k.

  • A. Biere, A. Cimatti, E. Clarke, Y. Zhu (TACAS’99)

125 / 1

slide-126
SLIDE 126

EPR-based BMC

EPR encoding:

◮ EPR formulas Finit(S), Ftarget(S), Fnext(S, S′) ◮ encoding predicates init(S), target(S), next(S, S′)

Transition system: ∀S [init(S) → Finit(S)] (1) ∀S, S′ [next(S, S′) → Fnext(S, S′)] (2) ∀S [target(S) ↔ Ftarget(S)] (3) BMC: init(s0) ∧ next(s0, s1) ∧ . . . ∧ next(sn−1, sn) ∧ ¬target(sn)

126 / 1

slide-127
SLIDE 127

EPR-based BMC

EPR encoding:

◮ EPR formulas Finit(S), Ftarget(S), Fnext(S, S′) ◮ encoding predicates init(S), target(S), next(S, S′)

Transition system: ∀S [init(S) → Finit(S)] (1) ∀S, S′ [next(S, S′) → Fnext(S, S′)] (2) ∀S [target(S) ↔ Ftarget(S)] (3) BMC: init(s0) ∧ next(s0, s1) ∧ . . . ∧ next(sn−1, sn) ∧ ¬target(sn)

◮ EPR encoding provides succinct representation ◮ avoids copying transition relation ◮ reasoning can be done at higher level ◮ major challenge: hardware designs are very large and complex

127 / 1

slide-128
SLIDE 128

Word level

==

wraddr[5:0] rdaddr[5:0] cacheline[63:0] memory m u x wrdata[63:0]

circuit

rden wren clock sel

  • utp[63:0]

rddata[63:0]

∀S, S’(next(S, S’) → // write is enabled ∀y(Assocwraddr(S’, y) → ∀A(clock(S’) ∧ wren(S’) ∧ A = y → ∀B(range[0,63](B) → (mem(S’, A, B) ↔ wrdata(S, B)))))).

BMC with memories and bit-vectors first-order predicates: mem(S, A, B), wrdata(S, B).

  • M. Emmer, Z. Khasidashvili, K. Korovin, C. Sticksel, A. Voronkov IJCAR’12

128 / 1

slide-129
SLIDE 129

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain?

129 / 1

slide-130
SLIDE 130

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn)

130 / 1

slide-131
SLIDE 131

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn) General lemmas: ¬bv1(x) ∨ bv2(x) ¬bv2(x) ∨ mem(x, y) bv1(x) ∨ mem(x, y)

131 / 1

slide-132
SLIDE 132

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn) General lemmas: ¬bv1(x) ∨ bv2(x) ✭✭✭✭✭✭✭✭ ✭ ¬bv2(x) ∨ mem(x, y) ✭✭✭✭✭✭✭✭ bv1(x) ∨ mem(x, y) mem(x, y)

132 / 1

slide-133
SLIDE 133

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn) General lemmas: ¬bv1(x) ∨ bv2(x) ✭✭✭✭✭✭✭✭ ✭ ¬bv2(x) ∨ mem(x, y) ✭✭✭✭✭✭✭✭ bv1(x) ∨ mem(x, y) mem(x, y)

133 / 1

slide-134
SLIDE 134

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn) General lemmas: ¬bv1(x) ∨ bv2(x) ✭✭✭✭✭✭✭✭ ✭ ¬bv2(x) ∨ mem(x, y) ✭✭✭✭✭✭✭✭ bv1(x) ∨ mem(x, y) mem(x, y) Quantified invariants: ∀s∀x [cond(s, x) → prop(s, x)]

134 / 1

slide-135
SLIDE 135

Properties of EPR

Direct reduction to SAT — exponential blow-up. Satisfiability for EPR is NEXPTIME-complete. More succinct but harder to solve.... Any gain? Yes: Reasoning can be done at a more general level. Restricting instances: ¬mem(a1, x1) ∨ ¬mem(a2, x2) ∨ . . . ¬mem(an, xn) mem(b1, x1) ∨ mem(b2, x2) ∨ . . . ∨ mem(bn, xn) General lemmas: ¬bv1(x) ∨ bv2(x) ✭✭✭✭✭✭✭✭ ✭ ¬bv2(x) ∨ mem(x, y) ✭✭✭✭✭✭✭✭ bv1(x) ∨ mem(x, y) mem(x, y) Quantified invariants: ∀s∀x [cond(s, x) → prop(s, x)] Using more expressive logics can speed up reasoning!

135 / 1

slide-136
SLIDE 136

Experiments: iProver vs Intel BMC

Problem # Memories # Transient BVs Intel BMC iProver BMC ROB2 2 (4704 bits) 255 (3479 bits) 50 8 DCC2 4 (8960 bits) 426 (1844 bits) 8 11 DCC1 4 (8960 bits) 1827 (5294 bits) 7 8 DCI1 32 (9216 bits) 3625 (6496 bits) 6 4 BPB2 4 (10240 bits) 550 (4955 bits) 50 11 SCD2 2 (16384 bits) 80 (756 bits) 4 14 SCD1 2 (16384 bits) 556 (1923 bits) 4 12 PMS1 8 (46080 bits) 1486 (6109 bits) 2 10 Large memories: iProver performs well compared to highly optimised Intel SAT-based model checker.

136 / 1

slide-137
SLIDE 137

From bounded to unbounded model checking EPR-based k-induction

slide-138
SLIDE 138

EPR-based k-induction

Base case: init(s0) ∧ target(s0) ∧ next(s0, s1) ∧ . . . ∧ next(sk−1, sk) ∧ ¬target(sk) Bad states are not reachable in ≤ k steps. Induction case: target(s0) ∧ next(s0, s1) ∧ . . . ∧ target(sk) ∧ next(sn, sk+1) ∧ ¬target(sk+1) Assume that bad states are not reachable in ≤ k steps then bad states are not reachable in k + 1 steps.

138 / 1

slide-139
SLIDE 139

EPR-based k-induction

Base case: init(s0) ∧ target(s0) ∧ next(s0, s1) ∧ . . . ∧ next(sk−1, sk) ∧ ¬target(sk) Bad states are not reachable in ≤ k steps. Induction case: target(s0) ∧ next(s0, s1) ∧ . . . ∧ target(sk) ∧ next(sn, sk+1) ∧ ¬target(sk+1) Assume that bad states are not reachable in ≤ k steps then bad states are not reachable in k + 1 steps. Visited states are non-equivalent ∀S, S′ (S ≡p S′ → ∃¯ x [p(S, ¯ x) ↔ ¬p(S′, ¯ x)]) ∀S, S′ (S ≡Σ S′ →

p∈Σ S ≡p S′)

  • 0≤i≤j≤k si ≡Σ sj
  • Z. Khasidashvili, K. Korovin, D. Tsarkov (EPR k-induction)

139 / 1

slide-140
SLIDE 140

QBF to EPR

slide-141
SLIDE 141

QBF to EPR

QBF: ∀x1∃y1∀x2∃y2 [x1 ∨ y1 ∨ ¬y2 ∧ . . .] First-order: Domain: {1, 0}; p(1); ¬p(0) ∀x1∃y1∀x2∃y2 [p(x1) ∨ p(y1) ∨ ¬p(y2) ∧ . . .] Skolemize: ∀x1∀x2 [p(x1) ∨ p(sk1(x1)) ∨ ¬p(sk2(x1, x2)) ∧ . . .] EPR: Replace Skolem functions with predicates: ∀x1∀x2 [p(x1) ∨ psk1(x1) ∨ ¬psk2(x1, x2) ∧ . . .]

  • M. Seidl, F. Lonsing, A. Biere (PAAR’12)

141 / 1

slide-142
SLIDE 142

BV with log-encoded width to EPR

142 / 1

slide-143
SLIDE 143

BV with log-encoded width to EPR

1 . . . . . . 1 65 2n Encode bit indexes in binary using n bits: E.g. ¬bv(0, . . . , 0, 1, 0, 0, 0, 0, 1

  • n

) represents value 0 at index 65. Succinct encodings of bit-vector operations avoiding bit-blasting: bv and, bv or, bv shl, bv shr, bv mult, bv add, . . ..

  • G. Kov´

asznai, A. Fr¨

  • hlich, and A. Biere (CADE’13)

143 / 1

slide-144
SLIDE 144

What’s next ? Abstraction refinement reasoning

slide-145
SLIDE 145

Large theories in TPTP

TPTP large theories benchmarks:

◮ Mizar – formalising mathematics ◮ Isabelle, HOL 4, HOL Light

translation of higher order problems from different domains into FOL

◮ CakeML – verification ◮ Cyc/SUMO – large first-order ontologies

Many of these benchmarks contain hundreds of thousand of axioms.

145 / 1

slide-146
SLIDE 146

Large theories in TPTP

TPTP large theories benchmarks:

◮ Mizar – formalising mathematics ◮ Isabelle, HOL 4, HOL Light

translation of higher order problems from different domains into FOL

◮ CakeML – verification ◮ Cyc/SUMO – large first-order ontologies

Many of these benchmarks contain hundreds of thousand of axioms.

146 / 1

slide-147
SLIDE 147

Large theories in TPTP

TPTP large theories benchmarks:

◮ Mizar – formalising mathematics ◮ Isabelle, HOL 4, HOL Light

translation of higher order problems from different domains into FOL

◮ CakeML – verification ◮ Cyc/SUMO – large first-order ontologies

Many of these benchmarks contain hundreds of thousand of axioms. Observation: large number of axioms is only one indication of complexity.

147 / 1

slide-148
SLIDE 148

QBF benchmarks

148 / 1

slide-149
SLIDE 149

HOL benchmarks

149 / 1

slide-150
SLIDE 150

Reasoning with large theories: axiom selection

Previous approaches: select “relevant axioms”

◮ Semantic or syntactic structure

◮ SRASS ◮ SInE

◮ Machine learning

◮ MaLARea

◮ Two phases

◮ Axiom selection ◮ Reasoning

Axiom selection phase Selected axioms Reasoning phase

150 / 1

slide-151
SLIDE 151

Reasoning with large theories: axiom selection

Previous approaches: select “relevant axioms”

◮ Semantic or syntactic structure

◮ SRASS ◮ SInE

◮ Machine learning

◮ MaLARea

◮ Two phases

◮ Axiom selection ◮ Reasoning

Axiom selection phase Selected axioms Reasoning phase

Observation: large number of axioms is only one source of complexity. We also have: large number of arguments; large signatures; long/deep clauses; etc.

151 / 1

slide-152
SLIDE 152

Abstraction-refinement approach L. Hernandez, K. IJCAR’18

◮ Abstraction-Refinement ◮ Interleaving abstraction and

reasoning phases

◮ The abstraction is easier

to solve

◮ If there is no solution, the

abstraction is refined

152 / 1

slide-153
SLIDE 153

Abstraction-refinement approach L. Hernandez, K. IJCAR’18

◮ Abstraction-Refinement ◮ Interleaving abstraction and

reasoning phases

◮ Over-Approximation ◮ The abstraction is easier

to solve

◮ If there is no solution, the

abstraction is refined

◮ If A |

= ⊥ then α(A) | = ⊥

153 / 1

slide-154
SLIDE 154

Abstraction-refinement approach L. Hernandez, K. IJCAR’18

◮ Abstraction-Refinement ◮ Interleaving abstraction and

reasoning phases

◮ Over-Approximation ◮ Under-Approximation ◮ The abstraction is easier

to solve

◮ If there is no solution, the

abstraction is refined

◮ If A |

= ⊥ then α(A) | = ⊥

◮ If α(A) |

= ⊥ then A | = ⊥

154 / 1

slide-155
SLIDE 155

Abstraction-refinement approach L. Hernandez, K. IJCAR’18

◮ Abstraction-Refinement ◮ Interleaving abstraction and

reasoning phases

◮ Over-Approximation ◮ Under-Approximation ◮ Combination of approximations ◮ The abstraction is easier

to solve

◮ If there is no solution, the

abstraction is refined

◮ If A |

= ⊥ then α(A) | = ⊥

◮ If α(A) |

= ⊥ then A | = ⊥

◮ Converge rapidly to a

solution if it exists

155 / 1

slide-156
SLIDE 156

Abstraction-Refinement in ATPs

◮ . . . ◮ Inst-Gen: Ganzinger, Korovin ◮ SPASS: targeted decidable fragment Teucke, Weidenbach ◮ Speculative inferences: Bonacina, Lynch, de Moura ◮ SMT: conflict and model-based instantiation

de Moura, Ge; Reynolds, Tinelli . . .

◮ AVATAR: new architecture for first-order theorem provers

Voronkov; Reger, Suda, . . .

156 / 1

slide-157
SLIDE 157

Over-Approximating Abstractions

Over-approximation abstractions:

◮ Subsumption abstraction ◮ Generalisation abstraction ◮ Argument filtering abstraction ◮ Signature grouping abstraction

157 / 1

slide-158
SLIDE 158

Over-Approximation Procedure

Concrete axioms A αs(A) Abstract axioms ˆ As ATPC Disproved Conjecture C Get ˆ As

uc

Retrieve concrete axioms, γs( ˆ As

uc)

Refine abstrac- tion α′

s(A)

ATPS Proved

UNSAT SAT UNSAT SAT

158 / 1

slide-159
SLIDE 159

Over-Approximation Procedure

Concrete axioms A αs(A) Abstract axioms ˆ As ATPC Disproved Conjecture C Get ˆ As

uc

Retrieve concrete axioms, γs( ˆ As

uc)

Refine abstrac- tion α′

s(A)

ATPS Proved

UNSAT SAT UNSAT SAT

Subsumption-Based Abstraction ◮ Partition based on joint literals. ◮ Abstract clauses represent each partition and subsume all clauses in the collection. ℓ1 ∨ℓ2 ∨ ℓ3      ℓ1 ℓ1 ∨ℓ3 ∨ ℓ4 ℓ1 ∨ℓ6 ∨ ℓ4 ℓ2 ∨ℓ7 ∨ ℓ6

  • ℓ2

ℓ2 ∨ℓ8 ∨ ℓ5 Subsumption-Based refinement ◮ Subpartition of the previous collections based on a new joint literal. ℓ1      ℓ1 ∨ℓ2∨ ℓ3

  • ℓ1 ∨ ℓ3

ℓ1 ∨ ℓ3 ∨ℓ4 ℓ1 ∨ℓ6∨ ℓ4

  • ℓ1 ∨ ℓ4

159 / 1

slide-160
SLIDE 160

Over-Approximation Procedure

Concrete axioms A αs(A) Abstract axioms ˆ As ATPC Disproved Conjecture C Get ˆ As

uc

Retrieve concrete axioms, γs( ˆ As

uc)

Refine abstrac- tion α′

s(A)

ATPS Proved

UNSAT SAT UNSAT SAT

Argument Filtering Abstraction ◮ Removing certain arguments in the signature symbols. P(x, f (x, g(y))) ∨ ¬P(c, x)      P

¯ 0 ∨ ¬P ¯

¬P(g(f (x, y)), g(y)) ¬P

¯

P(c, x) P

¯

Argument Filtering refinement ◮ Restoring arguments of abstract symbols. P

¯ 0 ∨ ¬P ¯ 0 

    P(x, f

¯ 0) ∨ ¬P(c, x)

¬P

¯

¬P(g

¯ 0, g ¯ 0)

P

¯

P(c, x)

160 / 1

slide-161
SLIDE 161

Over-Approximation Procedure

Concrete axioms A αs(A) Abstract axioms ˆ As ATPC Disproved Conjecture C Get ˆ As

uc

Retrieve concrete axioms, γs( ˆ As

uc)

Refine abstrac- tion α′

s(A)

ATPS Proved

UNSAT SAT UNSAT SAT

Signature Grouping Abstraction ◮ Abstraction of the signature by grouping symbols

  • f the same type.

R(x,y) ∨ Q(x)                ¬ S(c,c) ∨ Q(y) T1(x, y) ∨ T2(x) ¬ R(c,c) ∨ P(y) ¬T1(c, c) ∨ T2(y) ¬ Q(c) ¬T2(c) ¬ P(c) Signature grouping refinement ◮ Concretising abstract symbols. T1(x, y) ∨T2(x)      R(x, y) ∨T2(x) ¬ T1(c, c) ∨T2(y) ¬ S(c, c) ∨T2(y) ¬T2(c) ¬ R(c, c) ∨T2(y) ¬T2(c)

161 / 1

slide-162
SLIDE 162

Generalisation abstraction

◮ Strengthening abstraction function αs.

◮ Partition axioms A = ∪iAi; abstract axiom: αs(Ai) |

= Ai

¬Q(x, a) Negated conjecture

S(f (x)) S(h(x, y)) Q(z, x) ∨ R(x) ∨ P(x, z) Q(f (x), a) ∨ R(g(x)) ¬P(x, h(y, a)) ∨ R(y) ¬P(f (x), g(z)) ∨ R(h(a, z)) ¬R(f (y)) ¬R(h(f (x), g(y)))

slide-163
SLIDE 163

Generalisation abstraction

◮ Strengthening abstraction function αs.

◮ Partition axioms A = ∪iAi; abstract axiom: αs(Ai) |

= Ai

¬Q(x, a) Negated conjecture

S(f (x)) S(h(x, y)) Q(z, x) ∨ R(x) ∨ P(x, z) Q(f (x), a) ∨ R(g(x)) ¬P(x, h(y, a)) ∨ R(y) ¬P(f (x), g(z)) ∨ R(h(a, z)) ¬R(f (y)) ¬R(h(f (x), g(y)))

Q(x0, x1) ∨ R(x2) S(x0) ¬P(x0, x1) ∨ R(x2) ¬R(x0)

⊃ ⊂ ⊂ ⊃

163 / 1

slide-164
SLIDE 164

Generalisation abstraction

◮ Strengthening abstraction function αs.

◮ Partition axioms A = ∪iAi; abstract axiom: αs(Ai) |

= Ai

¬Q(x, a) Negated conjecture S(x0) Q(x0, x1) ∨ R(x2) ¬P(x0, x1) ∨ R(x2) ¬R(x0)

164 / 1

slide-165
SLIDE 165

Generalisation abstraction refinement

◮ Weakening abstraction refinement.

◮ Sub-partition groups of concrete axioms involved in an abstract

proof.

¬Q(x, a) Negated conjecture S(x0) ¬P(x0, x1) ∨ R(x2)

Q(z, x) ∨ R(x) ∨ P(x, z) Q(x0, x1) ∨ R(g(x)) ¬R(f (y)) ¬R(h(f (x), g(y)))

165 / 1

slide-166
SLIDE 166

Generalisation abstraction for termination

Consider the following set of clauses: S = { p(g(x), g(x)) ∨ q(f (g(x))) g(f (f (x))) ≃ g(f (x))} A generalisation abstraction of S: α(S) = { p(x, x) ∨ q(f (x)) g(f (x)) ≃ g(x)} Superposition is not applicable after subsumption abstraction and therefore S is satisfiable.

166 / 1

slide-167
SLIDE 167

Over-approximation

Over-approximation abstractions:

◮ Subsumption abstraction ◮ Generalisation abstraction ◮ Argument filtering abstraction ◮ Signature grouping abstraction

Combinations of these abstractions

◮ --abstr ref [sig;subs;arg filter] ◮ abstractions can enable further abstractions: e.g, argument filtering

can enable signature grouping which can enable subsumption Targeted abstractions:

◮ abstractions can target fragments e.g., EPR ◮ block superposition inferences

167 / 1

slide-168
SLIDE 168

Under-Approximation

◮ Weakening abstraction function.

◮ Removing irrelevant axioms using methods like SInE or MaLARea. ◮ Using ground instances of concrete axioms.

◮ Strengthening abstraction refinement.

◮ Turning a model I into a countermodel. ◮ Add concrete axioms ◮ Generate and add ground instances of axioms 168 / 1

slide-169
SLIDE 169

Under-Approximation

Concrete axioms A αw(A) Abstract axioms ˆ Aw ATPS Proved Conjecture C I | = ˆ Aw ∧ ¬C Refine abstraction ˆ Aw find a set ˘ A, I | = ˘ A ˆ Aw := ˆ Aw ∪ ˘ A Disproved

UNSAT SAT ˘ A = ∅ ˘ A = ∅

slide-170
SLIDE 170

Under-Approximation

Concrete axioms A αw(A) Abstract axioms ˆ Aw ATPS Proved Conjecture C I | = ˆ Aw ∧ ¬C Refine abstraction ˆ Aw find a set ˘ A, I | = ˘ A ˆ Aw := ˆ Aw ∪ ˘ A Disproved

UNSAT SAT ˘ A = ∅ ˘ A = ∅

Weakening Abstraction Function ◮ Using ground instances of concrete axioms (instantiation abstraction). ◮ Removing irrelevant axioms (deletion abstraction).

slide-171
SLIDE 171

Under-Approximation

Concrete axioms A αw(A) Abstract axioms ˆ Aw ATPS Proved Conjecture C I | = ˆ Aw ∧ ¬C Refine abstraction ˆ Aw find a set ˘ A, I | = ˘ A ˆ Aw := ˆ Aw ∪ ˘ A Disproved

UNSAT SAT ˘ A = ∅ ˘ A = ∅

Strengthening Abstraction Refinement ◮ Generate and add ground instances of axioms ◮ Add concrete axioms

171 / 1

slide-172
SLIDE 172

Under-Approximation

Concrete axioms A αw(A) Abstract axioms ˆ Aw ATPS Proved Conjecture C I | = ˆ Aw ∧ ¬C Refine abstraction ˆ Aw find a set ˘ A, I | = ˘ A ˆ Aw := ˆ Aw ∪ ˘ A Disproved

UNSAT SAT ˘ A = ∅ ˘ A = ∅

slide-173
SLIDE 173

Combined Approximations

Concrete axioms A αw(A) Abstract axioms ˆ Aw Over-approximation embedded as ATPS Proved Conjecture C I | = ˆ Aw ∧ ¬C Refine abstraction ˆ Aw find a set ˘ A, I | = ˘ A ˆ Aw := ˆ Aw ∪ ˘ A Disproved

UNSAT SAT ˘ A = ∅ ˘ A = ∅

Shared abstractions.

173 / 1

slide-174
SLIDE 174

Implementation & Experiments

◮ Abstraction-refinement implemented in iProver v2.8 ◮ Strategies: combination of atomic abstractions

  • -abstr ref [subs;arg filger;sig]

◮ SInE as under-approximating abstraction

174 / 1

slide-175
SLIDE 175

The Most Effective Strategies

Table: SC = Skolem and constant, SS = Skolem and split symb.

Depth Tolerance Abstractions Signature Arg-filter Until SAT Solutions 1 1.0 sig, subs, arg-fil SS true 1001 1 2.0 subs, sig, arg-fil SC false 42 2 1.0 subs, sig, arg-fil SC false 23 1 4.0 arg-fil, sig, subs SS true 5 1 1.0 subs, sig, arg-fil SC SS false 4 1 1.0 subs, sig, arg-fil false 2 2 1.0 sig SC false 2 1 8.0 subs, sig, arg-fil false 2 1 1.0 arg-fil, subs, sig SS false 2 2 1.0 arg-fil, sig, subs SS true 2 2 1.0 arg-fil false 1 2 1.0 subs, sig false 1 Total 1087 175 / 1

slide-176
SLIDE 176

CASC-26

Table: CASC-26 LTB comparison (out of 1500 problems)

Vampire 4.0 Vampire 4.2 MaLARea iProver 2.8 iProver 2.6 E LTB 1156 1144 1131 1087 777 683

176 / 1

slide-177
SLIDE 177

Abstraction-refinement current work

◮ Abstractions targeted for specific theories ◮ Goal directed abstractions ◮ Reuse of abstractions ◮ Different combination schemes/ ML ◮ Target abstractions for theories

177 / 1

slide-178
SLIDE 178

Conclusions

Instantiation-based theorem proving for first-order logic:

◮ Modular combination of SAT/SMT and first-order reasoning ◮ Combination of proof search and model search ◮ Abstraction-refinement for large/complex problems

Further directions:

◮ The quest of combining first-order and theories: highly undecidable ◮ Combination with SMT approaches to quantifier instantiation ◮ Abstraction-refinement as a generalisation of instantiation based

reasoning ?

178 / 1

slide-179
SLIDE 179

Extra: efficient datastructures and indexes

179 / 1

slide-180
SLIDE 180

Indexing

Why indexing:

◮ Single subsumption is NP-hard. ◮ We can have 100,000 clauses in our search space ◮ Applying naively between all pairs of clauses we need

10,000,000,000 subsumption checks !

180 / 1

slide-181
SLIDE 181

Indexing

Why indexing:

◮ Single subsumption is NP-hard. ◮ We can have 100,000 clauses in our search space ◮ Applying naively between all pairs of clauses we need

10,000,000,000 subsumption checks ! Indexes in iProver:

◮ non-perfect discrimination trees for unification, matching ◮ compressed feature vector indexes for subsumption, subsumption

resolution, dismatching constraints.

181 / 1

slide-182
SLIDE 182

Unification: Discrimination trees

ǫ f g ∗ a f (g(x), a) ∗ h ∗ f (g(x), h(x)) f (g(y), h(x)) h . . . . . . g . . . a g(a) Efficient filtering unification, matching and generalisation candidates

182 / 1

slide-183
SLIDE 183

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

183 / 1

slide-184
SLIDE 184

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

◮ clause C can not subsume any clause with number of positive

literals strictly less than C

184 / 1

slide-185
SLIDE 185

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

◮ clause C can not subsume any clause with number of positive

literals strictly less than C

◮ clause C can not subsume any clause with the number of

  • ccurrences of a symbol f less than in C

185 / 1

slide-186
SLIDE 186

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

◮ clause C can not subsume any clause with number of positive

literals strictly less than C

◮ clause C can not subsume any clause with the number of

  • ccurrences of a symbol f less than in C

◮ . . .

186 / 1

slide-187
SLIDE 187

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

◮ clause C can not subsume any clause with number of positive

literals strictly less than C

◮ clause C can not subsume any clause with the number of

  • ccurrences of a symbol f less than in C

◮ . . .

187 / 1

slide-188
SLIDE 188

Subsumption: Feature vector index

Subsumption is very expensive and usual indexing are complicated. Feature vector index [Schulz] works well for subsumption, and many other

  • perations

Design efficient filters based on “features of clauses”:

◮ clause C can not subsume any clause with number of literals strictly

less than C

◮ clause C can not subsume any clause with number of positive

literals strictly less than C

◮ clause C can not subsume any clause with the number of

  • ccurrences of a symbol f less than in C

◮ . . .

188 / 1

slide-189
SLIDE 189

Feature vector index

Fix: a list of features:

  • 1. number of literals
  • 2. number of occurrences of f
  • 3. number of occurrences of g

With each clause associate a feature vector: numeric vector of feature values Example: feature vector of C = p(f (f (x))) ∨ ¬p(g(y)) is fv(C) = [2, 2, 1] Arrange feature vectors in a trie data structure similar to discrimination tree

189 / 1

slide-190
SLIDE 190

Feature vector index

Fix: a list of features:

  • 1. number of literals
  • 2. number of occurrences of f
  • 3. number of occurrences of g

With each clause associate a feature vector: numeric vector of feature values Example: feature vector of C = p(f (f (x))) ∨ ¬p(g(y)) is fv(C) = [2, 2, 1] Arrange feature vectors in a trie data structure similar to discrimination tree For retrieving all candidates which can be subsumed by C we need to traverse only vectors which are component-wise greater or equal to fv(C).

190 / 1

slide-191
SLIDE 191

Compressed feature vector index [iProver]

The signature based features are most useful but also expensive. Example: is signature contains 1000 symbols and we use all symbols as features then feature vector for every clause will be 1000 in length.

191 / 1

slide-192
SLIDE 192

Compressed feature vector index [iProver]

The signature based features are most useful but also expensive. Example: is signature contains 1000 symbols and we use all symbols as features then feature vector for every clause will be 1000 in length. Basic idea: for each clause most features will be 0.

192 / 1

slide-193
SLIDE 193

Compressed feature vector index [iProver]

The signature based features are most useful but also expensive. Example: is signature contains 1000 symbols and we use all symbols as features then feature vector for every clause will be 1000 in length. Basic idea: for each clause most features will be 0. Compress feature vector: use list of pairs [(p1, v1), . . . , (pn, v1)] where pi are non-zero positions and vi are values that start from this position. Sequential positions with the same value are combined. iProver uses compressed feature vector index for forward and backward subsumption, subsumption resolution and dismatching constraints.

193 / 1