[PPT] - Deduction and Induction A Match Made in Heaven The Machine PowerPoint Presentation

SLIDE 1

Deduction and Induction

A Match Made in Heaven

Stephan Schulz

The Inference Engine Machine Learning

SLIDE 2

Deduction and Induction

A Match Made in Heaven

Stephan Schulz

The Inference Engine Machine Learning

r a Deal with the Devil?

SLIDE 3

Agenda

◮ Search and choice points in saturating theorem proving ◮ Basic questions about learning ◮ Learning from performance data

◮ Classification and heuristic selection ◮ Parameters for clause selection

◮ Learning from proofs and search graphs

◮ Proof extraction ◮ Learning clause evaluations (?)

◮ Conclusion

2

SLIDE 4

Theorem Proving: Big Picture

8X : human(X) ! mortal(X) 8X : philosopher(X) ! human(X) philosopher(socrates) ? | = mortal(socrates)

Real World Problem Formalized Problem

ATP

Proof Search

Proof Countermodel Timeout

r
r

3

SLIDE 5

Contradiction and Saturation

◮ Proof by contradiction

◮ Assume negation of conjecture ◮ Show that axioms and negated conjecture imply falsity

◮ Saturation

◮ Convert problem to Clause Normal Form ◮ Systematically enumerate logical consequences of axioms and negated conjecture ◮ Goal: Explicit contradiction (empty clause)

◮ Redundancy elimination

◮ Use contracting inferences to simplify or eliminate some clauses

Formula set Equi- satisfiable clause set

Clausifier

4

SLIDE 6

Contradiction and Saturation

◮ Proof by contradiction

◮ Assume negation of conjecture ◮ Show that axioms and negated conjecture imply falsity

◮ Saturation

◮ Convert problem to Clause Normal Form ◮ Systematically enumerate logical consequences of axioms and negated conjecture ◮ Goal: Explicit contradiction (empty clause)

◮ Redundancy elimination

◮ Use contracting inferences to simplify or eliminate some clauses

Search control problem: How and in which order do we enumerate consequences?

Formula set Equi- satisfiable clause set

Clausifier

4

SLIDE 7

Proof Search and Choice Points

◮ First-order logic is semi-decidable

◮ Provers search for proof in infinite space ◮ . . . of possible derivations ◮ . . . of possible consequences

◮ Major choice points of Superposition calculus:

◮ Term ordering (which terms are bigger) ◮ (Negative) literal selection ◮ Selection of clauses for inferences (with the given clause algorithm)

5

SLIDE 8

Term Ordering and Literal Selection

◮ Negative Superposition with selection

C ∨ s ≃ t D ∨ u ≃ v (C ∨ D ∨ u[p←t] ≃ v)σ

◮ if σ = mgu(u|p, s)

◮ and (s ≃ t)σ is ≻-maximal in (C ∨ s ≃ t)σ ◮ and s is ≻-maximal in (s ≃ t)σ ◮ and u ≃ v is selected in D ∨ u ≃ v ◮ and u is ≻-maximal in (s ≃ t)σ

◮ Choice points:

◮ ≻ is a ground-total rewrite ordering

◮ Consistent throughout the proof search ◮ I.e. in practice determined up-front

◮ Any negative literal can be selected

◮ Current practice: Fixed scheme picked up-front 6

SLIDE 9

The Given-Clause Algorithm

U

(unprocessed clauses)

g P

(processed clauses)

g=☐ ?

◮ Aim: Move everything from U to P

7

SLIDE 10

The Given-Clause Algorithm

U

(unprocessed clauses) Gene- rate

g P

(processed clauses)

g=☐ ?

◮ Aim: Move everything from U to P ◮ Invariant: All generating inferences with premises from P have been performed

7

SLIDE 11

The Given-Clause Algorithm

U

(unprocessed clauses) Gene- rate Simpli- fiable? Simplify

g P

(processed clauses)

g=☐ ?

◮ Aim: Move everything from U to P ◮ Invariant: All generating inferences with premises from P have been performed ◮ Invariant: P is interreduced

7

SLIDE 12

The Given-Clause Algorithm

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

◮ Aim: Move everything from U to P ◮ Invariant: All generating inferences with premises from P have been performed ◮ Invariant: P is interreduced ◮ Clauses added to U are simplified with respect to P

7

SLIDE 13

Choice Point Clause Selection

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

U

(unprocessed clauses)

8

SLIDE 14

Choice Point Clause Selection

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

U

(unprocessed clauses)

Choice Point

8

SLIDE 15

Induction for Deduction

◮ Question 1: What to learn from?

◮ Performance data (prover is a black box) ◮ Proofs (only final result of search is visible) ◮ Proof search graphs (most of search is visible)

◮ Question 2: What to learn?

◮ Here: Learn strategy selection ◮ Here: Learn parameterization for clause selection heuristics ◮ Here: Learn new clause evaluation functions ◮ . . .

?

9

SLIDE 16

Automatic Strategy Selection

10

SLIDE 17

Strategy Selection

Definition: A strategy is a collection of all search control parameters

◮ Term ordering ◮ Literal selection scheme ◮ Clause selection heuristic ◮ . . . (minor parameters)

11

SLIDE 18

Strategy Selection

Definition: A strategy is a collection of all search control parameters

◮ Term ordering ◮ Literal selection scheme ◮ Clause selection heuristic ◮ . . . (minor parameters) ◮ Observation: Different problems are simple for different strategies ◮ Question: Can we determine a good heuristic (or set of heuristics)

up-front?

◮ Original: Manually coded automatic modes

◮ Based on developer intuition/insight/experience ◮ Limited success, high maintenance

◮ State of the art: Automatic generation of automatic modes

11

SLIDE 19

“Learning” Heuristic Selection

TPTP problem library

12

SLIDE 20

“Learning” Heuristic Selection

TPTP problem library

12

SLIDE 21

“Learning” Heuristic Selection

TPTP problem library

Feature-based classification

12

SLIDE 22

“Learning” Heuristic Selection

TPTP problem library

Feature-based classification Assign strategies to classes based on collected performance data from previous experiments

Simplest: Always pick best

strategy in class

If no data, pick globally best

12

SLIDE 23

“Learning” Heuristic Selection

TPTP problem library

Feature-based classification Example features

Number of clausse
Arity of symbols
Unit/Horn/Non-horn

Assign strategies to classes based on collected performance data from previous experiments

Simplest: Always pick best

strategy in class

If no data, pick globally best

12

SLIDE 24

Auto Mode Performance

6000 7000 8000 9000 10000 11000 50 100 150 200 250 300 E 1.8 Auto E 1.8 Best

TPTP 5.6.0 CNF&FOF problems

13

SLIDE 25

A Caveat

TPTP problem library

Feature-based classification Example features

Number of clausse
Arity of symbols
Unit/Horn/Non-horn

Assign strategies to classes based on collected performance data from previous experiments

Simplest: Always pick best

strategy in class

If no data, pick globally best

14

SLIDE 26

A Caveat

TPTP problem library

Feature-based classification Example features

Number of clausse
Arity of symbols
Unit/Horn/Non-horn

Assign strategies to classes based on collected performance data from previous experiments

Simplest: Always pick best

strategy in class

If no data, pick globally best

Features based on developer…

…intuition
…insight
…experience

14

SLIDE 27

Current Work: Learning Classification

◮ Characterize problems by performance vectors

◮ Which strategy solved the problem how fast?

◮ Unsupervised clustering of problems based om

performance

◮ Each cluster contains problems on which the same strategies perform well

◮ Feature extraction: Try to find characterization of

clusters

◮ E.g. based on feature set ◮ E.g. using nearest-neighbour approaches

My Bachelor Student Ayatallah just started work on this topic - results in 6 months f.IE#iiEFEtI:tt

:

a better

auto

mode

( that is th practical

goal ) And b) to

better understand

which

features influence

Search ( that is the

theoretical goal )

:

me

Literature

would be

food

here

, e.g.

the

E

paper

S C 1.8 and

brainiac

')

,

and

Something

n

clustering

.

Quite

ford for

a

stat ! Thanks

.

15

SLIDE 28

Learning parameterization for clause selection heuristics

16

SLIDE 29

Reminder: Choice Point Clause Selection

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

U

(unprocessed clauses)

Choice Point

17

SLIDE 30

Basic Approaches to Clause Selection

◮ Symbol counting

◮ Pick smallest clause in U ◮ |{f (X) = a, P(a) = $true, g(Y ) = f (a)}| = 10

◮ FIFO

◮ Always pick oldest clause in U

◮ Flexible weighting

◮ Symbol counting, but give different weight to different symbols ◮ E.g. lower weight to symbols from goal! ◮ E.g. higher weight for symbols in inference positions

◮ Combinations

◮ Interleave different schemes

18

SLIDE 31

Given-Clause Selection in E (1)

◮ Domain Specific Language (DSL) for clause selection scheme ◮ Arbitrary number of priority queues ◮ Each queue ordered by:

◮ Unparameterized priority function ◮ Parameterized heuristic evaluation function

◮ Clauses picked using weighted round-robin scheme

◮ Example (5 queues): (1ConjectureRelativeSymbolWeight(SimulateSOS, 0.5, 100, 100, 100, 100, 1.5, 1.5, 1), 4ConjectureRelativeSymbolWeight(ConstPrio, 0.1, 100, 100, 100, 100,1.5, 1.5, 1.5), 1FIFOWeight(PreferProcessed), 1ConjectureRelativeSymbolWeight(PreferNonGoals, 0.5, 100, 100, 100,100, 1.5, 1.5, 1), 4*Refinedweight(SimulateSOS,3,2,2,1.5,2))

19

SLIDE 32

Given-Clause Selection in E (2)

◮ Example clause selection heuristic

(1ConjectureRelativeSymbolWeight(SimulateSOS, 0.5, 100, 100, 100, 100, 1.5, 1.5, 1), 4ConjectureRelativeSymbolWeight(ConstPrio, 0.1, 100, 100, 100, 100,1.5, 1.5, 1.5), 1FIFOWeight(PreferProcessed), 1ConjectureRelativeSymbolWeight(PreferNonGoals, 0.5, 100, 100, 100,100, 1.5, 1.5, 1), 4*Refinedweight(SimulateSOS,3,2,2,1.5,2))

◮ Infinitely many possibilities

◮ Several integer and floating point parameters per evaluation function ◮ Arbitrary combinations of individual evaluation functions

20

SLIDE 33

Given-Clause Selection in E (2)

◮ Example clause selection heuristic

(1ConjectureRelativeSymbolWeight(SimulateSOS, 0.5, 100, 100, 100, 100, 1.5, 1.5, 1), 4ConjectureRelativeSymbolWeight(ConstPrio, 0.1, 100, 100, 100, 100,1.5, 1.5, 1.5), 1FIFOWeight(PreferProcessed), 1ConjectureRelativeSymbolWeight(PreferNonGoals, 0.5, 100, 100, 100,100, 1.5, 1.5, 1), 4*Refinedweight(SimulateSOS,3,2,2,1.5,2))

◮ Infinitely many possibilities

◮ Several integer and floating point parameters per evaluation function ◮ Arbitrary combinations of individual evaluation functions

How do we find good clause selection heuristics (without relying on developer intuition, insight, experience)?

20

SLIDE 34

Genetic Algorithms

◮ Optimization based on evolving population of individuals

◮ Optimization is organized in generations ◮ In each generation, individuals compete to reproduce

◮ Each individual is a candidate solution (i.e. search heuristic)

◮ Individuals are assigned a fitness score based on performance ◮ More fit individuals are more likely to reproduce into the next generation

◮ The next generation:

◮ Mutation - randomly modify individual ◮ Crossover - create new individual from two parents ◮ Survivors

21

SLIDE 35

Applying Genetic Algorithms to Clause Selection

◮ Encoding: DSL translated into S-Expressions ◮ Mutation: Randomly modify parameters of one heuristic ◮ Crossover:

◮ Compose individual by randomly inserting evaluation functions from both parents ◮ If the same generic evaluation function occurs in both, randomly exchange parameters

◮ Fitness: How many medium difficulty problems are solved

◮ . . . on smallish sample set ◮ . . . with short time limit

◮ Selection: Tournament selection (n ≈ 5)

22

SLIDE 36

Evolution in Action

80 85 90 95 100 105 110 115 120 20 40 60 80 100 120 140 Fitness (solved problems) Generations Fitness over generations

23

SLIDE 37

(Very) Preliminary Results

◮ Evolution finds good clause selection heuristics from random initial

population

◮ Convergence in ≈ 200 generations ◮ Time per generation ≈ 45 CPU hours ◮ . . . ≈ 40 minutes on 24 core server

◮ Best evolved heuristic beats best conventional heuristic

◮ Evaluation on 15758 problems from TPTP 6.0.0 ◮ 30 second time limit, 2.6GHz Intel Xeon machines, enough memory ◮ Evolved: 8814 solutions found ◮ Manual: 8750 solutions found ◮ Unique solutions: 466 evolved vs. 386 manual

24

SLIDE 38

Current Work: Diversity Beats Ferocity

25

SLIDE 39

Current Work: Diversity Beats Ferocity

25

SLIDE 40

Current Work: Diversity Beats Ferocity

25

SLIDE 41

Current Work: Diversity Beats Ferocity

25

SLIDE 42

Current Work: Diversity Beats Ferocity

25

SLIDE 43

Current Work: Diversity Beats Ferocity

25

SLIDE 44

Current Work: Diversity Beats Ferocity

25

SLIDE 45

Current Work: Diversity Beats Ferocity

25

SLIDE 46

Current Work: Diversity Beats Ferocity

25

SLIDE 47

Current Work: Diversity Beats Ferocity

◮ Idea: Modify fitness function

◮ Problems are prey ◮ Individual heuristics are predators ◮ If several predators catch the same prey, they have to share the benefit ◮ = ⇒ problems solved by no or few heuristics are more valuable ◮ = ⇒ Force diversity of the ecosystem

26

SLIDE 48

Current Work: Diversity Beats Ferocity

◮ Idea: Modify fitness function

◮ Problems are prey ◮ Individual heuristics are predators ◮ If several predators catch the same prey, they have to share the benefit ◮ = ⇒ problems solved by no or few heuristics are more valuable ◮ = ⇒ Force diversity of the ecosystem

My Bachelor Student Ahmed just started work on this topic - results in 6 months

26

SLIDE 49

Proof Extraction and Learning

27

SLIDE 50

Learning from Proofs and Proof Search Graphs

◮ Intuition: Previous proof searches are useful to

guide new proof attempts

◮ Naive approach:

◮ Clauses in the proof tree are positive examples ◮ (All other clauses are negative examples)

◮ Initial attempts

◮ DISCOUNT (Schulz 1995, Schulz&Denzinger 1996) - UEQ, patterns ◮ E (Schulz 2000, 2001) - CNF, patterns ◮ Overall, modest successes ◮ Mostly with positive examples only - compare Otter’s hints

c0 c10 c17 c1 c12 c13 c16 c21 c2 c24 c3 c11 c27 c28 c4 c14 c15 c19 c20 c25 c5 c6 c7 c23 c26 c8 c30 c9 c18 c22 c29

28

SLIDE 51

Problems and Solutions

◮ Problem: Search protocol size

◮ Initial approach: Store all intermediate steps ◮ Bad time and space performance ◮ Borderline impossible in 2000, still hard today

◮ Problem: Not all examples represent search decisions

◮ Many intermediate results ◮ Also: Vastly unbalanced ratio of positive/negative examples

◮ Common solution:

◮ Internal proof object (re-)construction ◮ Compact representation of the search graph ◮ Actually evaluated and picked clauses are recorded ◮ Minimal overhead (0.24%) in time ◮ Small overhead in memory (due to structure sharing and early discarding of many redundant clauses)

29

SLIDE 52

Proof Generation with Limited Archiving

◮ DISCOUNT loop: Only clauses in P are used

for inferences

◮ U is subject to simplification, but is passive ◮ Only clauses in P need to be available in the proof tree

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

30

SLIDE 53

Proof Generation with Limited Archiving

◮ DISCOUNT loop: Only clauses in P are used

for inferences

◮ U is subject to simplification, but is passive ◮ Only clauses in P need to be available in the proof tree

◮ Backward simplification is rare

◮ Only clauses in P can be backwards-simplified (and P is small) ◮ Heuristically, newer clauses are larger (and big clauses rarely simplify small clauses)

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses) g=☐ ?

U

(unprocessed clauses)

30

SLIDE 54

Proof Generation with Limited Archiving

◮ DISCOUNT loop: Only clauses in P are used

for inferences

◮ U is subject to simplification, but is passive ◮ Only clauses in P need to be available in the proof tree

◮ Backward simplification is rare

◮ Only clauses in P can be backwards-simplified (and P is small) ◮ Heuristically, newer clauses are larger (and big clauses rarely simplify small clauses)

◮ Solution: Non-destructive

backwards-simplification

◮ Clauses in P are archived on simplification ◮ Simplified new clause is build from fresh copy

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses) g=☐ ?

U

(unprocessed clauses)

30

SLIDE 55

Proof Generation

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

31

SLIDE 56

Proof Generation

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

A

(archive) 31

SLIDE 57

The Structure of Proofs: Commutative Rings

cnf(c_0_0, axiom, (multiply(X1,add(X2,X3))=add(multiply(X1,X2),multiply(X1,X3))), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', distribute1)). cnf(c_0_10, hypothesis, (multiply(X1,add(X2,X1))=add(X1,multiply(X1,X2))), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_0, c_0_1]), c_0_2])). cnf(c_0_17, hypothesis, (multiply(X1,add(X1,X2))=add(X1,multiply(X1,X2))), inference(pm,[status(thm)],[c_0_0, c_0_1])). cnf(c_0_1, hypothesis, (multiply(X1,X1)=X1), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/RNG008-7.p', boolean_ring)). cnf(c_0_12, hypothesis, (add(X1,multiply(X1,additive_identity))=X1), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_10, c_0_5]), c_0_1])). cnf(c_0_13, hypothesis, (multiply(add(X1,X2),X2)=add(X2,multiply(X1,X2))), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_6, c_0_1]), c_0_2])). cnf(c_0_16, hypothesis, (add(X1,multiply(additive_identity,X1))=X1), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_13, c_0_5]), c_0_1])). cnf(c_0_21, hypothesis, (multiply(add(X1,X2),X1)=add(X1,multiply(X2,X1))), inference(pm,[status(thm)],[c_0_6, c_0_1])). cnf(c_0_2, axiom, (add(X1,X2)=add(X2,X1)), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', commutativity_for_addition)). cnf(c_0_24, plain, (add(X1,add(X2,X3))=add(X3,add(X1,X2))), inference(pm,[status(thm)],[c_0_2, c_0_3])). cnf(c_0_3, axiom, (add(X1,add(X2,X3))=add(add(X1,X2),X3)), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', associativity_for_addition)). cnf(c_0_11, plain, (add(X1,add(additive_inverse(X1),X2))=X2), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_3, c_0_4]), c_0_5])). cnf(c_0_27, hypothesis, (add(X1,add(X1,X2))=X2), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_3, c_0_25]), c_0_5])). cnf(c_0_28, hypothesis, (add(X1,multiply(X1,X2))=add(X1,multiply(X2,X1))), inference(rw,[status(thm)],[inference(rw,[status(thm)],[inference(rw,[status(thm)],[inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_13, c_0_26]), c_0_17]), c_0_10]), c_0_3]), c_0_27])). cnf(c_0_4, axiom, (add(X1,additive_inverse(X1))=additive_identity), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', right_additive_inverse)). cnf(c_0_14, hypothesis, (multiply(additive_inverse(X1),additive_identity)=additive_identity), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_11, c_0_12]), c_0_4])). cnf(c_0_15, plain, (additive_inverse(additive_inverse(X1))=X1), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_11, c_0_4]), c_0_7])). cnf(c_0_19, hypothesis, (multiply(additive_identity,additive_inverse(X1))=additive_identity), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_11, c_0_16]), c_0_4])). cnf(c_0_20, hypothesis, (add(X1,multiply(X1,additive_inverse(X1)))=additive_identity), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_17, c_0_4]), c_0_18])). cnf(c_0_25, hypothesis, (add(X1,X1)=additive_identity), inference(rw,[status(thm)],[inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_21, c_0_4]), c_0_22]), c_0_23])). cnf(c_0_5, axiom, (add(additive_identity,X1)=X1), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', left_additive_identity)). cnf(c_0_6, axiom, (multiply(add(X1,X2),X3)=add(multiply(X1,X3),multiply(X2,X3))), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', distribute2)). cnf(c_0_7, axiom, (add(X1,additive_identity)=X1), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/Axioms/RNG005-0.ax', right_additive_identity)). cnf(c_0_23, hypothesis, (multiply(additive_inverse(X1),X1)=X1), inference(rw,[status(thm)],[inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_11, c_0_20]), c_0_7]), c_0_15])). cnf(c_0_26, hypothesis, (add(X1,add(X2,X1))=X2), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_24, c_0_25]), c_0_7])). cnf(c_0_8, negated_conjecture, (multiply(b,a)!=c), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/RNG008-7.p', prove_commutativity)). cnf(c_0_30, negated_conjecture, ($false), inference(cn,[status(thm)],[inference(rw,[status(thm)],[inference(rw,[status(thm)],[c_0_8, c_0_29]), c_0_9])])). cnf(c_0_9, negated_conjecture, (multiply(a,b)=c), file('/Users/schulz/EPROVER/TPTP_6.0.0_FLAT/RNG008-7.p', a_times_b_is_c)). cnf(c_0_18, hypothesis, (multiply(X1,additive_identity)=additive_identity), inference(pm,[status(thm)],[c_0_14, c_0_15])). cnf(c_0_22, hypothesis, (multiply(additive_identity,X1)=additive_identity), inference(pm,[status(thm)],[c_0_19, c_0_15])). cnf(c_0_29, hypothesis, (multiply(X1,X2)=multiply(X2,X1)), inference(rw,[status(thm)],[inference(pm,[status(thm)],[c_0_27, c_0_28]), c_0_27])).

32

SLIDE 58

The Structure of Proofs: Commutative Rings

cnf(c_0_0, axiom, (multiply(X1,add(X2,X3))=add(multiply(X1,X2),multiply(X1,X3)))). cnf(c_0_10, hypothesis, (multiply(X1,add(X2,X1))=add(X1,multiply(X1,X2)))). cnf(c_0_17, hypothesis, (multiply(X1,add(X1,X2))=add(X1,multiply(X1,X2)))). cnf(c_0_1, hypothesis, (multiply(X1,X1)=X1)). cnf(c_0_12, hypothesis, (add(X1,multiply(X1,additive_identity))=X1)). cnf(c_0_13, hypothesis, (multiply(add(X1,X2),X2)=add(X2,multiply(X1,X2)))). cnf(c_0_16, hypothesis, (add(X1,multiply(additive_identity,X1))=X1)). cnf(c_0_21, hypothesis, (multiply(add(X1,X2),X1)=add(X1,multiply(X2,X1)))). cnf(c_0_2, axiom, (add(X1,X2)=add(X2,X1))). cnf(c_0_24, plain, (add(X1,add(X2,X3))=add(X3,add(X1,X2)))). cnf(c_0_3, axiom, (add(X1,add(X2,X3))=add(add(X1,X2),X3))). cnf(c_0_11, plain, (add(X1,add(additive_inverse(X1),X2))=X2)). cnf(c_0_27, hypothesis, (add(X1,add(X1,X2))=X2)). cnf(c_0_28, hypothesis, (add(X1,multiply(X1,X2))=add(X1,multiply(X2,X1)))). cnf(c_0_4, axiom, (add(X1,additive_inverse(X1))=additive_identity)). cnf(c_0_14, hypothesis, (multiply(additive_inverse(X1),additive_identity)=additive_identity)). cnf(c_0_15, plain, (additive_inverse(additive_inverse(X1))=X1)). cnf(c_0_19, hypothesis, (multiply(additive_identity,additive_inverse(X1))=additive_identity)). cnf(c_0_20, hypothesis, (add(X1,multiply(X1,additive_inverse(X1)))=additive_identity)). cnf(c_0_25, hypothesis, (add(X1,X1)=additive_identity)). cnf(c_0_5, axiom, (add(additive_identity,X1)=X1)). cnf(c_0_6, axiom, (multiply(add(X1,X2),X3)=add(multiply(X1,X3),multiply(X2,X3)))). cnf(c_0_7, axiom, (add(X1,additive_identity)=X1)). cnf(c_0_23, hypothesis, (multiply(additive_inverse(X1),X1)=X1)). cnf(c_0_26, hypothesis, (add(X1,add(X2,X1))=X2)). cnf(c_0_8, negated_conjecture, (multiply(b,a)!=c)). cnf(c_0_30, negated_conjecture, ($false)). cnf(c_0_9, negated_conjecture, (multiply(a,b)=c)). cnf(c_0_18, hypothesis, (multiply(X1,additive_identity)=additive_identity)). cnf(c_0_22, hypothesis, (multiply(additive_identity,X1)=additive_identity)). cnf(c_0_29, hypothesis, (multiply(X1,X2)=multiply(X2,X1))).

32

SLIDE 59

The Structure of Proofs: Commutative Rings

c0 c10 c17 c1 c12 c13 c16 c21 c2 c24 c3 c11 c27 c28 c4 c14 c15 c19 c20 c25 c5 c6 c7 c23 c26 c8 c30 c9 c18 c22 c29

32

SLIDE 60

The Structure of Proofs: Commutative Rings

c0 c10 c17 c1 c12 c13 c16 c21 c2 c24 c3 c11 c27 c28 c4 c14 c15 c19 c20 c25 c5 c6 c7 c23 c26 c8 c30 c9 c18 c22 c29

32

SLIDE 61

The Structure of Proofs: Commutative Rings

c0 c11 c18 c49 c74 c113 c116 c125 c126 c129 c140 c141 c142 c146 c150 c151 c152 c153 c154 c155 c156 c157 c158 c159 c160 c161 c162 c163 c164 c165 c166 c167 c168 c169 c170 c1 c13 c14 c17 c22 c29 c38 c2 c25 c31 c36 c99 c100 c101 c102 c104 c105 c106 c107 c108 c3 c12 c30 c35 c37 c43 c44 c45 c46 c47 c50 c52 c53 c55 c56 c57 c58 c61 c62 c63 c64 c65 c66 c67 c68 c69 c70 c71 c72 c75 c77 c79 c80 c81 c82 c85 c86 c87 c88 c89 c90 c115 c124 c133 c134 c135 c136 c137 c138 c139 c171 c172 c173 c174 c175 c176 c177 c178 c179 c180 c182 c184 c185 c186 c187 c188 c190 c192 c193 c194 c195 c196 c199 c202 c205 c206 c207 c208 c211 c212 c213 c214 c215 c216 c217 c218 c219 c220 c221 c222 c223 c224 c4 c15 c16 c20 c21 c26 c5 c6 c48 c73 c91 c92 c93 c94 c95 c96 c97 c98 c103 c109 c110 c111 c112 c114 c117 c7 c24 c27 c28 c233 c8 c33 c39 c127 c143 c225 c226 c227 c228 c231 c232 c9 c32 c234 c10 c41 c144 c145 c147 c148 c149 c19 c23 c42 c60 c84 c118 c119 c120 c121 c122 c123 c128 c130 c131 c132 c197 c198 c200 c201 c34 c59 c83 c40 c181 c183 c189 c191 c229 c51 c54 c76 c78 c203 c204 c209 c210 c230

32

SLIDE 62

Classification of Search Decisions

◮ Proof state at success:

◮ All proof clauses are in P ∪ A ◮ Clauses in U never contribute

◮ All clauses in P ∪ A have been

selected for processing

◮ Positive examples: Proof clauses ◮ Negative examples: Non-proof clauses

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

A

(archive)

33

SLIDE 63

Classification of Search Decisions

◮ Proof state at success:

◮ All proof clauses are in P ∪ A ◮ Clauses in U never contribute

◮ All clauses in P ∪ A have been

selected for processing

◮ Positive examples: Proof clauses ◮ Negative examples: Non-proof clauses

Idea: Apply Machine Learning

U

(unprocessed clauses) Gene- rate Simpli- fiable? Cheap Simplify Simplify

g P

(processed clauses)

g=☐ ?

A

(archive)

33

SLIDE 64

Example: Proof Objects RNG008-7

c0). c10). c17). c1). c12). c13). c16). c21). c2). c24). c3). c11). c27). c28). c4). c14). c15). c19). c20). c25). c5). c6). c7). c23). c26). c8). c30). c9). c18). c22). c29).

34

SLIDE 65

Example: Proof Objects RNG008-7

c0). c10). c1). c11). c2). c12). c3). c13). c4). c14). c5). c15). c6). c17). c7). c20). c8). c37). c9). c39). c16). c25). c19). c21). c24). c29). c32). c18). c35). c36). c22). c23). c27). c28). c33). c31). c34). c26). c30). c38). c40).

34

SLIDE 66

Example: Proof Objects RNG008-7

c0). c11). c18). c49). c74). c113). c115). c125). c126). c129). c140). c141). c144). c146). c150). c151). c152). c153). c154). c155). c156). c157). c158). c159). c160). c161). c162). c163). c164). c165). c166). c167). c168). c169). c170). c1). c13). c14). c17). c22). c29). c38). c2). c25). c31). c36). c99). c100). c101). c102). c104). c105). c106). c107). c108). c3). c12). c28). c30). c35). c37). c43). c44). c45). c46). c47). c51). c52). c53). c55). c56). c57). c60). c61). c62). c63). c64). c65). c66). c67). c68). c69). c70). c71). c72). c76). c77). c79). c80). c81). c84). c85). c86). c87). c88). c89). c90). c117). c124). c133). c134). c135). c136). c137). c138). c139). c171). c172). c173). c174). c175). c176). c177). c178). c179). c180). c182). c184). c185). c186). c187). c188). c190). c192). c193). c194). c195). c196). c199). c202). c205). c206). c207). c208). c211). c212). c213). c214). c215). c216). c217). c218). c219). c220). c221). c222). c223). c224). c4). c15). c16). c20). c21). c26). c5). c6). c48). c73). c91). c92). c93). c94). c95). c96). c97). c98). c103). c109). c110). c111). c112). c114). c116). c7). c24). c27). c233). c8). c33). c39). c128). c145). c225). c226). c227). c228). c231). c232). c9). c32). c234). c10). c41). c142). c143). c147). c148). c149). c19). c23). c42). c59). c83). c118). c119). c120). c121). c122). c123). c127). c130). c131). c132). c197). c198). c200). c201). c34). c58). c82). c40). c181). c183). c189). c191). c229). c50). c54). c75). c78). c203). c204). c209). c210). c230).

34

SLIDE 67

Example: Proof Objects RNG008-7

c0). c11). c1). c12). c2). c13). c3). c14). c4). c15). c5). c16). c6). c18). c7). c21). c8). c35). c9). c39). c10). c51). c17). c26). c60). c85). c124). c126). c136). c137). c140). c151). c152). c155). c157). c161). c162). c163). c164). c165). c166). c167). c168). c169). c170). c171). c172). c173). c174). c175). c176). c177). c178). c179). c180). c181). c20). c22). c25). c30). c38). c48). c33). c41). c46). c110). c111). c112). c113). c115). c116). c117). c118). c119). c19). c37). c40). c45). c47). c54). c55). c56). c57). c58). c61). c63). c64). c66). c67). c68). c70). c71). c73). c74). c75). c76). c77). c78). c79). c80). c81). c82). c83). c86). c88). c90). c91). c92). c94). c95). c97). c98). c99). c100). c101). c128). c135). c144). c145). c146). c147). c148). c149). c150). c182). c183). c184). c185). c186). c187). c188). c189). c190). c191). c193). c195). c196). c197). c198). c199). c201). c203). c204). c205). c206). c207). c210). c213). c216). c217). c218). c219). c222). c223). c224). c225). c226). c227). c228). c229). c230). c231). c232). c233). c234). c235). c23). c24). c28). c29). c34). c52). c153). c154). c156). c158). c159). c160). c59). c84). c102). c103). c104). c105). c106). c107). c108). c109). c114). c120). c121). c122). c123). c125). c127). c32). c244). c36). c27). c31). c53). c69). c93). c129). c130). c131). c132). c133). c134). c138). c139). c141). c142). c143). c208). c209). c211). c212). c43). c49). c236). c237). c238). c239). c242). c243). c44). c72). c96). c42). c50). c245). c192). c194). c200). c202). c241). c62). c65). c87). c89). c214). c215). c220). c221). c240).

34

SLIDE 68

Some Initial Results

◮ Training examples can be cheaply extracted ◮ Ratio of utilized to useless given clauses (GCU-ratio) is a good

predictor of Heuristic perfomance (Schulz/M¨

hrmann, IJCAR 2016)

◮ Positive training examples can be automatically written into a watch

list and used as hints

◮ Clauses on the watchlist are preferred over all other clauses ◮ First experiments ◮ Reproving with much better GCU-ratio (and much faster) ◮ Some improvement even for related problems

35

SLIDE 69

Open Questions

◮ Abstractions

◮ Are concrete function symbols relevant? ◮ Is the concrete term structure relevant?

◮ Learning methods

◮ Folding architecture networks? ◮ Feature-based numerical methods? ◮ Pattern-based learning? ◮ Deep learning with convoluted networks?

◮ Trade-offs

◮ Power vs. convenience ◮ Speed vs. quality ◮ Online vs. offline costs

?

36

SLIDE 70

Open Questions

◮ Abstractions

◮ Are concrete function symbols relevant? ◮ Is the concrete term structure relevant?

◮ Learning methods

◮ Folding architecture networks? ◮ Feature-based numerical methods? ◮ Pattern-based learning? ◮ Deep learning with convoluted networks?

◮ Trade-offs

◮ Power vs. convenience ◮ Speed vs. quality ◮ Online vs. offline costs

?

Work in Progress

36

SLIDE 71

(Nearly) The End

37

SLIDE 72

Conclusion

◮ Controlling proof search for theorem provers is a

rich application for machine learning techniques

◮ Inductive techniques can be applied at several

different levels of search control

◮ Explicit proofs can be generated efficiently

◮ . . . and mined for training examples!

◮ Proofs are beautiful and informative

◮ Learning from proofs may be the future

38

SLIDE 73

Conclusion

◮ Controlling proof search for theorem provers is a

rich application for machine learning techniques

◮ Inductive techniques can be applied at several

different levels of search control

◮ Explicit proofs can be generated efficiently

◮ . . . and mined for training examples!

◮ Proofs are beautiful and informative

◮ Learning from proofs may be the future

Thank you!

Questions?

38

SLIDE 74

Image Credit

◮ Clipart via http://openclipart.org ◮ Hieronimous Bosch via https://commons.wikimedia.org

39