Whipping Satallax A sadistic approach to internal guidance Michael - - PowerPoint PPT Presentation

whipping satallax
SMART_READER_LITE
LIVE PREVIEW

Whipping Satallax A sadistic approach to internal guidance Michael - - PowerPoint PPT Presentation

Whipping Satallax A sadistic approach to internal guidance Michael Frber Chad Brown 6 April 2016 Michael Frber, Chad Brown Whipping Satallax 1/21 Introduction FEMaLeCoP Satallax Evaluation Michael Frber, Chad Brown Whipping


slide-1
SLIDE 1

Whipping Satallax

A sadistic approach to internal guidance Michael Färber Chad Brown 6 April 2016

Michael Färber, Chad Brown Whipping Satallax 1/21

slide-2
SLIDE 2

Introduction FEMaLeCoP Satallax Evaluation

Michael Färber, Chad Brown Whipping Satallax 2/21

slide-3
SLIDE 3

Introduction

Introduction

Michael Färber, Chad Brown Whipping Satallax 3/21

slide-4
SLIDE 4

Introduction

Chad Brown a.k.a. Marquis de Sade

Figure 1: Američan v Praze.

Michael Färber, Chad Brown Whipping Satallax 4/21

slide-5
SLIDE 5

Introduction

120 days of learning – a play in 3 acts

Protagonists

  • Josef Urban
  • Cezary Kaliszyk
  • Daniel Kühlwein
  • Chad Brown

Projects

  • MaLeS: Machine Learning of Strategies, invent ATP strategies

automatically

  • MaLeCoP & FEMaLeCoP: (Fairly Efficient) Machine Learning

Connection Prover

  • Satallax: an ATP for higher-order logic

Michael Färber, Chad Brown Whipping Satallax 5/21

slide-6
SLIDE 6

FEMaLeCoP

FEMaLeCoP

Michael Färber, Chad Brown Whipping Satallax 6/21

slide-7
SLIDE 7

FEMaLeCoP

FEMaLeCoP = leanCoP + fast ML

The three steps to learning

  • 1. Record which contrapositives (clause + literal) are useful in

which prover state

  • 2. Create efficient classifier from learnt data
  • 3. Rank future choices using classifier

What to influence?

tableau extension step: choice of contrapositive

How to characterise prover state?

symbols of previously chosen literals on active path

Michael Färber, Chad Brown Whipping Satallax 7/21

slide-8
SLIDE 8

FEMaLeCoP

Ranking

Naive Bayes

find contrapositive l (label) with maximal probability to be useful in conjunction with path symbols f (features) r(l, f ) = P(l)

  • i

P(fi | l)

In practice (simplified)

r(l, f ) = log Dl +

  • i

log(idf(fi))c(l, fi) c(l, f ) =

  • σ

if Dl,f = 0 log Dl,f

Dl

  • therwise

Dl is occurrence of l, and Dl,f is co-occurrence of l with f

Michael Färber, Chad Brown Whipping Satallax 8/21

slide-9
SLIDE 9

Satallax

Satallax

Michael Färber, Chad Brown Whipping Satallax 9/21

slide-10
SLIDE 10

Satallax

Satallax 101

Basic procedure

  • Based on given clause algorithm
  • Uses SAT solver to find contradictions among active clauses

Vocabulary

  • Priority queue: holds proof commands such as Formula

Processing, Mating, Confrontation, . . .

  • Priority determined by a set of flags, which form a mode
  • Set of modes with runtime weight is called strategy (MaLeS

used to find modes / strategy)

Michael Färber, Chad Brown Whipping Satallax 10/21

slide-11
SLIDE 11

Satallax

ML-ATP questions

Questions

  • Where to influence proof search?
  • How to characterise prover state?

Point of influence

  • More than 90% of commands on priority queue are

ProcessProp and store only a term

  • Influence priority of commands (caution not to influence too

much for fairness towards other commands)

  • Difference to FEMaLeCoP: also remember intermediate facts

→ “lemma learning”

Michael Färber, Chad Brown Whipping Satallax 11/21

slide-12
SLIDE 12

Satallax

Collecting training data

When to record data?

  • Data recording during proof search can considerably hurt

success rate

  • Solution: Save data only once proof has been found

What data to save?

  • Conjecture (if given)
  • Axioms (problem premises)
  • Processed terms + their priorities
  • Refutation terms (set of terms actually used for the proof)

Michael Färber, Chad Brown Whipping Satallax 12/21

slide-13
SLIDE 13

Satallax

Training data postprocessing

Positive / negative examples

  • Positive examples: Processed terms ∩ refutation terms
  • Negative examples: All other processed terms

Options

  • Discard terms with fresh variables
  • Normalise all symbols in terms, i.e. (a + b) + c = a + (b + c)

becomes c1(c1(c2, c3), c4) = c1(c2, c1(c3, c4))

  • Normalise only fresh variables
  • Only keep axiom terms (to measure “premise selection effect”)

Possible features

  • Axioms
  • Symbols of processed terms

Michael Färber, Chad Brown Whipping Satallax 13/21

slide-14
SLIDE 14

Satallax

Naive Bayes classification with monoid occurrences

Problem

  • Only positive examples à la FEMaLeCoP give bad results
  • How to integrate negative examples? Multiple classifiers, . . . ?

Solution

  • Generalised classifier to store term occurrences as monoid types
  • Allows easy extension of classifier to different kinds of
  • ccurrences (e.g. neutral examples) while keeping performance

high

In Code

  • Before: lbl_no : ('l, int) Hashtbl.t
  • After: lbl_no : ('l, LabelNo.t) Hashtbl.t, where

LabelNo is a Monoid

Michael Färber, Chad Brown Whipping Satallax 14/21

slide-15
SLIDE 15

Satallax

Monoids

Commutative monoid

Commutative monoid is (M, +) with a neutral element 0 ∈ M s.t.:

  • (a + b) + c = a + (b + c)
  • a + 0 = a
  • a + b = b + a

Monoids as label occurrences

  • 0 represents the non-occurrence of a label.
  • + combines label occurrences.
  • Commutativity of +: order of learnt labels does not matter.

Pair monoid for positive/negative examples

Let M = (N × N, +M), 0M = (0, 0) and +M pairwise addition. The first/second pair elements store positive/negative label occurrences.

Michael Färber, Chad Brown Whipping Satallax 15/21

slide-16
SLIDE 16

Satallax

The core ranking formula

Pair monoid ranking

r(l) = |p − n| p + n (σpp + σnn)

  • p, n . . . number of positive/negative occurrences of l
  • σp = 1, σn = −1
  • |p−n|

p+n . . . “confidence”; the less controversial a label, the

higher its influence

What about features?

did not increase success rate, but incurred performance decrease

Michael Färber, Chad Brown Whipping Satallax 16/21

slide-17
SLIDE 17

Satallax

Tuning of guidance parameters

Off-line tuning via training data

  • Rank all examples with classifier
  • For every positive example, sum up number of preceding

negative examples

  • Find guidance values with minimal sum

Particle Swarm Optimization

  • Run ATP with different parameters and modify them

automatically depending on how many problems solved

Outcome

Off-line tuning fast to find initial values, but PSO more reliable

Michael Färber, Chad Brown Whipping Satallax 17/21

slide-18
SLIDE 18

Evaluation

Evaluation

Michael Färber, Chad Brown Whipping Satallax 18/21

slide-19
SLIDE 19

Evaluation

Evaluation

On-line learning

Learn data after each successful proof and use in all subsequent proof attempts (1x fold)

Off-line learning

Try all problems and save training data, then try all unsolved problems with guidance from training (2x map)

Michael Färber, Chad Brown Whipping Satallax 19/21

slide-20
SLIDE 20

Evaluation

Results

Test set

THF version of Flyspeck from Cezary, with 14185 problems

Satallax without guidance

  • 1s, auto strategy: 2717 problems
  • 2s, auto strategy: 3394 problems
  • 2s, auto strategy restricted to 1s modes: 2845 problems

Satallax with guidance

  • On-line learning (1s): 3374 problems
  • Off-line learning (1s): 3428 problems

Michael Färber, Chad Brown Whipping Satallax 20/21

slide-21
SLIDE 21

Evaluation

Conclusion

When to use internal guidance?

  • Satallax could be used to continually improve itself in an ITP

situation with on-line learning

  • When run on multiple cores, off-line learning a fast alternative

Future work

  • Negative examples in FEMaLeCoP via new NB classifier with

monoids

  • Integrate internal guidance in ITP
  • Use more training data for classifier (features . . . ?)
  • Different features, e.g. TPTP

Michael Färber, Chad Brown Whipping Satallax 21/21