Forgetting to learn logic programs Andrew Cropper University of - - PowerPoint PPT Presentation

forgetting to learn logic programs
SMART_READER_LITE
LIVE PREVIEW

Forgetting to learn logic programs Andrew Cropper University of - - PowerPoint PPT Presentation

Forgetting to learn logic programs Andrew Cropper University of Oxford Program induction/synthesis Examples Learner Background knowledge Program induction/synthesis Examples Learner Computer program Background knowledge Examples


slide-1
SLIDE 1

Forgetting to learn logic programs

Andrew Cropper University of Oxford

slide-2
SLIDE 2

Examples Background knowledge Learner Program induction/synthesis

slide-3
SLIDE 3

Examples Background knowledge Learner Computer program Program induction/synthesis

slide-4
SLIDE 4

Examples input

  • utput

dog g sheep p chicken ?

slide-5
SLIDE 5

Examples Background knowledge head tail empty input

  • utput

dog g sheep p chicken ?

slide-6
SLIDE 6

Examples input

  • utput

dog g sheep p chicken ? def f(a): t = tail(a) if empty(t): return head(a) return f(t) Background knowledge head tail empty

slide-7
SLIDE 7

Examples input

  • utput

dog g sheep p chicken n def f(a): t = tail(a) if empty(t): return head(a) return f(t) Background knowledge head tail empty

slide-8
SLIDE 8

Examples

f(A,B):-tail(A,C),empty(C),head(A,B). f(A,B):-tail(A,C),f(C,B).

input

  • utput

dog g sheep p chicken n Background knowledge head tail empty

slide-9
SLIDE 9

Background knowledge defines the hypothesis space

slide-10
SLIDE 10

Where does background knowledge come from?

slide-11
SLIDE 11

Hand-crafted rules [almost every approach]

slide-12
SLIDE 12

Unsupervised learning ALPS [Dumančić et al. IJCAI 2019] Playgol [Cropper. IJCAI 2019]

slide-13
SLIDE 13

Supervised multi-task learning Use knowledge gained from solving one problem to help solve a different problem Metabias [Lin et al. ECAI 2014] Dreamcoder [Ellis et al. NIPS 2018]

slide-14
SLIDE 14

Why does it work? We increase branching but reduce depth

slide-15
SLIDE 15
slide-16
SLIDE 16

Problem: big branching factor

slide-17
SLIDE 17

Idea Forget things

slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20

ILP problem Given:

  • background knowledge B
  • positive examples E+
  • negative examples E-
slide-21
SLIDE 21

ILP problem Return: A hypothesis H that with B entails E+ and not E+

slide-22
SLIDE 22

Forgetting problem Given background knowledge B Return B’ ⊂ B from which you can still learn the target hypothesis

slide-23
SLIDE 23

Why? Reduce branching and sample complexity

slide-24
SLIDE 24
slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27

How? Forgetgol, a multi-task ILP system based on Metagol, which takes a forgetting function as input

slide-28
SLIDE 28

Forgetgol Continually expands and shrinks its hypothesis space

slide-29
SLIDE 29

Syntactical forgetting (lossless)

  • 1. Unfold each induced clause to remove invented

predicate symbols

  • 2. Check whether a syntactically duplicate clause

already exists

slide-30
SLIDE 30

Statistical forgetting Assigns a cost to each clause based on:

  • 1. How difficult it was to learn
  • 2. How likely it is to be reused
slide-31
SLIDE 31

Does it work?

  • Q. Can forgetting improve learning performance?
slide-32
SLIDE 32

We compare: Metabias: remember everything Metagol: remember nothing Forgetgolsyn: syntactical forgetting Forgetgolstat: statistical forgetting

slide-33
SLIDE 33

Robot planning

slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

What happened? Metabias rarely induces a program with more than two clauses because of program reuse

slide-41
SLIDE 41

Lego building

slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44

What happened? Less reuse (and greater search depth) so forgetting has more effect

slide-45
SLIDE 45

Conclusions Forgetting can improve learning performance when given >10,000 tasks, but, surprisingly, not by much

slide-46
SLIDE 46

Limitations and future work Better forgetting methods Larger and more diverse datasets Concept shift Recency Other program induction systems