Learning higher-order logic programs Andrew Cropper, Rolf Morel, and - - PowerPoint PPT Presentation

learning higher order logic programs
SMART_READER_LITE
LIVE PREVIEW

Learning higher-order logic programs Andrew Cropper, Rolf Morel, and - - PowerPoint PPT Presentation

Learning higher-order logic programs Andrew Cropper, Rolf Morel, and Stephen Muggleton Program induction/synthesis Examples Learner Background knowledge Program induction/synthesis Examples Learner Computer program Background knowledge


slide-1
SLIDE 1

Learning higher-order logic programs

Andrew Cropper, Rolf Morel, and Stephen Muggleton

slide-2
SLIDE 2

Examples Background knowledge Learner Program induction/synthesis

slide-3
SLIDE 3

Examples Background knowledge Learner Computer program Program induction/synthesis

slide-4
SLIDE 4

Examples input

  • utput

dog g sheep p chicken ?

slide-5
SLIDE 5

Examples Background knowledge head tail empty input

  • utput

dog g sheep p chicken ?

slide-6
SLIDE 6

Examples input

  • utput

dog g sheep p chicken ? def f(a): t = tail(a) if empty(t): return head(a) return f(t) Background knowledge head tail empty

slide-7
SLIDE 7

Examples input

  • utput

dog g sheep p chicken n def f(a): t = tail(a) if empty(t): return head(a) return f(t) Background knowledge head tail empty

slide-8
SLIDE 8

Examples

f(A,B):-tail(A,C),empty(C),head(A,B). f(A,B):-tail(A,C),f(C,B).

input

  • utput

dog g sheep p chicken n Background knowledge head tail empty

slide-9
SLIDE 9

input output dbu cat eph dog hpptf ?

slide-10
SLIDE 10

input output dbu cat eph dog hpptf goose

slide-11
SLIDE 11

f(A,B):- empty(A), empty(B). f(A,B):- head(A,C), char_to_int(C,D), prec(D,E), int_to_char(E,F), head(B,F), tail(A,G), tail(B,H), f(G,H).

base case inductive case

slide-12
SLIDE 12

f(A,B):- empty(A), empty(B). f(A,B):- head(A,C), f1(C,F), head(B,F), tail(A,G), tail(B,H), f(G,H). f1(A,B):- char_to_int(A,C), prec(C,D), int_to_char(D,B).

list manipulation cool stuff

slide-13
SLIDE 13

f(A,B):- empty(A), empty(B). f(A,B):- head(A,C), f1(C,F), head(B,F), tail(A,G), tail(B,H), f(G,H). f1(A,B):- char_to_int(A,C), prec(C,D), int_to_char(D,B).

slide-14
SLIDE 14

Idea Learn higher-order programs

slide-15
SLIDE 15

map([],[],_F). map([A|As],[B|Bs],F):- call(F,A,B), map(As,Bs,F).

slide-16
SLIDE 16

f(A,B):- map(A,B,f1). f1(A,B):- char_to_int(A,C), prec(C,D), int_to_char(D,B).

slide-17
SLIDE 17

f(A,B):- map(A,B,f1). f1(A,B):- char_to_int(A,C), prec(C,D), int_to_char(D,B).

From 12 to 6 literals

slide-18
SLIDE 18

Why? Search complexity is bn b is the number of background relations n is the size of the program Idea: increase branching to reduce depth

slide-19
SLIDE 19

Fragment Complexity First-order 612 = 2,176,782,336

slide-20
SLIDE 20

Fragment Complexity First-order 612 = 2,176,782,336 Higher-order 76 = 117,649 +1 because of map

slide-21
SLIDE 21

Fragment Complexity First-order 612 = 2,176,782,336 Higher-order 76 = 117,649 Higher-order* 46 = 4,096 If we do not give head, tail, empty

slide-22
SLIDE 22

How? Extend Metagol [Cropper and Muggleton, 2016]

slide-23
SLIDE 23

Metagol Proves examples using a Prolog meta-interpreter Extracts a logic program from the proof Uses metarules to guide the search

slide-24
SLIDE 24

Metarule P(A,B) ← Q(A,C), R(C,B)

P, Q, and R are second-order variables A, B, and C are first-order variables

slide-25
SLIDE 25

Examples input

  • utput

1 3 2 4 3 ?

slide-26
SLIDE 26

Examples input

  • utput

1 3 2 4 3 ? Background knowledge succ/2 Metarule

P(A,B) ← Q(A,C),R(C,B)

slide-27
SLIDE 27

Examples input

  • utput

1 3 2 4 3 ? Background knowledge succ/2 Metarule

P(A,B) ← Q(A,C),R(C,B) target(A,B) ← succ(A,C),succ(C,B) P/target, Q/succ, R/succ

slide-28
SLIDE 28

Examples input

  • utput

1 3 2 4 3 5 Background knowledge succ/2 Metarule

P(A,B) ← Q(A,C),R(C,B) target(A,B) ← succ(A,C),succ(C,B) P/target, Q/succ, R/succ

slide-29
SLIDE 29

Examples input

  • utput

[1,2,3] [c,d,e] [2,3,4] ? [3,4,5] ?

slide-30
SLIDE 30

Examples Background knowledge

succ/2 int_to_char/2 map/3

Metarules

P(A,B) ← Q(A,C),R(C,B) P(A,B) ← Q(A,B,R)

input

  • utput

[1,2,3] [c,d,e] [2,3,4] ? [3,4,5] ?

slide-31
SLIDE 31

← f([1,2,3],[c,d,e])

negated example (i.e. a goal)

slide-32
SLIDE 32

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R)

metarule

slide-33
SLIDE 33

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R)

resolution {P/f}

slide-34
SLIDE 34

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R) ← Q([1,2,3],[c,d,e],R)

new goal

slide-35
SLIDE 35

← Q([1,2,3],[c,d,e],R)

slide-36
SLIDE 36

← Q([1,2,3],[c,d,e],R)

succ/2 int_to_char/2 map/3

slide-37
SLIDE 37

← Q([1,2,3],[c,d,e],R)

map/3

resolution {Q/map}

slide-38
SLIDE 38

← Q([1,2,3],[c,d,e],R) ← map([1,2,3],[c,d,e],R)

map/3

slide-39
SLIDE 39

← map([1,2,3],[c,d,e],R)

succ/2 int_to_char/2 map/3

slide-40
SLIDE 40

← map([1,2,3],[c,d,e],R)

succ/2 int_to_char/2 map/3

← map([1,2,3],[c,d,e],succ) ← map([1,2,3],[c,d,e],int_to_char)

slide-41
SLIDE 41

f(A,B):-f1(A,C),f3(C,B) f1(A,B):-f2(A,C),f2(C,B). f2(A,B):-map(A,B,succ). f3(A,B):-map(A,B,int_to_char).

Metagol solution

slide-42
SLIDE 42

f(A,B):- map(A,C,succ). map(C,D,succ). map(D,B,int_to_char).

Metagol unfolded solution

slide-43
SLIDE 43

MetagolHO Allows interpreted background knowledge

ibk( [map,[A|As],[B|Bs],F], % head [[F,A,B],[map,As,Bs,F]] % body ).

slide-44
SLIDE 44

Examples BK

succ/2 int_to_char/2 Interpreted BK map/3

Metarules

P(A,B) ← Q(A,C),R(C,B) P(A,B) ← Q(A,B,R)

input

  • utput

[1,2,3] [c,d,e] [2,3,4] ? [3,4,5] ?

slide-45
SLIDE 45

← f([1,2,3],[c,d,e])

negated example (i.e. a goal)

slide-46
SLIDE 46

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R)

metarule

slide-47
SLIDE 47

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R)

resolution {P/f}

slide-48
SLIDE 48

← f([1,2,3],[c,d,e]) P(A,B) ← Q(A,B,R) ← Q([1,2,3],[c,d,e],R)

new goal

slide-49
SLIDE 49

← Q([1,2,3],[c,d,e],R)

slide-50
SLIDE 50

← Q([1,2,3],[c,d,e],R)

map([A|As],[B|Bs],R) ← … interpreted BK

slide-51
SLIDE 51

← Q([1,2,3],[c,d,e],R)

map([A|As],[B|Bs],R) ← … resolution {Q/map}

slide-52
SLIDE 52

← Q([1,2,3],[c,d,e],R)

map([A|As],[B|Bs],R) ← …

←R(1,c), R(2,d), R(3,e)

map decomposes goal into subgoals

slide-53
SLIDE 53

←R(1,c), R(2,d), R(3,e)

slide-54
SLIDE 54

←R(1,c), R(2,d), R(3,e) S(A,B) ← T(A,C),U(C,B)

metarule resolution {R/S}

slide-55
SLIDE 55

←R(1,c), R(2,d), R(3,e) S(A,B) ← T(A,C),U(C,B) ←T(1,C1),U(C1,c), T(2,C2),U(C2,d), T(3,C3),U(C3,e)

decomposes problem again

slide-56
SLIDE 56

←R(1,c), R(2,d), R(3,e) S(A,B) ← T(A,C),U(C,B) ←T(1,C1),U(C1,c), T(2,C2),U(C2,d), T(3,C3),U(C3,e)

and the proof continues …

slide-57
SLIDE 57

f(A,B):-map(A,B,f1). f1(A,B):-succ(A,C),f2(C,B). f2(A,B):-succ(A,C),int_to_char(C,B).

MetagolHO solution

slide-58
SLIDE 58

f(A,B):- map(A,B,f1). f1(A,B):- succ(A,C), succ(C,D), int_to_char(D,B).

MetagolHO unfolded solution

invented

slide-59
SLIDE 59

Decryption example

input output dbu cat eph dog hpptf ?

slide-60
SLIDE 60

f(A,B):-f1(A,B),f5(A,B). f1(A,B):-head(A,C),f2(C,B). f2(A,B):-head(B,C),f3(A,C). f3(A,B):-char_to_int(A,C),f4(C,B). f4(A,B):-prec(A,C),int_to_char(C,B), f5(A,B):-tail(A,C),f6(C,B). f6(A,B):-tail(B,C),f(A,C).

Metagol

7 clauses and 21 literals

slide-61
SLIDE 61

f(A,B):-map(A,B,f1). f1(A,B):-char_to_int(A,C),f2(C,B). f2(A,B):-prec(A,C),int_to_char(C,B).

MetagolHO

3 clauses and 8 literals

slide-62
SLIDE 62

Does it help in practice?

  • Q. Can learning higher-order programs improve

learning performance?

slide-63
SLIDE 63

Robot waiter

slide-64
SLIDE 64

Chess

slide-65
SLIDE 65

Input Output [alice,bob,charlie] [alic,bo,charli] [inductive,logic,programming] [inductiv,logi,programmin] [ferrara,orleans,london,kyoto] [ferrar,orlean,londo,kyot]

Droplasts

slide-66
SLIDE 66

f(A,B):-map(A,B,f1). f1(A,B):-f2(A,C),f3(C,B). f2(A,B):-f3(A,C),tail(C,B). f3(A,B):-reduceback(A,B,concat).

MetagolHO solution

slide-67
SLIDE 67

f(A,B):-map(A,B,f1). f1(A,B):-f2(A,C),tail(C,D),f2(D,B). f2(A,B):-reduceback(A,B,concat).

invented reverse invented droplast

MetagolHO unfolded solution

slide-68
SLIDE 68

Input Output [alice,bob,charlie] [alic,bo] [inductive,logic,programming] [inductiv,logi] [ferrara,orleans,london,kyoto] [ferrar,orlean,londo]

Double droplasts

slide-69
SLIDE 69

f(A,B):-f1(A,C),f2(C,B). f1(A,B):-map(A,B,f2). f2(A,B):-f3(A,C),f4(C,B). f3(A,B):-f4(A,C),tail(C,B). f4(A,B):-reduceback(A,B,concat).

MetagolHO solution

slide-70
SLIDE 70

f(A,B):-map(A,C,f1),f1(C,B). f1(A,B):-f2(A,C),tail(C,D),f2(D,B). f2(A,B):-reduceback(A,B,concat).

uses f1 as a term uses f1 as a predicate symbol

MetagolHO unfolded solution

slide-71
SLIDE 71

Conclusions Inducing higher-order programs can reduce program size and sample complexity and improve learning performance Can decompose problems through predicate invention

slide-72
SLIDE 72

Limitations Inefficient search Which metarules? Which higher-order definitions?

slide-73
SLIDE 73

Thank you Cropper, A., Morel, R., and Muggleton, S. Learning higher-order logic programs. Machine Learning. 2019. Metagol system. https://github.com/metagol/metagol