Machine Reading and Reasoning with Neural Program Interpreters - - PowerPoint PPT Presentation

machine reading and reasoning with neural program
SMART_READER_LITE
LIVE PREVIEW

Machine Reading and Reasoning with Neural Program Interpreters - - PowerPoint PPT Presentation

Machine Reading and Reasoning with Neural Program Interpreters Sebastian Riedel @riedelcastro Bloomsbury AI Machine Reading Collaborators Pontus Stenetorp Matko Bosnjak Johannes Welbl (UCL) (UCL) (UCL) Tim Rocktschel Jason Naradowsky


slide-1
SLIDE 1

Machine Reading and Reasoning with Neural Program Interpreters

Sebastian Riedel

Machine Reading

Bloomsbury AI

@riedelcastro

slide-2
SLIDE 2

Collaborators

2

Tim Rocktäschel Matko Bosnjak Pontus Stenetorp

(now at Oxford) (UCL) (UCL)

Johannes Welbl

(UCL)

Jason Naradowsky

(Johns Hopkins University)

slide-3
SLIDE 3

“Should we separate meaning from language?”

3

Chris Manning @AKBC 2013

! [Language] [Meaning] ? [Information Need]

[Maybe not?]

slide-4
SLIDE 4

End-to-End Reading and Comprehension

4

(Hermann et. al 2015, Seo et al., 2016, Rajpurkar et al., 2016, Weissenborn 2016…)

! [Language] ? [Information Need]

slide-5
SLIDE 5

Nicola Tesla … In January 1880, two of Tesla's uncles put together enough money to help him leave Gospić for Prague where he was to study. Unfortunately, he arrived too late to enroll at Charles-Ferdinand University; he never studied Greek, a required subject; and he was illiterate in Czech, another required subject. Tesla did, however, attend lectures at the university, although, as an auditor, he did not receive grades for the courses.

Machine Reading

5

Nicola Tesla … In January 1880, two of Tesla's uncles put together enough money to help him leave Gospić for Prague where he was to study. Unfortunately, he arrived too late to enroll at Charles-Ferdinand University; he never studied Greek, a required subject; and he was illiterate in Czech, another required subject. Tesla did, however, attend lectures at the university, although, as an auditor, he did not receive grades for the courses. What city did Tesla move to in 1880? Prague Why was he unable to enroll at the university? arrived too late to enroll Nicola Tesla … In January 1880, two of Tesla's uncles put together enough money to help him leave Gospić for Prague where he was to study. Unfortunately, he arrived too late to enroll at Charles-Ferdinand University; he never studied Greek, a required subject; and he was illiterate in Czech, another required subject. Tesla did, however, attend lectures at the university, although, as an auditor, he did not receive grades for the courses.

slide-6
SLIDE 6

How to read and reason?

6

slide-7
SLIDE 7

Machine Reading and Reasoning

7

Which medical specialty deals with pituitary ACTH hypersecretion? Endocrinology Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

slide-8
SLIDE 8

Machine Reading and Reasoning

8

How many pictures were in each of the albums? 2 Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album.

slide-9
SLIDE 9

Can we learn this end-to-end?

9

slide-10
SLIDE 10

Part 1: A Read & Reason Dataset

10

Pontus Stenetorp Johannes Welbl

slide-11
SLIDE 11

11

slide-12
SLIDE 12

A Single Instance

12

What is the nationality of Jamie Burnett? Candidates: Scotland (correct) China (incorrect) … Jamie Burnett (born 16 September 1975 ) is a professional snooker player from Hamilton, South Lanarkshire…He began the 2014/2015 season with a quarter-final showing at the Yixing Open… …Hamilton is a town in South Lanarkshire, in the central Lowlands of Scotland… The Yixing Open was a professional minor- ranking snooker tournament that took place at the Yixing Sports Centre in Yixing, China.

slide-13
SLIDE 13

Dataset Construction Method

13

Unlabelled Text Knowledge Base Dataset Construction Method [Conditionally Accepted to TACL]

Multihop Dataset

slide-14
SLIDE 14

Dataset Construction Method

14

Documents Entities

Jamie Burnett Scotland

described in mentions described in mentions

Hamilton Jamie Burnett, citizenship, Scotland Yixing Open China What is the nationality of Jamie Burnett?

KB Triple Instance

Scotland China

slide-15
SLIDE 15

Dataset Construction Method

15

Unlabelled Text Knowledge Base Dataset Construction Method [Conditionally Accepted to TACL]

Multihop Dataset

slide-16
SLIDE 16

Dataset Construction Method

16

Dataset Construction Method

WikiHop

Unlabelled Text Knowledge Base

slide-17
SLIDE 17

Dataset Construction Method

17

Dataset Construction Method

MedHop

Unlabelled Text Knowledge Base

slide-18
SLIDE 18

Baseline Results

18

Accuracy [%]

8.333 16.667 25 33.333 41.667 50 WikiHop MedHop

47.8 42.9 9 25.6 9.5 10.6 13.9 11.5

Random Max-Mention TF-IDF BiDAF

slide-19
SLIDE 19

Reduction to Traditional Machine Comprehension

19

Which medical specialty deals with pituitary ACTH hypersecretion? Endocrinology Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

slide-20
SLIDE 20

Reduction to Traditional Machine Comprehension

20

Which medical specialty deals with pituitary ACTH hypersecretion? Endocrinology A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

slide-21
SLIDE 21

Does BiDaf Aggregate?

21

Which medical specialty deals with pituitary ACTH hypersecretion? Endocrinology A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

slide-22
SLIDE 22

Removing Relevant Documents, Keep Answer Documents

22

Accuracy [%]

25 30 35 40 45 50 55 WikiHop MedHop

30.4 44.6 33.7 54.5

BIDAF BIDAF doc-rem

10%

slide-23
SLIDE 23

Part 2: Learning to Read and Calculate

23

Tim Rocktäschel Matko Bosnjak Jason Naradowsky

slide-24
SLIDE 24

Machine Reading and Reasoning: Math

24

How many pictures were in each of the albums? 2 Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album.

slide-25
SLIDE 25

Differentiable Program Interpreters

25

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

slide-26
SLIDE 26

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Differentiable Program Interpreters

26

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

slide-27
SLIDE 27

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Differentiable Program Interpreters

27

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

slide-28
SLIDE 28

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Differentiable Program Interpreters

28

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

slide-29
SLIDE 29

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Differentiable Program Interpreters

29

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 2

d

Stack

slide-30
SLIDE 30

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Training

30

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 2

d

Stack

slide-31
SLIDE 31

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Training

31

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 2

d

Stack

slide-32
SLIDE 32

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Training

32

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 2

d

Stack

slide-33
SLIDE 33

1 2 3 4 5 6 7 8 9 3 4 2

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

Training

33

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 4 2 3

d

Stack Program Interpreter

1 2 3 4 5 6 7 8 9

Heap

Bosnjak et al. ICML 2017

Reader Model

Isabel uploaded 2 pictures from her phone and 4 from her camera to facebook. She sorted the pics into 3 different albums with the same amount of pics in each album. How many pictures were in each of the albums?

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 6 3

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 3 6

d

Stack

1 2 3 4 5 6 7 8 9

p

Code

1 2 3 4 5 6 7 8 9 2

d

Stack

def solve(x): {+|-|%|*} solve(y)

slide-34
SLIDE 34

Results on Benchmark (Accuracy)

34

Bosnjak et al. ICML 2017

50 62.5 75 87.5 100 Roy & Roth (2015) Ours Seq2Seq

95 96 55.5 How many pictures were in each of the albums? (2 + 4) / 3

Seq2Seq solves simpler problem

slide-35
SLIDE 35

Limitations

Continuous relaxations difficult to train

Hard to learn long programs Hard to learn with recursive function calls

Need better gradients in presence of discrete variables

35

slide-36
SLIDE 36

Part 3: Learning to Aggregate

36

Tim Rocktäschel

slide-37
SLIDE 37

Neural Theorem Provers

37

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … Which medical specialty deals with pituitary ACTH hypersecretion? A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

slide-38
SLIDE 38

Neural Theorem Provers

38

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … Which medical specialty deals with pituitary ACTH hypersecretion? A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code

slide-39
SLIDE 39

1 2 3 4 5 6 7 8 9

p

Code

Neural Theorem Provers

39

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

Question

What is ACTH hypersecretion created by?

1 2 3 4 5 6 7 8 9

p

Code

Which medical specialty deals with pituitary ACTH hypersecretion?

slide-40
SLIDE 40

Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code

Neural Theorem Provers

40

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code Question

What is the pituitary a part of?

1 2 3 4 5 6 7 8 9

p

Code

Which medical specialty deals with pituitary ACTH hypersecretion?

slide-41
SLIDE 41

Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code

Neural Theorem Provers

41

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code Question

What is the pituitary a part of?

1 2 3 4 5 6 7 8 9

p

Code Question

What topic covers the endocrine system?

1 2 3 4 5 6 7 8 9

p

Code

Which medical specialty deals with pituitary ACTH hypersecretion?

slide-42
SLIDE 42

Vectors Correspond to Interpretable Rules

42

Pituitary ACTH hypersecretion ... is a form of hyperpituitarism characterized by an abnormally high level of ACTH produced by the anterior pituitary … A major organ of the endocrine system, the anterior pituitary is the glandular, anterior lobe that ... The endocrine system is ... ... The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine.

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code Question

What is the pituitary a part of?

1 2 3 4 5 6 7 8 9

p

Code Question

What topic covers the endocrine system?

1 2 3 4 5 6 7 8 9

p

Code

X deals with Y if Y produced by Z Z is a part of U X deals with U

Which medical specialty deals with pituitary ACTH hypersecretion?

slide-43
SLIDE 43

Catch: Currently only Works on Relational Data

43

createdBy(hypersecretion, anterior pituitary) dealsWith(hypersecretion, X) partOf(endocrine system, anterior pituitary) dealsWith(endocrine system, endocrinology)

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code Question

What is the pituitary a part of?

1 2 3 4 5 6 7 8 9

p

Code Question

covers(endocrine system, X)

1 2 3 4 5 6 7 8 9

p

Code

X deals with Y if Y produced by Z Z is a part of U X deals with U

slide-44
SLIDE 44

Catch: Currently only Works on Relational Data

44

createdBy(hypersecretion, anterior pituitary) dealsWith(hypersecretion, X) partOf(endocrine system, anterior pituitary) dealsWith(endocrine system, endocrinology)

Rocktäschel and Riedel NIPS 2017

Agent

Amounts to Differentiable Version of the Backward Chaining algorithm used in Prolog

slide-45
SLIDE 45

Supports Soft Unification

45

createdBy(hypersecretion, anterior pituitary) dealsWith(hypersecretion, X) partOf(endocrine system, anterior pituitary) dealsWith(endocrine system, endocrinology)

Rocktäschel and Riedel NIPS 2017

Agent Program Interpreter Reader Question

Which specialty deals with pituitary ACTH hypersecretion?

1 2 3 4 5 6 7 8 9

p

Code Question

What is the pituitary a part of?

1 2 3 4 5 6 7 8 9

p

Code Question

covers(endocrine system, X)

1 2 3 4 5 6 7 8 9

p

Code

X deals with Y if Y produced by Z Z is a part of U X deals with U

slide-46
SLIDE 46

Results on Benchmark (Rank of Correct Answer)

46

25 50 75 100 Countries S3 UMLS Nations Complex NTP Complex NTP Complex NTP

89 99 77.3 86 96 48.4

Comparable or Better than Baselines, and interpretable Rocktäschel and Riedel NIPS 2017

slide-47
SLIDE 47

Interpretability: Learnt Rules

if X is located in Y and Y is located in Z then X is located in Z if X expels diplomats of Y then X shows negative behaviour towards Y if X interacts with Y and Y interacts with Z then X interacts with Z

47

slide-48
SLIDE 48

Limitations

Scalability

Currently only works for KBs with < 10k facts Small proof depth

Still requires relational representation

48

slide-49
SLIDE 49

Conclusion

Great Progress in End-to-End Reading Comprehension Reasoning (aggregation, calculation etc.) end-to-end is still very challenging Our Approaches

create datasets cast reasoning as program learning and execution are end-to-end differentiable (can be trained on downstream loss) are inspired and tied to traditional symbolic formalisms (Forth, Prolog/Datalog) are learnt models are interpretable allow injection of prior knowledge

49