Question asking as program induction Anselm Rothe WITH Todd - - PowerPoint PPT Presentation
Question asking as program induction Anselm Rothe WITH Todd - - PowerPoint PPT Presentation
Question asking as program induction Anselm Rothe WITH Todd Gureckis Brenden Lake Computers are useless. They can only give you answers. (attributed to) Pablo Picasso What does it take Computers are to build a machine useless.
Todd Gureckis Brenden Lake
WITH
Computers are
- useless. They can only
give you answers.
(attributed to) Pablo Picasso
Computers are
- useless. They can only
give you answers.
(attributed to) Pablo Picasso
What does it take to build a machine that asks good questions?
Anselm Rothe - Question asking as program induction 5
What does it take to build a machine that asks good questions?
- Representing questions
as programs that, when executed on the state of the world,
- utput an answer
Anselm Rothe - Question asking as program induction 6
What does it take to build a machine that asks good questions? Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
- Representing questions
as programs that, when executed on the state of the world,
- utput an answer
Anselm Rothe - Question asking as program induction
A B C D E F
1 2 3 4 5 6
Hidden gameboard Possible ships
random samples
A B C D E F
1 2 3 4 5 6
Revealed gameboard
Generative model Current data/context
Identify the hidden gameboard!
Goal
1x 1x 1x
HUMAN QUESTIONS
7
We need a task that allows people to intuitively ask interesting questions and is still amenable to formal modeling
Rothe, Lake, & Gureckis 2016, CogSci Rothe, Lake, & Gureckis 2018, Computational Brain & Behavior
World model Ambiguous context
Anselm Rothe - Question asking as program induction
A B C D E F
1 2 3 4 5 6
Hidden gameboard Possible ships
random samples
A B C D E F
1 2 3 4 5 6
Revealed gameboard
Generative model Current data/context
Identify the hidden gameboard!
Goal
1x 1x 1x
HUMAN QUESTIONS
8
A B C D E F
1 2 3 4 5 6
Hidden gameboard Possible ships
random samples
A B C D E F
1 2 3 4 5 6
Revealed gameboard
G
Identify the hidden gameboard!
1x 1x 1x
Rothe, Lake, & Gureckis 2016, CogSci Rothe, Lake, & Gureckis 2018, Computational Brain & Behavior
World model Ambiguous context
Anselm Rothe - Question asking as program induction
A B C D E F
1 2 3 4 5 6
Hidden gameboard Possible ships
random samples
A B C D E F
1 2 3 4 5 6
Revealed gameboard
Generative model Current data/context
Identify the hidden gameboard!
Goal
1x 1x 1x
HUMAN QUESTIONS
9
People were dropped into the middle of a game and were given the ‘magic’ opportunity to ask whatever they want*
* only one-word-answer questions, no combination of questions
type in your question
Is the red ship horizontal? |
A B C D E F
1 2 3 4 5 6
Revealed gameboard
G
Identify the hidden gameboard!
1x 1x 1x
Ambiguous context
Anselm Rothe - Question asking as program induction 10
A B C D E F
1 2 3 4 5 6
At what location is the top left part of the purple ship? What is the location of one purple tile? Is the blue ship horizontal? Is the red ship 2 tiles long? Is the purple ship horizontal? Is the red ship horizontal?
Context Example questions from people ...
Rothe, Lake, & Gureckis 2016, CogSci Rothe, Lake, & Gureckis 2018, Computational Brain & Behavior
HUMAN QUESTIONS
Anselm Rothe - Question asking as program induction 11
A B C D E F
1 2 3 4 5 6
Trial 4
A B C D E F
1 2 3 4 5 6
Trial 11
A B C D E F
1 2 3 4 5 6
Trial 17
A B C D E F
1 2 3 4 5 6
Trial 16
A B C D E F
1 2 3 4 5 6
Trial 9
A B C D E F
1 2 3 4 5 6
Trial 15
A B C D E F
1 2 3 4 5 6
Trial 1
A B C D E F
1 2 3 4 5 6
Trial 2
A B C D E F
1 2 3 4 5 6
Trial 3
A B C D E F
1 2 3 4 5 6
Trial 5
A B C D E F
1 2 3 4 5 6
Trial 6
A B C D E F
1 2 3 4 5 6
Trial 7
A B C D E F
1 2 3 4 5 6
Trial 8
A B C D E F
1 2 3 4 5 6
Trial 18
A B C D E F
1 2 3 4 5 6
Trial 14
A B C D E F
1 2 3 4 5 6
Trial 13
A B C D E F
1 2 3 4 5 6
Trial 12
A B C D E F
1 2 3 4 5 6
Trial 10
- 40 MTurk participants
- 605 human questions
Rothe, Lake, & Gureckis 2016, CogSci Rothe, Lake, & Gureckis 2018, Computational Brain & Behavior
Anselm Rothe - Question asking as program induction 12
A B C D E F
1 2 3 4 5 6
Trial 4A B C D E F
1 2 3 4 5 6
Trial 11A B C D E F
1 2 3 4 5 6
Trial 17A B C D E F
1 2 3 4 5 6
Trial 16A B C D E F
1 2 3 4 5 6
Trial 9A B C D E F
1 2 3 4 5 6
Trial 15A B C D E F
1 2 3 4 5 6
Trial 1A B C D E F
1 2 3 4 5 6
Trial 2A B C D E F
1 2 3 4 5 6
Trial 3A B C D E F
1 2 3 4 5 6
Trial 5A B C D E F
1 2 3 4 5 6
Trial 6A B C D E F
1 2 3 4 5 6
Trial 7A B C D E F
1 2 3 4 5 6
Trial 8A B C D E F
1 2 3 4 5 6
Trial 18A B C D E F
1 2 3 4 5 6
Trial 14A B C D E F
1 2 3 4 5 6
Trial 13A B C D E F
1 2 3 4 5 6
Trial 12A B C D E F
1 2 3 4 5 6
Trial 10- 15% of participants’ questions
were only asked in a single context
- Our model needs the ability
to generate novel questions
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
Anselm Rothe - Question asking as program induction 13
“How long is the blue ship?” “Does the blue ship have 3 tiles?” “Are there any ships with 4 tiles?” “Is the blue ship less then 4 tiles?” “Are all 3 ships the same size?” “Does the red ship have more tiles than the blue ship?”
size blue ship 3 red more 4 less
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
Anselm Rothe - Question asking as program induction 14
size blue
(size Blue)
COMPOSITIONALITY IN QUESTION STRUCTURE
- Questions are represented as programs that, when executed on the state
- f the world, output an answer
Anselm Rothe - Question asking as program induction 15
size blue red more
(> (size Blue) (size Red))
equal
(= (size Blue) (size Red)) (size Blue)
COMPOSITIONALITY IN QUESTION STRUCTURE
- Questions are represented as programs that, when executed on the state
- f the world, output an answer
Anselm Rothe - Question asking as program induction 16
size blue red more
- rientation
equal
(= (orientation Blue) (orientation Red))
“Are the blue ship and the red ship parallel?”
(> (size Blue) (size Red)) (= (size Blue) (size Red)) (size Blue)
COMPOSITIONALITY IN QUESTION STRUCTURE
- Questions are represented as programs that, when executed on the state
- f the world, output an answer
Anselm Rothe - Question asking as program induction 17
COMPOSITIONALITY IN QUESTION STRUCTURE
How many ships are three tiles long? ( + ( map ( lambda x ( = ( size x ) 3 ) ) ( set Blue Red Purple ) ) ) Are any ships 3 tiles long? ( > ( + ( map ( lambda x ( = ( size x ) 3 ) ) ( set Blue Red Purple ) ) ) ) Are all ships three tiles long? ( = ( + ( map ( lambda x ( = ( size x ) 3 ) ) ( set Blue Red Purple ) ) ) 3 )
- Questions are represented as programs that, when executed on the state
- f the world, output an answer
18
A → B (boolean) A → N (number) A → C (color) A → O (orientation) A → L (location) B → TRUE B → FALSE B → (not B) B → (and B B) B → (or B B) B → (= B B) B → (= N N) B → (= O O) B → (= setN) B → (> N N) B → (touch S S) b N → 0 ... N → 10 N → (+ N N) N → (+ B B) N → (+ setN) N → (+ setB) N → (– N N) N → (size S) b N → (row L) N → (col L) → C → S (ship color) C → Water C → (color L) b S → Blue S → Red S → Purple S → x λ O → H O → V O → (orient S) b L → A1 ... L → F6 L → (topleft S) b L → (bottomright S) b L → (draw setL) * setB → (map fxB setS) fxB → (λ x B) setN → (map fxN setS) fxN → (λ x N) setS → (set Blue Red Purple) setL → (set A1 ... F6) setL → (shipTiles S) b * setL → (map fxL setS) fxL → (λ x L)
A GRAMMAR OF QUESTIONS
Rothe, Lake, & Gureckis 2017, NIPS
19
Generating questions
- Drawing samples from grammar
- Evolutionary search
cost / fitness function
Question space as defined by grammar
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
? ? ? ? ?
✔ ✔
Anselm Rothe - Question asking as program induction 20
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
Anselm Rothe - Question asking as program induction
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
21
A B C D E F
1 2 3 4 5 6
human questions
Anselm Rothe - Question asking as program induction
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
22
A B C D E F
1 2 3 4 5 6
human questions
?
Using a genetic algorithm with EIG as fitness function to search for the “best question” for a given context
Anselm Rothe - Question asking as program induction 23
(- (- (+ (+ (- (- (+ (size Purple) (colL (topleft Red))) (size Blue)) (- (+ (size Blue) (size Red)) (colL (topleft Red)))) (colL (bottomright Purple))) (+ (+ (colL (topleft Red)) (+ (- (- (+ (size Purple) (colL (topleft Red))) (size Blue)) (- (+ (size Blue) (size Red)) (colL (topleft Blue)))) (colL (topleft Red)))) (+ (- (- (+ (size Purple) (colL (topleft Red))) (size Blue)) (- (+ (size Blue) (size Red)) (colL (topleft Red)))) (colL (topleft Red))))) (size Red)) (- (+ (size Blue) (size Blue)) (colL (topleft Red))))
5.38
A B C D E F
1 2 3 4 5 6
x
Using a genetic algorithm with EIG as fitness function to search for the “best question” for a given context
Anselm Rothe - Question asking as program induction 24
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
- !1 Informativeness
Informative questions
- !2 Complexity
Short questions
What features are relevant for people to ask a question?
Anselm Rothe - Question asking as program induction 25
- !1 Informativeness
Informative questions
- !2 Complexity
Short questions
What features are relevant for people to ask a question?
Human Model Space of questions (defined by grammar)
p(Question)
- Combine features of question x via
weighted sum
⬅
E(x) = θ1f1(x) + θ2f2(x) + ... + θKfK(x),
- f feature
- f question
. We will describe
grammar as
- Predict probability of question x being
asked
Rothe, Lake, & Gureckis 2017, NIPS
Anselm Rothe - Question asking as program induction 26
⬅ Model Features Log-likelihood Full all
- 1400.06
Information-agnostic not !1
- 1464.65
Complexity-agnostic not !2
- 22993.38
Type-agnostic not !3
A
(out-of-sample predictions)
Rothe, Lake, & Gureckis 2017, NIPS
- !1 Informativeness
Informative questions
- !2 Complexity
Short questions
What features are relevant for people to ask a question?
Human Model Space of questions (defined by grammar)
p(Question)
Anselm Rothe - Question asking as program induction 27
A B C D E F
1 2 3 4 5 6
MODEL OR HUMAN?
Are all the ships horizontal?
(all (map (lambda x (== H (orient x))) (set Blue Red Purple)))
Are any of the ship sizes greater than 2?
(any (map (lambda x (> (size x) 2)) (set Blue Red Purple)))
How many ships are 4 tiles long?
(++ (map (lambda x (== (size x) 4)) (set Blue Red Purple)))
Anselm Rothe - Question asking as program induction 28
A B C D E F
1 2 3 4 5 6
MODEL OR HUMAN?
Are all the ships horizontal?
(all (map (lambda x (== H (orient x))) (set Blue Red Purple)))
Are any of the ship sizes greater than 2?
(any (map (lambda x (> (size x) 2)) (set Blue Red Purple)))
How many ships are 4 tiles long?
(++ (map (lambda x (== (size x) 4)) (set Blue Red Purple)))
Anselm Rothe - Question asking as program induction
Average rank correlation ⍴ = .64
29
Model score
- ●
- ●
- ●
- ρ = 0.85
- ρ = 0.47
- ●
- ρ = 0.85
- ●
- ρ = 0.69
- ●
- ρ = 0.62
- ●
- ●
- ρ = 0.8
- ●
- ρ = 0.75
- ρ = 0.82
- ρ = 0.47
- ρ = 0.8
- ●
- ●
- ρ = 0.6
- ●
- ρ = 0.45
context: 13 context: 14 context: 15 context: 16 context: 17 context: 18 context: 5 context: 6 context: 7 context: 8 context: 9 context: 10 −40 −20 −40 −20 −40 −20 −40 −20 −40 −20 −40 −20
Negative energy
−40 5 10 15 5 10 15
Empirical question frequency Human
unnormalized log prob
Rothe, Lake, & Gureckis 2017, NIPS
You may now generate your questions
Key ingredients
- Generativity
- Compositionality
- Informativeness
- Simplicity
What does it take to build a machine that asks good questions? We represent questions as programs that, when executed
- n the state of the world, output
an answer. We achieve generativity through compositionality. Good, human-like questions are informative but simple.
Anselm Rothe - Question asking as program induction 31
Anselm Rothe - Question asking as program induction 32
Model Simulated data Human data Fit parameters
θ