Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 - - PowerPoint PPT Presentation
Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 - - PowerPoint PPT Presentation
Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 Li Dong and Mirella Lapata Semantic Parsing Mapping natural language to structured representations Human-friendly -> Computer-friendly all flights from dallas before 10am
Semantic Parsing
2 / 27
Mapping natural language to structured representations
Human-friendly -> Computer-friendly
all flights from dallas before 10am
(lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti))) Semantic Parser
Example from ATIS (Kwiatkowski et al., 2011)
Sequence decoder (Jia and Liang, 2016; Dong and Lapata, 2016; Ling
et al., 2016 ; Iyer et al., 2017)
Syntactically-constrained decoder (Dong and Lapata, 2016;
Xiao et al., 2016; Alvarez-Melis and Jaakkola, 2017; Yin and Neubig, 2017; Cheng et al., 2017; Krishnamurthy et al., 2017; Rabinovich et al., 2017; Xu et al., 2017)
Neural Semantic Parsing
3 / 27
Encoder Decoder
LSTM answer(J,(compa ny(J,'microsoft'),j
- b(J),not((req_de
g(J,'bscs')))))
Attention Layer
LSTM
what microsoft jobs do not require a bscs? Input Utterance Structured Representation
This Work
4 / 27
Meaning Sketch (lambda#2 (and flight@1 from@2 (< departure time@1 ? ) ) )
all flights from dallas before 10am
(lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti))) Low-level Details (e.g., arguments and variable names)
&
Python code example SQL example
Meaning Sketch
5 / 27
if len ( NAME ) < NUMBER or NAME [ NUMBER ] != STRING :
if length of bits is lesser than integer 3 or second element of bits is not equal to string ’as’ , if len(bits) < 3 or bits[1] != ’as’: WHERE > AND = What record company did conductor Mikhail Snitko record for after 1996?
SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)
Disentangle high-level from low-level semantics
Model meaning at different levels of granularity
More compact meaning representation
Length: 21.1 → 9.2 (on ATIS)
Explicit sharing coarse structure
For examples that have the same basic meaning
Provide global context to fine meaning decoder
Know what the basic meaning of input looks like
Meaning Sketch
6 / 27
Method
7 / 27
Method
8 / 27
Method
9 / 27
Sketch constrains the decoding output
Example 1: one augment is missing Example 2: type information
Method
10 / 27
flight@1 (flight ) NUMBER (a numeric token)
𝑦: input, 𝑏: sketch, 𝑧: meaning representation Training: maximize the log likelihood Inference: greedy search
Training and Inference
11 / 27
Coarse Meaning Decoder Fine Meaning Decoder
Natural language to logical form (Geo/ATIS) Natural language to source code (Django) Natural language to SQL (WikiSQL)
Semantic Parsing Tasks
12 / 27
what is the population of the state with the largest area? (argmax $0 (and (mountain:t $0) (loc:t $0 alaska:s)) (elevation:i $0)) if length of bits is lesser than integer 3 or second element of bits is not equal to string ’as’ , if len(bits) < 3 or bits[1] != ’as’: What record company did conductor Mikhail Snitko record for after 1996? SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)
Pianist Conductor Record Company Year of Recording Format
(Zettlemoyer and Collins, 2005; Kwiatkowski et al., 2011; Oda et al., 2015; Zhong et al., 2017)
Natural Language to Logical Form
13 / 27
(lambda#2 (and flight@1 from@2 (< departure time@1 ? ) ) ) (lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti)))
“#” Variable information (e.g., lambda, count, and argmax) “?” Partial argument information “@” Arguments of predicate or
- perator
Substitute tokens with their token types Except
Delimiters (e.g., “[”, and “:”) Operators (e.g., “+”, and “*”) Built-in keywords (e.g., “True”, and “while”)
Natural Language to Source Code
14 / 27
if NAME [ : NUMBER ] . NAME ( ) == STRING : if s [ : 4 ] . lower ( ) == ’http’:
https://docs.python.org/3/library/tokenize.html
WikiSQL (Zhong et al., 2017)
Natural Language to SQL
15 / 27
WHERE > AND =
SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)
SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...
Natural Language to SQL
16 / 27
How many presidents are graduated from A?
President College SELECT COUNT(President) WHERE (College = A) College Number of Presidents SELECT Number of Presidents WHERE (College = A)
Decoding is table-aware
ǁ 𝑓1 ǁ 𝑓2 ǁ 𝑓4 ǁ 𝑓3
Table-aware input encoder
Natural Language to SQL
17 / 27
||
- f
|| || college number presidents
Column 1 Column 2
𝒅1 𝒅2
𝑦2 𝑦3 𝑦4
𝒇1 𝒇2 𝒇4 𝒇3
𝑦1
Input Question Question-to-Table Attention
𝒅2
𝒇
𝒅4
𝒇
𝒅3
𝒇
𝒅1
𝒇
LSTM units Vectors Attention
𝒅2
SELECT clause
Natural Language to SQL
18 / 27
SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...
ǁ 𝑓 Softmax Classifier
agg_operator∈ {empty, COUNT, MIN, MAX, SUM, AVG}
||
- f
|| college number presidents
Column 1 Column 2
Column Pointer 𝒅1 agg_column
Question Vector
WHERE Clause
Natural Language to SQL
19 / 27
ǁ 𝑓
WHERE > AND = Sketch Classification
SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...
What record company did conductor Mikhail Snitko record for after 1996 ?
Pianist Conductor Record Company Year of Recording Format
WHERE Clause
Natural Language to SQL
20 / 27
ǁ 𝑓
WHERE > AND =
𝒘1 𝒘2 𝒊2
AND
cond_col Pointer cond Pointer
ǁ 𝑓𝑚 ǁ 𝑓𝑠
… …
Sketch-Guided
WHERE
Decoding Sketch Encoding Sketch Classification
SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...
𝒅4 𝒊1 𝒊3 𝒊4 What record company did conductor Mikhail Snitko record for after 1996 ?
Pianist Conductor Record Company Year of Recording Format
WHERE Clause
Natural Language to SQL
21 / 27
ǁ 𝑓
WHERE > AND =
𝒘1 𝒘2 𝒊2
AND
cond_col Pointer cond Pointer
ǁ 𝑓𝑚 ǁ 𝑓𝑠
… …
Sketch-Guided
WHERE
Decoding Sketch Encoding Sketch Classification
SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...
𝒅4 𝒊1 𝒊3 𝒊4 What record company did conductor Mikhail Snitko record for after 1996 ?
Pianist Conductor Record Company Year of Recording Format
Point to a table column Point to a text span
Experimental Results
22 / 27
62.3 71.6 69.5 74.1
50 55 60 65 70 75
(Ling et al., 2016) (Yin and Neubig, 2017) OneStage (w/o sketch) Coarse2Fine
Accuracy
NL->Code (Django)
Experimental Results
23 / 27
84.6 84.2 87.1 84.6 87.1 85.9 85 85.3 88.2 87.7 75 77 79 81 83 85 87 89 Geo ATIS Accuracy
NL->Logical Form
Seq2Seq Seq2Tree ASN OneStage Coarse2Fine
Baseline: (Dong and Lapata, 2016; Rabinovich et al., 2017)
Experimental Results
24 / 27
53.3 59.4 68 75.9 78.5
45 50 55 60 65 70 75 80
Aug Pointer Network (Zhong et al., 2017) (Zhong et al., 2017) (Xu et al., 2017) OneStage (w/o sketch) Coarse2Fine
Execution Accuracy
NL->SQL (WikiSQL)
85.4 85.9 73.2 95.4 89.3 88 77.4 95.9 65 70 75 80 85 90 95 Geo ATIS Django WikiSQL Sketch Accuracy OneStage Coarse2Fine
Sketch Accuracy
25 / 27
Oracle Meaning Sketch
26 / 27
88.2 87.7 74.1 78.5 93.9 95.1 83 79.6 65 70 75 80 85 90 95 Geo ATIS Django WikiSQL Accuracy Coarse2Fine + Oracle Sketch
Alternative ways of defining meaning sketches
Different levels of granularity
Weakly supervised setting
Meaning sketch reduces search space
Partial annotation
Only annotate meaning sketches for some examples
Future Work
27 / 27
Thanks!
Q&A
Code Available: http://homepages.inf.ed.ac.uk/s1478528