Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 - - PowerPoint PPT Presentation

coarse to fine decoding for neural semantic parsing
SMART_READER_LITE
LIVE PREVIEW

Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 - - PowerPoint PPT Presentation

Coarse-to-Fine Decoding for Neural Semantic Parsing July 16, 2018 Li Dong and Mirella Lapata Semantic Parsing Mapping natural language to structured representations Human-friendly -> Computer-friendly all flights from dallas before 10am


slide-1
SLIDE 1

Coarse-to-Fine Decoding for Neural Semantic Parsing

July 16, 2018

Li Dong and Mirella Lapata

slide-2
SLIDE 2

Semantic Parsing

2 / 27

Mapping natural language to structured representations

Human-friendly -> Computer-friendly

all flights from dallas before 10am

(lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti))) Semantic Parser

Example from ATIS (Kwiatkowski et al., 2011)

slide-3
SLIDE 3

Sequence decoder (Jia and Liang, 2016; Dong and Lapata, 2016; Ling

et al., 2016 ; Iyer et al., 2017)

Syntactically-constrained decoder (Dong and Lapata, 2016;

Xiao et al., 2016; Alvarez-Melis and Jaakkola, 2017; Yin and Neubig, 2017; Cheng et al., 2017; Krishnamurthy et al., 2017; Rabinovich et al., 2017; Xu et al., 2017)

Neural Semantic Parsing

3 / 27

Encoder Decoder

LSTM answer(J,(compa ny(J,'microsoft'),j

  • b(J),not((req_de

g(J,'bscs')))))

Attention Layer

LSTM

what microsoft jobs do not require a bscs? Input Utterance Structured Representation

slide-4
SLIDE 4

This Work

4 / 27

Meaning Sketch (lambda#2 (and flight@1 from@2 (< departure time@1 ? ) ) )

all flights from dallas before 10am

(lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti))) Low-level Details (e.g., arguments and variable names)

&

slide-5
SLIDE 5

Python code example SQL example

Meaning Sketch

5 / 27

if len ( NAME ) < NUMBER or NAME [ NUMBER ] != STRING :

if length of bits is lesser than integer 3 or second element of bits is not equal to string ’as’ , if len(bits) < 3 or bits[1] != ’as’: WHERE > AND = What record company did conductor Mikhail Snitko record for after 1996?

SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)

slide-6
SLIDE 6

Disentangle high-level from low-level semantics

Model meaning at different levels of granularity

More compact meaning representation

Length: 21.1 → 9.2 (on ATIS)

Explicit sharing coarse structure

For examples that have the same basic meaning

Provide global context to fine meaning decoder

Know what the basic meaning of input looks like

Meaning Sketch

6 / 27

slide-7
SLIDE 7

Method

7 / 27

slide-8
SLIDE 8

Method

8 / 27

slide-9
SLIDE 9

Method

9 / 27

slide-10
SLIDE 10

Sketch constrains the decoding output

Example 1: one augment is missing Example 2: type information

Method

10 / 27

flight@1 (flight ) NUMBER (a numeric token)

slide-11
SLIDE 11

𝑦: input, 𝑏: sketch, 𝑧: meaning representation Training: maximize the log likelihood Inference: greedy search

Training and Inference

11 / 27

Coarse Meaning Decoder Fine Meaning Decoder

slide-12
SLIDE 12

Natural language to logical form (Geo/ATIS) Natural language to source code (Django) Natural language to SQL (WikiSQL)

Semantic Parsing Tasks

12 / 27

what is the population of the state with the largest area? (argmax $0 (and (mountain:t $0) (loc:t $0 alaska:s)) (elevation:i $0)) if length of bits is lesser than integer 3 or second element of bits is not equal to string ’as’ , if len(bits) < 3 or bits[1] != ’as’: What record company did conductor Mikhail Snitko record for after 1996? SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)

Pianist Conductor Record Company Year of Recording Format

(Zettlemoyer and Collins, 2005; Kwiatkowski et al., 2011; Oda et al., 2015; Zhong et al., 2017)

slide-13
SLIDE 13

Natural Language to Logical Form

13 / 27

(lambda#2 (and flight@1 from@2 (< departure time@1 ? ) ) ) (lambda $0 e (and (flight $0) (from $0 dallas:ci) (< (departure_time $0) 1000:ti)))

“#” Variable information (e.g., lambda, count, and argmax) “?” Partial argument information “@” Arguments of predicate or

  • perator
slide-14
SLIDE 14

Substitute tokens with their token types Except

Delimiters (e.g., “[”, and “:”) Operators (e.g., “+”, and “*”) Built-in keywords (e.g., “True”, and “while”)

Natural Language to Source Code

14 / 27

if NAME [ : NUMBER ] . NAME ( ) == STRING : if s [ : 4 ] . lower ( ) == ’http’:

https://docs.python.org/3/library/tokenize.html

slide-15
SLIDE 15

WikiSQL (Zhong et al., 2017)

Natural Language to SQL

15 / 27

WHERE > AND =

SELECT Record Company WHERE (Year of Recording > 1996) AND (Conductor = Mikhail Snitko)

SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...

slide-16
SLIDE 16

Natural Language to SQL

16 / 27

How many presidents are graduated from A?

President College SELECT COUNT(President) WHERE (College = A) College Number of Presidents SELECT Number of Presidents WHERE (College = A)

Decoding is table-aware

slide-17
SLIDE 17

ǁ 𝑓1 ǁ 𝑓2 ǁ 𝑓4 ǁ 𝑓3

Table-aware input encoder

Natural Language to SQL

17 / 27

||

  • f

|| || college number presidents

Column 1 Column 2

𝒅1 𝒅2

𝑦2 𝑦3 𝑦4

𝒇1 𝒇2 𝒇4 𝒇3

𝑦1

Input Question Question-to-Table Attention

𝒅2

𝒇

𝒅4

𝒇

𝒅3

𝒇

𝒅1

𝒇

LSTM units Vectors Attention

slide-18
SLIDE 18

𝒅2

SELECT clause

Natural Language to SQL

18 / 27

SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...

ǁ 𝑓 Softmax Classifier

agg_operator∈ {empty, COUNT, MIN, MAX, SUM, AVG}

||

  • f

|| college number presidents

Column 1 Column 2

Column Pointer 𝒅1 agg_column

Question Vector

slide-19
SLIDE 19

WHERE Clause

Natural Language to SQL

19 / 27

ǁ 𝑓

WHERE > AND = Sketch Classification

SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...

What record company did conductor Mikhail Snitko record for after 1996 ?

Pianist Conductor Record Company Year of Recording Format

slide-20
SLIDE 20

WHERE Clause

Natural Language to SQL

20 / 27

ǁ 𝑓

WHERE > AND =

𝒘1 𝒘2 𝒊2

AND

cond_col Pointer cond Pointer

ǁ 𝑓𝑚 ǁ 𝑓𝑠

… …

Sketch-Guided

WHERE

Decoding Sketch Encoding Sketch Classification

SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...

𝒅4 𝒊1 𝒊3 𝒊4 What record company did conductor Mikhail Snitko record for after 1996 ?

Pianist Conductor Record Company Year of Recording Format

slide-21
SLIDE 21

WHERE Clause

Natural Language to SQL

21 / 27

ǁ 𝑓

WHERE > AND =

𝒘1 𝒘2 𝒊2

AND

cond_col Pointer cond Pointer

ǁ 𝑓𝑚 ǁ 𝑓𝑠

… …

Sketch-Guided

WHERE

Decoding Sketch Encoding Sketch Classification

SELECT agg_operator agg_column WHERE (cond_column cond_operator cond_value) AND ...

𝒅4 𝒊1 𝒊3 𝒊4 What record company did conductor Mikhail Snitko record for after 1996 ?

Pianist Conductor Record Company Year of Recording Format

Point to a table column Point to a text span

slide-22
SLIDE 22

Experimental Results

22 / 27

62.3 71.6 69.5 74.1

50 55 60 65 70 75

(Ling et al., 2016) (Yin and Neubig, 2017) OneStage (w/o sketch) Coarse2Fine

Accuracy

NL->Code (Django)

slide-23
SLIDE 23

Experimental Results

23 / 27

84.6 84.2 87.1 84.6 87.1 85.9 85 85.3 88.2 87.7 75 77 79 81 83 85 87 89 Geo ATIS Accuracy

NL->Logical Form

Seq2Seq Seq2Tree ASN OneStage Coarse2Fine

Baseline: (Dong and Lapata, 2016; Rabinovich et al., 2017)

slide-24
SLIDE 24

Experimental Results

24 / 27

53.3 59.4 68 75.9 78.5

45 50 55 60 65 70 75 80

Aug Pointer Network (Zhong et al., 2017) (Zhong et al., 2017) (Xu et al., 2017) OneStage (w/o sketch) Coarse2Fine

Execution Accuracy

NL->SQL (WikiSQL)

slide-25
SLIDE 25

85.4 85.9 73.2 95.4 89.3 88 77.4 95.9 65 70 75 80 85 90 95 Geo ATIS Django WikiSQL Sketch Accuracy OneStage Coarse2Fine

Sketch Accuracy

25 / 27

slide-26
SLIDE 26

Oracle Meaning Sketch

26 / 27

88.2 87.7 74.1 78.5 93.9 95.1 83 79.6 65 70 75 80 85 90 95 Geo ATIS Django WikiSQL Accuracy Coarse2Fine + Oracle Sketch

slide-27
SLIDE 27

Alternative ways of defining meaning sketches

Different levels of granularity

Weakly supervised setting

Meaning sketch reduces search space

Partial annotation

Only annotate meaning sketches for some examples

Future Work

27 / 27

slide-28
SLIDE 28

Thanks!

Q&A

Code Available: http://homepages.inf.ed.ac.uk/s1478528