FA N D A : A Novel Approach to Perform Follow-up Query Analysis - - PowerPoint PPT Presentation

β–Ά
fa n d a a novel approach to perform
SMART_READER_LITE
LIVE PREVIEW

FA N D A : A Novel Approach to Perform Follow-up Query Analysis - - PowerPoint PPT Presentation

FA N D A : A Novel Approach to Perform Follow-up Query Analysis Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang In Intr trodu oduct ction on Interaction Precedent Query User [1] : show the sales of BMW in 2009. System :


slide-1
SLIDE 1

FANDA: A Novel Approach to Perform Follow-up Query Analysis

Qian Liu, Bei Chen, Jian-Guang Lou, Ge Jin, Dongmei Zhang

slide-2
SLIDE 2

User [1] : show the sales of BMW in 2009. System : SELECT Sales WHERE Brand = BMW and Year = 2009 User [2] : what about profit? show the profit of BMW in 2009. System : SELECT Profit WHERE Brand = BMW and Year = 2009 User [3] : of Benz? show the profit of Benz in 2009. System : SELECT profit WHERE Brand = Benz and Year = 2009 User [4] : Compare it to Ford. Compare the profit of Benz in 2009 to Ford. System : SELECT profit WHERE ( Brand = Benz or Brand = Ford ) and

Year = 2009

In Intr trodu

  • duct

ction

  • n

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Brand Sales Profit Year BMW 31020 5000 2009 Ford 25220 3000 2009 Benz 47060 6000 2009

Interaction

74.58% queries follow immediately after the question they are related to. (Bertomeu et al., 2006) Precedent Query Precedent SQL Follow-up Query Fused Query Follow-up SQL

slide-3
SLIDE 3

In Intr trodu

  • duct

ction

  • n

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

  • Iyyer M, Yih W, Chang M W. Search-based neural structured learning for sequential question answering[C]//ACL 2017

Se SequentialQA ntialQA

  • Dahl D A, Bates M, Brown M, et al. Expanding the scope of the ATIS task: The ATIS-3 corpus[C]//HLT-ACL 1994
  • Miller S, Stallard D, Bobrow R, et al. A fully statistical approach to natural language interfaces[C]//ACL 1996
  • Zettlemoyer L S, Collins M. Learning context-dependent mappings from sentences to logical form[C]//ACL 2009
  • Suhr A, Iyer S, Artzi Y. Learning to Map Context-Dependent Sentences to Executable Formal Queries[C]//NAACL

2018

ATIS3 S3 Non-senten entential tial Questio tion Resolutio lution

  • Kumar V, Joshi S. Non-sentential Question Resolution using Sequence to Sequence Learning[C]//COLING

2016

  • Kumar V, Joshi S. Incomplete Follow-up Question Resolution using Retrieval based Sequence to Sequence

Learning[C]//SIGIR 2017

slide-4
SLIDE 4

In Intr trodu

  • duct

ction

  • n

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

  • Prior work in context-dependent parsing focuses on specific domain or

simple scenarios.

  • Our goal: language understanding in complex scenarios covering

diverse domains in NLIDB.

  • Dataset: A new dataset FollowUp is presented for research and

evaluation.

  • Method: A novel approach is presented for taking account interaction

history information into current sentence.

slide-5
SLIDE 5

In Intr trodu

  • duct

ction

  • n

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Dataset

  • 1000 queries in 120 different Tables

inherited from WikiSQL

  • Annotation with Query Triple :

(Precedent, Followup, Fused)

  • Train/Dev/Test : 640/160/200
slide-6
SLIDE 6

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Encode Input Decode Output

Sequence to Sequence

Encode & Predict Input Fusion Output

Follow-up ANalysis for DAtabase

Vector Structure

  • Learning to encode and decode
  • Non-interpretable
  • Require lots of training data
  • Learning to encode, fusion with semantic

rules

  • Reason when fusion
  • Cold start for little data

Method

In Intr trodu

  • duct

ction

  • n
slide-7
SLIDE 7

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Generate symbol sequence

Anonymization

Generate segment sequence

Generation

Query fusion in structure level

Fusion

slide-8
SLIDE 8

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Anonymization

Symbol Meaning Col Column Name Val Cell Value Agg Aggregation Com Comparison Dir Order Direction Per Personal Pronoun Pos Possessive Pronoun Dem Demonstrative Table-Related Knowledge Language-Related Knowledge

  • Words in utterance are split into two types:

analysis-specific words and rhetorical words.

  • All numbers and dates belong to Val.
  • One analysis-specific word could belong to different

symbols, generating several symbol sequences.

slide-9
SLIDE 9

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Generation

Segment Compositional Deduction Rule

Select [ Agg + [ Val ] ] + Col Group Col Order [ Dir ] + Col 𝑋

1

[ Col ] + [ Com ] + Val 𝑋

2

Col + Com + Col 𝑄

1

Per 𝑄2 Pos 𝑄3 Dem + Col

  • symbol does not consider the context around and segment structure is designed.
  • segment is a combination of adjacent symbols, inspired by SQL parameter and common sense.

Precedent Query: Could you tell me the player whose score is larger than 67 Follow Query (1): Who play the same position as him ? Follow Query (2): sort them using their score in ascending order. Select 𝑋

1

𝑄3 𝑄

1

𝑄

1

𝑃𝑠𝑒𝑓𝑠

slide-10
SLIDE 10

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Generation

  • 1. Symbols are combined to generate

all possible segment sequences.

  • 2. A ranking model is built to score

these segment sequences and pick the best one as output.

  • 3. Intent was introduced to distinguish

two scenarios: Refine & Append.

slide-11
SLIDE 11

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Fusion

Network Year TSN 1995 CBC 1995 CFL 1996

1. Conflicting segment pairs will not happen at the same time. 2. Utilize one sentence to make up for the lack of the another sentence.

slide-12
SLIDE 12

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Fusion

slide-13
SLIDE 13

Follow low-up up Analys alysis s for Data tabas base

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Model Learning 𝑦1 β„Ž1 β„Ž1 𝐠1 𝑃 𝑦2 β„Ž2 β„Ž2 𝐠2 𝑃 𝑦3 β„Ž3 β„Ž3 𝐠3 π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ π‘¦π‘œ β„Žπ‘œ β„Žπ‘œ 𝐠n 𝑃 … … …

show the sum

…

average

forward backward LSTM output CRF Layer input

𝒬 π’ͺ

𝑃 𝑃 π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ π‘‡π‘“π‘šπ‘“π‘‘π‘’π½ … π‘†π‘“π‘”π‘—π‘œπ‘“π½ π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ 𝑃 𝑃 π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ π‘‡π‘“π‘šπ‘“π‘‘π‘’π½ … π‘†π‘“π‘”π‘—π‘œπ‘“π½ π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ 𝑃 𝑃 π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ π‘‡π‘“π‘šπ‘“π‘‘π‘’π½ … π΅π‘žπ‘žπ‘“π‘œπ‘’π½ π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ 𝑃 𝑃 π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ π‘‡π‘“π‘šπ‘“π‘‘π‘’π½ … π΅π‘žπ‘žπ‘“π‘œπ‘’π½ π‘‡π‘“π‘šπ‘“π‘‘π‘’πΆ 𝐁𝑃𝑃 𝐁𝑃𝑇𝐢

slide-14
SLIDE 14

Exp xper eriments iments

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

  • Symbol Acc: Symbol Consistent With Gold Fused Query
  • BLEU: Quality of Output Fused Query
  • Execution Accuracy: Output Query Execution Correctness

( Parser using Coarse-to-Fine )

  • SEQ2SEQ: Attention SEQ2SEQ
  • COPYNET: + copy mechanism
  • S2S + ANON: SEQ2SEQ + anonymization
  • COPY + ANON: COPY+ anonymization
  • CONCAT: Concatenate Precedent Query and Follow-up
  • E2ECR: End to End Coreference Resolution System

Origin : In 1995, is there any network named CBC? Any TSN? Transform : In Val#1, is there any Col#1 named Val#2? Any Val#3?

Anonymization Dataset

slide-15
SLIDE 15

Exp xper eriments iments

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Ablation Results

48.2 59.87 47.8 59.02 35.3 55.01 24.3 52.92

10 20 30 40 50 60

Symbol Acc BLEU

FANDA + Pretrain FANDA FANDA - Intent FANDA - Ranking

slide-16
SLIDE 16

Exp xper eriments iments

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

COP OPY+ANON FA FA N D A

Substantial Overlap √ √ ( Segment Type ) No Overlap √ ( Table Structure) Ambiguous overlap √ ( Combination of Symbol)

Error Case Analysis

slide-17
SLIDE 17

Thank you!

Futu ture re Work

  • Extending to multi-turns and multi-tables.
  • Using reinforcement learning
slide-18
SLIDE 18

Qu Questi estion

  • n & A

Answer swer

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

SEQ2SEQ 𝑻1 = 𝑆𝑂𝑂(π‘°π‘œ, 𝑭𝑑𝑝𝑑) α𝑗 = 𝑻1

π‘ˆ 𝑿 𝑰𝑗

𝑏𝑗 = 𝑓α𝑗 Οƒ 𝑓α𝑙

𝑫 = ෍ 𝑏𝑗 Β· 𝑰𝑗

𝑷 = 𝐺 𝑫, 𝑻1 𝑻2 = 𝑆𝑂𝑂(𝑻1, 𝑭𝑝)

slide-19
SLIDE 19

Qu Questi estion

  • n & A

Answer swer

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Memory

(Encoder Hidden State)

𝑻1

S

(Decoder Hidden State)

A

(Attention)

α𝑗 = π‘Š

𝑏 π‘ˆtanh(𝑿[π‘°π‘œπ‘°π‘—])

𝑏𝑗 = 𝑓α𝑗 Οƒ 𝑓α𝑙

𝑫 = ෍ 𝑏𝑗 Β· 𝑰𝑗

𝑻1 = 𝑆𝑂𝑂(π‘°π‘œ, [𝑭𝑑, 𝑫]) 𝑷𝐻 = 𝑿𝑝𝑻1 Ξ²j = tanh π‘Ώπ·π‘°π‘˜ 𝑻1 𝑷𝐷 = [Ξ²1, … , β𝑑] 𝑸 = 𝑇𝑝𝑔𝑒𝑛𝑏𝑦( 𝑷𝐻, 𝑷𝐷 )

π‘ž 𝑧𝑒 = π‘ž 𝑧𝑒, 𝑕 Β· + π‘ž(𝑧𝑒, 𝑑| Β· οΌ‰

show T by C1 about how by C2

π‘°π‘œ 𝑫

Generate

(Vocabulary)

Copy

(Source Sentence)

show

show T by C1 about how by C2

Softmax

<PAD> …

𝑭𝑑

COPYNET

slide-20
SLIDE 20

Qu Questi estion

  • n & A

Answer swer

FA N D A : A Novel Approach to Perform Follow-up Query Analysis

Score Evolution

  • In Iteration 1, different but similar

scores are assigned to all candidates in P and N with random initialization.

  • From Iteration 5 to 21, the score

distribution becomes increasingly skewed.

  • From Iteration 13 to 21, the candidate

with the highest score remains unchanged, indicating the stability of weakly supervised learning.