Learning to Map Sentences to Logical Form: Structured Classification - - PowerPoint PPT Presentation

learning to map sentences to logical form structured
SMART_READER_LITE
LIVE PREVIEW

Learning to Map Sentences to Logical Form: Structured Classification - - PowerPoint PPT Presentation

Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars Luke Zettlemoyer and Michael Collins MIT CSAIL The Problem Learning to Map Sentences to Logical Form Texas borders Kansas borders


slide-1
SLIDE 1

Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars

Luke Zettlemoyer and Michael Collins MIT CSAIL

slide-2
SLIDE 2

The Problem

Learning to Map Sentences to Logical Form

Texas borders Kansas borders(texas,kansas)

slide-3
SLIDE 3

Several potential applications

  • Natural Language Interfaces to Databases
  • Dialogue Systems
  • Machine Translation
slide-4
SLIDE 4

Some Training Examples

Input: What states border Texas? Output: λx.state(x) ∧ borders(x,texas) Input: What is the largest state? Output: argmax(λx.state(x), λx.size(x)) Input: What states border the largest state? Output: λx.state(x) ∧ borders(x, argmax(λy.state(y), λy.size(y)))

slide-5
SLIDE 5

Our Approach

Learn lexical information (syntax/semantics) for words:

  • Texas

| syntax = noun phrase (NP) : semantics = texas

  • states | syntax = noun (N) : semantics = λx.state(x)

Learn to parse to logical form: Input: What states border Texas? Output: λx.state(x) ∧ borders(x,texas)

slide-6
SLIDE 6

Background

  • Combinatory Categorial Grammar (CCG)
  • Lexicon
  • Parsing Rules (Combinators)
  • Probabilistic CCG (PCCG)
slide-7
SLIDE 7

CCG Lexicon

NP : kansas Kansas NP : kansas_city_MO Kansas city (S\NP)/NP : λx.λy.borders(y,x) borders NP : texas Texas

Syntax : Semantics Category Words

slide-8
SLIDE 8

Parsing Rules (Combinators)

  • Application
  • X/Y : f Y : a => X : f(a)
  • Y : a X\Y : f => X : f(a)
  • Additional rules
  • Composition
  • Type Raising

(S\NP)/NP

λx.λy.borders(y,x)

NP

texas

S\NP

λy.borders(y,texas)

NP

kansas

S\NP

λy.borders(y,texas)

S

borders(kansas,texas)

slide-9
SLIDE 9

CCG Parsing

NP texas (S\NP)/NP

λx.λy.borders(y,x)

borders Kansas Texas NP kansas S\NP

λy.borders(y,kansas)

S

borders(texas,kansas)

slide-10
SLIDE 10

Parsing a Question

(S\NP)/NP λx.λy.borders(y,x)

border Texas What

NP texas

S\NP

λy.borders(y,texas)

states

N λx.state(x) S/(S\NP)/N λf.λg.λx.f(x)∧g(x) S/(S\NP) λg.λx.state(x)∧g(x) S λx.state(x) ∧ borders(x,texas)

slide-11
SLIDE 11

Probabilistic CCG (PCCG)

Log-linear model:

  • A CCG for parsing
  • Features
  • fi(L,S,T): number of times lexical item i is

used in the parse T that maps from sentence S to logical form L

  • A parameter vector θ with an entry for

each fi

slide-12
SLIDE 12

PCCG Distributions

Log-linear model:

  • Defines a joint distribution:
  • Parses are a hidden variable:

P(L | S;) = P(L,T | S;)

T

  • P(L,T | S;) =

e f (L,T ,S) e f (L,T ,S)

(L,T )

slide-13
SLIDE 13

Learning

  • Generating Lexical Items
  • Learning a complete PCCG
slide-14
SLIDE 14

Lexical Generation

... ... NP : kansas Kansas (S\NP)/NP : λx.λy.borders(y,x) borders NP : texas Texas

Category Words

Output Lexicon Input Training Example

Sentence: Texas borders Kansas Logic Form: borders(texas,kansas)

slide-15
SLIDE 15

GENLEX

  • Input: a training example (Si,Li)
  • Computation:
  • 1. Create all substrings of words in Si
  • 2. Create categories from Li
  • 3. Create lexical entries that are the cross

product of these two sets

  • Output: Lexicon Λ
slide-16
SLIDE 16

Step 1: GENLEX Words

Input Sentence:

Texas borders Kansas

Ouput Substrings:

Texas borders Kansas Texas borders borders Kansas Texas borders Kansas

slide-17
SLIDE 17

Step 2: GENLEX Categories

Input Logical Form:

borders(texas,kansas)

Output Categories: ... ... ...

slide-18
SLIDE 18

Two GENLEX Rules

Output Category

(S\NP)/NP : λx.λy.p(y,x) an arity two predicate p NP : c a constant c

Input Trigger

Example Input: borders(texas,kansas) Output Categories: NP : texas, NP : kansas, (S\NP)/NP : λx.λy.borders(y,x)

slide-19
SLIDE 19

All of the Category Rules

S/NP : λx.f(x) arity one function f (N\N)/NP : λx.λg.λy.p(y,x)∧g(x) arity two predicate p N/N : λg.λx.p(x,c)∧g(x) arity two predicate p and constant c N/N : λg.λx.p(x)∧g(x) arity one predicate p N : λx.p(x) arity one predicate p S\NP : λx.p(x) arity one predicate p (S\NP)/NP : λx.λy.p(x,y) arity two predicate p (S\NP)/NP : λx.λy.p(y,x) arity two predicate p NP : c a constant c NP/N : λg.argmax/min(g(x),λx.f(x)) arity one function f

Output Category Input Trigger

slide-20
SLIDE 20

Step 3: GENLEX Cross Product

Output Substrings:

Texas borders Kansas Texas borders borders Kansas Texas borders Kansas

Output Categories:

NP : texas NP : kansas (S\NP)/NP : λx.λy.borders(y,x)

GENLEX is the cross product in these two output sets

X

Input Training Example

Sentence: Texas borders Kansas Logic Form: borders(texas,kansas)

Output Lexicon

slide-21
SLIDE 21

GENLEX: Output Lexicon

... ... NP : texas

Texas borders Kansas

NP : texas borders NP : kansas borders (S\NP)/NP : λx.λy.borders(y,x) borders NP : kansas

Texas borders Kansas

(S\NP)/NP : λx.λy.borders(y,x)

Texas borders Kansas

(S\NP)/NP : λx.λy.borders(y,x) Texas NP : kansas Texas NP : texas Texas

Category Words

slide-22
SLIDE 22

Inputs: Initial lexicon Λ0

A Simple Algorithm

The initial lexicon has two types of entries:

  • Domain Independent:

Example:

What | S/(S\NP)/N : λf.λg.λx.f(x)∧g(x)

  • Domain Dependent:

Example:

Texas | NP : texas

slide-23
SLIDE 23

Inputs: Initial lexicon Λ0 Training examples Initialization: Create lexicon Create features f Create initial parameters θ0 Computation: Estimate parameters Output: PCCG (Λ*, θ, f )

A Simple Algorithm

E = (Si,Li) :i = 1Kn

{ }

* = 0 GENLEX(Si,Li)

i=1 n

U

= STOCGRAD(E, 0,*)

slide-24
SLIDE 24

Inputs: Λ0, E Initialization: Create Λ*, f, θ0 Computation: For t = 1...T 1. Prune Lexicon:

  • For each

− Set − Calculate the set of highest scoring correct parses − Define λi to be lexical items in a parse in π

  • Set

2. Estimate parameters: Output: PCCG (ΛT, θT, f )

The Final Algorithm

(Si,Li) E

= 0 GENLEX(Si,Li) = MAXPARSE(Si,Li,, t 1), t = 0 i

i=1 n

U

t = STOCGRAD(E, t 1,t )

slide-25
SLIDE 25

Related Work

  • CHILL (Zelle and Mooney, 1996)
  • learns deterministic parser; assumes semantic lexicon

as input (borders | borders(_,_))

  • WOLFIE (Thompson and Mooney, 2002)
  • learns complete lexicon; deterministic parsing
  • COCKTAIL (Tang and Mooney, 2001)
  • best results; statistical parsing; assumes semantic

lexicon

slide-26
SLIDE 26

Experiments

Two database domains:

  • Geo880

–600 training examples –280 test examples

  • Jobs640

–500 training examples –140 test examples

slide-27
SLIDE 27

Evaluation

Test for completely correct semantics

  • Precision:

# correct / total # parsed

  • Recall:

# correct / total # sentences

slide-28
SLIDE 28

Results

79.84 93.25 79.40 89.92

COCKTAIL

79.29 97.36 79.29 96.25

Our Method Recall Precision Recall Precision

Jobs 640 Geo 880

slide-29
SLIDE 29

Example Learned Lexical Entries

N : λx.state(x) states N/N : λg.λx.major(x)∧g(x) major N : λx.population(x) population N : λx.city(x) cities N : λx.river(x) river (S\NP)/NP : λx.λy.traverse(y,x) run through NP/N : λg.argmax(g,λx.size(x)) the largest ... ... NP/N : λg.argmax(g,λx.len(x)) the longest NP/N : λg.argmax(g,λx.elev(x)) the highest N : λx.river(x) rivers

Category Words

slide-30
SLIDE 30

Error Analysis

Low recall: GENLEX is not general enough

  • Fails to parse 10% of training examples

Some unparsed examples include:

  • Through which states does the Mississippi run?
  • If I moved to California and learned SQL on

Oracle could I find anything for 30000 on Unix?

slide-31
SLIDE 31

Future Work

  • Improve recall
  • Explore robust parsing techniques for

ungrammatical input

  • Develop new domains
  • Integrate with a dialogue system
slide-32
SLIDE 32

The End

Thanks

slide-33
SLIDE 33

Convergence

Some Guarantees

  • 1. Prune Lexicon
  • Will not decrease accuracy on training

set

  • 2. Estimate parameters
  • Should increase the likelihood of the

training set