semantic parsing with combinatory categorial grammars
play

Semantic Parsing with Combinatory Categorial Grammars Yoav Artzi ! - PowerPoint PPT Presentation

A b r i d g e d Semantic Parsing with Combinatory Categorial Grammars Yoav Artzi ! University of Washington Based on ACL 2013 Tutorial ! With Nicholas FitzGerald and Luke Zettlemyer ! Original tutorial slides available at


  1. CCG Categories ADJ : λ x.fun ( x ) • Basic building block ! • Capture syntactic and semantic information jointly

  2. CCG Categories ADJ : λ x.fun ( x ) Syntax Semantics • Basic building block ! • Capture syntactic and semantic information jointly

  3. CCG Categories ADJ : λ x.fun ( x ) Syntax ( S \ NP ) /ADJ : λ f. λ x.f ( x ) NP : CCG • Primitive symbols: N, S, NP , ADJ and PP ! • Syntactic combination operator (/,\) ! • Slashes specify argument order and direction

  4. CCG Categories ADJ : λ x.fun ( x ) Semantics ( S \ NP ) /ADJ : λ f. λ x.f ( x ) NP : CCG • λ -calculus expression ! • Syntactic type maps to semantic type

  5. CCG Lexical Entries fun ` ADJ : λ x.fun ( x ) • Pair words and phrases with meaning ! • Meaning captured by a CCG category

  6. CCG Lexical Entries fun ` ADJ : λ x.fun ( x ) Natural ! CCG Category Language • Pair words and phrases with meaning ! • Meaning captured by a CCG category

  7. CCG Lexicons fun ` ADJ : λ x.fun ( x ) is ` ( S \ NP ) /ADJ : λ f. λ x.f ( x ) CCG ` NP : CCG • Pair words and phrases with meaning ! • Meaning captured by a CCG category

  8. Between CCGs and CFGs CFGs CCGs Combination operations Many Few Parse tree nodes Non-terminals Categories Handful, but Syntactic symbols Few dozen can combine Paired with words POS tags Categories

  9. Parsing with CCGs CCG is fun NP S \ NP/ADJ ADJ CCG λ f. λ x.f ( x ) λ x.fun ( x ) > S \ NP λ x.fun ( x ) < S fun ( CCG ) Use lexicon to match words and phrases with their categories

  10. CCG Operations • Small set of operators ! • Input: 1-2 CCG categories ! • Output: A single CCG category ! • Operate on syntax semantics together ! • Mirror natural logic operations

  11. CCG Operations Application B : g A \ B : f ⇒ A : f ( g ) ( < ) A/B : f B : g ⇒ A : f ( g ) ( > ) • Equivalent to function application ! • Two directions: forward and backward ! - Determined by slash direction

  12. CCG Operations Application Argument Function Result B : g A \ B : f ⇒ A : f ( g ) ( < ) A/B : f B : g ⇒ A : f ( g ) ( > ) • Equivalent to function application ! • Two directions: forward and backward ! - Determined by slash direction

  13. Parsing with CCGs CCG is fun NP S \ NP/ADJ ADJ CCG λ f. λ x.f ( x ) λ x.fun ( x ) > S \ NP λ x.fun ( x ) < S fun ( CCG ) Use lexicon to match words and phrases with their categories

  14. Parsing with CCGs CCG is fun NP S \ NP/ADJ ADJ CCG λ f. λ x.f ( x ) λ x.fun ( x ) > S \ NP λ x.fun ( x ) < S fun ( CCG ) Combine categories using operators A/B : f B : g ⇒ A : f ( g ) ( > )

  15. Parsing with CCGs CCG is fun NP S \ NP/ADJ ADJ CCG λ f. λ x.f ( x ) λ x.fun ( x ) > S \ NP λ x.fun ( x ) < S fun ( CCG ) Combine categories using operators B : g A \ B : f ⇒ A : f ( g ) ( < )

  16. Parsing with CCGs Composed adjectives square blue or round yellow pillow Non-standard coordination

  17. CCG Operations Composition A/B : f B/C : g ⇒ A/C : λ x.f ( g ( x )) ( > B ) B \ C : g A \ B : f ⇒ A \ C : λ x.f ( g ( x )) ( < B ) • Equivalent to function composition* ! • Two directions: forward and backward * Formal definition of logical composition in supplementary slides

  18. CCG Operations Composition f g f ◦ g A/B : f B/C : g ⇒ A/C : λ x.f ( g ( x )) ( > B ) B \ C : g A \ B : f ⇒ A \ C : λ x.f ( g ( x )) ( < B ) • Equivalent to function composition* ! • Two directions: forward and backward * Formal definition of logical composition in supplementary slides

  19. CCG Operations Type Shifting ADJ : λ x.g ( x ) ⇒ N/N : λ f. λ x.f ( x ) ∧ g ( x ) PP : λ x.g ( x ) ⇒ N \ N : λ f. λ x.f ( x ) ∧ g ( x ) AP : λ e.g ( e ) ⇒ S \ S : λ f. λ e.f ( e ) ∧ g ( e ) AP : λ e.g ( e ) ⇒ S/S : λ f. λ e.f ( e ) ∧ g ( e ) • Category-specific unary operations ! • Modify category type to take an argument ! • Helps in keeping a compact lexicon

  20. CCG Operations Type Shifting Input Output ADJ : λ x.g ( x ) ⇒ N/N : λ f. λ x.f ( x ) ∧ g ( x ) PP : λ x.g ( x ) ⇒ N \ N : λ f. λ x.f ( x ) ∧ g ( x ) AP : λ e.g ( e ) ⇒ S \ S : λ f. λ e.f ( e ) ∧ g ( e ) AP : λ e.g ( e ) ⇒ S/S : λ f. λ e.f ( e ) ∧ g ( e ) • Category-specific unary operations ! • Modify category type to take an argument ! • Helps in keeping a compact lexicon

  21. CCG Operations Type Shifting Input Output ADJ : λ x.g ( x ) ⇒ N/N : λ f. λ x.f ( x ) ∧ g ( x ) PP : λ x.g ( x ) ⇒ N \ N : λ f. λ x.f ( x ) ∧ g ( x ) AP : λ e.g ( e ) ⇒ S \ S : λ f. λ e.f ( e ) ∧ g ( e ) AP : λ e.g ( e ) ⇒ S/S : λ f. λ e.f ( e ) ∧ g ( e ) Topicalization • Category-specific unary operations ! • Modify category type to take an argument ! • Helps in keeping a compact lexicon

  22. CCG Operations Coordination and ` C : conj or ` C : disj • Coordination is special cased ! - Specific rules perform coordination ! - Coordinating operators are marked with special lexical entries

  23. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) < N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x )))

  24. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) < N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Use lexicon to match words and phrases with their categories

  25. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) < N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Shift adjectives to combine ADJ : λ x.g ( x ) ⇒ N/N : λ f. λ x.f ( x ) ∧ g ( x )

  26. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) < N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Shift adjectives to combine ADJ : λ x.g ( x ) ⇒ N/N : λ f. λ x.f ( x ) ∧ g ( x )

  27. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) > N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Compose pairs of adjectives A/B : f B/C : g ⇒ A/C : λ x.f ( g ( x )) ( > B )

  28. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) > N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Coordinate composed adjectives

  29. Parsing with CCGs square blue or round yellow pillow ADJ ADJ C ADJ ADJ N λ x.square ( x ) λ x.blue ( x ) disj λ x.round ( x ) λ x.yellow ( x ) λ x.pillow ( x ) N/N N/N N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) λ f. λ x.f ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) λ f. λ x.f ( x ) ∧ yellow ( x ) > B > B N/N N/N λ f. λ x.f ( x ) ∧ square ( x ) ∧ blue ( x ) λ f. λ x.f ( x ) ∧ round ( x ) ∧ yellow ( x ) < Φ > N/N λ f. λ x.f ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) > N λ x.pillow ( x ) ∧ (( square ( x ) ∧ blue ( x )) ∨ ( round ( x ) ∧ yellow ( x ))) Apply coordinated adjectives to noun A/B : f B : g ⇒ A : f ( g ) ( > )

  30. Parsing with CCGs x CCG is fun NP S \ NP/ADJ ADJ CCG λ f. λ x.f ( x ) λ x.fun ( x ) y > S \ NP λ x.fun ( x ) < z S fun ( CCG ) Lexical Many parsing Many potential + Ambiguity decisions trees and LFs

  31. Weighted Linear CCGs • Given a weighted linear model: ! - CCG lexicon Λ ! - Feature function ! f : X × Y → R m - Weights ! w ∈ R m • The best parse is: ! y ∗ = arg max w · f ( x, y ) ! y • We consider all possible parses y for sentence x given the lexicon Λ

  32. Parsing Algorithms • Syntax-only CCG parsing has polynomial time CKY-style algorithms ! • Parsing with semantics requires entire category as chart signature ! - e.g., ! ADJ : λ x.fun ( x ) • In practice, prune to top-N for each span ! - Approximate, but polynomial time

  33. More on CCGs • Generalized type-raising operations ! • Cross composition operations for cross serial dependencies ! • Compositional approaches to English intonation ! • and a lot more ... even Jazz [Steedman 1996; 2000; 2011; Granroth and Steedman 2012]

  34. Parsing Learning Modeling ! • Lambda calculus ! • Parsing with Combinatory Categorial Grammars ! • Linear CCGs ! • Factored lexicons Online

  35. Learning Learning Data CCG Algorithm • What kind of data/supervision we can use? ! • What do we need to learn?

  36. Parsing as Structure Prediction show me flights to Boston S/N N PP/NP NP λ f.f λ x.flight ( x ) λ y. λ x.to ( x, y ) BOSTON > PP λ x.to ( x, BOSTON ) N \ N λ f. λ x.f ( x ) ∧ to ( x, BOSTON ) < N λ x.flight ( x ) ∧ to ( x, BOSTON ) > S λ x.flight ( x ) ∧ to ( x, BOSTON )

  37. Learning CCG show me flights to Boston S/N N PP/NP NP λ f.f λ x.flight ( x ) λ y. λ x.to ( x, y ) BOSTON > PP λ x.to ( x, BOSTON ) N \ N λ f. λ x.f ( x ) ∧ to ( x, BOSTON ) < N λ x.flight ( x ) ∧ to ( x, BOSTON ) > S λ x.flight ( x ) ∧ to ( x, BOSTON ) Combinators w Lexicon Predefined

  38. Supervised Data show me flights to Boston S/N N PP/NP NP λ f.f λ x.flight ( x ) λ y. λ x.to ( x, y ) BOSTON > PP λ x.to ( x, BOSTON ) N \ N λ f. λ x.f ( x ) ∧ to ( x, BOSTON ) < N λ x.flight ( x ) ∧ to ( x, BOSTON ) > S λ x.flight ( x ) ∧ to ( x, BOSTON )

  39. Supervised Data t show me flights to Boston n S/N N PP/NP NP e λ f.f λ x.flight ( x ) λ y. λ x.to ( x, y ) BOSTON t a > PP L λ x.to ( x, BOSTON ) N \ N λ f. λ x.f ( x ) ∧ to ( x, BOSTON ) < N λ x.flight ( x ) ∧ to ( x, BOSTON ) > S λ x.flight ( x ) ∧ to ( x, BOSTON )

  40. Supervised Data Supervised learning is done from pairs of sentences and logical forms Show me flights to Boston λ x.flight ( x ) ∧ to ( x, BOSTON ) I need a flight from baltimore to seattle λ x.flight ( x ) ∧ from ( x, BALTIMORE ) ∧ to ( x, SEATTLE ) what ground transportation is available in san francisco λ x.ground transport ( x ) ∧ to city ( x, SF ) [Zettlemoyer and Collins 2005; 2007]

  41. Weak Supervision • Logical form is latent ! • “Labeling” requires less expertise ! • Labels don’t uniquely determine correct logical forms ! • Learning requires executing logical forms within a system and evaluating the result

  42. Weak Supervision Learning from Query Answers What is the largest state that borders Texas? New Mexico [Clarke et al. 2010; Liang et al. 2011]

  43. Weak Supervision Learning from Query Answers What is the largest state that borders Texas? New Mexico argmax ( λ x.state ( x ) ∧ border ( x, TX ) , λ y.size ( y )) argmax ( λ x.river ( x ) ∧ in ( x, TX ) , λ y.size ( y )) [Clarke et al. 2010; Liang et al. 2011]

  44. Weak Supervision Learning from Query Answers What is the largest state that borders Texas? New Mexico argmax ( λ x.state ( x ) New Mexico ∧ border ( x, TX ) , λ y.size ( y )) argmax ( λ x.river ( x ) Rio Grande ∧ in ( x, TX ) , λ y.size ( y )) [Clarke et al. 2010; Liang et al. 2011]

  45. Weak Supervision Learning from Query Answers What is the largest state that borders Texas? New Mexico argmax ( λ x.state ( x ) New Mexico ∧ border ( x, TX ) , λ y.size ( y )) argmax ( λ x.river ( x ) Rio Grande ∧ in ( x, TX ) , λ y.size ( y )) [Clarke et al. 2010; Liang et al. 2011]

  46. Weak Supervision Learning from Demonstrations at the chair, move forward three steps past the sofa [Chen and Mooney 2011; Kim and Mooney 2012; Artzi and Zettlemoyer 2013b]

  47. Weak Supervision Learning from Demonstrations at the chair, move forward three steps past the sofa Some examples from other domains: ! • Sentences and labeled game states [Goldwasser and Roth 2011] ! • Sentences and sets of physical objects [Matuszek et al. 2012] [Chen and Mooney 2011; Kim and Mooney 2012; Artzi and Zettlemoyer 2013b]

  48. Weak Supervision Learning from Conversation Logs how can I help you ? (OPEN_TASK) S YSTEM i ‘ d like to fly to new york U SER S YSTEM flying to new york . (CONFIRM: from(fl, ATL) ) leaving what city ? (ASK: λ x.from(fl,x) ) from boston on june seven with american airlines U SER S YSTEM flying to new york . (CONFIRM: to(fl, NYC) ) what date would you like to depart boston ? (ASK: λ x.date(fl,x) ∧ to(fl, BOS) ) june seventh U SER [ CONVERSATION CONTINUES ] [Artzi and Zettlemoyer 2011]

  49. Parsing Learning Modeling ! • Structured perceptron ! • A unified learning algorithm ! • Supervised learning ! • Weak supervision Online

  50. Structured Perceptron • Simple additive updates ! - Only requires efficient decoding ( argmax ) ! - Closely related to MaxEnt and other feature rich models ! - Provably finds linear separator in finite updates, if one exists ! • Challenge: learning with hidden variables

  51. Structured Perceptron • Simple additive updates ! - Only requires efficient decoding ( argmax ) ! - Closely related to MaxEnt and other feature rich models ! - Provably finds linear separator in finite updates, if one exists ! • Challenge: learning with hidden variables Derivations in the complete tutorial

  52. Hidden Variable Perceptron • No known convergence guarantees ! - Log-linear version is non-convex ! • Simple and easy to implement ! - Works well with careful initialization ! • Modifications for semantic parsing ! - Lots of different hidden information ! - Can add a margin constraint, do probabilistic version, etc.

  53. Unified Learning Algorithm • Handle various learning signals ! • Estimate parsing parameters ! • Induce lexicon structure ! • Related to loss-sensitive structured perceptron [Singh-Miller and Collins 2007]

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend