parsing combinatory categorial grammar with answer set
play

Parsing Combinatory Categorial Grammar with Answer Set Programming: - PowerPoint PPT Presentation

Parsing Combinatory Categorial Grammar with Answer Set Programming: Preliminary Report Yuliya Lierler Peter Schller Computer Science Department, University of Kentucky KBS Group Institut fr Informationssysteme, Technische Universitt


  1. Parsing Combinatory Categorial Grammar with Answer Set Programming: Preliminary Report Yuliya Lierler Peter Schüller Computer Science Department, University of Kentucky KBS Group – Institut für Informationssysteme, Technische Universität Wien WLP – September 30, 2011 supported by: CRA/NSF 2010 Computing Innovation Fellowship, Vienna Science and Technology Fund (WWTF) project ICT08-020

  2. Natural Language Parsing ◮ Required for transforming natural language into KR language(s) ◮ First step: obtaining sentence structure ◮ Example: John saw the astronomer with the telescope. ⇒ two distinct structures = “structural ambiguity” John [saw the astronomer] [with the telescope]. John saw [the astronomer [with the telescope]]. ◮ “Wide-coverage parsing” ⇒ parsing unrestricted natural language (e.g., newspaper) 1 / 18

  3. This Work ◮ Goals of this work: ◮ Wide-coverage parsing ◮ obtaining all distinct structures ◮ Approach: ◮ Parsing represented as planning ◮ Answer Set Programming for realizing the planning ◮ Use of ASP with Function symbols ◮ Optimization for best-effort parsing ◮ Framework using python, gringo, clasp ◮ Visualization 2 / 18

  4. Planning, Answer Set Programming Planning: ◮ actions, executability, effects ◮ initial and goal state ◮ ⇒ find sequence of actions from initial to goal state Answer Set Programming: ◮ declarative programming paradigm ◮ logic programming rules and function symbols ◮ stable model semantics ◮ guess & check — resp. GENERATE - DEFINE - TEST paradigm 3 / 18

  5. Using ASP for Planning ◮ GENERATE all possible action sequences ◮ DEFINE action effects starting from initial state ◮ TEST executability ◮ TEST goal conditions 4 / 18

  6. Combinatory Categorial Grammar (1) ◮ Categories for words and constituents: ◮ Atomic categories, e.g.: noun N , noun phrase NP , sentence S ◮ Complex categories: specify argument and result, e.g.: ◮ S \ NP ⇒ expect NP to the left, result is S ◮ ( S \ NP ) / NP ⇒ expect NP to the right, result is S \ NP ◮ Given CCG lexicon ⇒ represent words by corresponding categories: dog The bit John NP / N ( S \ NP ) / NP N NP ◮ Words may have multiple categories ⇒ handle all combinations 5 / 18

  7. Combinatory Categorial Grammar (2) ◮ Combinators are grammar rules that combine categories: application composition type raising A / B B A / B B / C A > > B B / ( B \ A ) > T A / C A 6 / 18

  8. Combinatory Categorial Grammar (2) ◮ Combinators are grammar rules that combine categories: application composition type raising A / B B A / B B / C A > > B B / ( B \ A ) > T A / C A ◮ Instantiation of combinators used for parsing, e.g.: NP / N N > NP ◮ Example derivation, resp. parse tree: dog bit The John ( S \ NP ) / NP NP / N NP N > > S \ NP NP < S 6 / 18

  9. Using Planning to Realize CCG (1) ◮ State = Abstract Sequence Representation (ASR): ASR contains categories, numbered from left to right. Example: dog The bit John NP / N N ( S \ NP ) / NP NP is represented by the Initial State ASR: [ NP / N 1 , N 2 , ( S \ NP ) / NP 3 , NP 4 ] 7 / 18

  10. Using Planning to Realize CCG (1) ◮ State = Abstract Sequence Representation (ASR): ASR contains categories, numbered from left to right. Example: dog The bit John NP / N N ( S \ NP ) / NP NP is represented by the Initial State ASR: [ NP / N 1 , N 2 , ( S \ NP ) / NP 3 , NP 4 ] ◮ Actions = Combinators that operate on precondition ASR. Combinators yield a single result category. Result category is numbered like the leftmost precondition category. Example: NP / N 1 N 2 > NP 1 7 / 18

  11. Using Planning to Realize CCG (2) ◮ Action Effect = replace precondition sequence by result category. Example: time step 1: ASR = [ NP 1 , ( S \ NP ) / NP 3 , NP 4 ] ( S \ NP ) / NP 3 NP 4 ⇒ action > S \ NP 3 time step 2: ASR = [ NP 1 , S \ NP 3 ] NP 1 S \ NP 3 ⇒ action > S 1 time step 3: ASR = [ S 1 ] ◮ Goal State = ASR [ S 1 ] ◮ Concurrent execution of actions possible. 8 / 18

  12. Spurious CCG Parses ◮ Redundant parse trees yield same semantic result. Example: dog The bit John ( S \ NP ) / NP λαβ. b ( β, α ) NP / N λα.α NP j N d > > NP d S \ NP λβ. b ( β, j ) < S b ( d , j ) 9 / 18

  13. Spurious CCG Parses ◮ Redundant parse trees yield same semantic result. Example: dog The bit John ( S \ NP ) / NP λαβ. b ( β, α ) NP / N λα.α NP j N d > > NP d S \ NP λβ. b ( β, j ) < S b ( d , j ) versus dog The NP / N λα.α N d > NP d bit S / ( S \ NP ) λγδ.γ ( d , δ ) > T ( S \ NP ) / NP λαβ. b ( β, α ) John S / NP λδ. [ λαβ. b ( β, α )]( d , δ ) = λδ. b ( d , δ ) > B NP j > S b ( d , j ) ◮ Such parse trees are called spurious and should be suppressed. 9 / 18

  14. Spurious Parse Normalization A SP C CG T K implements known methods for eliminating spurious parses: ◮ Allow only one branching direction for functional compositions: normalize W / X X / Y Y / Z W / X X / Y Y / Z > B > B W / Y X / Z > B > B ⇒ W / Z W / Z 10 / 18

  15. Spurious Parse Normalization A SP C CG T K implements known methods for eliminating spurious parses: ◮ Allow only one branching direction for functional compositions: normalize W / X X / Y Y / Z W / X X / Y Y / Z > B > B W / Y X / Z > B > B ⇒ W / Z W / Z ◮ Disallow certain combinations of rule applications: normalize X / Y Y / Z X / Y Y / Z Z Z > B > Y X / Z > > ⇒ X X ◮ Implemented as executability conditions of actions. 10 / 18

  16. ASP Encoding (State Representation) ◮ posCat ( p , c , t ) ⇒ category c is annotated with (position) p at time t ◮ posAdjacent ( p L , p R , t ) ⇒ position p L is adjacent to position p R at time t ◮ categories represented as function symbols rfunc , lfunc , and strings Example: “The dog bit John.” is represented as the EDB posCat ( 1 , rfunc (“ NP ” , “ N ”) , 0 ) . posCat ( 2 , “ N ” , 0 ) . posCat ( 3 , rfunc ( lfunc (“ S ” , “ NP ”) , “ NP ”) , 0 ) . posCat ( 4 , “ NP ” , 0 ) . posAdjacent ( 1 , 2 , 0 ) . posAdjacent ( 2 , 3 , 0 ) . posAdjacent ( 3 , 4 , 0 ) . 11 / 18

  17. ASP Encoding (Action Generation) ◮ GENERATE part of encoding for A / B B > A { occurs ( ruleFwdAppl , L , R , T ) } ← posCat ( L , rfunc ( A , B ) , T ) , posCat ( R , B , T ) , posAdjacent ( L , R , T ) , not ban ( ruleFwdAppl , L , T ) , time ( T ) , T < maxsteps . ◮ DEFINE part for ban /2 realizes normalizations 12 / 18

  18. ASP Encoding (Effects) ◮ DEFINE part of encoding for explicit effects of A / B B > A posCat ( L , A , T + 1 ) ← occurs ( ruleFwdAppl , L , R , T ) , posCat ( L , rfunc ( A , B ) , T ) , time ( T ) , T < maxsteps . ◮ DEFINE part of encoding for implicit effect called “affectedness”: posAffected ( L , T + 1 ) ← occurs ( Act , L , R , T ) , binary ( Act ) , time ( T ) , T < maxsteps . 13 / 18

  19. ASP Encoding (Inertia and Goal) ◮ DEFINE part of encoding for ASR inertia: posCat ( P , C , T + 1 ) ← posCat ( P , C , T ) , not posAffected ( P , T + 1 ) , time ( T ) , T < maxsteps . ◮ TEST forbids invalid concurrency ◮ TEST enforces reaching the goal state 14 / 18

  20. ASPCCG Toolkit A SP C CG T K Sentence (string) C&C supertagger OR Sequence of words + category tags for each word Sequence of words ccg.asp + GRINGO + CLASP Dictionary Parser answer sets GRINGO + CLASP Visualisation ccg2idpdraw.asp + IDPDraw ◮ implemented in ASP controlled by python ◮ using/exteding BioASP library in potassco ◮ http://www.kr.tuwien.ac.at/staff/ps/aspccgtk/ 15 / 18

  21. Visualisation dog bit The John ( S \ NP ) / NP NP NP / N N > > S \ NP NP < S ◮ uses IDPDraw ◮ in python: convert rfunc ( NP , N ) into “ NP / N ” 16 / 18

  22. Best-effort parsing ◮ Assume, in our lexicon, “bit” always requires someone being bitten (i.e., assume there is no intransitive category for “bit”). ◮ “The dog bit” then is not recognized as a sentence. 17 / 18

  23. Best-effort parsing ◮ Assume, in our lexicon, “bit” always requires someone being bitten (i.e., assume there is no intransitive category for “bit”). ◮ “The dog bit” then is not recognized as a sentence. ◮ A SP C CG T K will not find a parse and provide a best-effort parse: The dog bit NP / N N ( S \ NP ) / NP > NP > T S / ( S \ NP ) > B S / NP 17 / 18

  24. Recent, Ongoing and Future Work Recent and Ongoing: ◮ using incremental solver ICLINGO ◮ performance evaluation on large corpus CCGBank ◮ different encodings (configuration, CYK) ( ⇒ there we have the main effort in grounding) Future: ◮ add features to make A SP C CG T K comparable to C&C (probably the most widely used wide coverage CCG parser) ◮ make compatible with Boxer ◮ correctness evaluation on large corpus 18 / 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend