What do Recurrent Neural Network Grammars Learn About Syntax ?
Authors: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith Presented by: Triveni Putti Paper link: https://arxiv.org/pdf/1611.05774.pdf
What do Recurrent Neural Network Grammars Learn About Syntax ? - - PowerPoint PPT Presentation
What do Recurrent Neural Network Grammars Learn About Syntax ? Authors: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith Presented by: Triveni Putti Paper link: https://arxiv.org/pdf/1611.05774.pdf
Authors: Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith Presented by: Triveni Putti Paper link: https://arxiv.org/pdf/1611.05774.pdf
‐ Experiments and results
‐ Experiments and results ‐ Headedness in phrases
Terminals The The hungry The hungry cat The hungry cat The hungry cat The hungry cat meows The hungry cat meows The hungry cat meows. The hungry cat meows. NT(S) NT(NP) GEN(The) GEN(hungry) GEN(cat) REDUCE NT(VP) GEN(meows) REDUCE GEN(.) REDUCE (S (S (NP (S (NP The (S (NP The hungry (S (NP The hungry cat (S (NP The hungry cat) (S (NP The hungry cat) (VP (S (NP The hungry cat) (VP meows (S (NP The hungry cat) (VP meows ) (S (NP The hungry cat) (VP meows ). (S (NP The hungry cat) (VP meows ). ) (S (NP The hungry cat) Action Stack
NP The hungry cat ) NP (
performance on PTB
+ indicates systems that use additional unparsed data (semi supervised)
performance on PTB
performance on PTB
Perplexity
Perplexity
performance on PTB
performance on PTB
Average perplexity
average number of “choices” for each nonterminal category
distribution (no headedness)
than the uniform distribution baseline, they are quite peaked around certain components.
Attention weight vectors for some samples for PPs
Simple NPs – Rightmost nouns> Adjectives> Determiners ~ Possessive determiners(6,7) Complex NPs – Both first (8) or last noun (9) can have high attention; for conjunctions of multiple NPs, conjunction gets most attention (10) Attention weight vectors for some samples for NPs
Simple VPs - NP> Verb (9); Negation is assigned non-trivial weight (7,8) Other VPs - for conjunctions of multiple VPs, conjunction gets most attention (10) Attention weight vectors for some samples for VPs