emnlp 2017 copenhagen contributions
play

EMNLP 2017 Copenhagen Contributions } Syntactic Graph Convolutional - PowerPoint PPT Presentation

Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling Diego Marcheggiani and Ivan Titov University of Amsterdam University of Edinburgh EMNLP 2017 Copenhagen Contributions } Syntactic Graph Convolutional


  1. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling Diego Marcheggiani and Ivan Titov University of Amsterdam University of Edinburgh EMNLP 2017 Copenhagen

  2. Contributions } Syntactic Graph Convolutional Networks } State-of-the-art semantic role labeling model } English and Chinese Sequa makes and repairs jet engines.

  3. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence Sequa makes and repairs jet engines. Sequa makes and repairs jet engines.

  4. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  5. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles A0 repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  6. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles A1 A0 repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  7. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles A1 A0 A1 A0 repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  8. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles A1 A0 A1 A0 A1 repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  9. Semantic Role Labeling } Only the head of an argument is labeled } Sequence labeling task for each predicate } Focus on argument identification and labeling A1 A0 A1 A0 A1 repair.01 engine.01 make.01 Sequa makes and repairs jet engines.

  10. Related work } SRL systems that use syntax with simple NN architectures } [FitzGerald et al., 2015] } [Roth and Lapata, 2016] } Recent models ignore linguistic bias } [Zhou and Xu, 2014] } [He et al., 2017] } [Marcheggiani et al., 2017]

  11. Motivations creation creator Sequa makes and repairs jet engines. COORD CONJ NMOD SBJ OBJ ROOT } Some semantic dependencies are mirrored in the syntactic graph

  12. Motivations creation repairer entity repaired creator Sequa makes and repairs jet engines. COORD CONJ NMOD SBJ OBJ ROOT } Some semantic dependencies are mirrored in the syntactic graph } Not all of them – syntax-semantic interface is not trivial

  13. Encoding Sentences with Graph Convolutional Networks } Graph Convolutional Networks (GCNs) [Kipf and Welling, 2017] } Syntactic GCNs } Semantic Role Labeling Model } Experiments } Conclusions

  14. Graph Convolutional Networks (message passing) [Kipf and Welling, 2017] Undirected graph

  15. Graph Convolutional Networks (message passing) [Kipf and Welling, 2017] Update of the blue node Undirected graph

  16. Graph Convolutional Networks (message passing) [Kipf and Welling, 2017] Update of the blue node Undirected graph 0 1 X h i = ReLU @ W 0 h i + W 1 h j A j ∈ N ( i ) Self loop Neighborhood

  17. GCNs Pipeline [Kipf and Welling, 2017] Hidden layer Hidden layer Input Output … … … X = H (0) Z = H (n) Initial feature Representation representation of H (1) H (2) informed by nodes’ nodes neighborhood

  18. GCNs Pipeline [Kipf and Welling, 2017] Hidden layer Hidden layer Input Output … … … X = H (0) Z = H (n) Initial feature Representation representation of H (1) H (2) informed by nodes’ nodes neighborhood Extend GCNs for syntactic dependency trees

  19. Encoding Sentences with Graph Convolutional Networks } Graph Convolutional Networks (GCNs) } Syntactic GCNs } Semantic Role Labeling Model } Experiments } Conclusions

  20. Example Lane disputed those estimates NMOD OBJ SBJ

  21. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) self × W (1) self × W (1) self × W (1) Lane disputed those estimates NMOD OBJ SBJ

  22. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) self × W (1) self × W (1) self × W (1) × W (1) × W (1) W (1) obj subj nmod × Lane disputed those estimates NMOD OBJ SBJ

  23. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) nmod 0 × W (1) self × W (1) self × W (1) self × W (1) × W (1) × W (1) subj 0 × W (1) W (1) obj subj nmod × × W (1) obj 0 Lane disputed those estimates NMOD OBJ SBJ

  24. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) nmod 0 × W (1) self × W (1) self × W (1) self × W (1) × W (1) × W (1) subj 0 × W (1) W (1) obj subj nmod × × W (1) obj 0 Lane disputed those estimates NMOD OBJ SBJ

  25. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) nmod 0 × W (1) self self × W (1) × W (1) self × W (1) × W (1) × W (1) subj 0 × × W (1) W (1) j s b o u nmod b j × W ( 1 ) o b j 0 Lane disputed those estimates NMOD OBJ SBJ

  26. Example ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (2) self × W (2) self nmod 0 × W (2) × W (2) self × W (2) × W (2) × W (2) subj 0 × W (2) ) 2 ( W j nmod b o s × u b j Stacking GCNs widens the × W ( 2 ) o b syntactic neighborhood j 0 ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) self × W (1) 0 d ) 1 o self ( self m × W (1) × W (1) n × W self × W (1) × W (1) × W (1) subj 0 × ) 1 ( W (1) W j s b o u × nmod b j × W ( 1 ) o b j 0 Lane disputed those estimates NMOD OBJ SBJ

  27. Syntactic GCNs 0 1 W ( k ) + b ( k ) @ X h ( k +1) L ( u,v ) h ( k ) = ReLU v u A L ( u,v ) u ∈ N ( v )

  28. Syntactic GCNs 0 1 W ( k ) + b ( k ) @ X h ( k +1) L ( u,v ) h ( k ) = ReLU v u A L ( u,v ) u ∈ N ( v ) Syntactic neighborhood

  29. Syntactic GCNs Message 0 1 W ( k ) + b ( k ) @ X h ( k +1) L ( u,v ) h ( k ) = ReLU v u A L ( u,v ) u ∈ N ( v ) Syntactic neighborhood

  30. Syntactic GCNs Message 0 1 W ( k ) + b ( k ) @ X h ( k +1) L ( u,v ) h ( k ) = ReLU v u A L ( u,v ) u ∈ N ( v ) Messages are direction and Self-loop is included in N Syntactic neighborhood label specific

  31. Syntactic GCNs Message 0 1 W ( k ) + b ( k ) @ X h ( k +1) L ( u,v ) h ( k ) = ReLU v u A L ( u,v ) u ∈ N ( v ) Messages are direction and Self-loop is included in N Syntactic neighborhood label specific } O verparametrized: one matrix for each label-direction pair W ( k ) L ( u,v ) = V ( k ) } dir ( u,v )

  32. Edge-wise Gates } Not all edges are equally important

  33. Edge-wise Gates } Not all edges are equally important } We should not blindly rely on predicted syntax

  34. Edge-wise Gates } Not all edges are equally important } We should not blindly rely on predicted syntax } Gates decide the “importance” of each message ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) g g g g g g g g g g Lane disputed those estimates NMOD SBJ OBJ

  35. Edge-wise Gates } Not all edges are equally important } We should not blindly rely on predicted syntax } Gates decide the “importance” of each message ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) ReLU ( Σ · ) g g g g g g g g g g Gates depend on Lane disputed those estimates nodes and edges NMOD SBJ OBJ

  36. Encoding Sentences with Graph Convolutional Networks } Graph Convolutional Networks (GCNs) } Syntactic GCNs } Semantic Role Labeling Model } Experiments } Conclusions

  37. Our Model } Word representation } Bidirectional LSTM encoder } GCN Encoder } Local role classifier

  38. Word Representation } Pretrained word embeddings } Word embeddings } POS tag embeddings } Predicate lemma embeddings } Predicate flag word representation Lane disputed those estimates

  39. BiLSTM Encoder } Encode each word with its left and right context } Stacked BiLSTM J layers BiLSTM word representation Lane disputed those estimates

  40. GCNs Encoder dobj nsubj nmod K layers } Syntactic GCNs after BiLSTM encoder GCN } Add syntactic information } Skip connections } Longer dependencies are captured J layers BiLSTM word representation Lane disputed those estimates

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend