Language Modelling Makes Sense
Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation
Da Daniel l Lo Loureiro, Alípio Jorge ACL – Florence, 31 July 2019
Language Modelling Makes Sense Propagating Representations through - - PowerPoint PPT Presentation
Language Modelling Makes Sense Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation Da Daniel l Lo Loureiro, Alpio Jorge ACL Florence, 31 July 2019 Sense Embeddings Exploiting the latest Neural
Da Daniel l Lo Loureiro, Alípio Jorge ACL – Florence, 31 July 2019
Exploiting the latest Neural Language Models (NLMs) for sense-level representation learning.
Introduction Related Work Our Approach Performance Applications Conclusions
Exploiting the latest Neural Language Models (NLMs) for sense-level representation learning.
Introduction Related Work Our Approach Performance Applications Conclusions
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
Bag-of-Features Classifiers (SVM) Deep Sequence Classifiers (BiLSTM) Sense-level Representations (k-NN)
(over NLM reprs.)
[Iacobacci et al. (2016)] [Zhong and Ng (2010)] [Luo et al. (2018b)] [Luo et al. (2018a)] [Vial et al. (2018)] [Raganato et al. (2017)] [Peters et al. (2018)] [Melamud et al. (2016)] [Yuan et al. (2016)]
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
Bag-of-Features Classifiers (SVM) Deep Sequence Classifiers (BiLSTM) Sense-level Representations (k-NN)
(over NLM reprs.)
[Iacobacci et al. (2016)] [Zhong and Ng (2010)] [Luo et al. (2018b)] [Luo et al. (2018a)] [Vial et al. (2018)] [Raganato et al. (2017)] [Peters et al. (2018)] [Melamud et al. (2016)] [Yuan et al. (2016)]
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
It Makes Sense (IMS) [Zhong and Ng (2010)] :
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
“glasses”
Bi-directional LSTMs (BiLSTMs):
[Raganato et al. (2017)]
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
Introduction Rel elated Work
Our Approach Performance Applications Conclusions
Matching Contextual Word Embeddings:
[Ruder (2018)]
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Reinforce Enrich Propagate Bootstrap
Annotated Dataset WordNet Ontology WordNet Glosses Morphological Embeddings
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Can your insurance company aid you in reducing administrative costs ? Would it be feasible to limit the menu in order to reduce feeding costs ?
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Can your insurance company aid you in reducing administrative costs ?
insurance_company%1:14:00:: aid%2:41:00:: reduce%2:30:00:: administrative%3:01:00:: cost%1:21:00::
Would it be feasible to limit the menu in order to reduce feeding costs ?
cost%1:21:00:: feasible%5:00:00:possible:00 limit%2:30:00:: menu%1:10:00:: reduce%2:30:00:: feeding%1:04:01::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Can your insurance company aid you in reducing administrative costs ?
insurance_company%1:14:00:: aid%2:41:00:: reduce%2:30:00:: administrative%3:01:00:: cost%1:21:00::
Would it be feasible to limit the menu in order to reduce feeding costs ?
cost%1:21:00:: feasible%5:00:00:possible:00 limit%2:30:00:: menu%1:10:00:: reduce%2:30:00:: feeding%1:04:01::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Can your insurance company aid you in reducing administrative costs ?
insurance_company%1:14:00:: aid%2:41:00:: reduce%2:30:00:: administrative%3:01:00:: cost%1:21:00::
Would it be feasible to limit the menu in order to reduce feeding costs ?
cost%1:21:00:: feasible%5:00:00:possible:00 limit%2:30:00:: menu%1:10:00:: reduce%2:30:00:: feeding%1:04:01::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
insurance_company%1:14:00:: aid%2:41:00:: reduce%2:30:00:: administrative%3:01:00:: cost%1:21:00:: cost%1:21:00:: feasible%5:00:00:possible:00 limit%2:30:00:: menu%1:10:00:: reduce%2:30:00:: feeding%1:04:01::
𝑑1 𝑑1 𝑑1 𝑑1 𝑑1 𝑑2 𝑑2 𝑑2 𝑑2 𝑑2 𝑑2
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
reduce%2:30:00:: cost%1:21:00:: cost%1:21:00:: reduce%2:30:00::
𝑑1 𝑑1 𝑑2 𝑑2
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
𝑤 reduce%2:30:00::
reduce%2:30:00::
𝑑1
reduce%2:30:00::
𝑑2
+ n
reduce%2:30:00::
𝑑n
+ + … =
𝑤 cost%1:21:00::
cost%1:21:00::
𝑑1
cost%1:21:00::
𝑑2
+ n
cost%1:21:00::
𝑑n
+ + … =
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
𝑤 reduce%2:30:00::
reduce%2:30:00::
𝑑1
reduce%2:30:00::
𝑑2
+ n
reduce%2:30:00::
𝑑n
+ + … =
𝑤 cost%1:21:00::
cost%1:21:00::
𝑑1
cost%1:21:00::
𝑑2
+ n
cost%1:21:00::
𝑑n
+ + … =
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Sensekey Sensekey Synset Synset Synset Synset Lexname Sensekey Sensekey
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
kid%1:18:00:: Sensekey child.n.01 Synset juvenile.n.01 Synset noun.person Sensekey Sensekey
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00:: hotdog%1:18:00:: potato_chip%1:13:00:: wrap%1:13:00:: sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00:: hotdog%1:18:00:: potato_chip%1:13:00:: wrap%1:13:00:: sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00::
burger.n.02 hotdog.n.01 sandwich.n.01 chips.n.04 noun.food
hotdog%1:18::00 potato_chip%1:13::00
wrap.n.02
wrap%1:13::00 sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00::
burger.n.02 hotdog.n.01 sandwich.n.01 chips.n.04 noun.food
hotdog%1:18::00 potato_chip%1:13::00
wrap.n.02
wrap%1:13::00 sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00::
burger.n.02 hotdog.n.01 sandwich.n.01 chips.n.04 noun.food
hotdog%1:18::00 potato_chip%1:13::00
wrap.n.02
wrap%1:13::00 sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00::
burger.n.02 hotdog.n.01 sandwich.n.01 chips.n.04 noun.food
hotdog%1:18::00 potato_chip%1:13::00
wrap.n.02
wrap%1:13::00 sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
hamburger%1:13:01:: burger%1:13:00::
burger.n.02 hotdog.n.01 sandwich.n.01 chips.n.04 noun.food
hotdog%1:18::00 potato_chip%1:13::00
wrap.n.02
wrap%1:13::00 sandwich%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00:: (sandwich.n.01)
Definition: two (or more) slices of bread with a filling between them Lemmas: sandwich
wrap:%1:13:00:: (wrap.n.02)
Definition: a sandwich in which the filling is rolled up in a soft tortilla Lemmas: wrap, tortilla
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00:: (sandwich.n.01)
sandwich - two (or more) slices of bread with a filling between them
wrap:%1:13:00:: (wrap.n.02)
wrap, tortilla - a sandwich in which the filling is rolled up in a soft tortilla
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap - wrap, tortilla - a sandwich in which the filling is rolled up in a soft tortilla
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap - wrap, tortilla - a sandwich in which the filling is rolled up in a soft tortilla
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap – wrap, tortilla - a sandwich in which the filling is rolled up in a soft tortilla
𝑑 𝑑 𝑑 𝑑 𝑑 𝑑 … 𝑑 𝑑 …
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap - wrap - a sandwich in which the filling is rolled up in a soft tortilla
𝑒 = 1024
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap - wrap - a sandwich in which the filling is rolled up in a soft tortilla
sandwich:%1:13:00::
wrap:%1:13:00::
sandwich:%1:13:00::
sandwich - sandwich - two (or more) slices of bread with a filling between them
wrap%1:13:00::
wrap - wrap - a sandwich in which the filling is rolled up in a soft tortilla
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
𝑒 = 2048
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Contextual Embeddings aren’t good at preserving morphological relatedness
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00:: wrap%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00:: wrap%1:13:00::
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
sandwich:%1:13:00:: wrap%1:13:00::
𝑒 = 2348
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
Introduction Related Work Our ur Ap Approach Performance Applications Conclusions
𝑤𝑒 𝑤𝑚 𝑤𝑡 𝑤𝑒 𝑤𝑚 𝑤𝑡 𝑤𝑒 𝑤𝑚 𝑤𝑡
spectacles%1:06:00:: glass%1:27:00:: drinking_glass%1:06:00::
Introduction Related Work Our Approach Per erformance Applications Conclusions
Introduction Related Work Our Approach Per erformance Applications Conclusions
60 65 70 75 80 MFS IMS (Zhong and Ng, 2010) IMS + Emb. (Iacobacci et
BiLSTM (Raganato et
BiLSTM VR (Vial et al. 2018) context2vec (Melamud et
ELMo k-NN (Peters et al. 2018) BERT k-NN (Adapted Peters et al.) LMMS-BERT (Ours)
Standard English WSD Evaluation
F1 on ALL set of the WSD Evaluation Framework (Raganato et al. 2017)
Introduction Related Work Our Approach Per erformance Applications Conclusions
Uninformed Sense Matching (matching +200K)
Same standard but without filtering candidates by lemmas or POS
10 20 30 40 50 60 70 80 LMMS 1024 LMMS 2048 LMMS 2348
Introduction Related Work Our Approach Performance Ap Applications Conclusions
Introduction Related Work Our Approach Performance Ap Applications Conclusions
Introduction Related Work Our Approach Performance Ap Applications Conclusions
Introduction Related Work Our Approach Performance Ap Applications Conclusions
Introduction Related Work Our Approach Performance Ap Applications Conclusions
𝑐𝑗𝑏𝑡 𝑡 = 𝑡𝑗𝑛 Ԧ 𝑤𝑛𝑏𝑜𝑜
1, Ԧ
𝑤𝑡 − 𝑡𝑗𝑛( Ԧ 𝑤𝑥𝑝𝑛𝑏𝑜𝑜
1, Ԧ
𝑤𝑡)
Introduction Related Work Our Approach Performance Applications Conclusions
Introduction Related Work Our Approach Performance Applications Conclusions
Introduction Related Work Our Approach Performance Applications Conclusions
@danielbloureiro dloureiro@fc.up.pt