entities as a
play

Entities as a Gemma Boleda 2 Window into Gabriella Lapesa 1 V Thejas - PowerPoint PPT Presentation

Institute for Natural Language Processing Collaborations with Abhijeet Gupta 1 Marco Baroni 2 Entities as a Gemma Boleda 2 Window into Gabriella Lapesa 1 V Thejas 3 (Distributional) Matthijs Westera 2 Semantics 1 University of Stuttgart


  1. Institute for Natural Language Processing Collaborations with Abhijeet Gupta 1 Marco Baroni 2 Entities as a Gemma Boleda 2 Window into Gabriella Lapesa 1 V Thejas 3 (Distributional) Matthijs Westera 2 Semantics 1 University of Stuttgart Sebastian Padó 2 UPF Barcelona 3 BITS Pilani

  2. RANLP, September 3, 2019 2

  3. 3 RANLP, September 3, 2019

  4. • deal, option are categories (concepts) • Listed in dictionary • Macron, Brexit are individual entities/events • Listed in encyclopedia RANLP, September 3, 2019 4

  5. Model-theoretic semantics • Meaning of language units defined relative to world model (Gamut 1991: Universe U = set of individuals) • Proper nouns and other entities: • Mapped onto elements of the universe • Common nouns, adjectives, and other categories: • Mapped onto sets of elements of the universe Brexit politician E. Macron U events B. Johnson RANLP, September 3, 2019 5

  6. Model-theoretic semantics • Meaning of language units defined relative to world model (Gamut 1991: Universe U = set of individuals) • Proper nouns and other entities: Entities and categories • Mapped onto elements of the universe are fundamentally different • Common nouns, adjectives, and other categories: What about current NLP? • Mapped onto sets of elements of the universe Brexit politician E. Macron U events B. Johnson RANLP, September 3, 2019 6

  7. Distributional Semantics (DS) • Dominant paradigm to acquire lexical information: deal • Learn linear algebra Macron Johnson option representations of linguistic Brexit units from context • A.k.a. Vector spaces, embeddings, distributed representations • Still DS because all use the “distributional hypothesis”: “You shall know a word by the company it keeps” (Firth, Harris, Miller & Charles 1991, etc.) RANLP, September 3, 2019 7

  8. Distributional Semantics (DS) • Dominant paradigm to acquire lexical information: deal • Learn linear algebra Macron Johnson option representations of linguistic How is this applied to Brexit units from context categories / entities in NLP? • A.k.a. Vector spaces, Split by subcommunity embeddings, distributed representations • Still DS because all use the “distributional hypothesis”: “You shall know a word by the company it keeps” (Firth, Harris, Miller & Charles 1991, etc.) RANLP, September 3, 2019 8

  9. Computational Lexical Semantics • Strong focus on modelling linguistic aspects of meaning: categories and relations among categories • Hyponymy/hypernymy (entailment), From Clarke 2009 synonymy, meronymy • Also diachronic change “Interested in generalizations” RANLP, September 3, 2019 9

  10. Semantic Web / Information Extraction • Complementary focus on modelling world knowledge aspects of meaning: entities and relations among entities • Knowledge bases / knowledge graphs “Interested in particularities” RANLP, September 3, 2019 10

  11. The Current Situation • So Distributional Semantics deal is applied Macron Johnson option • to both entities and categories Brexit • to learn fairly different things • How is this possible? • “It just works” • DS is a practice without a theory RANLP, September 3, 2019 11

  12. Agenda for this presentation • Q: Are there relevant differences in the way we can apply DS to modelling entities and categories? • Research strand 1: Knowledge Bases • How far can we push DS in learning world knowledge? • Research strand 2: The Instantiation Relation • How do categories and entities behave distributionally? Benefit: insights into capabilities and limits of distributional approaches to meaning RANLP, September 3, 2019 12

  13. Agenda for this presentation • Q: Are there relevant differences in the way we can apply DS to modelling entities and categories? • Research strand 1: Knowledge Bases • How far can we push DS in learning world knowledge? • Research strand 2: The Instantiation Relation • How do categories and entities behave distributionally? Benefit: insights into capabilities and limits of distributional approaches to meaning RANLP, September 3, 2019 13

  14. Strand 1: Knowledge Base Completion • Challenge: KBs are incomplete [Min et al. 2013, West et al. 2014] • Knowledge Base Completion (KBC) : Add missing edges to knowledge graph • Very active area of research • Representation learning • Learn embeddings for entities and relations RANLP, September 3, 2019 14

  15. Entity Embeddings and KBC • KBC embeddings can be learned from text, KB, or both • Our Interest: limits of distributional semantics • Focus on text-based embeddings of entities • Entities have fine-grained attributes with specific values • Research Question: Can all attributes be predicted from vanilla word embeddings? (And if not, why not?) Italy Italy sunny 30 Population : 61 million wine 15 Area : 301,000 sq.km beach 12 Language : Italian Rome 10 Contained by : Europe 15 Naples 6 Currency used: Euro RANLP, September 3, 2019

  16. Simple Supervised KBC [Gupta et al. 15,17] • Task: Use entity embeddings to predict entity attributes with Multi-Layer Perceptron (MLP) Italy • Numeric: predict value(s) Population : 61 million Area : 301,000 sq.km • Categorical: predict embedding Language : Italian Contained by : Europe for relatum (Italy, currency, Euro) Currency used: Euro Output (All) Numeric Attribute Values Categorical Attribute Value Embedding |N| n σ tanh Hidden Layer Hidden Layer h h tanh tanh Entity Embedding Entity Embedding 1-hot Attribute Vector n n |C| RANLP, September 3, 2019 16

  17. Evaluation of Attributes • Categorical attributes: Mean Reciprocal Rank (MRR) • Mean rank of predicted relatum embedding among nearest neighbors of true relatum embedding • Numeric attributes: Correlation • Spearman correlation between predicted and true rankings of entities w.r.t. attribute (Leaving out details here; see papers) RANLP, September 3, 2019 17

  18. Experimental Setup • Em Embe beddi ddings ngs : Google News vectors (Mikolov et al. 2013) • Word2Vec skipgram, 300 dimensions • Ex Expe periment ntal setup: up: Train/Test on 7 FreeBase domains | C | | N | Domain # Entities (train/val/test) Animal 279/93/93 22 118 Book 16/5/6 8 2 Citytown 1783/594/595 57 62 Country 155/53/51 79 698 Employer 720/140/141 50 55 Organization 187/63/62 36 32 People 85/28/29 25 76 Sum 3225/976/977 277 1043 RANLP, September 3, 2019 18

  19. Experimental Setup • Em Embe beddi ddings ngs : Google News vectors (Mikolov et al. 2013) • Word2Vec skipgram, 300 dimensions • Ex Expe periment ntal setup: up: Train/Test on 7 FreeBase domains Three case studies / observations | C | | N | Domain # Entities (train/val/test) Animal 279/93/93 22 118 (My) explanation to follow Book 16/5/6 8 2 Citytown 1783/594/595 57 62 Country 155/53/51 79 698 Employer 720/140/141 50 55 Organization 187/63/62 36 32 People 85/28/29 25 76 Sum 3225/976/977 277 1043 RANLP, September 3, 2019 19

  20. Domain Country: Numeric Attributes Feature Correlation of MLP best Geolocation (Lat. / Long.) 0.93 0. 93 GDP_per_capita 0. 0.89 89 CO2_emissions_per_capita 0. 0.88 88 … … GDP_nominal 0. 0.78 78 … … Date_founded 0.54 worst Religion_percentage 0.42 • Attributes differ greatly in difficulty • Geographical attributes easy (Louwerse et al. 2009) RANLP, September 3, 2019 21

  21. Geolocation: The Good Actual Predicted A Hong Kong B Bangladesh C Cocos Islands D Eritrea E Latvia F Belarus G Iran RANLP, September 3, 2019 22

  22. Geolocation: The Bad Actual Predicted Actual Predicted A New Caledonia E Niue B Cocos Islands F Tuvalu C Cook Islands G Vanuatu D Mauritius RANLP, September 3, 2019 23

  23. Domain Country: GDP Feature Correlation of MLP best Geolocation (Lat. / Long.) 0.93 0. 93 GDP_per_capita 0. 0.89 89 CO2_emissions_per_capita 0. 0.88 88 … … GDP_nominal 0.78 0. 78 … … Date_founded 0.54 worst Religion_percentage 0.42 • Even very similar attributes differ substantially (?) RANLP, September 3, 2019 24

  24. Domain Country: Difficult Attributes Feature Correlation of MLP best Geolocation (Lat. / Long.) 0.93 0. 93 GDP_per_capita 0. 0.89 89 CO2_emissions_per_capita 0. 0.88 88 … … GDP_nominal 0. 0.78 78 … … Date_founded 0.54 worst Religion_percentage 0.42 • The most difficult attributes appear to be very sp speci cific RANLP, September 3, 2019 25

  25. Contextual Support • Our KBC task = learn mappings from context-derived embedding space to attribute space Switzerland China Luxembourg GDP per capita 1. Attribute must correlate with prominent context cues 2. Entities with similar values of attribute must co-occur with similar context cues RANLP, September 3, 2019 26

  26. Contextual Support • Our KBC task = learn mappings from (BOW) embedding space to attribute space The extent to which China Germany this holds: degree of contextual support Luxembourg GDP per capita 1. Attribute must correlate with prominent context cues 2. Entities with similar values of attribute must co-occur with similar context cues RANLP, September 3, 2019 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend