Entity Representation and Retrieval Laura Dietz University of New - - PowerPoint PPT Presentation

entity representation and retrieval
SMART_READER_LITE
LIVE PREVIEW

Entity Representation and Retrieval Laura Dietz University of New - - PowerPoint PPT Presentation

Entity Representation and Retrieval Laura Dietz University of New Hampshire Alexander Kotov Wayne State University Edgar Meij Bloomberg SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR Knowledge Graph Fragment SIGIR 2018 Tutorial on


slide-1
SLIDE 1

Entity Representation and Retrieval

Laura Dietz

University of New Hampshire

Alexander Kotov

Wayne State University

Edgar Meij

Bloomberg

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-2
SLIDE 2

Knowledge Graph Fragment

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-3
SLIDE 3

Entity Retrieval

Besides documents, users often search for concrete or abstract entities/objects (i.e. people, products, organizations, books) Users are willing to express these information needs more elaborately than with a few keywords [Balog et al., SIGIR’08] Entities (or entity cards) provide immediate answers to such queries → natural units for organizing search results Knowledge graphs are built around entities → Entity Retrieval from Knowledge Graph(s) (ERKG)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-4
SLIDE 4

Entity Retrieval Tasks

Entity Search: simple queries aimed at finding a particular entity or an entity which is an attribute of another entity

◮ “Ben Franklin” ◮ “Einstein Relativity theory” ◮ “England football player highest paid”

List Search: descriptive queries with several relevant entities

◮ “US presidents since 1960” ◮ “animals lay eggs mammals” ◮ “Formula 1 drivers that won the Monaco Grand Prix”

Question Answering: queries are questions in natural language

◮ “Who founded Intel?” ◮ “For which label did Elvis record his first album?”

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-5
SLIDE 5

Entity Retrieval from Knowledge Graph(s) (ERKG)

Evolution of entity retrieval tasks:

◮ Expert search at TREC 2005–2008 enterprise track: find experts knowledgeable about a given topic ◮ Entity ranking track at INEX 2007–2009: find Wikipedia page of entities with a given target type ◮ Related entity search at TREC 2009–2011 entity track: find Web pages of entities related to a given entity in a certain way

Can be used for entity linking: fragment of text as query, list of linked entities as result Can be combined with methods using KGs for ad-hoc or Web search (part 3 of this tutorial)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-6
SLIDE 6

Why ERKG?

Unique IR problem: there are no documents. Entities in KG have no textual representation, apart from their names Challenging IR problem: knowledge graphs are best suited for structured graph pattern-based SPARQL queries, not for traditional IR models

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-7
SLIDE 7

Research Challenges in ERKG

ERKG requires accurate interpretation of unstructured textual queries and matching them with entity semantics:

  • 1. How to design entity representations that capture the semantics of

entity properties and relations to other entities?

  • 2. How to semantically match unstructured queries with structured

entity representations?

  • 3. How to account for entity types in retrieval?

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-8
SLIDE 8

Architecture of ERKG Methods

[Tonon, Demartini et al., SIGIR’12]

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-9
SLIDE 9

Outline

Entity representation Entity retrieval Entity set expansion Entity ranking

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-10
SLIDE 10

Structured Entity Documents

Build a textual representation (i.e. “document”) for each entity by considering all triples, where it stands as a subject (or object)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-11
SLIDE 11

Predicate Folding

Simple approach: each predicate corresponds to one entity document field Problem: there are infinitely many predicates → optimization of field importance weights is computationally intractable Predicate folding: group predicates into a small set of predefined categories → entity documents with smaller number of fields

◮ by predicate type (attributes, incoming/outgoing links)[P´ erez-Ag¨ uera et al., SemSearch 2010] ◮ by predicate importance (determined based on predicate popularity)[Blanco et al., ISWC 2011]

The number and type of fields depends on a retrieval task

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-12
SLIDE 12

Predicate Folding Example

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-13
SLIDE 13

2-field Entity Document

[Neumayer, Balog et al., ECIR’12]

Each entity is represented as a two-field document: title

  • bject values belonging to predicates ending with “name”,

“label” or “title” content

  • bject values for 1000 most frequent predicates

concatenated together into a flat text representation This simple scheme is effective for entity retrieval

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-14
SLIDE 14

2-field Entity Document Example

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-15
SLIDE 15

3-field Entity Document

[Zhiltsov and Agichtein, CIKM’13]

Each entity is represented as a three-field document: names literals of foaf:name, rdfs:label predicates along with tokens extracted from entity URIs attributes literals of all other predicates

  • utgoing links

names of entities in the object position This scheme is effective for entity retrieval

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-16
SLIDE 16

3-field Entity Document Example

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-17
SLIDE 17

5-field Entity Document

[Zhiltsov, Kotov et al., SIGIR’15]

Each entity is represented as a five-field document: names labels or names of entities attributes all entity properties, other than names categories classes or groups, to which the entity has been assigned similar entity names names of the entities that are very similar or identical to a given entity related entity names names of entities in the object position This flexible scheme is effective for a variety of tasks: entity search, list search, question answering

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-18
SLIDE 18

5-field Entity Document Example

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-19
SLIDE 19

Challenges related to Entity Representations

Vocabulary mismatch between relevant entity(ies) description(s) and the query terms that can be used to search for it(them) Associations between words and entities depend on the context:

◮ Germany should be returned for queries related to World War II and 2006 Soccer World Cup

Real-life events change the descriptions of entities:

◮ Ferguson, Missouri before and after August 2014

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-20
SLIDE 20

Dynamic Entity Representation

[Graus, Tsagkias et al., WSDM’16]

Idea: create static entity representations using knowledge bases and leverage different social media sources to dynamically update them Represent entities as fielded documents, in which each field corresponds to different source Tweak the weights of different fields over time

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-21
SLIDE 21

Static Sources

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-22
SLIDE 22

Dynamic Sources

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-23
SLIDE 23

Outline

Entity representation Entity retrieval Entity set expansion Entity ranking

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-24
SLIDE 24

Methods for ERKG

ERKG has been addressed in a probabilistic generative framework: P(e|q) ∝ P(q|e)P(e) Besides keywords qw, query q implicitly or explicitly contains target entity type(s) qt, which can be incorporated into entity retrieval models

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-25
SLIDE 25

Incorporating Entity Types

Two ways to combine term-based similarity P(qw|e) and type-based similarity P(qt|e): Filtering [Bron et al., CIKM’10]: P(q|e) = P(qw|e)P(qt|e) Interpolation [Balog et al., TOIS’11; Kaptein et al., AI’13; Pehcevski et al., IR’10; Raviv et al., JIWES’12]: P(q|e) = (1 − λt)P(qw|e) + λtP(qt|e)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-26
SLIDE 26

Term-based Similarity

Possible options for P(qw|e): unigram bag-of-words models for structured document retrieval:

◮ Mixture of Language Models (MLM) [Ogilvie and Callan, SIGIR’03] ◮ BM25 for multi-field documents (BM25F) [Robertson et al., CIKM’04] ◮ Probabilistic Retrieval Model for Semi-structured Data (PRMS) [Kim and Croft, ECIR’09]

term dependence (bigrams) models:

◮ Sequential Dependence Model (SDM) [Metzler and Croft, SIGIR’05]

term dependence models for structured document retrieval:

◮ Fielded Sequential Dependence Model (FSDM) [Zhiltsov et al., SIGIR’15] ◮ Parameterized Fielded Sequential Dependence Model (PFSDM) [Nikolaev et al., SIGIR’16]

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-27
SLIDE 27

Fielded Sequential Dependence Model

[Zhiltsov, Kotov et al., SIGIR’15]

Idea: account both for phrases (bigrams) and document structure Document score is a linear combination of matching functions for unigrams and bigrams in each document field: PΛ(D|Q)

rank

= λT

  • q∈Q

˜ fT(qi, D) + λO

  • q∈Q

˜ fO(qi, qi+1, D) + λU

  • q∈Q

˜ fU(qi, qi+1, D) MLM is a special case of FSDM, when λT = 1, λO = 0, λU = 0

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-28
SLIDE 28

Fielded Sequential Dependence Model

[Zhiltsov, Kotov et al., SIGIR’15]

Idea: account both for phrases (bigrams) and document structure Document score is a linear combination of matching functions for unigrams and bigrams in each document field: PΛ(D|Q)

rank

= λT

  • q∈Q

˜ fT(qi, D) + λO

  • q∈Q

˜ fO(qi, qi+1, D) + λU

  • q∈Q

˜ fU(qi, qi+1, D) MLM is a special case of FSDM, when λT = 1, λO = 0, λU = 0

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-29
SLIDE 29

Fielded Sequential Dependence Model

[Zhiltsov, Kotov et al., SIGIR’15]

Idea: account both for phrases (bigrams) and document structure Document score is a linear combination of matching functions for unigrams and bigrams in each document field: PΛ(D|Q)

rank

= λT

  • q∈Q

˜ fT(qi, D) + λO

  • q∈Q

˜ fO(qi, qi+1, D) + λU

  • q∈Q

˜ fU(qi, qi+1, D) MLM is a special case of FSDM, when λT = 1, λO = 0, λU = 0

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-30
SLIDE 30

Fielded Sequential Dependence Model

[Zhiltsov, Kotov et al., SIGIR’15]

Idea: account both for phrases (bigrams) and document structure Document score is a linear combination of matching functions for unigrams and bigrams in each document field: PΛ(D|Q)

rank

= λT

  • q∈Q

˜ fT(qi, D) + λO

  • q∈Q

˜ fO(qi, qi+1, D) + λU

  • q∈Q

˜ fU(qi, qi+1, D) MLM is a special case of FSDM, when λT = 1, λO = 0, λU = 0

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-31
SLIDE 31

FSDM ranking function

FSDM matching function for unigrams: ˜ fT(qi, D) = log

  • j

w T

j P(qi|θj D) = log

  • j

w T

j

tfqi,Dj + µj

cf j

qi

|Cj|

|Dj| + µj

Example:

apollo astronauts who walked on the moon Parameters:

  • 1. Field importance weights for unigrams and bigrams
  • 2. Relative importance weights of matching unigrams and bigrams

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-32
SLIDE 32

FSDM ranking function

FSDM matching function for unigrams: ˜ fT(qi, D) = log

  • j

w T

j P(qi|θj D) = log

  • j

w T

j

tfqi,Dj + µj

cf j

qi

|Cj|

|Dj| + µj

Example:

apollo astronauts

category

who walked on the moon Parameters:

  • 1. Field importance weights for unigrams and bigrams
  • 2. Relative importance weights of matching unigrams and bigrams

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-33
SLIDE 33

FSDM ranking function

FSDM matching function for unigrams: ˜ fT(qi, D) = log

  • j

w T

j P(qi|θj D) = log

  • j

w T

j

tfqi,Dj + µj

cf j

qi

|Cj|

|Dj| + µj

Example:

apollo astronauts

category

who walked on the moon

category

Parameters:

  • 1. Field importance weights for unigrams and bigrams
  • 2. Relative importance weights of matching unigrams and bigrams

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-34
SLIDE 34

Limitation of FSDM

Same field weights for all query unigrams and all query bigrams

Example:

capitals in Europe which were host cities of summer Olympic games

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-35
SLIDE 35

Limitation of FSDM

Same field weights for all query unigrams and all query bigrams

Example:

capitals

category in Europe which were host cities of summer Olympic games

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-36
SLIDE 36

Limitation of FSDM

Same field weights for all query unigrams and all query bigrams

Example:

capitals

category in Europe attribute

which were host cities of summer Olympic games

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-37
SLIDE 37

Limitation of FSDM

Same field weights for all query unigrams and all query bigrams

Example:

capitals

category in Europe attribute

which were host cities of summer

category Olympic games

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-38
SLIDE 38

Parametric extension of FSDM

[Nikolaev, Kotov et al., SIGIR’16]

Idea: calculate field weight for each unigram and bigram based on features: w T

qi,j =

  • k

αU

j,kφk(qi, j)

φk(qi, j) is the the k-th feature value for unigram qi in field j αU

j,k are feature weights that are learned by coordinate ascent to

maximize target retrieval metric

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-39
SLIDE 39

Features

Source Description CT Collection statistics Posterior probability P(Ej|κ) UG BG Top SDM score of the j-th field when κ is used as a query BG Stanford POS Tagger Is concept κ a proper noun? UG Is κ a plural non-proper noun? UG BG Is κ a superlative adjective? UG Stanford Parser Is κ part of a noun phrase? BG Is κ the only singular non-proper noun in a noun phrase? UG Intercept UG BG

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-40
SLIDE 40

Entity Linking in ERKG

[Hasibi et al., ICTIR’16]

Idea: linked entities as additional feature function in FSDM PΛ(D|Q)

rank

= λT

  • q∈Q

˜ fT(qi, D) + λO

  • q∈Q

˜ fO(qi, qi+1, D) + λU

  • q∈Q

˜ fU(qi, qi+1, D) + λE

  • e∈E(Q)

˜ fE(e, D)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-41
SLIDE 41

Type-based Similarity

[Garigliotti and Balog, ICTIR’17]

If target type(s) qt are provided with the query, the distribution of types for entity e is estimated as: P(t|Θe) = n(t, e) + µP(t)

  • t′ n(t′, e) + µ

With both Θq and Θe in place, type-based similarity between q and e is estimated as: P(qt|e) = z(max

e′ KL(Θq||Θe′) − KL(Θq||Θe))

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-42
SLIDE 42

Entity Type Representation

[Garigliotti and Balog, ICTIR’17] (a) all assigned types (b) most general types (c) most specific types

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-43
SLIDE 43

Type-based Similarity

If no target type(s) are provided with the query, they can be inferred using: Type-centric approach [Balog and Neumayer, CIKM’12]: build a document for each type by concatenating the descriptions of all entities that belong to it P(q|t) =

|q|

  • i=1

P(wi|θt) =

|q|

  • i=1

(1 − λ)

  • e:t∈et

(P(w|ed)P(e|t) + λP(wi)) Entity-centric approach [Balog and Neumayer, CIKM’12]: aggregate retrieval scores and type distributions of top retrieved entities P(q|t) =

  • e:t∈et

P(q|e)P(e|t)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-44
SLIDE 44

Type-based Similarity (cont.)

Type ranking [Garigliotti et al., SIGIR’17]: combines scores of entity- and type-centric approaches with taxonomy and type label features Head-modifier approach [Ma et al., WWW’18]: query and type names are phrases, which consists of a head word (hq and ht) and a set of modifiers (Mq and Mt) (e.g. “Italian Nobel prize winners”, “Musicians who appeared in the Blues Brothers movies”) P(q|t) = P(ht|hq)α1P(Mt|hq)α2P(ht|Mq)α3P(Mt|Mq)α4

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-45
SLIDE 45

MRF-based Combined Model

[Raviv et al., JIWES’17]

Entity name EN, description ED and types ET can be combined into Markov Random Field-based retrieval model: P(E|Q) = λENP(EN|Q) + λEDP(ED|Q) + λEPP(EP|Q)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-46
SLIDE 46

Outline

Entity representation Entity retrieval Entity set expansion Entity ranking

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-47
SLIDE 47

Combining IR and Structured Search

[Tonon, Demartini et al., SIGIR’12]

Maintain inverted index for entity representations and triple store for entity relations Hybrid approach: IR models for initial entity retrieval and SPARQL queries for expansion

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-48
SLIDE 48

Pipeline

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-49
SLIDE 49

Result Expansion Strategies

Follow predicates leading to

  • ther entities

Follow predicates leading to entity attributes Explore entity neighbors and the neighbors of neighbors

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-50
SLIDE 50

Predicates to Follow

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-51
SLIDE 51

Outline

Entity representation Entity retrieval Entity set expansion Entity ranking

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-52
SLIDE 52

Learning-to-Rank Entities

[Dali and Fortuna, WWW’11]

Potential features:

◮ Popularity and importance of Wikipedia page: # of accesses from logs, # of edits, page length ◮ RDF features: # of triples E is subject/object/subject and object is a literal, # of categories Wikipedia page for E belongs to, size of the biggest/smallest/median category ◮ HITS scores and Pagerank of Wikipedia page and E in the RDF graph ◮ # of hits from search engine API for the top 5 keywords from the abstract of Wikipedia page for E ◮ Count of entity name in Google N-grams

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-53
SLIDE 53

Feature Importance

Features approximating the entity importance (hub and authority scores, PageRank) of Wikipedia page are effective PageRank and HITS scores on RDF graph are not effective (outperformed by simpler RDF features) Google N-grams is effective proxy for entity popularity, cheaper than search engine API Feature combinations improve both robustness and accuracy of ranking

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-54
SLIDE 54

Knowledge Graph as Tensor

For a knowledge graph with n distinct entities and m distinct predicates, we construct a tensor X of size n × n × m, where Xijk = 1, if there is k-th predicate between i-th entity and j-th entity, and Xijk = 0, otherwise Each k-th frontal tensor slice Xk is an adjacency matrix for the k-the predicate

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-55
SLIDE 55

RESCAL Tensor Factorization

[Nickel et al., ICML’11, WWW’12]

Given r is the number of latent factors, factorize each Xk: Xk = ARkAT, k = 1, m, where A is a dense n × r matrix, a matrix of latent embeddings for entities, and Rk is an r × r matrix of latent factors

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-56
SLIDE 56

KG entity embedding methods

Idea: Represent KG entities and relations as dense real-valued vectors (i.e. embeddings) and predict relation between entities es and eo in a KG based on f (es, eo, Θ) Interaction-based methods

◮ RESCAL [Nikel et al., ICML’11]: w T

k es ⊗ eo

◮ LFM [Jenatton et al., NIPS’12]: esWpeo ◮ HolE [Nickel et al., AAAI’16]: σ(pT(es ⋆ eo))

Neural network-based methods

◮ ER-MLP [Dong et al., KDD’14]: w Tg

  • C T[es; p; eo]
  • ◮ NTN [Socher et al., NIPS’13]: w T

p g

  • es

TW [1:k] p

eo + C T

p [es]; eo

  • ◮ ConvE [Dettmers et al., AAAI’18]: g(vec(g([es; p] ∗ ω))W )eo

Distance-based methods

◮ Unstructured [Bordes et al., AAAI’11]: -es − eo2

2

◮ SE [Bordes et al., AAAI’11]: -Wes es − Weoeo1 ◮ TransE [Bordes et al., NIPS’13]: -es + p − eo1/2

⊗, ⋆, ∗, ·, [·; ·] and vec denote tensor product, cross-correlation, convolution, 2D reshaping, vector concatenation and tensor vectorization

  • perators

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-57
SLIDE 57

Interpretable KG Entity Embeddings

[Jameel et al., SIGIR’17]

Salient properties of entities are modeled as hyperplanes that separate entities that have a property in their descriptions from the

  • nes that do not

Normals of separating hyperplanes point to the regions where entities with a salient property occur

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-58
SLIDE 58

Utilizing entity embeddings for entity re-ranking

[Zhiltsov and Agichtein, CIKM’13]

  • 1. Retrieve initial set of entities
  • 2. Re-rank retrieved entities using similarity metrics to top-k retrieved

entities in low-dimensional space as features:

◮ cosine similarity: cos(e, etop) ◮ Euclidean distance: e − etop2 ◮ heat kernel: e−

e−etop2 2 σ SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-59
SLIDE 59

Ranking KG Entities using Top Documents

[Schuhmacher, Dietz et al., CIKM’15]

Aim: complex entity-focused informational queries (e.g. “Argentine British relations”)

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-60
SLIDE 60

Takeaway messages

Use dynamic entity representations built from different sources (not

  • nly KG)

Use retrieval models that account for query unigram and bigrams (FSDM and PFSDM) rather than bag-of-words structured document retrieval models (BM25F and MLM) to obtain candidate entities Leverage entity links and types in entity retrieval models Expand candidate entities by following KG links Re-rank candidate entities by using a variety of features including the ones based on KG entity embeddings

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR

slide-61
SLIDE 61

Thank you!

SIGIR 2018 Tutorial on Utilizing KGs for Text-centric IR