Statistical Relational Learning and Knowledge Graph Reasoning - - PowerPoint PPT Presentation

statistical relational learning and knowledge graph
SMART_READER_LITE
LIVE PREVIEW

Statistical Relational Learning and Knowledge Graph Reasoning - - PowerPoint PPT Presentation

Statistical Relational Learning and Knowledge Graph Reasoning CSCI 699 J AY P UJARA Reminder: Basic problems A 1 E 1 A 2 Who are the entities (nodes) in the graph? R 2 R 1 What are their attributes E 2 and types (labels)? R 3 A 1 A 2


slide-1
SLIDE 1

Statistical Relational Learning and Knowledge Graph Reasoning

CSCI 699 JAY PUJARA

slide-2
SLIDE 2

Reminder: Basic problems

  • Who are the entities

(nodes) in the graph?

  • What are their attributes

and types (labels)?

  • How are they related

(edges)?

2

R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2

slide-3
SLIDE 3

Motivating Problem: New Opportunities

Internet

Extraction

Knowledge Graph (KG)

Structured representation of entities, their labels and the relationships between them Massive source of publicly available information Cutting-edge IE methods

slide-4
SLIDE 4

Motivating Problem: Real Challenges

Internet

Knowledge Graph Noisy! Contains many errors and inconsistencies Difficult!

Extraction

slide-5
SLIDE 5

Graph Construction Issues

Extracted knowledge is:

  • ambiguous:
  • Ex: Beetles, beetles, Beatles
  • Ex: citizenOf, livedIn, bornIn

5

slide-6
SLIDE 6

Graph Construction Issues

Extracted knowledge is:

  • ambiguous
  • incomplete
  • Ex: missing relationships
  • Ex: missing labels
  • Ex: missing entities

6

author author c

  • w
  • r

k e r

slide-7
SLIDE 7

Graph Construction Issues

Extracted knowledge is:

  • ambiguous
  • incomplete
  • inconsistent
  • Ex: Cynthia Lennon, Yoko Ono
  • Ex: exclusive labels (alive, dead)
  • Ex: domain-range constraints

7

spouse spouse

slide-8
SLIDE 8

Graph Construction Issues

Extracted knowledge is:

  • ambiguous
  • incomplete
  • inconsistent

8

slide-9
SLIDE 9

NELL:The Never-Ending Language Learner

  • Large-scale IE project

(Carlson et al., 2010)

  • Lifelong learning: aims to

“read the web”

  • Ontology of known

labels and relations

  • Knowledge base

contains millions of facts

slide-10
SLIDE 10

Examples of NELL errors

slide-11
SLIDE 11

Kyrgyzstan has many variants:

  • Kyrgystan
  • Kyrgistan
  • Kyrghyzstan
  • Kyrgzstan
  • Kyrgyz Republic

Entity co-reference errors

slide-12
SLIDE 12

Kyrgyzstan is labeled a bird and a country

Missing and spurious labels

slide-13
SLIDE 13

Missing and spurious relations

Kyrgyzstan’s location is ambiguous – Kazakhstan, Russia and US are included in possible locations

slide-14
SLIDE 14

Violations of ontological knowledge

  • Equivalence of co-referent entities (sameAs)
  • SameEntity(Kyrgyzstan, Kyrgyz Republic)
  • Mutual exclusion (disjointWith) of labels
  • MUT(bird, country)
  • Selectional preferences (domain/range) of relations
  • RNG(countryLocation, continent)

Enforcing these constraints require jointly considering multiple extractions

slide-15
SLIDE 15

Graph Construction approach

  • Graph construction cleans and completes extraction graph
  • Incorporate ontological constraints and relational patterns
  • Discover statistical relationships within knowledge graph

15

slide-16
SLIDE 16

Graph Construction

Probabilistic Models

TO TOPICS: OVERVIEW GRAPHICAL MODELS RANDOM WALK METHODS

16

slide-17
SLIDE 17

Graph Construction

Probabilistic Models

TO TOPICS:

OVERVIEW

GRAPHICAL MODELS RANDOM WALK METHODS

17

slide-18
SLIDE 18



Voter Party Classification

?

slide-19
SLIDE 19



Statuses & Tweets

Multiple Sources of Information

Voter Party Classification

slide-20
SLIDE 20



Statuses & Tweets Donations

Multiple Sources of Information

Voter Party Classification

slide-21
SLIDE 21



Statuses & Tweets Donations Friends & Followers

Multiple Sources of Information

Voter Party Classification

slide-22
SLIDE 22



Statuses & Tweets Donations Friends & Followers Family

Multiple Sources of Information

Voter Party Classification

slide-23
SLIDE 23

Voter Party Classification

slide-24
SLIDE 24

Voter Party Classification

slide-25
SLIDE 25

Voter Party Classification

$

CarlyFiorinaforVicePresident.com

slide-26
SLIDE 26

Voter Party Classification

$

CarlyFiorinaforVicePresident.com

slide-27
SLIDE 27



Statuses & Tweets Donations Friends & Followers Family

Multiple Sources of Information

Voter Party Classification

slide-28
SLIDE 28

Standard Classification

CarlyFiorinaforVicePresident.com

Bag-of-words features

slide-29
SLIDE 29

Standard Classification

CarlyFiorinaforVicePresident.com

Bag-of-words features

Pr(Y)

slide-30
SLIDE 30

Standard Classification

CarlyFiorinaforVicePresident.com

Bag-of-words features

slide-31
SLIDE 31



Donations Status Updates Friends Family

Multiple Sources of Information

Voter Party Classification

slide-32
SLIDE 32

Collective Classification

Follows

slide-33
SLIDE 33

Collective Classification

Follows

slide-34
SLIDE 34

Collective Classification

Follows Pr(Y)

slide-35
SLIDE 35

Collective Classification

Follows

My label is likely to match that of my follower

slide-36
SLIDE 36

Collective Classification

Follows

Follows(U1, U2) & Votes(U1, P) à Votes(U2, P)

slide-37
SLIDE 37

Collective Classification

€  

spouse follower

slide-38
SLIDE 38

Collective Classification

€  

spouse follower

slide-39
SLIDE 39

Collective Classification

Follows(U1, U2) & Votes(U1, P) à Votes(U2, P) Spouse(U1, U2) & Votes(U1, P) à Votes(U2, P)

€  

spouse follower

slide-40
SLIDE 40

Collective Classification

slide-41
SLIDE 41

Collective Classification

Pr(Y)

slide-42
SLIDE 42

Collective Classification

€  

spouse follower

€

follower

slide-43
SLIDE 43

€  

spouse follower

€

follower

Collective Classification

2.0: Follows(U1, U2) & Votes(U1, P) à Votes(U2, P) 5.0: Spouse(U1, U2) ^& Votes(U1, P) à Votes(U2, P)

slide-44
SLIDE 44

Collective Classification

€  

spouse follower

€

follower

slide-45
SLIDE 45

Collective Classification

€  €   €

spouse spouse colleague colleague spouse friend friend friend friend

? ? ? ? ? ?

slide-46
SLIDE 46

Collective Classification with PSL

/* Local rules */ 5.0: Donates(A, P) -> Votes(A, P) 0.3: Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) 0.3: Mentions(A, “Tax Cuts”) -> Votes(A, “Republican”) /* Relational rules */ 1.0: Votes(A,P) & Spouse(B,A) -> Votes(B,P) 0.3: Votes(A,P) & Friend(B,A) -> Votes(B,P) 0.1: Votes(A,P) & Colleague(B,A) -> Votes(B,P) /* Range constraint */ Votes(A, “Republican”) + Votes(A, “Democrat”) = 1.0 .

slide-47
SLIDE 47

Beyond Pure Reasoning

  • Classical AI approach to knowledge: reasoning

Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)

47

slide-48
SLIDE 48

Beyond Pure Reasoning

  • Classical AI approach to knowledge: reasoning

Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)

  • Reasoning difficult when extracted knowledge has errors

48

slide-49
SLIDE 49

Beyond Pure Reasoning

  • Classical AI approach to knowledge: reasoning

Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)

  • Reasoning difficult when extracted knowledge has errors
  • Solution: probabilistic models

P(Lbl(Socrates, Mortal)|Lbl(Socrates,Man)=0.9)

49

slide-50
SLIDE 50

Logic Refresher: Satisfaction

Affordable Health Democrat Logical Satisfaction TRUE TRUE

J

TRUE FALSE

L

FALSE TRUE

J

FALSE FALSE

J

/* Model Snippet */ Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”)

slide-51
SLIDE 51

Logic and Noisy Data

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction

TRUE TRUE TRUE

J L

TRUE TRUE FALSE

L J

slide-52
SLIDE 52

Logic and Noisy Data

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction

TRUE TRUE TRUE

J L

TRUE TRUE FALSE

L J

In logic, much as in politics, it is hard to satisfy everyone

slide-53
SLIDE 53

Soft Logic to the Rescue!

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction

TRUE TRUE

0.5

TRUE TRUE

0.5

! ! ! !

slide-54
SLIDE 54

What does 0.5 MEAN?

slide-55
SLIDE 55

What does 0.5 mean?

  • Rounding probability:
  • Flip a coin with bias 0.5
  • Heads = TRUE
  • Tails = FALSE
  • Using this method is a ¾ optimal solution to the

NP hard weighted MAX SAT problem [Goemans&Williams, 94]

55

slide-56
SLIDE 56

What does ! MEAN?

slide-57
SLIDE 57

What does ! mean?

P -> Q

  • /* Soft Logic Penalty */
  • if P < Q
  • return J
  • else:
  • return P-Q
slide-58
SLIDE 58

!: Closed Form

P -> Q

  • max(0, P-Q)
slide-59
SLIDE 59

Q=0 Q=0.2 Q=0.4 Q=0.6 Q=0.8 Q=1

0.2 0.4 0.6 0.8 1

P=1 P=.6 P=.2

Soft Loss

0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1

!: Closed Form

P -> Q

max(0, P-Q)

slide-60
SLIDE 60

What does ! mean?

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

/* Soft Logic Penalty */ if Mentions(A, “Tax Cuts”) < !Votes(A, “Democrat”): return 0 else: return Mentions(A, “Tax Cuts”) - !Votes(A, “Democrat”)

slide-61
SLIDE 61

Computing !

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty

1 1 0.7 1 1 0.2

!Q = 1-Q P -> Q = max(0, P-Q)

slide-62
SLIDE 62

Computing !

/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty

1 1 0.7 0.3 0.7 1 1 0.2 0.8 0.2

!Q = 1-Q P -> Q = max(0, P-Q)

slide-63
SLIDE 63

Computing ! with soft evidence

/* Model Snippet */ [1] Supports(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Supports(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty

0.4 0.1 0.65 0.4 0.1 0.2 0.4 0.1 0.9

!Q = 1-Q P -> Q = max(0, P-Q)

slide-64
SLIDE 64

Computing ! with soft evidence

/* Model Snippet */ [1] Supports(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Supports(A, “Tax Cuts”)

  • > !Votes(A, “Democrat”)

Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty

0.4 0.1 0.65 0.4 0.1 0.2 0.2 0.4 0.1 0.9 0.5

!Q = 1-Q P -> Q = max(0, P-Q)

slide-65
SLIDE 65

Computing ! for arbitrary formulas

!Q = 1-Q P -> Q = max(0, P-Q) P & Q = max(0, P+Q-1) P | Q = min(1, P+Q)

slide-66
SLIDE 66

Underlying Probability Distribution

p(Y|X) = 1 Z(w, X) exp 2 4−

m

X

j=1

wj h max {j(Y, X), 0}]{1,2}i 3 5

Joint probability over soft-truth assignments Sum over rule penalties

slide-67
SLIDE 67

Optimizing PSL models

  • PSL finds optimal assignment for all unknowns
  • Optimal = minimizes the soft-logic penalty
  • Fast, joint convex optimization using ADMM
  • Supports learning rule weights and latent

variables

slide-68
SLIDE 68

Graph Construction

Probabilistic Models

TO TOPICS: OVERVIEW

GRAPHICAL MODELS

RANDOM WALK METHODS

69

slide-69
SLIDE 69

Graphical Models: Overview

  • Define joint probability distribution on knowledge graphs
  • Each candidate fact in the knowledge graph is a variable
  • Statistical signals, ontological knowledge and rules

parameterize the dependencies between variables

  • Find most likely knowledge graph by optimization/sampling

70

slide-70
SLIDE 70

Motivating Problem (revised)

Internet

(noisy) Extraction Graph Knowledge Graph

= Large-scale IE

Joint Reasoning

slide-71
SLIDE 71

Knowledge Graph Identification

Performs graph identification:

  • entity resolution
  • collective classification
  • link prediction

Enforces ontological constraints Incorporates multiple uncertain sources

Knowledge Graph Identification Knowledge Graph

=

Problem: Solution: Knowledge Graph Identification (KGI)

Extraction Graph

slide-72
SLIDE 72

Knowledge Graph Identification

Define a graphical model to perform all three of these tasks simultaneously!

  • Who are the entities

(nodes) in the graph?

  • What are their attributes

and types (labels)?

  • How are they related

(edges)?

73

R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2

PUJARA+ISWC13

slide-73
SLIDE 73

Knowledge Graph Identification

P(Who, What, How|Extractions)

74

R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2

PUJARA+ISWC13

slide-74
SLIDE 74

Probabilistic Models

  • Use dependencies between facts in KG
  • Probability defined jointly over facts

75

P=0 P=0.25 P=0.75

slide-75
SLIDE 75

What determines probability?

  • Statistical signals from text extractors and classifiers

76

slide-76
SLIDE 76

What determines probability?

  • Statistical signals from text extractors and classifiers
  • P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
  • LevenshteinSimilarity(Beatles, Beetles) = 0.9

77

slide-77
SLIDE 77

What determines probability?

  • Statistical signals from text extractors and classifiers
  • Ontological knowledge about domain

78

slide-78
SLIDE 78

What determines probability?

  • Statistical signals from text extractors and classifiers
  • Ontological knowledge about domain
  • Functional(Spouse) & R(A,Spouse,B) -> !R(A,Spouse,C)
  • Range(Spouse, Person) & R(A,Spouse,B) -> Type(B, Person)

79

slide-79
SLIDE 79

What determines probability?

  • Statistical signals from text extractors and classifiers
  • Ontological knowledge about domain
  • Rules and patterns mined from data

80

slide-80
SLIDE 80

What determines probability?

  • Statistical signals from text extractors and classifiers
  • Ontological knowledge about domain
  • Rules and patterns mined from data
  • R(A, Spouse, B) & R(A, Lives, L) -> R(B, Lives, L)
  • R(A, Spouse, B) & R(A, Child, C) -> R(B, Child, C)

81

slide-81
SLIDE 81

What determines probability?

  • Statistical signals from text extractors and classifiers
  • P(R(John,Spouse,Yoko))=0.75; P(R(John,Spouse,Cynthia))=0.25
  • LevenshteinSimilarity(Beatles, Beetles) = 0.9
  • Ontological knowledge about domain
  • Functional(Spouse) & R(A,Spouse,B) -> !R(A,Spouse,C)
  • Range(Spouse, Person) & R(A,Spouse,B) -> Type(B, Person)
  • Rules and patterns mined from data
  • R(A, Spouse, B) & R(A, Lives, L) -> R(B, Lives, L)
  • R(A, Spouse, B) & R(A, Child, C) -> R(B, Child, C)

82

slide-82
SLIDE 82

Example: The Fab Four

83

slide-83
SLIDE 83

Illustration of KG Identification

Uncertain Extractions:

.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)

PUJARA+ISWC13; PUJARA+AIMAG15

slide-84
SLIDE 84

Illustration of KG Identification

Uncertain Extractions:

.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)

musician Fab Four Beatles novel Abbey Road Lbl

R e l ( A l b u m A r t i s t )

Lbl Lbl (Annotated) Extraction Graph

PUJARA+ISWC13; PUJARA+AIMAG15

slide-85
SLIDE 85

Illustration of KG Identification

Ontology:

Dom(albumArtist, musician) Mut(novel, musician)

Uncertain Extractions:

.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)

musician Fab Four Beatles novel Abbey Road

Dom

Mut Lbl Lbl Lbl Extraction Graph

PUJARA+ISWC13; PUJARA+AIMAG15

R e l ( A l b u m A r t i s t )

slide-86
SLIDE 86

Illustration of KG Identification

Ontology:

Dom(albumArtist, musician) Mut(novel, musician)

Uncertain Extractions:

.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)

Entity Resolution:

SameEnt(Fab Four, Beatles)

musician Fab Four Beatles novel Abbey Road SameEnt Mut Lbl Lbl Lbl (Annotated) Extraction Graph

Dom

PUJARA+ISWC13; PUJARA+AIMAG15

R e l ( A l b u m A r t i s t )

slide-87
SLIDE 87

Illustration of KG Identification

Ontology:

Dom(albumArtist, musician) Mut(novel, musician)

Uncertain Extractions:

.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)

Entity Resolution:

SameEnt(Fab Four, Beatles)

Beatles Fab Four Abbey Road musician

Rel(AlbumArtist)

Lbl musician Fab Four Beatles novel Abbey Road SameEnt Mut Lbl Lbl Lbl (Annotated) Extraction Graph After Knowledge Graph Identification

Dom

PUJARA+ISWC13; PUJARA+AIMAG15

R e l ( A l b u m A r t i s t )

slide-88
SLIDE 88

Probabilistic graphical model for KG

Lbl(Fab Four, musician) Lbl(Beatles, musician) Rel(Beatles, AlbumArtist, Abbey Road) Rel(Fab Four, AlbumArtist, Abbey Road) Lbl(Beatles, novel) Lbl(Fab Four, novel)

slide-89
SLIDE 89

Defining graphical models

  • Many options for defining a graphical model
  • We focus on two approaches, MLNs and PSL, that use rules
  • MLNs treat facts as Boolean, use sampling for satisfaction
  • PSL infers a “truth value” for each fact via optimization

90

slide-90
SLIDE 90

100: Subsumes(L1,L2) & Label(E,L1)

  • >

Label(E,L2) 100: Exclusive(L1,L2) & Label(E,L1)

  • > !Label(E,L2)

100: Inverse(R1,R2) & Relation(R1,E,O) -> Relation(R2,O,E) 100: Subsumes(R1,R2) & Relation(R1,E,O) -> Relation(R2,E,O) 100: Exclusive(R1,R2) & Relation(R1,E,O) -> !Relation(R2,E,O) 100: Domain(R,L) & Relation(R,E,O)

  • > Label(E,L)

100: Range(R,L) & Relation(R,E,O)

  • >

Label(O,L) 10: SameEntity(E1,E2) & Label(E1,L)

  • >

Label(E2,L) 10: SameEntity(E1,E2) & Relation(R,E1,O) -> Relation(R,E2,O) 1: Label_OBIE(E,L)

  • >

Label(E,L) 1: Label_OpenIE(E,L)

  • >

Label(E,L) 1: Relation_Pattern(R,E,O)

  • >

Relation(R,E,O) 1: !Relation(R,E,O) 1: !Label(E,L)

Rules for KG Model

JIANG+ICDM12; PUJARA+ISWC13, PUJARA+AIMAG15

91

slide-91
SLIDE 91

Rules to Distributions

  • Rules are grounded by substituting literals into formulas
  • Each ground rule has a weighted satisfaction derived

from the formula’s truth value

  • Together, the ground rules provide a joint probability

distribution over knowledge graph facts, conditioned on the extractions

P(G|E) = 1 Z exp "X

r∈R

wrφr(G, E) #

wr : SameEnt(Fab Four, Beatles) ∧ Lbl(Beatles, musician) ⇒ Lbl(Fab Four, musician)

JIANG+ICDM12; PUJARA+ISWC13

slide-92
SLIDE 92

Probability Distribution over KGs

P(G | E) = 1 Z exp − wr

r∈R

ϕr(G) $ % & '

CandLblT (FabFour, novel) ⇒ Lbl(FabFour, novel) Mut(novel, musician) ∧ Lbl(Beatles, novel) ⇒ ¬Lbl(Beatles, musician) SameEnt(Beatles, FabFour) ∧ Lbl(Beatles, musician) ⇒ Lbl(FabFour, musician)

slide-93
SLIDE 93

Lbl(Fab Four, musician) φ1 Lbl(Fab Four, novel) Lbl(Beatles, novel) Lbl(Beatles, musician) Rel(Beatles, albumArtist, Abbey Road)

φ5 φ

φ2 φ3 φ4 φ φ φ φ [φ1] CandLblstruct(FabFour, novel) ⇒ Lbl(FabFour, novel)

[φ2] CandRelpat(Beatles, AlbumArtist, AbbeyRoad) ⇒ Rel(Beatles, AlbumArtist, AbbeyRoad)

[φ3] SameEnt(Beatles, FabFour) ∧ Lbl(Beatles, musician) ⇒ Lbl(FabFour, musician) [φ4] Dom(AlbumArtist, musician) ∧ Rel(Beatles, AlbumArtist, AbbeyRoad) ⇒ Lbl(Beatles, musician) [φ5] Mut(musician, novel) ∧ Lbl(FabFour, musican) ⇒ ¬Lbl(FabFour, novel)

PUJARA+ISWC13; PUJARA+AIMAG15

slide-94
SLIDE 94

How do we get a knowledge graph?

Have: P(KG) forall KGs Need: best KG

95

MAP inference: optimizing over distribution to find the best knowledge graph

R1 R2 R3

A1 A2 E2 E3 A1 A2 A1 A2 E1

P( )

R1 R2 R3

A1 A2 E2 E3 A1 A2 A1 A2 E1

slide-95
SLIDE 95

Inference and KG optimization

  • Finding the best KG satisfying weighed rules: NP Hard
  • MLNs [discrete]: Monte Carlo sampling methods
  • Solution quality dependent on burn-in time, iterations, etc.
  • PSL [continuous]: optimize convex linear surrogate
  • Fast optimization, ¾-optimal MAX SAT lower bound

96

slide-96
SLIDE 96

Graphical Models Experiments

Data: ~1.5M extractions, ~70K ontological relations, ~500 relation/label types Task: Collectively construct a KG and evaluate on 25K target facts Comparisons:

Extract Average confidences of extractors for each fact in the NELL candidates Rules Default, rule-based heuristic strategy used by the NELL project MLN Jiang+, ICDM12 – estimates marginal probabilities with MC-SAT PSL Pujara+, ISWC13 – convex optimization of continuous truth values with ADMM

Running Time: Inference completes in 10 seconds, values for 25K facts

JIANG+ICDM12; PUJARA+ISWC13

AUC F1 Extract .873 .828 Rules .765 .673 MLN (Jiang, 12) .899 .836 PSL (Pujara, 13) .904 .853

slide-97
SLIDE 97

Graphical Models: Pros/Cons

BENEFITS

  • Define probability

distribution over KGs

  • Easily specified via rules
  • Fuse knowledge from many

different sources

DRAWBACKS

98

  • Requires optimization over

all KG facts - overkill

  • Dependent on rules from
  • ntology/expert
  • Require probabilistic

semantics - unavailable