Statistical Relational Learning and Knowledge Graph Reasoning
CSCI 699 JAY PUJARA
Statistical Relational Learning and Knowledge Graph Reasoning - - PowerPoint PPT Presentation
Statistical Relational Learning and Knowledge Graph Reasoning CSCI 699 J AY P UJARA Reminder: Basic problems A 1 E 1 A 2 Who are the entities (nodes) in the graph? R 2 R 1 What are their attributes E 2 and types (labels)? R 3 A 1 A 2
CSCI 699 JAY PUJARA
(nodes) in the graph?
and types (labels)?
(edges)?
2
R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2
Internet
Extraction
Knowledge Graph (KG)
Structured representation of entities, their labels and the relationships between them Massive source of publicly available information Cutting-edge IE methods
Internet
Knowledge Graph Noisy! Contains many errors and inconsistencies Difficult!
Extraction
Extracted knowledge is:
5
Extracted knowledge is:
6
author author c
k e r
Extracted knowledge is:
7
spouse spouse
Extracted knowledge is:
8
NELL:The Never-Ending Language Learner
(Carlson et al., 2010)
“read the web”
labels and relations
contains millions of facts
Examples of NELL errors
Kyrgyzstan has many variants:
Kyrgyzstan is labeled a bird and a country
Kyrgyzstan’s location is ambiguous – Kazakhstan, Russia and US are included in possible locations
Enforcing these constraints require jointly considering multiple extractions
15
TO TOPICS: OVERVIEW GRAPHICAL MODELS RANDOM WALK METHODS
16
TO TOPICS:
OVERVIEW
GRAPHICAL MODELS RANDOM WALK METHODS
17
Statuses & Tweets
Multiple Sources of Information
Statuses & Tweets Donations
Multiple Sources of Information
Statuses & Tweets Donations Friends & Followers
Multiple Sources of Information
Statuses & Tweets Donations Friends & Followers Family
Multiple Sources of Information
$
CarlyFiorinaforVicePresident.com
$
CarlyFiorinaforVicePresident.com
Statuses & Tweets Donations Friends & Followers Family
Multiple Sources of Information
CarlyFiorinaforVicePresident.com
Bag-of-words features
CarlyFiorinaforVicePresident.com
Bag-of-words features
Pr(Y)
CarlyFiorinaforVicePresident.com
Bag-of-words features
Donations Status Updates Friends Family
Multiple Sources of Information
Follows
Follows
Follows Pr(Y)
Follows
My label is likely to match that of my follower
Follows
Follows(U1, U2) & Votes(U1, P) à Votes(U2, P)
spouse follower
spouse follower
Follows(U1, U2) & Votes(U1, P) à Votes(U2, P) Spouse(U1, U2) & Votes(U1, P) à Votes(U2, P)
spouse follower
Pr(Y)
spouse follower
follower
spouse follower
follower
2.0: Follows(U1, U2) & Votes(U1, P) à Votes(U2, P) 5.0: Spouse(U1, U2) ^& Votes(U1, P) à Votes(U2, P)
spouse follower
follower
spouse spouse colleague colleague spouse friend friend friend friend
/* Local rules */ 5.0: Donates(A, P) -> Votes(A, P) 0.3: Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) 0.3: Mentions(A, “Tax Cuts”) -> Votes(A, “Republican”) /* Relational rules */ 1.0: Votes(A,P) & Spouse(B,A) -> Votes(B,P) 0.3: Votes(A,P) & Friend(B,A) -> Votes(B,P) 0.1: Votes(A,P) & Colleague(B,A) -> Votes(B,P) /* Range constraint */ Votes(A, “Republican”) + Votes(A, “Democrat”) = 1.0 .
Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)
47
Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)
48
Lbl(Socrates, Man) & Sub(Man, Mortal) -> Lbl(Socrates, Mortal)
P(Lbl(Socrates, Mortal)|Lbl(Socrates,Man)=0.9)
49
Affordable Health Democrat Logical Satisfaction TRUE TRUE
J
TRUE FALSE
L
FALSE TRUE
J
FALSE FALSE
J
/* Model Snippet */ Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”)
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction
TRUE TRUE TRUE
J L
TRUE TRUE FALSE
L J
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction
TRUE TRUE TRUE
J L
TRUE TRUE FALSE
L J
In logic, much as in politics, it is hard to satisfy everyone
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Logical Satisfaction [2] Logical Satisfaction
TRUE TRUE
0.5
TRUE TRUE
0.5
! ! ! !
NP hard weighted MAX SAT problem [Goemans&Williams, 94]
55
Q=0 Q=0.2 Q=0.4 Q=0.6 Q=0.8 Q=1
0.2 0.4 0.6 0.8 1
P=1 P=.6 P=.2
Soft Loss
0-0.2 0.2-0.4 0.4-0.6 0.6-0.8 0.8-1
!: Closed Form
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
/* Soft Logic Penalty */ if Mentions(A, “Tax Cuts”) < !Votes(A, “Democrat”): return 0 else: return Mentions(A, “Tax Cuts”) - !Votes(A, “Democrat”)
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty
1 1 0.7 1 1 0.2
!Q = 1-Q P -> Q = max(0, P-Q)
/* Model Snippet */ [1] Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Mentions(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty
1 1 0.7 0.3 0.7 1 1 0.2 0.8 0.2
!Q = 1-Q P -> Q = max(0, P-Q)
/* Model Snippet */ [1] Supports(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Supports(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty
0.4 0.1 0.65 0.4 0.1 0.2 0.4 0.1 0.9
!Q = 1-Q P -> Q = max(0, P-Q)
/* Model Snippet */ [1] Supports(A, “Affordable Health”) -> Votes(A, “Democrat”) [2] Supports(A, “Tax Cuts”)
Affordable Health Tax Cuts Democrat [1] Penalty [2] Penalty
0.4 0.1 0.65 0.4 0.1 0.2 0.2 0.4 0.1 0.9 0.5
!Q = 1-Q P -> Q = max(0, P-Q)
!Q = 1-Q P -> Q = max(0, P-Q) P & Q = max(0, P+Q-1) P | Q = min(1, P+Q)
p(Y|X) = 1 Z(w, X) exp 2 4−
m
X
j=1
wj h max {j(Y, X), 0}]{1,2}i 3 5
Joint probability over soft-truth assignments Sum over rule penalties
variables
TO TOPICS: OVERVIEW
GRAPHICAL MODELS
RANDOM WALK METHODS
69
parameterize the dependencies between variables
70
Internet
(noisy) Extraction Graph Knowledge Graph
= Large-scale IE
Joint Reasoning
Performs graph identification:
Enforces ontological constraints Incorporates multiple uncertain sources
Knowledge Graph Identification Knowledge Graph
=
Problem: Solution: Knowledge Graph Identification (KGI)
Extraction Graph
Define a graphical model to perform all three of these tasks simultaneously!
(nodes) in the graph?
and types (labels)?
(edges)?
73
R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2
PUJARA+ISWC13
P(Who, What, How|Extractions)
74
R1 E1 R2 R3 A1 A2 E2 E3 A1 A2 A1 A2
PUJARA+ISWC13
75
P=0 P=0.25 P=0.75
76
77
78
79
80
81
82
83
Uncertain Extractions:
.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)
PUJARA+ISWC13; PUJARA+AIMAG15
Uncertain Extractions:
.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)
musician Fab Four Beatles novel Abbey Road Lbl
R e l ( A l b u m A r t i s t )
Lbl Lbl (Annotated) Extraction Graph
PUJARA+ISWC13; PUJARA+AIMAG15
Ontology:
Dom(albumArtist, musician) Mut(novel, musician)
Uncertain Extractions:
.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)
musician Fab Four Beatles novel Abbey Road
Dom
Mut Lbl Lbl Lbl Extraction Graph
PUJARA+ISWC13; PUJARA+AIMAG15
R e l ( A l b u m A r t i s t )
Ontology:
Dom(albumArtist, musician) Mut(novel, musician)
Uncertain Extractions:
.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)
Entity Resolution:
SameEnt(Fab Four, Beatles)
musician Fab Four Beatles novel Abbey Road SameEnt Mut Lbl Lbl Lbl (Annotated) Extraction Graph
Dom
PUJARA+ISWC13; PUJARA+AIMAG15
R e l ( A l b u m A r t i s t )
Ontology:
Dom(albumArtist, musician) Mut(novel, musician)
Uncertain Extractions:
.5: Lbl(Fab Four, novel) .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road)
Entity Resolution:
SameEnt(Fab Four, Beatles)
Beatles Fab Four Abbey Road musician
Rel(AlbumArtist)
Lbl musician Fab Four Beatles novel Abbey Road SameEnt Mut Lbl Lbl Lbl (Annotated) Extraction Graph After Knowledge Graph Identification
Dom
PUJARA+ISWC13; PUJARA+AIMAG15
R e l ( A l b u m A r t i s t )
Lbl(Fab Four, musician) Lbl(Beatles, musician) Rel(Beatles, AlbumArtist, Abbey Road) Rel(Fab Four, AlbumArtist, Abbey Road) Lbl(Beatles, novel) Lbl(Fab Four, novel)
90
100: Subsumes(L1,L2) & Label(E,L1)
Label(E,L2) 100: Exclusive(L1,L2) & Label(E,L1)
100: Inverse(R1,R2) & Relation(R1,E,O) -> Relation(R2,O,E) 100: Subsumes(R1,R2) & Relation(R1,E,O) -> Relation(R2,E,O) 100: Exclusive(R1,R2) & Relation(R1,E,O) -> !Relation(R2,E,O) 100: Domain(R,L) & Relation(R,E,O)
100: Range(R,L) & Relation(R,E,O)
Label(O,L) 10: SameEntity(E1,E2) & Label(E1,L)
Label(E2,L) 10: SameEntity(E1,E2) & Relation(R,E1,O) -> Relation(R,E2,O) 1: Label_OBIE(E,L)
Label(E,L) 1: Label_OpenIE(E,L)
Label(E,L) 1: Relation_Pattern(R,E,O)
Relation(R,E,O) 1: !Relation(R,E,O) 1: !Label(E,L)
JIANG+ICDM12; PUJARA+ISWC13, PUJARA+AIMAG15
91
from the formula’s truth value
distribution over knowledge graph facts, conditioned on the extractions
r∈R
wr : SameEnt(Fab Four, Beatles) ∧ Lbl(Beatles, musician) ⇒ Lbl(Fab Four, musician)
JIANG+ICDM12; PUJARA+ISWC13
r∈R
CandLblT (FabFour, novel) ⇒ Lbl(FabFour, novel) Mut(novel, musician) ∧ Lbl(Beatles, novel) ⇒ ¬Lbl(Beatles, musician) SameEnt(Beatles, FabFour) ∧ Lbl(Beatles, musician) ⇒ Lbl(FabFour, musician)
Lbl(Fab Four, musician) φ1 Lbl(Fab Four, novel) Lbl(Beatles, novel) Lbl(Beatles, musician) Rel(Beatles, albumArtist, Abbey Road)
φ5 φ
φ2 φ3 φ4 φ φ φ φ [φ1] CandLblstruct(FabFour, novel) ⇒ Lbl(FabFour, novel)
[φ2] CandRelpat(Beatles, AlbumArtist, AbbeyRoad) ⇒ Rel(Beatles, AlbumArtist, AbbeyRoad)
[φ3] SameEnt(Beatles, FabFour) ∧ Lbl(Beatles, musician) ⇒ Lbl(FabFour, musician) [φ4] Dom(AlbumArtist, musician) ∧ Rel(Beatles, AlbumArtist, AbbeyRoad) ⇒ Lbl(Beatles, musician) [φ5] Mut(musician, novel) ∧ Lbl(FabFour, musican) ⇒ ¬Lbl(FabFour, novel)
PUJARA+ISWC13; PUJARA+AIMAG15
Have: P(KG) forall KGs Need: best KG
95
MAP inference: optimizing over distribution to find the best knowledge graph
R1 R2 R3
A1 A2 E2 E3 A1 A2 A1 A2 E1
R1 R2 R3
A1 A2 E2 E3 A1 A2 A1 A2 E1
96
Data: ~1.5M extractions, ~70K ontological relations, ~500 relation/label types Task: Collectively construct a KG and evaluate on 25K target facts Comparisons:
Extract Average confidences of extractors for each fact in the NELL candidates Rules Default, rule-based heuristic strategy used by the NELL project MLN Jiang+, ICDM12 – estimates marginal probabilities with MC-SAT PSL Pujara+, ISWC13 – convex optimization of continuous truth values with ADMM
Running Time: Inference completes in 10 seconds, values for 25K facts
JIANG+ICDM12; PUJARA+ISWC13
AUC F1 Extract .873 .828 Rules .765 .673 MLN (Jiang, 12) .899 .836 PSL (Pujara, 13) .904 .853
BENEFITS
distribution over KGs
different sources
DRAWBACKS
98
all KG facts - overkill
semantics - unavailable