Applications of Rule Mining in Knowledge Bases Luis Galrraga - - PowerPoint PPT Presentation

applications of rule mining in knowledge bases
SMART_READER_LITE
LIVE PREVIEW

Applications of Rule Mining in Knowledge Bases Luis Galrraga - - PowerPoint PPT Presentation

Applications of Rule Mining in Knowledge Bases Luis Galrraga November 3 rd , 2014 PIKM, Shanghai 1 Knowledge Bases (KBs) Barack Obama hasChild born On hasChild Malia Aug 4, 1961 hasChild marriedTo hasChild Michelle Sasha 2 KBs in


slide-1
SLIDE 1

Applications of Rule Mining in Knowledge Bases

Luis Galárraga

November 3rd, 2014 PIKM, Shanghai

1

slide-2
SLIDE 2

Knowledge Bases (KBs)

2 hasChild marriedTo born On Aug 4, 1961

Sasha Barack Obama Michelle

hasChild

Malia

hasChild hasChild

slide-3
SLIDE 3

KBs in action

3

slide-4
SLIDE 4

KBs in action

4

slide-5
SLIDE 5

Some popular KBs

5

slide-6
SLIDE 6

Rule Mining in KBs

6 hasChild marriedTo born On Aug 4, 1961

Sasha Barack Obama Michelle

hasChild

Malia

hasChild hasChild

slide-7
SLIDE 7

Rule Mining in KBs

7 hasChild born On Aug 4, 1961

Sasha

hasChild

Malia

hasChild

x y z

marriedTo hasChild

slide-8
SLIDE 8

Rule Mining in KBs

8 hasChild marriedTo born On Aug 4, 1961

Sasha Barack Obama Michelle

hasChild

Malia

hasChild hasChild

slide-9
SLIDE 9

Rule Mining in KBs

9 born On Aug 4, 1961

Malia

x z

marriedTo

y

hasChild hasChild hasChild hasChild

slide-10
SLIDE 10

Rule Mining in KBs

10 born On Aug 4, 1961

Malia

x z hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

marriedTo

y

hasChild hasChild hasChild hasChild

slide-11
SLIDE 11

Rule Mining in KBs

11

KBs are often incomplete

Elvis Presley Priscilla Lisa Marie

hasChild marriedTo

slide-12
SLIDE 12

Rule Mining in KBs

12

Rules can be used to make predictions

Elvis Presley Priscilla Lisa Marie

hasChild hasChild?

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

marriedTo

slide-13
SLIDE 13

Rule Mining in KBs

13

Missing information is counter-evidence under the Closed World Assumption

Elvis Presley Priscilla Lisa Marie

hasChild hasChild isMarriedTo

slide-14
SLIDE 14

Rule Mining in KBs

14

KBs operate under the Open World Assumption

Elvis Presley Priscilla Lisa Marie

hasChild hasChild isMarriedTo

slide-15
SLIDE 15

Partial Completeness Assumption (PCA)

15

Sasha Michelle Malia

hasChild hasChild

slide-16
SLIDE 16

Partial Completeness Assumption (PCA)

16

Sasha Michelle Malia

hasChild hasChild hasChild

slide-17
SLIDE 17

PCA for Rule Mining

17

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild marriedTo

Sasha Michelle

hasChild

Malia

hasChild hasChild

slide-18
SLIDE 18

PCA for Rule Mining

18

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild marriedTo

Sasha Michelle

hasChild

Malia

hasChild hasChild

Hits Misses 2

slide-19
SLIDE 19

PCA for Rule Mining

19

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild marriedTo

Prince Charles Camilla

Hits Misses 2

hasChild hasChild

Prince William Tom Laura

slide-20
SLIDE 20

PCA for Rule Mining

20 hasChild hasChild marriedTo

Prince Charles Camilla

hasChild

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

Hits Misses 2

hasChild

Tom Laura Prince William

slide-21
SLIDE 21

PCA for Rule Mining

21 hasChild hasChild marriedTo

Prince Charles Camilla

hasChild

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

Hits Misses 2 1

Tom Laura Prince William

slide-22
SLIDE 22

PCA for Rule Mining

22

Elvis Presley Priscilla Lisa Marie

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild marriedTo

Hits Misses 2 1

slide-23
SLIDE 23

PCA for Rule Mining

23

Elvis Presley Priscilla Lisa Marie

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild hasChild marriedTo

Hits Misses 2 1

slide-24
SLIDE 24

PCA for Rule Mining

24

Elvis Presley Priscilla Lisa Marie

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild hasChild marriedTo

Hits Misses 2 1

Standard confidence counts it as a miss

slide-25
SLIDE 25

PCA for Rule Mining

25

Elvis Presley Priscilla Lisa Marie

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

hasChild hasChild marriedTo

Hits Misses 2 1 Standard Confidence 2/4 = 50% PCA Confidence 2/3 = 66.67%

slide-26
SLIDE 26

AMIE: Association Rule Mining Under Incomplete Evidence

  • AMIE is a system that learns closed Horn rules
  • It performs exhaustive search based on:

– Minimum support threshold – Mining operators – Monotonicity of support for pruning – Optimized in-memory database – Confidence gain is used to prune the output.

26

Luis Galárraga, Christina Teflioudi, Katja Hose, Fabian Suchanek. AMIE: Association Rule Mining Under Incomplete Evidence in Ontological Knowledge

  • Bases. In WWW, 2013. Best student paper award.

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

slide-27
SLIDE 27

z x hasChild

27

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

slide-28
SLIDE 28

z x hasChild

28

slide-29
SLIDE 29

z x hasChild

Add dangling atom (OD)

z x hasChild ?r marriedTo influences …. y

29

slide-30
SLIDE 30

z x hasChild

Add dangling atom (OD)

z x hasChild ?r marriedTo influences …. y z x hasChild marriedTo y

30

slide-31
SLIDE 31

z x hasChild

Add dangling atom (OD)

z x hasChild ?r marriedTo influences …. y z x hasChild marriedTo y

Add closing atom (OC)

z x hasChild y marriedTo ?r hasChild supervises …

31

slide-32
SLIDE 32

z x hasChild

Add dangling atom (OD)

z x hasChild ?r marriedTo influences …. y z x hasChild marriedTo y

Add closing atom (OC)

z x hasChild y marriedTo ?r hasChild supervises … hasChild z x hasChild y marriedTo

32

slide-33
SLIDE 33

z x hasChild

Add dangling atom (OD)

z x hasChild ?r marriedTo influences …. y z x hasChild marriedTo y

Add closing atom (OC)

z x hasChild y marriedTo ?r hasChild supervises … hasChild z x hasChild y marriedTo

33

hasChild(y, x), marriedTo(y, z) => hasChild(z, x)

slide-34
SLIDE 34

AMIE: Association Rule Mining Under Incomplete Evidence

Minimum support threshold RDF KB

k

1 1 Concurrent mining implementation Tailored In-memory DB

34

slide-35
SLIDE 35

AMIE: Association Rule Mining Under Incomplete Evidence

Minimum support threshold RDF KB

k

1 1 Concurrent mining implementation Tailored In-memory DB

35

PCA Confidence used to rank rules

slide-36
SLIDE 36

AMIE: Association Rule Mining Under Incomplete Evidence

isMarriedTo(x, y) livesIn(x, z) => livesIn(y, z) ∧ isCitizenOf(x, y) => livesIn(x, y) hasAdvisor(x, y) graduatedFrom(x, z) => worksAt(y, z) ∧ hasWonPrize(x, Gottfried Wilhelm Leibniz Prize) => livesIn(x, Germany)

Some rules mined by AMIE on YAGO:

slide-37
SLIDE 37

AMIE: Association Rule Mining Under Incomplete Evidence

Facts Rules YAGO2 1M 3.62min 138 1M 17.76min 18K 6.7M 2.89min 6.9K Dataset Runtime YAGO2 (const) Dbpedia (2 atoms)

AMIE finds rules in medium-size ontologies in a few minutes.

slide-38
SLIDE 38

AMIE: Association Rule Mining Under Incomplete Evidence

PCA confidence better for prediction than standard confidence.

38

slide-39
SLIDE 39

Rules for Ontology Schema Alignment

Sasha Barack Obama

hasChild

Malia

hasChild

Sasha President Obama

parent

Malia

parent

KB 1 KB 2

sibling 39

Rule mining can be used for data integration

slide-40
SLIDE 40

Rules for Ontology Schema Alignment

Sasha Barack Obama

hasChild

Malia

hasChild parent

Malia

parent sameAs sameAs sameAs

KB 1 KB 2

sibling 40

Use instance alignments to align the schemas

Sasha President Obama

slide-41
SLIDE 41

Rules for Ontology Schema Alignment

Sasha Barack Obama

hasChild

Malia

hasChild parent parent sameAs sameAs sameAs

KB 1 KB 2

hasChild(x, y) <=> parent(y, x)

sibling 41

Sasha Malia President Obama

slide-42
SLIDE 42

Rules for Ontology Schema Alignment

Sasha Barack Obama

hasChild

Malia

hasChild parent parent sameAs sameAs sameAs

KB 1 KB 2

hasChild(y, x) hasChild(y, z) => sibling(x, z)

sibling 42

Sasha Malia President Obama

slide-43
SLIDE 43

Rules for Ontology Schema Alignment

Sasha Barack Obama

hasChild

Malia

hasChild parent parent

Run AMIE on a coalesce of the KBs

43

hasChild <=> parent-1 hasChild(y, x) hasChild(y, z) => sibling(x, z)

AMIE

sibling

slide-44
SLIDE 44

ROSA rules

ROSA rules are a class of cross-ontology alignments

r(x, y) => r'(x, y) R-subsumption r(x, y) <=> r'(x, y) R-equivalence type(x, C) => type(x, C') C-subsumption r1(x, y), r2(y, z) => r'(x, z) 2-hops translation r(x, z) r(y, z) => r'(x, y) Triangle alignment r1(x, y), r2(x, V) => r'(x, y) Specific R-subsumption r(x, V) => r'(x, V') Attribute-Value translation r1(x, V1), r2(x, V2) => r'(x, V') 2-values translation Luis Galárraga, Nicoleta Preda, Fabian Suchanek. Mining Rules to Align Knowledge Bases. In Automated Knowledge Base Construction Workshop (AKBC), 2013.

44

slide-45
SLIDE 45

Rule Mining for canonicalization of relations

Open KBs express relations in multiple ways

Barack Obama

is a graduate of

Harvard Law School Columbia University

earned degree from earned degree from 45

slide-46
SLIDE 46

Rule Mining for canonicalization of relations

Problem for query answering

Barack Obama

is a graduate of

Harvard Law School Columbia University

earned degree from 46 earned degree from

slide-47
SLIDE 47

Rule Mining for canonicalization of relations

Barack Obama

is a graduate of

Harvard Law School Columbia University

earned degree from

Barack Obama is a graduate of?

47 earned degree from

slide-48
SLIDE 48

Rule Mining for canonicalization of relations

Use rule mining to find equivalent relations

is a graduate of <=> earned degree from

AMIE

Barack Obama

is a graduate of

Harvard Law School Columbia University

earned degree from

Luis Galárraga, Geremy Heitz, Kevin Murphy, Fabian Suchanek. Canonicalizing Open Knowledge Bases. In CIKM, 2014

48 earned degree from

slide-49
SLIDE 49

Research outlook

  • Numerical correlations
  • Probabilistic model to learn confidence of

predictions

– Multiple rules can predict a fact – Integrate soft and hard constraints

49

export(x, y), import(x, z) => cad(x, 1.2 * (z - y))