A Short Introduction to Probabilistic Soft Logic Angelika Kimmig, - - PowerPoint PPT Presentation

a short introduction to probabilistic soft logic
SMART_READER_LITE
LIVE PREVIEW

A Short Introduction to Probabilistic Soft Logic Angelika Kimmig, - - PowerPoint PPT Presentation

A Short Introduction to Probabilistic Soft Logic Angelika Kimmig, Stephen H. Bach, Matthias Broecheler, Bert Huang and Lise Getoor NIPS Workshop on Probabilistic Programming 2012 http://psl.umiacs.umd.edu 1 Probabilistic Soft Logic (PSL)


slide-1
SLIDE 1

A Short Introduction to Probabilistic Soft Logic

Angelika Kimmig, Stephen H. Bach, Matthias Broecheler, Bert Huang and Lise Getoor NIPS Workshop on Probabilistic Programming 2012 http://psl.umiacs.umd.edu

1

slide-2
SLIDE 2

Probabilistic Soft Logic (PSL)

2

[Broecheler et al, UAI 10]

slide-3
SLIDE 3

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

2

[Broecheler et al, UAI 10]

slide-4
SLIDE 4

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]

2

[Broecheler et al, UAI 10]

slide-5
SLIDE 5

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]
  • Dependencies as weighted first order rules

2

[Broecheler et al, UAI 10]

slide-6
SLIDE 6

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]
  • Dependencies as weighted first order rules
  • Support for similarity functions and aggregation

2

[Broecheler et al, UAI 10]

slide-7
SLIDE 7

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]
  • Dependencies as weighted first order rules
  • Support for similarity functions and aggregation
  • Linear (in)equality constraints

2

[Broecheler et al, UAI 10]

slide-8
SLIDE 8

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]
  • Dependencies as weighted first order rules
  • Support for similarity functions and aggregation
  • Linear (in)equality constraints

Efficient MPE inference: continuous convex optimization

2

new approach NIPS 12

[Broecheler et al, UAI 10]

slide-9
SLIDE 9

Applications

  • Collective classification
  • Ontology alignment
  • Entity resolution
  • Link prediction
  • Trust in social networks
  • Social group modeling
  • Personalized medicine
  • ...

3

slide-10
SLIDE 10

Ontology Alignment

4 1

Organization Employees Developer Staff Customers Service & Products Hardware IT Services Sales Person Software

work for provides buys helps interacts sells to develops

Company Employee Technician Accountant Customer Products & Services Hardware Consulting Sales Software Dev

works for develop buys helps interacts with sells
slide-11
SLIDE 11

Ontology Alignment

4 1

Organization Employees Developer Staff Customers Service & Products Hardware IT Services Sales Person Software

work for provides buys helps interacts sells to develops

Company Employee Technician Accountant Customer Products & Services Hardware Consulting Sales Software Dev

works for develop buys helps interacts with sells

similar names

slide-12
SLIDE 12

Ontology Alignment

4 1

Organization Employees Developer Staff Customers Service & Products Hardware IT Services Sales Person Software

work for provides buys helps interacts sells to develops

Company Employee Technician Accountant Customer Products & Services Hardware Consulting Sales Software Dev

works for develop buys helps interacts with sells

similar names similar ranges

slide-13
SLIDE 13

Ontology Alignment

4 1

Organization Employees Developer Staff Customers Service & Products Hardware IT Services Sales Person Software

work for provides buys helps interacts sells to develops

Company Employee Technician Accountant Customer Products & Services Hardware Consulting Sales Software Dev

works for develop buys helps interacts with sells

similar names similar ranges similar subconcepts

slide-14
SLIDE 14

5

     

ann bob dan carla emma fred

Trust Modeling

5

0.9 0.7 0.8 0.5 0.2 0.8 0.9 0.5 0.4

slide-15
SLIDE 15

5

     

ann bob dan carla emma fred

Trust Modeling

5

0.9 0.7 0.8 0.5 0.2 0.8 0.9 0.5 0.4

trusts(X,Y) ∧ trusts(Y,Z) → trusts(X,Z)

slide-16
SLIDE 16

5

     

ann bob dan carla emma fred

Trust Modeling

5

0.9 0.7 0.8 0.5 0.2 0.8 0.9 0.5 0.4

trusts(X,Y) ∧ trusts(Y,Z) → trusts(X,Z)

slide-17
SLIDE 17

5

     

ann bob dan carla emma fred

Trust Modeling

5

0.9 0.7 0.8 0.5 0.2 0.8 0.9 0.5 0.4

trusts(X,Y) ∧ trusts(Y,Z) → trusts(X,Z)

slide-18
SLIDE 18

5

     

spouse spouse colleague colleague spouse friend friend friend friend ann bob dan carla emma fred

Voter Opinion Modeling

6

slide-19
SLIDE 19

5

     

spouse spouse colleague colleague spouse friend friend friend friend ann bob dan carla emma fred

Voter Opinion Modeling

6

slide-20
SLIDE 20

5

     

spouse spouse colleague colleague spouse friend friend friend friend ann bob dan carla emma fred

Voter Opinion Modeling

6

? ? ?

slide-21
SLIDE 21

PSL Program

friend(carla,emma)=0.9 friend(bob,dan)=0.4 spouse(ann,bob)=1.0 prefers(ann, )=0.8 ...

7

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-22
SLIDE 22

0.3: lives(A,S) ∧ majority(S,P) → prefers(A,P) 0.8: spouse(B,A) ∧ prefers(B,P) → prefers(A,P) 0.1: similarAge(B,A) ∧ prefers(B,P) → prefers(A,P) 0.4: prefers(A,P) → prefersAvg({A.friend},P) partial-functional: prefers

PSL Program

friend(carla,emma)=0.9 friend(bob,dan)=0.4 spouse(ann,bob)=1.0 prefers(ann, )=0.8 ...

7

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-23
SLIDE 23

partial-functional: prefers

Constraints

8

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-24
SLIDE 24

partial-functional: prefers

Constraints

prefers(A, )+prefers(A, ) ≤ 1.0

8

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-25
SLIDE 25

0.3: lives(A,S) ∧ majority(S,P) → prefers(A,P)

Local Rules

9

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-26
SLIDE 26

0.8: spouse(B,A) ∧ prefers(B,P) → prefers(A,P)

Propagation Rules

10

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-27
SLIDE 27

0.1: similarAge(B,A) ∧ prefers(B,P) → prefers(A,P)

Similarity Rules

11

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

Similarity function with range [0,1]

slide-28
SLIDE 28

0.4: prefers(A,P) → prefersAvg({A.friend},P)

Sets

12

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

slide-29
SLIDE 29

0.4: prefers(A,P) → prefersAvg({A.friend},P)

Sets

12

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

all X such that friend(A,X)

slide-30
SLIDE 30

0.4: prefers(A,P) → prefersAvg({A.friend},P)

Sets

12

c

b spouse

spouse colleague colleague spouse friend friend

friend

friend

?

a

d

e

f

? ?

all X such that friend(A,X) truth value ≔ average truth value of prefers(X,P)

slide-31
SLIDE 31

PSL Program

  • Ground atoms = random variables
  • Soft truth value assignments
  • Assignment satisfying more rules more likely
  • Constraints to rule out unwanted assignments

13

slide-32
SLIDE 32

Probabilistic Model

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-33
SLIDE 33

Probabilistic Model

Interpretation

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-34
SLIDE 34

Probabilistic Model

Set of rule groundings Interpretation

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-35
SLIDE 35

Probabilistic Model

Set of rule groundings Rule’s weight Interpretation

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-36
SLIDE 36

Probabilistic Model

Set of rule groundings Rule’s weight Interpretation Ground rule’s distance from satisfaction given I

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-37
SLIDE 37

Probabilistic Model

Set of rule groundings Rule’s weight Interpretation Ground rule’s distance from satisfaction given I ∈{1,2}

14

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-38
SLIDE 38

Probabilistic Model

Set of rule groundings Rule’s weight Interpretation Ground rule’s distance from satisfaction given I ∈{1,2}

14

Normalization constant

Z = Z

J∈I

exp @− X

r∈P

X

g∈G(r)

wr(dg(J))k 1 A

f(I) = 1

Z exp

@− X

r∈P

X

g∈G(r)

wr(dg(I))k 1 A

slide-39
SLIDE 39

Distance from Satisfaction

15

dr(I) = max{0, I(body) − I(head)}

slide-40
SLIDE 40

Distance from Satisfaction

body → head satisfied ⇔ truth value of body ≤ truth value of head

15

dr(I) = max{0, I(body) − I(head)}

slide-41
SLIDE 41

Distance from Satisfaction

body → head satisfied ⇔ truth value of body ≤ truth value of head

I(v1 ∧ v2) = max{0, I(v1) + I(v2) − 1} I(v1 ∨ v2) = min{I(v1) + I(v2), 1} I(¬l1) = 1 − I(v1)

Lukasiewicz t-norm

15

dr(I) = max{0, I(body) − I(head)}

slide-42
SLIDE 42

similarAge(bob,ann) ∧ prefers(bob, ) → prefers(ann, )

Distance from Satisfaction

16

0.8 0.5 0.5

slide-43
SLIDE 43

similarAge(bob,ann) ∧ prefers(bob, ) → prefers(ann, )

Distance from Satisfaction

16

I(v1 ∧ v2) = max{0, I(v1) + I(v2) − 1} dr(I) = max{0, I(body) − I(head)}

0.8 0.5 0.5

slide-44
SLIDE 44

similarAge(bob,ann) ∧ prefers(bob, ) → prefers(ann, )

Distance from Satisfaction

16

max{0, 0.8+0.5−1} ≤ 0.5 d=0.0

I(v1 ∧ v2) = max{0, I(v1) + I(v2) − 1} dr(I) = max{0, I(body) − I(head)}

0.8 0.5 0.5

slide-45
SLIDE 45

similarAge(bob,ann) ∧ prefers(bob, ) → prefers(ann, )

Distance from Satisfaction

16

max{0, 0.8+0.5−1} ≤ 0.5 max{0, 0.8+0.5−1} ≤ 0.2 d=0.0 d=0.1

I(v1 ∧ v2) = max{0, I(v1) + I(v2) − 1} dr(I) = max{0, I(body) − I(head)}

0.8 0.5 0.2

slide-46
SLIDE 46

similarAge(bob,ann) ∧ prefers(bob, ) → prefers(ann, )

Distance from Satisfaction

16

max{0, 0.8+0.5−1} ≤ 0.5 max{0, 0.8+0.5−1} ≤ 0.2 max{0, 0.8+0.9−1} ≤ 0.2 d=0.0 d=0.1 d=0.5

I(v1 ∧ v2) = max{0, I(v1) + I(v2) − 1} dr(I) = max{0, I(body) − I(head)}

0.8 0.2 0.9

slide-47
SLIDE 47

Tasks

17

slide-48
SLIDE 48

Tasks

  • MPE Inference

prefers(bob, ) ≥ prefers(bob, ) ?

17

slide-49
SLIDE 49

Tasks

  • MPE Inference

prefers(bob, ) ≥ prefers(bob, ) ?

  • Computing Marginals

P(prefers(bob, ) ≥ 0.8) ?

17

slide-50
SLIDE 50

Tasks

  • MPE Inference

prefers(bob, ) ≥ prefers(bob, ) ?

  • Computing Marginals

P(prefers(bob, ) ≥ 0.8) ?

  • Weight Learning
  • Structure Learning

17

slide-51
SLIDE 51

Geometric Intuition: MPE Inference

1 1 x1 x3 x2 1 most likely interpretations given x1=0

18

d1(I) = max{0, I(x1) − I(x2)} d2(I) = max{0, I(x2) − I(x3)} I(x1) + I(x3) ≤ 1

slide-52
SLIDE 52

MPE Inference

  • Convex optimization problem
  • New solver [Bach et al, NIPS 12]
  • Consensus optimization
  • Linear time in practice
  • Closed form solutions

for subproblems

19

slide-53
SLIDE 53

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

[Bach et al, NIPS 12]

slide-54
SLIDE 54

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

  • riginal random variables

[Bach et al, NIPS 12]

slide-55
SLIDE 55

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

rules with local copies of random variables

  • riginal random variables

[Bach et al, NIPS 12]

slide-56
SLIDE 56

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

  • ptimize truth

values & agreement with

  • riginal variables

per rule rules with local copies of random variables

  • riginal random variables

[Bach et al, NIPS 12]

slide-57
SLIDE 57

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

  • ptimize truth

values & agreement with

  • riginal variables

per rule update variables to average

  • f copies

rules with local copies of random variables

  • riginal random variables

[Bach et al, NIPS 12]

slide-58
SLIDE 58

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

  • ptimize truth

values & agreement with

  • riginal variables

per rule update variables to average

  • f copies

rules with local copies of random variables

  • riginal random variables

[Bach et al, NIPS 12]

slide-59
SLIDE 59

Consensus Optimization

20

r1(V') r2(V'') r3(V''') v1 v2 v3 v4

  • ptimize truth

values & agreement with

  • riginal variables

per rule update variables to average

  • f copies

rules with local copies of random variables

  • riginal random variables

[Bach et al, NIPS 12]

new: fast solutions

slide-60
SLIDE 60

Geometric Intuition: Marginals

1 1 x1 x3 x2 1

| | 1 f

P(0.4 ≤ x2 ≤ 0.6)

21

d1(I) = max{0, I(x1) − I(x2)} d2(I) = max{0, I(x2) − I(x3)} I(x1) + I(x3) ≤ 1

slide-61
SLIDE 61

Computing Marginals

xi Histogram sampling using hit-and-run Monte Carlo scheme

22

[Broecheler and Getoor, NIPS 10]

slide-62
SLIDE 62

Hit-and-Run

p

23

slide-63
SLIDE 63

Hit-and-Run

p d

23

slide-64
SLIDE 64

Hit-and-Run

p d

23

slide-65
SLIDE 65

Hit-and-Run

p d q

23

slide-66
SLIDE 66

Hit-and-Run

p d q

23

slide-67
SLIDE 67

Probabilistic Soft Logic (PSL)

Declarative language to specify graphical models

  • Logical atoms with soft truth values in [0,1]
  • Dependencies as weighted first order rules
  • Support for similarity functions and aggregation
  • Linear (in)equality constraints

Inference

  • MPE: consensus optimization
  • Marginals: hit-and-run histogram sampling

24

slide-68
SLIDE 68

Thank you!

http://psl.umiacs.umd.edu

25