Fast Learning of Relational Dependency Networks Relational - - PowerPoint PPT Presentation

fast learning of relational
SMART_READER_LITE
LIVE PREVIEW

Fast Learning of Relational Dependency Networks Relational - - PowerPoint PPT Presentation

Fast Learning of Relational Dependency Networks Relational Dependency Networks B in Person Structure: Directed graph, gender(B) cycles are allowed. Parents of Node = Friend(A,B) Markov Blanket of Node. Parameter = gender(A)


slide-1
SLIDE 1

Fast Learning of Relational Dependency Networks

slide-2
SLIDE 2

2/14

Relational Dependency Networks

Neville, J. & Jensen, D. (2007), 'Relational Dependency Networks', Journal of Machine Learning Research 8, 653--692.

  • Structure: Directed graph,

cycles are allowed.

  • Parents of Node =

Markov Blanket of Node.

  • Parameter =

distribution of child given parents.

  • Accommodates relational

autocorrelations.

CoffeeDr(A) Friend(A,B) gender(A) gender(B) A in Person B in Person

slide-3
SLIDE 3

3/14

Task: learn relational dependency network

structure + parameters

single generative model fast learning Bayesian network

e.g., 1 min for 1M records.

Convert Bayesian network to Relational Dependency Network multiple discriminative models independently learned (one for each predicate)

previous approaches

  • ur new

approach new closed-form transformation method

Overview

slide-4
SLIDE 4

4/14

From BN Structure To DN Structure

 Solid arrows = Bayesian Network  Solid + dash arrows = Dependency Network

Heckerman, D.; Chickering, D. M.; Meek, C.; Rounthwaite, R.; Kadie, C. & Kaelbling, P . (2000), 'Dependency Networks for Inference, Collaborative Filtering, and Data Visualization', Journal of Machine Learning Research 1, 49—75.

CoffeeDr(A) Friend(A,B) gender(A) gender(B)

slide-5
SLIDE 5

5/14

From BN Parameters to DN Parameters

 Log-linear model for probability of target instance given its

Markov blanket.

 Example: Predict the gender of Sam, given that

 40% of Sam’s friends are Women, and  Sam is a coffee drinker.

Fast Learning of Relational Dependency Networks

BN Parameter Markov Blanket

P(target = value|Markov blanket) ∝ exp {∑target instance + children ∑ parent values PV

, child values CV ln(P(CV|PV)) ∙ frequency(CV

,PV)}

DN Parameter

slide-6
SLIDE 6

6/14

Example

 Predict the gender of Sam, given that

 40% of Sam’s friends are Women, and  Sam is a coffee drinker:

P(g(A) = W | g(B) = W, F(A,B) = T) =0.55 P(g(A) = M | g(B) = M, F(A,B) = T) = 0.63 P(cd(A) = T|g(A) = M) = 0.6 P(cd(A) = T|g(A) = W) = 0.8

CoffeeDr(sam) Friend(sam,B ) gender(sam) gender(B) Child Value Parent State CP log(CP) Rel. Freq. log(CP) * Freq.

g(sam) = W g(B) = W , F(sam,B) = T 0.55

  • 0.60

0.40

  • 0.24

g(sam) = W g(B) = M, F(sam,B) = T 0.37

  • 0.99

0.60

  • 0.60

cd(sam) = T g(sam) = W 0.80

  • 0.22

1.00

  • 0.22

cd(sam) = F g(sam) = W 0.20

  • 1.61

0.00 0.00 Sum{ EXP(Sum) ∝ P(gender(sam)=W|MB) }

  • 1.06
slide-7
SLIDE 7

7/14

Evaluation Metrics

 Running time  Conditional Log Likelihood (CLL)

 How confident we are with the prediction

 Area Under Precision-Recall Curve (PR)

 For skewed distributions.

 Results are averaged over 5-fold cross-validation, over all

two-class predicates in the dataset.

 Comparison Methods: RDN-Boost, MLN-Boost.

Natarajan, S.; Khot, T.; Kersting, K.; Gutmann, B. & Shavlik, J. W . (2012), 'Gradient-based boosting for statistical relational learning: The relational dependency network case', Machine Learning 86(1), 25-56.

slide-8
SLIDE 8

8/14

Accuracy Comparison

  • 0.70
  • 0.60
  • 0.50
  • 0.40
  • 0.30
  • 0.20
  • 0.10

0.00

CLL

0.00 0.20 0.40 0.60 0.80 1.00 1.20

UW Mondial Hepatitis Muta MovieLens(0.1M)

PR

RDN_Boost MLN_Boost RDN_Bayes

slide-9
SLIDE 9

9/14

Learning Time Comparison

Dataset # Predicates # tuples RDN_Boost MLN_Boost RDN_Bayes UW 14 612 15±0.3 19±0.7 1±0.0 Mondial 18 870 27±0.9 42±1.0 102±6.9 Hepatitis 19 11,316 251±5.3 230±2.0 286±2.9 Mutagenesis 11 24,326 118±6.3 49±1.3 1±0.0 MovieLens(0.1M) 7 83,402 44±4.5 min 31±1.87 min 1±0.0 MovieLens(1M) 7 1,010,051 >24 hours >24 hours 10±0.1

  • Standard deviations are shown.
  • Units are seconds unless otherwise stated.

Fast Learning of Relational Dependency Networks

slide-10
SLIDE 10

10/14

RDN-Bayes uses more relevant predicates and more first-order variables

Database Target Predicate # extra predicate s # extra first

  • rder

variables CLL-diff Mondial religion 11 1 0.58 IMDB gender 6 2 0.30 UW-CSE student 4 1 0.50 Hepatitis sex 4 2 0.20 Mutagenesis ind1 5 1 0.56 MovieLens gender 1 1 0.26  Our best predicate for each database:

Fast Learning of Relational Dependency Networks

slide-11
SLIDE 11

11/14

Structure Comparison Example IMDB

Fast Learning of Relational Dependency Networks

Model Target Markov Blanket RDN- Boost gender(U) Occupation(U), Age(U) RDN- Bayes gender(U) Occupation(U), Age(U), Rating(U,M), RunningTime(M), CastMember(M,X), AGender(X)

UserID Occupation Age gender UserID MovieID Rating MovieID Time ActorID MovieID ActorID AGender RDN-Boost RDN- Bayes

🎦

slide-12
SLIDE 12

12/14

Conclusions

 Basic Idea: convert Bayesian networks to relational dependency

networks.

  • fast BN learning ⇒ fast DN learning.
  • dependency networks ⇒ inference with cyclic

dependencies/autocorrelations.

  • New log-linear model for converting BN parameters to DN parameters.
  • I.e., define probability of a node given Markov blanket, Bayes net

model.

  • Empirical evaluation
  • Scales very well with number of records.
  • Competitive accuracy with functional gradient boosting.

Fast Learning of Relational Dependency Networks

slide-13
SLIDE 13

13/14

There’s More

 Empirical Comparisons

 counts instead of frequencies  weight learning  more on MLN-Boost

 Theorems about dependency network consistency

Fast Learning of Relational Dependency Networks

slide-14
SLIDE 14

14/14

The End

 Any questions?

Fast Learning of Relational Dependency Networks