Fast Learning of Relational Dependency Networks Relational - PowerPoint PPT Presentation

Fast Learning of Relational Dependency Networks

Relational Dependency Networks B in Person • Structure: Directed graph, gender(B) cycles are allowed. • Parents of Node = Friend(A,B) Markov Blanket of Node. • Parameter = gender(A) distribution of child given parents. CoffeeDr(A) • Accommodates relational A in Person autocorrelations. Neville, J. & Jensen, D. (2007), 'Relational Dependency Networks', Journal of Machine Learning Research 8, 653--692. 2/14

Overview Task: learn relational dependency network structure + parameters our new previous approach approaches single generative model multiple discriminative models fast learning Bayesian network independently learned e.g., 1 min for 1M records. (one for each predicate) new closed-form transformation method Convert Bayesian network to Relational Dependency Network 3/14

From BN Structure To DN Structure  Solid arrows = Bayesian Network  Solid + dash arrows = Dependency Network gender(B) Friend(A,B) gender(A) CoffeeDr(A) Heckerman, D.; Chickering, D. M.; Meek, C.; Rounthwaite, R.; Kadie, C. & Kaelbling, P . (2000), 'Dependency Networks for Inference, Collaborative Filtering, and Data Visualization', Journal of Machine Learning Research 1, 49 — 75. 4/14

From BN Parameters to DN Parameters  Log-linear model for probability of target instance given its Markov blanket.  Example: Predict the gender of Sam, given that  40% of Sam’s friends are Women, and  Sam is a coffee drinker. DN Parameter P(target = value|Markov blanket) ∝ exp {∑ target instance + children ∑ parent values PV , child values CV ln(P(CV|PV)) ∙ frequency(CV ,PV) } BN Parameter Markov Blanket 5/14 Fast Learning of Relational Dependency Networks

Example  Predict the gender of Sam, given that  40% of Sam’s friends are Women, and  Sam is a coffee drinker: CP log(CP) Rel. log(CP) gender(B) Friend(sam,B Child Value Parent State Freq. * Freq. ) g(sam) = W g(B) = W , 0.55 -0.60 0.40 -0.24 F(sam,B) = T gender(sam) g(sam) = W g(B) = M, 0.37 -0.99 0.60 -0.60 F(sam,B) = T cd(sam) = T g(sam) = W 0.80 -0.22 1.00 -0.22 cd(sam) = F g(sam) = W 0.20 -1.61 0.00 0.00 CoffeeDr(sam) Sum{ EXP(Sum) ∝ P(gender(sam)=W|MB) } -1.06 P(g(A) = W | g(B) = W, F(A,B) = T) =0.55 P(g(A) = M | g(B) = M, F(A,B) = T) = 0.63 P(cd(A) = T|g(A) = M) = 0.6 P(cd(A) = T|g(A) = W) = 0.8 6/14

Evaluation Metrics  Running time  Conditional Log Likelihood (CLL)  How confident we are with the prediction  Area Under Precision-Recall Curve (PR)  For skewed distributions.  Results are averaged over 5-fold cross-validation, over all two-class predicates in the dataset.  Comparison Methods: RDN-Boost, MLN-Boost. Natarajan, S.; Khot, T.; Kersting, K.; Gutmann, B. & Shavlik, J. W . (2012), 'Gradient-based boosting for statistical relational learning: The relational dependency network case', Machine Learning 86(1), 25-56. 7/14

Accuracy Comparison RDN_Boost MLN_Boost RDN_Bayes 1.20 1.00 0.80 PR 0.60 0.40 0.20 0.00 UW Mondial Hepatitis Muta MovieLens(0.1M) 0.00 -0.10 -0.20 -0.30 CLL -0.40 -0.50 -0.60 -0.70 8/14

Learning Time Comparison Dataset # Predicates # tuples RDN_Boost MLN_Boost RDN_Bayes UW 14 612 15±0.3 19±0.7 1±0.0 Mondial 18 870 27±0.9 42±1.0 102±6.9 Hepatitis 19 11,316 251±5.3 230±2.0 286±2.9 11 Mutagenesis 24,326 118±6.3 49±1.3 1±0.0 MovieLens(0.1M) 7 83,402 44±4.5 min 31±1.87 min 1±0.0 MovieLens(1M) 7 1,010,051 >24 hours >24 hours 10±0.1 • Standard deviations are shown. • Units are seconds unless otherwise stated. 9/14 Fast Learning of Relational Dependency Networks

RDN-Bayes uses more relevant predicates and more first-order variables  Our best predicate for each database: # extra # extra first Target predicate order Database Predicate s variables CLL-diff Mondial religion 11 1 0.58 IMDB gender 6 2 0.30 UW-CSE student 4 1 0.50 Hepatitis sex 4 2 0.20 Mutagenesis ind1 5 1 0.56 MovieLens gender 1 1 0.26 10/14 Fast Learning of Relational Dependency Networks

Structure Comparison Example IMDB UserID Occupation Age gender UserID MovieID Rating RDN-Boost MovieID Time 🎦 Model Target Markov Blanket RDN- gender(U) Occupation(U), RDN- Boost Age(U) Bayes ActorID MovieID RDN- gender(U) Occupation(U), Bayes Age(U), Rating(U,M), RunningTime(M), CastMember(M,X), ActorID AGender AGender(X) 11/14 Fast Learning of Relational Dependency Networks

Conclusions  Basic Idea: convert Bayesian networks to relational dependency networks.  fast BN learning ⇒ fast DN learning.  dependency networks ⇒ inference with cyclic dependencies/autocorrelations. • New log-linear model for converting BN parameters to DN parameters. • I.e., define probability of a node given Markov blanket, Bayes net model. • Empirical evaluation • Scales very well with number of records. • Competitive accuracy with functional gradient boosting. 12/14 Fast Learning of Relational Dependency Networks

There’s More  Empirical Comparisons  counts instead of frequencies  weight learning  more on MLN-Boost  Theorems about dependency network consistency 13/14 Fast Learning of Relational Dependency Networks

The End  Any questions? 14/14 Fast Learning of Relational Dependency Networks

Fast Learning of Relational Dependency Networks Relational - PowerPoint PPT Presentation

Fast Learning of Relational Dependency Networks Relational Dependency Networks B in Person Structure: Directed graph, gender(B) cycles are allowed. Parents of Node = Friend(A,B) Markov Blanket of Node. Parameter = gender(A)

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational

Relational Calculus More declarative than relational algebra Foundation for query

RELATIONAL ALGEBRA CHAPTER 6 1 CHAPTER 6 OUTLINE Unary Relational Operations: SELECT and

Relational Data Model Hacettepe University Computer Engineering Department Outline 1. Relational

This Lecture The Relational Model Relational data structures Relations and Relational

Fast and Scalable Relational Division on Fast and Scalable Relational Division on Database

Goals Why relational learning? Review of logic programming Examples for

Relational Non-Relational Rational Agile Predictable Flexible Traditional

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

Relational Calculus Another Theoretical QL-Relational Calculus Comes in two flavors: Tuple

in Data Mining (An overview to Multiple Instance Learning) Sebastin Ventura Soto Knowledge

Predicting virus mutations through relational learning AIMM 2012 E Cilia 1 , S Teso 2 , S

Exercise. SNP-based drug resistance to Nevirapine drug against the HIV reverse transcriptase Marc

Current cautions about drug development in treatment naive populations more risk than

tools: towards mimicking wet experiments Carole Knibbe

Probabilistic Inductive Logic Programming with SLIPCOVER Fabrizio Riguzzi F. Riguzzi PILP 1 /

The tale of a Virtual Research Community in NMR and

Biophysics of Metalloenzymes Topics and Themes: (Metallo-) Proteins and Enzymes in the Cell 1)