using binary decision diagrams to enumerate inductive
play

Using Binary Decision Diagrams to Enumerate Inductive Logic - PowerPoint PPT Presentation

Hikaru Shindo*, Masaaki Nishino**, Akihiro Yamamoto* September 4, 2018 * Graduate School of Informatics, Kyoto University ** NTT Communication Science Laboratories Using Binary Decision Diagrams to Enumerate Inductive Logic Programming


  1. Hikaru Shindo*, Masaaki Nishino**, Akihiro Yamamoto* September 4, 2018 * Graduate School of Informatics, Kyoto University ** NTT Communication Science Laboratories Using Binary Decision Diagrams to Enumerate Inductive Logic Programming Solutions

  2. 1 • Key idea: We use Binary Decision Diagram for enumeration. represents the set of solutions. Binary Decision Diagram that • We show how to build recursively a • We propose an efficient algorithm for enumerating solutions of representing compactly a Boolean function. might miss important solutions. • Basic formalization of ILP allows many potential solutions, and we Abstract Inductive Logic Programming problem with Binary Decision Diagram s. ⇒ Enumeration is fundamental technique to avoid such missing. • Binary Decision Diagram (BDD) is a directed acyclic graph 0 1 2 0 1

  3. 1. Introduction 2. Binary Decision Diagram and Enumeration of Solutions 3. Applications 4. Experiments 5. Conclusion and Future work 2 Table of contents

  4. Introduction

  5. 3 • ILP system generate solutions for given positive examples and negative examples. On the view point of logic, a lot of candidates of solutions might be generated. • Every ILP system choose some appropriate solutions based on some criteria or its search method. . . . Motivation Example Σ = { p ( a ) } , E + = { p ( a ) } , ⇒ Σ = { p ( x ) ← q ( x ) , q ( a ) } , E − = { p ( b ) } , B = {} We call the solution of ILP problem as hypothesis .

  6. Merits of the enumeration: The importance of a hypothesis depends on the case, so algorithms that give only one hypothesis may not return the best hypothesis. Users can select a hypothesis or compare some hypotheses using an evaluation function. We can efficiently perform online leaning, i.e., updating the current set of hypothesis when new examples are added. 4 Fundamental idea: Enumeration of hypotheses Enumeration of hypotheses is keeping all hypotheses. 1. Preventing hypothesis omission 2. Hypothesis selection 3. Online-learning

  7. • We assume that a finite set of clauses that can be an element of hypotheses is given explicitly. • Even in that finite space, enumerating all hypotheses naively is an implausible task because there are a serious amount of candidate hypotheses. of hypotheses for enumeration. • In this work, we developed an efficient recursive algorithm for constructing a BDD. 5 Approach • To treat such large scale sets of hypotheses, we use Binary Decision Diagram (BDD) s that give compressed representation

  8. • An efficient algorithm for enumerating hypotheses using BDDs. • The class of ILP problems that we can apply our algorithm. • An efficient algorithm to get the best hypothesis with an evaluation function. • We empirically show that our method can be applied to real data. 6 Contribution

  9. Binary Decision Diagram and Enumeration of Solutions

  10. represents a Boolean function. Binary operations between BDDs can be executed efficiently. 7 Binary Decision Diagrams A Binary Decision Diagram (BDD) is a directed acyclic graph that 0 1 2 0 1 BDD that represents F ( x 0 , x 1 , x 2 ) = ( x 0 ∧ x 1 ) ∨ x 2 For example, given two BDDs representing logical functions F and G , then the BDD representing H = F ∧ G can be computed in time linear to F and G sizes.

  11. knowledge, and hypotheses are represented by first-order logic. 8 Inductive Logic Programming In Inductive Logic Programming (ILP) , all data, background ILP Problem Input Finite sets E + , E − , and B of ground atoms Output A set of definite clauses Σ such that 1. for all A ∈ E + Σ ∪ B | = A 2. for all A ∈ E − Σ ∪ B ̸| = A Example E + = { p ( a ) } , E − = { p ( b ) } , B = {} Σ = { p ( a ) } , { p ( x ) ← q ( x ) , q ( a ) } , . . .

  12. • To enumerate ILP hypotheses with BDDs, we introduce Boolean variables, because BDD is a representation of a Boolean function. (1) 9 Using BDDs for enumerating ILP solutions • Boolean variables make the hypothesis enumeration problem equivalent to the problem of identifying a Boolean function . • Hypothesis space H is a finite set of clauses that can be an element of the hypothesis. We assume that H is given explicitly . For each clause C ∈ H , we introduce a propositional variable v C ∈ Σ that becomes true if and only if clause C ∈ Σ . For readability, we represent [ C ∈ Σ] instead of v C ∈ Σ , C ∈ Σ ⇔ [ C ∈ Σ] = T .

  13. 10 Then, a BDD that represents the set of hypotheses is Given: The BDD to be built: Building a BDD that represents hypotheses We define F A as a BDD that represents the Boolean function that becomes true if and only if Σ ∪ B | = A . ∧ ∧ F A ∧ ¬ F A . A ∈E + A ∈E − Example E + = { p ( a ) } , E − = { p ( b ) } , B = {} , 0 1 4 4 F p ( a ) ∧ ¬ F p ( b ) = ∧ 2 3 0 1 1 0

  14. 11 (2) Solving ILP problem on the BDD I C : the BDD that represents the Boolean variable [ C ∈ Σ] BK A : the BDD that represents a constant that becomes true if and only if A ∈ B . Then F A for A ∈ E + ∪ E − is recursively defined as ( ) ∨ ∧ F A = BK A ∨ I C ∧ F B i . C ∈H ∃ θ Cθ = A ← B 1 ∧ ... ∧ B n The right side of equation (2) represents the fact that Σ ∪ B | = A if 1. A ∈ B , or 2. A is deduced by a substitution.

  15. 12 2 Introduced variables: 0 4 1 3 Solving ILP problem on the BDD Example ⃝ [ p ( a ) ∈ Σ] , ⃝ [ p ( b ) ∈ Σ] , ⃝ [ q ( a ) ∈ Σ] , ⃝ [ q ( b ) ∈ Σ] , ⃝ [ p ( x ) ← q ( x ) ∈ Σ] F p ( a ) = ∨ ( I p ( x ) ← q ( x ) ∧ F q ( a ) ) I p ( a ) 0 4 2 1 ) 0 1 0 1 0 ∨ ( ∧ F p ( b ) = I p ( b ) ∨ ( I p ( x ) ← q ( x ) ∧ F q ( b ) ) 1 4 3 1 ) 0 1 0 1 0 ∨ ( ∧

  16. 13 Introduced variables: . . . Enumerated hypotheses: 4 3 2 1 0 Solving ILP problem on the BDD Problem 0 E + = { p ( a ) } , E − = { p ( b ) } , B = {} , { } p ( a ) , p ( b ) , H = . 1 q ( a ) , q ( b ) , p ( x ) ← q ( x ) 1 2 ⃝ [ p ( a ) ∈ Σ] ⃝ [ p ( b ) ∈ Σ] 3 3 ⃝ [ q ( a ) ∈ Σ] ⃝ [ q ( b ) ∈ Σ] ⃝ [ p ( x ) ← q ( x ) ∈ Σ] 4 4 Σ = { p ( a ) } 0 1 Σ = { q ( a ) , p ( x ) ← q ( x ) } F p ( a ) ∧ ¬ F p ( b )

  17. Applications

  18. 14 The hypothesis with minimum number of 0 Introduced variables: 1 path colored red. 2 This corresponds to the minimum-weight 3 atoms: 4 Search for the best hypothesis 0 ⃝ [ p ( a ) ∈ Σ] ⃝ [ p ( b ) ∈ Σ] 1 1 ⃝ [ q ( a ) ∈ Σ] ⃝ [ q ( b ) ∈ Σ] 1 2 ⃝ [ p ( x ) ← q ( x ) ∈ Σ] 1 1 1 3 3 Example 1 4 4 2 1 Σ best = { p ( a ) } 2 0 1 F p ( a ) ∧ ¬ F p ( b )

  19. Experiments

  20. 15 Classification of natural numbers When n is even, E + = { e (0) , e ( s 2 (0)) , . . . , e ( s n (0)) } , E − = { e ( s (0)) , e ( s 3 (0)) , . . . , e ( s n +1 (0)) } . When n is odd, E + = { e (0) , e ( s 2 (0)) , . . . , e ( s n +1 (0)) } , E − = { e ( s (0)) , e ( s 3 (0)) , . . . , e ( s n (0)) } . Example In the case of n = 1 , E + , E − , B , and H are, respectively, E + = { e (0) , e ( s 2 (0)) } , E − = { e ( s (0)) } , B = ∅ , and   e (0) , e ( x ) ,      e ( s (0)) , e ( s ( x )) ,        e ( s 2 (0)) , e ( s 2 ( x )) , H = . e ( s 2 ( x )) ← e ( x ) ,  e ( s ( x )) ← e ( x ) ,         e ( s 2 ( x )) ← e ( s ( x )) , e ( s 2 ( x )) ← e ( s ( x )) ∧ e ( x )   

  21. 16 263 4 69 1.16msec 5 134 69 1.48msec 6 101 27 2.21msec 7 520 156 1.68msec 8 1033 219 2.66msec 1.02msec 42 36 0.62msec variables nodes hypotheses BDD construction time best hypothesis search time 1 10 28 7.56msec 8 9.63msec 2 19 3 14 192 0.68msec Results n 1 . 25 × 10 7 1 . 90 × 10 msec 1 . 31 × 10 13 3 . 08 × 10 msec 4 . 82 × 10 32 7 . 00 × 10 msec 9 . 77 × 10 63 3 . 50 × 10 2 msec 2 . 26 × 10 141 1 . 68 × 10 3 msec 1 . 80 × 10 308 + 1 . 20 × 10 4 msec Table 1: The results of the natural number problem

  22. 17 2243 1 https://archive.ics.uci.edu/ml/datasets/soybean+(small) 2345 117 Shuttle 13495msec 2 https://archive.ics.uci.edu/ml/datasets/Shuttle+Landing+Control 788498 Soybean One of the best hypotheses found in problem of Soybean(small) is, construction time BDD hypotheses nodes variables Problem 3 http://archive.ics.uci.edu/ml/index.php UCI Machine Learning Repository 3 . 30msec Classification of real data (1) Soybean(small) 1 and (2) Shuttle Landing Control 2 from Target concept: D 1 , no _ auto respectively. 1 . 80 × 10 308 + 6 . 76 × 10 10 Table 2: The results of real data problem Σ best = { class ( x, D 1) ← stem _ canker ( x, above _ soil ) } .

  23. Conclusion and Future work

  24. • We proposed a BDD-based method to enumerate hypotheses of an ILP. • We showed that users can get the best hypothesis following an evaluation function from the constructed BDD. • Enumerating hypotheses that have some errors • Combination with other ILP approaches • Enumeration with other data structures 18 Conclusion and Future work Conclusion Future Work

  25. the hypothesis. and it satisfies the following two requirements. 19 Requirements Hypothesis space is a finite set of clauses that can be an element of We assume that the hypothesis space is given explicitly , Requirement 1 The hypothesis space does not contain any mutually recursive clauses . Requirement 2 The hypothesis space is variable-bounded .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend