1 1
Graph Graph-
- based Learning
Graph- -based Learning based Learning Graph Larry Holder Larry - - PowerPoint PPT Presentation
Graph- -based Learning based Learning Graph Larry Holder Larry Holder Computer Science and Engineering Computer Science and Engineering University of Texas at Arlington University of Texas at Arlington 1 1 Graph- -based Learning based
1 1
2 2
3 3
ID ID Last Last First First Age Age Income Income P1 P1 P2 P2 Doe Doe John John 30 30 P3 P3 Doe Doe Sally Sally 29 29 80000 80000 90000 90000 Smith Smith Robert Robert 35 35 100000 100000
Person
Person1 Person1 Person2 Person2 P1 P1 P2 P2 P3 P3 P7 P7
Married RichCouple(X,Y) Person(X,LastX,FirstX,AgeX,IncX) & Person(Y,LastY,FirstY,AgeY,IncY) & Married(X,Y) & (IncX + IncY) > 150000.
4 4
Inductive Logic Programming (ILP) Inductive Logic Programming (ILP)
5 5
6 6
Inokuchi Inokuchi, , Washio Washio and and Motoda Motoda, 2003 , 2003
Kuramochi Kuramochi and and Karypis Karypis, 2001 , 2001
Yan Yan and Han, 2002 and Han, 2002
7 7
Discovery
Clustering
Supervised learning
Person Doe John 80000 30 Last First Age Income Person Doe Sally 90000 29 Last First Age Income Person Smith Robert 100000 35 Last First Age Income Married Married
8 8
9 9
10 10
Input is a labeled (vertices and edges) directed graph Input is a labeled (vertices and edges) directed graph A A substructure substructure is a connected is a connected subgraph subgraph An An instance instance of a substructure is an isomorphic
subgraph
Input graph compressed by replacing instances with Input graph compressed by replacing instances with vertex representing substructure vertex representing substructure
R1 C1 T1 S1 T2 S2 T3 S3 T4 S4 Input Database Substructure S1 (graph form) Compressed Database
triangle
R1 C1
square
shape shape
S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1
11 11
12 12
S
13 13
Substructures:
triangle (4), square (4), circle (1), rectangle (1) circle rectangle triangle square
triangle square
triangle square
triangle square
14 14
Substructures:
triangle square
circle rectangle square
rectangle triangle
circle rectangle triangle square
triangle square
triangle square
triangle square
rectangle
15 15
16 16
17 17
sample.g:
e 1 11 shape e 2 12 shape e 3 13 shape e 4 14 shape e 5 15 shape e 6 16 shape e 7 17 shape e 8 18 shape e 9 19 shape e 10 20 shape e 1 5 on e 2 6 on e 3 7 on e 4 8 on e 5 10 on e 9 10 on e 10 2 on e 10 3 on e 10 4 on v 1 object v 2 object v 3 object v 4 object v 5 object v 6 object v 7 object v 8 object v 9 object v 10 object v 11 triangle v 12 triangle v 13 triangle v 14 triangle v 15 square v 16 square v 17 square v 18 square v 19 circle v 20 rectangle
R1 C1 T1 S1 T2 S2 T3 S3 T4 S4
18 18
19 19
1 2 A B a b 5 3 4 B A b a a b B ∅ (1,3) 1 (1,4) 0 (1,5) 1 (1,λ) 1 (2,4) 7 (2,5) 6 (2,λ) 10 (2,3) 3 (2,5) 6 (2,λ) 9 (2,3) 7 (2,4) 7 (2,λ) 10 (2,3) 9 (2,4) 10 (2,5) 9 (2,λ) 11
Least-cost match is {(1,4), (2,3)}
20 20
k partial mappings considered
21 21
22 22
23 23
24 24
25 25
DNA O | O == P — OH C — N C — C C — C \ O O | O == P — OH | O | CH2 C \ N — C \ C O \ C / \ C — C N — C / \ O C
26 26
1 | P
2 |
n
27 27
S2 a b S3 c d e f S2 a b S3 S3
28 28
Replace instances of right Replace instances of right-
hand side with new vertex labeled with non vertex labeled with non-
terminal on left-
hand side
29 29
a c b a d b a e b a f b x q z y x q z y x q z y x q z y r k
30 30
x q z y S1 S1 x q z y
a c b a d b a e b a f b r k S1 S1
31 31
S2 a b S3 S2 S3 c d e f a b S3 r k S2 S1 S1
32 32
Input Hypothesis
triangle square shape shape
33 33
34 34
NegEgs PosEgs red NegEgsCove
PosEgsNotC Error # # # # + + =
35 35
compound atom atom c 22
c 22
element element type type charge charge 7 contains contains six_ring in_group in_group halide10 ashby_alert ashby_alert p 6 positive ames di227 cytogen_ca compound atom atom c 22
c 22
element element type type charge charge 7 contains contains six_ring in_group in_group halide10 ashby_alert ashby_alert p 6 positive ames di227 cytogen_ca compound p drosophila_slrl compound p _ compound compound p _ compound amine p chromaberr has_group compound amine p has_group compound compound amine p has_group
36 36
Protein data
DNA data
Toxicology (cancer) data
Earthquake data
Aircraft Safety and Reporting System
web_page web_page web_page hyperlink hyperlink hyperlink home … …
37 37
38 38
39 39