1 1
Graph Graph-
- based Learning
Graph- -based Learning based Learning Graph Larry Holder Larry - - PowerPoint PPT Presentation
Graph- -based Learning based Learning Graph Larry Holder Larry Holder School of Electrical Engineering and School of Electrical Engineering and Computer Science Computer Science Washington State University Washington State University 1
1 1
2 2
3 3
ID ID Last Last First First Age Age Income Income P1 P1 Doe Doe John John 30 30 80000 80000 P2 P2 Doe Doe Sally Sally 29 29 90000 90000 P3 P3 Smith Smith Robert Robert 35 35 100000 100000 Person1 Person1 Person2 Person2 P1 P1 P2 P2 P3 P3 P7 P7
Person Married RichCouple(X,Y) Person(X,LastX,FirstX,AgeX,IncX) & Person(Y,LastY,FirstY,AgeY,IncY) & Married(X,Y) & (IncX + IncY) > 150000.
4 4
Inductive Logic Programming (ILP) Inductive Logic Programming (ILP)
5 5
6 6
Inokuchi Inokuchi, , Washio Washio and and Motoda Motoda, 2003 , 2003
Kuramochi Kuramochi and and Karypis Karypis, 2001 , 2001
Yan Yan and Han, 2002 and Han, 2002
7 7
Discovery
Clustering
Supervised learning
Person Doe John 80000 30 Last First Age Income Person Doe Sally 90000 29 Last First Age Income Person Smith Robert 100000 35 Last First Age Income Married Married
8 8
9 9
10 10
Input is a labeled (vertices and edges) directed graph Input is a labeled (vertices and edges) directed graph A A substructure substructure is a connected is a connected subgraph subgraph An An instance instance of a substructure is an isomorphic
subgraph
Input graph compressed by replacing instances with Input graph compressed by replacing instances with vertex representing substructure vertex representing substructure
triangle
R1 C1 T1 S1 T2 S2 T3 S3 T4 S4 Input Database Substructure S1 (graph form) Compressed Database R1 C1
square
shape shape
S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1
11 11
12 12
S
13 13
Substructures:
triangle (4), square (4), circle (1), rectangle (1) circle rectangle triangle square
triangle square
triangle square
triangle square
14 14
Substructures:
triangle square
circle rectangle square
rectangle triangle
circle rectangle triangle square
triangle square
triangle square
triangle square
rectangle
15 15
16 16
17 17
v 1 object v 2 object v 3 object v 4 object v 5 object v 6 object v 7 object v 8 object v 9 object v 10 object v 11 triangle v 12 triangle v 13 triangle v 14 triangle v 15 square v 16 square v 17 square v 18 square v 19 circle v 20 rectangle e 1 11 shape e 2 12 shape e 3 13 shape e 4 14 shape e 5 15 shape e 6 16 shape e 7 17 shape e 8 18 shape e 9 19 shape e 10 20 shape e 1 5 on e 2 6 on e 3 7 on e 4 8 on e 5 10 on e 9 10 on e 10 2 on e 10 3 on e 10 4 on
sample.g:
R1 C1 T1 S1 T2 S2 T3 S3 T4 S4
18 18
19 19
1 2 A B a b 5 3 4 B A b a a b B ∅ (1,3) 1 (1,4) 0 (1,5) 1 (1,λ) 1 (2,4) 7 (2,5) 6 (2,λ) 10 (2,3) 3 (2,5) 6 (2,λ) 9 (2,3) 7 (2,4) 7 (2,λ) 10 (2,3) 9 (2,4) 10 (2,5) 9 (2,λ) 11
Least-cost match is {(1,4), (2,3)}
20 20
k partial mappings considered
21 21
22 22
23 23
24 24
25 25
DNA O | O == P — OH C — N C — C C — C \ O O | O == P — OH | O | CH2 C \ N — C \ C O \ C / \ C — C N — C / \ O C
26 26
1 | P
2 |
n
27 27
S2 a b S3 c d e f S2 a b S3 S3
28 28
Replace instances of right Replace instances of right-
hand side with new vertex labeled with non vertex labeled with non-
terminal on left-
hand side
29 29
a c b a d b a e b a f b x q z y x q z y x q z y x q z y r k
30 30
a c b a d b a e b a f b r k S1 S1 x q z y S1 S1 x q z y
31 31
S2 a b S3 S2 S3 c d e f a b S3 r k S2 S1 S1
32 32
triangle square shape shape
Input Hypothesis
33 33
34 34
NegEgs PosEgs red NegEgsCove
PosEgsNotC Error # # # # + + =
35 35
compound atom atom c 22
c 22
element element type type charge charge 7 contains contains six_ring in_group in_group halide10 ashby_alert ashby_alert p 6 positive ames di227 cytogen_ca compound atom atom c 22
c 22
element element type type charge charge 7 contains contains six_ring in_group in_group halide10 ashby_alert ashby_alert p 6 positive ames di227 cytogen_ca compound amine p chromaberr has_group compound amine p has_group compound compound amine p has_group compound p drosophila_slrl compound p _ compound compound p _
36 36
Protein data
DNA data
Toxicology (cancer) data
Earthquake data
Aircraft Safety and Reporting System
web_page web_page web_page hyperlink hyperlink hyperlink home … …
37 37
38 38
39 39