Distributed, Egocentric Representations of Graphs for Detecting - - PowerPoint PPT Presentation

distributed egocentric representations of graphs for
SMART_READER_LITE
LIVE PREVIEW

Distributed, Egocentric Representations of Graphs for Detecting - - PowerPoint PPT Presentation

Distributed, Egocentric Representations of Graphs for Detecting Critical Structures Ruo-Chun Tzeng 1 Shan-Hung Wu 2 1 Microsoft Inc, Taiwan 2 National Tsing Hua University, Taiwan International Conference on Machine Learning, 2019 Ruo-Chun Tzeng


slide-1
SLIDE 1

Distributed, Egocentric Representations of Graphs for Detecting Critical Structures

Ruo-Chun Tzeng1 Shan-Hung Wu2

1Microsoft Inc, Taiwan 2National Tsing Hua University, Taiwan

International Conference on Machine Learning, 2019

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 1 / 9

slide-2
SLIDE 2

Goal

To learn representations of graphs by using convolutions

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 1 / 9

slide-3
SLIDE 3

Goal

To learn representations of graphs by using convolutions ...while keeping nice properties of CNNs on images:

Filters detect location independent patterns Filters at a deep layer have enlarged receptive fields

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 1 / 9

slide-4
SLIDE 4

Goal

To learn representations of graphs by using convolutions ...while keeping nice properties of CNNs on images:

Filters detect location independent patterns Filters at a deep layer have enlarged receptive fields

...and being able to detect critical structures

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 1 / 9

slide-5
SLIDE 5

What are the critical structures?

Local-scale critical structures, e.g., Alkane vs Alcohol Global-scale critical structures, e.g, Hydrocarbon

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 2 / 9

slide-6
SLIDE 6

STOA: Graph Attention Networks (GAT)1

The (1-head) GAT learns an attention score αij for each edge (i, j) h(l)

i

= σ

 

j∈Ni

αijWh(l−1)

j

+ b

 

α’s explicitly point out the critical structures

When jointly learned with a task, αij denotes the contribution of edge (i, j) to the model prediction

1P Veličković, G Cucurull, A Casanova, A Romero, P Lio, Y Bengio, Graph attention

networks, ICLR’18

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 3 / 9

slide-7
SLIDE 7

Drawback: limited learning ability

However, the (1-head) GAT suffers from limited learning ability h(l)

i

= σ

 

j∈Ni

αijWh(l−1)

j

+ b

  A filter W scans one node at a time does not capture the interactions between nodes

Not a serious problem for node-level (e.g., classification) tasks But may severely degrade the performance of graph-level tasks

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 4 / 9

slide-8
SLIDE 8

A new way: Ego-CNN

Idea: to learn critical structures just like image-based CNNs

1-head GAT (ICLR’18)

h(l)

i

= σ

   

  • j∈Ni

αijWh(l−1)

j

+ b

   

Traditional Convolution

h(l)

i

= σ

  • W ⊛ j∈Nih(l−1)

j

+ b

  • Ego-Convolution (ours)

h(l)

i

= σ

  • W ⊛ j∈Nih(l−1)

j

+ b

  • 1

For each node i, a filter W is applied to all nodes in the neighborhood Ni of i

2

Use common visualization techniques (e.g., deconv) to backtrack critical structures

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 5 / 9

slide-9
SLIDE 9

Challenge: variable-sized Ni makes W ill-defined

h(l)

i

= σ

  • W ⊛ j∈Nih(l−1)

j

+ b

  • For images, Ni can be easily defined

E.g., K × K pixel block centered at i

But how to define Ni for graphs?

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 6 / 9

slide-10
SLIDE 10

Challenge: variable-sized Ni makes W ill-defined

h(l)

i

= σ

  • W ⊛ j∈Nih(l−1)

j

+ b

  • For images, Ni can be easily defined

E.g., K × K pixel block centered at i

But how to define Ni for graphs? Solution: nodes that are most salient to the given task in a ego-network centered at i

1

First layer: set N (1)

i

as the top K unique nodes in Weisfeiler-Lehnman labeling

2

Deep layers: N (l)

i

= N (l−1)

i

(just like image-based CNNs)

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 6 / 9

slide-11
SLIDE 11

Improved learning ability on graph classification

In Ego-CNN, a W at layer l can detect node interaction patterns within l-hop ego-networks 1st layer 2nd 5th

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 7 / 9

slide-12
SLIDE 12

Improved learning ability on graph classification

In Ego-CNN, a W at layer l can detect node interaction patterns within l-hop ego-networks 1st layer 2nd 5th Graph classification benchmark datasets With K = 16, Ego-CNN is comparable to the state-of-the-arts

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 7 / 9

slide-13
SLIDE 13

Ego-CNN can learn critical structures WITHOUT α

Backtracking W with common CNN visualization techniques (e.g., deconv) reveals critical structures

Local-Scale: Alkane vs Alcohol Global-Scale: Symmetric vs Asymmetric (a) C14H29OH (c) Symmetric Isomer (b) C82H165OH (d) Asymmetric Isomer

Table: Visualization of the critical structures detected by Ego-CNN

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 8 / 9

slide-14
SLIDE 14

More benefits... and let’s chat at Poster #22

Ego-CNN can detect self-similar patterns

I.e., same patterns that exist at different zoom levels Commonly exist in social networks

How?

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 9 / 9

slide-15
SLIDE 15

More benefits... and let’s chat at Poster #22

Ego-CNN can detect self-similar patterns

I.e., same patterns that exist at different zoom levels Commonly exist in social networks

How? By simply tying the weights (W’s) across different layers For more details, please go to Poster #22

Ruo-Chun Tzeng and Shan-Hung Wu Ego-CNN ICML’19 9 / 9