Network Embedding Introductory Talk by Akash Anil Research Scholar - - PowerPoint PPT Presentation

network embedding
SMART_READER_LITE
LIVE PREVIEW

Network Embedding Introductory Talk by Akash Anil Research Scholar - - PowerPoint PPT Presentation

Network Embedding Introductory Talk by Akash Anil Research Scholar OSINT Lab Dept. of Computer Science & Engineering Indian Institute of Technology Guwahati September 11, 2019 Akash Anil Network Embedding September 11, 2019 1 / 25


slide-1
SLIDE 1

Network Embedding

Introductory Talk by

Akash Anil

Research Scholar OSINT Lab

  • Dept. of Computer Science & Engineering

Indian Institute of Technology Guwahati

September 11, 2019

Akash Anil Network Embedding September 11, 2019 1 / 25

slide-2
SLIDE 2

Table of contents

1

Network Embedding

2

Word Embedding

3

Skip-Gram Based Neural Node Embedding

4

DeepWalk

5

Node2Vec

6

Limitation

7

VERSE

8

References

Akash Anil Network Embedding September 11, 2019 2 / 25

slide-3
SLIDE 3

Network Embedding

Network Embedding

Suppose G(V , E) represents a network then Network Embedding refers to generating low dimensional network features corresponding to Node, Edge, Substructure, and the Whole-Graph [1].

Akash Anil Network Embedding September 11, 2019 3 / 25

slide-4
SLIDE 4

Network Embedding

Network Embedding

Suppose G(V , E) represents a network then Network Embedding refers to generating low dimensional network features corresponding to Node, Edge, Substructure, and the Whole-Graph [1].

Figure: Different taxonomies of Network Embedding, Picture Source: Cai et. al. ”https://arxiv.org/pdf/1709.07604.pdf”

Akash Anil Network Embedding September 11, 2019 3 / 25

slide-5
SLIDE 5

Network Embedding Applications

Applications

1 Automatic feature vector generation helps in solving traditional

problems on graph e.g. node classification, relation prediction, clustering, etc.

Akash Anil Network Embedding September 11, 2019 4 / 25

slide-6
SLIDE 6

Network Embedding Applications

Applications

1 Automatic feature vector generation helps in solving traditional

problems on graph e.g. node classification, relation prediction, clustering, etc.

2 Recent Uses Akash Anil Network Embedding September 11, 2019 4 / 25

slide-7
SLIDE 7

Network Embedding Why Neural Network ??

Why Neural Network based Network Embedding ??

Traditional approaches based on matrix factorization (e.g. SVD) are not scalable to networks with large number of nodes. Recent advances in unsupervised word embedding using single layer neural network (e.g. Word2Vec [3]).

Akash Anil Network Embedding September 11, 2019 5 / 25

slide-8
SLIDE 8

Word Embedding

Word Embedding (Word2Vec)

Akash Anil Network Embedding September 11, 2019 6 / 25

slide-9
SLIDE 9

Skip-Gram Based Neural Node Embedding

A generalized framework used for Node Embedding

4 5 6 3 2 1 7 9 8 1

Input Layer Hidden Layer Output Layer Skip-Gram Model x j V dim W V ×N W V ×N

'

W V ×N

'

W V ×N

'

N dim y1 j y2 j yCj G(V , E)

Akash Anil Network Embedding September 11, 2019 7 / 25

slide-10
SLIDE 10

Skip-Gram Based Neural Node Embedding

A generalized framework used for Node Embedding

Akash Anil Network Embedding September 11, 2019 8 / 25

slide-11
SLIDE 11

DeepWalk

DeepWalk [4]

DeepWalk is the first network embedding model exploiting neural networks. Scalable to the large real-world networks.

Akash Anil Network Embedding September 11, 2019 9 / 25

slide-12
SLIDE 12

DeepWalk

Designing DeepWalk Model

1 Generate node sequences (input corpus) using truncated random walk. 2 Iterate random walks from same source node 80 times for

convergence.

3 Supply the node sequences as input to skip-gram model. 4 Maximize the probability of neighborhoods for the given node. Akash Anil Network Embedding September 11, 2019 10 / 25

slide-13
SLIDE 13

DeepWalk Results

Multi-label Classification for Blog-Catalog Data

Akash Anil Network Embedding September 11, 2019 11 / 25

slide-14
SLIDE 14

DeepWalk Results

Multi-label Classification for Flickr Data

Akash Anil Network Embedding September 11, 2019 12 / 25

slide-15
SLIDE 15

DeepWalk Results

Limitations of DeepWalk

Relying on rigid notion of network neighborhood or local characteristics. Fails to captures proximity of different semantics.

Akash Anil Network Embedding September 11, 2019 13 / 25

slide-16
SLIDE 16

Node2Vec

Node2Vec [2]

Uses 2nd Order Random walk to generate corpus. Presents a semi-supervised model which balances the trade-offs of capturing local and global network characteristics. Scalable model applicable to any type of graph e.g., (un)directed, (un)weighted, etc.

Akash Anil Network Embedding September 11, 2019 14 / 25

slide-17
SLIDE 17

Node2Vec

Designing Node2Vec (I)

Suppose a random walker just traversed edge (t, v) ∈ E and now resting at node v. To estimate the transition probability to visit next node x

  • riginating from v, Node2Vec sets the transition probability wvx to

πvx = αpq(t, v).wvx, where αpq(t, v) =

    

1 p

if dtx = 0 1 if dtx = 1

1 q

if dtx = 2 here dtx is the shortest distance be- tween nodes t to x.

Akash Anil Network Embedding September 11, 2019 15 / 25

slide-18
SLIDE 18

Node2Vec

Designing Node2Vec (II)

p is treated as Return Parameter. q is treated as in-out parameter. High q gives BFS like behaviour and low represents DFS. BFS is helpful in capturing local proximities between nodes. DFS is helpful in capturing global proximities between nodes.

Akash Anil Network Embedding September 11, 2019 16 / 25

slide-19
SLIDE 19

Node2Vec

Designing Node2Vec (II)

p is treated as Return Parameter. q is treated as in-out parameter. High q gives BFS like behaviour and low represents DFS. BFS is helpful in capturing local proximities between nodes. DFS is helpful in capturing global proximities between nodes. Node sequences are generated using truncated random walks of length 80. From each node random walk iterates 10 times. Using 10% of the dataset sample, Node2Vec sets the hyper-parameters p and q. Supply the node sequences as input to skip-gram model and maximize the neighborhood probability.

Akash Anil Network Embedding September 11, 2019 16 / 25

slide-20
SLIDE 20

Node2Vec results

Efficiency of Node2Vec over Multi-label classification

Akash Anil Network Embedding September 11, 2019 17 / 25

slide-21
SLIDE 21

Limitation

Limitations of DeepWalk and Node2Vec

fail to capture different types of similarity naturally observed in real-world networks.

Akash Anil Network Embedding September 11, 2019 18 / 25

slide-22
SLIDE 22

Limitation

Limitations of DeepWalk and Node2Vec

fail to capture different types of similarity naturally observed in real-world networks.

Akash Anil Network Embedding September 11, 2019 18 / 25

slide-23
SLIDE 23

VERSE

Versatile Graph Embeddings (VERSE) [5]

Proposes a model capable of capturing different types of similarity distributions. Uses state-of-the-art similarity measures to instantiate the model.

Akash Anil Network Embedding September 11, 2019 19 / 25

slide-24
SLIDE 24

VERSE

Designing VERSE

Select a similarity measure, such as Personalized PageRank, SimRank,

  • etc. and generate the similarity distribution matrix SimG.

Initialize the Embedding space SimE with random weights. Minimize the KL-divergence between distributions SimG and SimE:

  • v∈V KL(SimG(v, .)||SimE(v, .))

Akash Anil Network Embedding September 11, 2019 20 / 25

slide-25
SLIDE 25

VERSE results

Efficiency of VERSE over Multi-class classification for Co-cit data

Akash Anil Network Embedding September 11, 2019 21 / 25

slide-26
SLIDE 26

References

References I

Hongyun Cai, Vincent W Zheng, and Kevin Chang. A comprehensive survey of graph embedding: problems, techniques and applications. TKDE, 2018. Aditya Grover and Jure Leskovec. Node2vec: Scalable feature learning for networks. In Proceedings of the 22Nd ACM SIGKDD International Conference

  • n Knowledge Discovery and Data Mining, KDD ’16, pages 855–864,

New York, NY, USA, 2016. ACM.

Akash Anil Network Embedding September 11, 2019 22 / 25

slide-27
SLIDE 27

References

References II

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111–3119, 2013. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 701–710. ACM, 2014.

Akash Anil Network Embedding September 11, 2019 23 / 25

slide-28
SLIDE 28

References

References III

Anton Tsitsulin, Davide Mottin, Panagiotis Karras, and Emmanuel M¨ uller. Verse: Versatile graph embeddings from similarity measures. In Proceedings of the 2018 World Wide Web Conference, WWW ’18, pages 539–548, Republic and Canton of Geneva, Switzerland, 2018. International World Wide Web Conferences Steering Committee.

Akash Anil Network Embedding September 11, 2019 24 / 25

slide-29
SLIDE 29

Thank You Akash Anil Network Embedding September 11, 2019 25 / 25