Graph-based semi-supervised learning for complex networks Leto Peel - - PowerPoint PPT Presentation

graph based semi supervised learning for complex networks
SMART_READER_LITE
LIVE PREVIEW

Graph-based semi-supervised learning for complex networks Leto Peel - - PowerPoint PPT Presentation

Graph-based semi-supervised learning for complex networks Leto Peel Universit catholique de Louvain @PiratePeel Here is a network social networks food webs internet protein interactions Network nodes can have properties or attributes


slide-1
SLIDE 1

Graph-based semi-supervised learning for complex networks

Leto Peel Université catholique de Louvain @PiratePeel

slide-2
SLIDE 2

Here is a network

social networks food webs internet protein interactions

slide-3
SLIDE 3

social networks age, sex, ethnicity, race, etc. food webs feeding mode, species body mass, etc. internet data capacity, physical location, etc. protein interactions molecular weight, association with cancer, etc.

Network nodes can have properties or attributes (metadata)

Metadata values Metadata unknown

slide-4
SLIDE 4

social networks age, sex, ethnicity, race, etc. food webs feeding mode, species body mass, etc. internet data capacity, physical location, etc. protein interactions molecular weight, association with cancer, etc.

Network nodes can have properties or attributes (metadata)

Metadata values Metadata unknown Can we predict the unknown metadata values?

slide-5
SLIDE 5

Now, let's talk about supervised learning...

X Y

input

  • utput

feature vector discrete label

f

Training

{(X,Y)}train f

inference Predict

f(Xtest )

classification Y

~

slide-6
SLIDE 6

f

slide-7
SLIDE 7
slide-8
SLIDE 8

Predict

Now, let's talk about semi-supervised learning...

X Y

input

  • utput

feature vector discrete label

f f(Xtest )

classification Y

~ Xtest

Training

{(X,Y)}train

use all available data for training the classifier

f

inference

slide-9
SLIDE 9

Construct a graph based on similarity in X and propagate label information around the graph

Graph-based semi-supervised learning

slide-10
SLIDE 10

Semi-supervised learning in complex networks

Metadata values Metadata unknown

slide-11
SLIDE 11

Semi-supervised learning in complex networks

Metadata values Metadata unknown assortative

slide-12
SLIDE 12

Semi-supervised learning in complex networks

Metadata values Metadata unknown assortative disassortative

slide-13
SLIDE 13

Semi-supervised learning in complex networks

Metadata values Metadata unknown assortative disassortative mixed

slide-14
SLIDE 14

Semi-supervised learning in relational networks

Metadata values Metadata unknown assortative disassortative mixed

slide-15
SLIDE 15

Semi-supervised learning in relational networks

Metadata values Metadata unknown assortative disassortative mixed

slide-16
SLIDE 16

Semi-supervised learning in relational networks

Metadata values Metadata unknown assortative disassortative mixed

slide-17
SLIDE 17

Naive application of label propagation does not work if we don't know how classes interact

slide-18
SLIDE 18

Naive application of label propagation does not work if we don't know how classes interact Solution: Construct a similarity graph based on the relational network

slide-19
SLIDE 19

Structurally equivalent nodes

Lorrain & White, Structural equivalence of individuals in social networks. J. Math. Sociol., 1971

slide-20
SLIDE 20

Common neighbours

cosine similarity is a measure of how structurally equivalent two nodes are cosine label propagation

slide-21
SLIDE 21

Neighbours of neighbours

the set of neighbours of a node's neighbours contain all structurally equivalent nodes two-step label propagation

slide-22
SLIDE 22

Why are paths of length 2 important?

bipartite / diassortative negative auto-correlation Gallagher et al. Using ghost edges for classification in sparsely labeled networks, KDD 2008 presence of triangles in assortative relations

slide-23
SLIDE 23

Why are paths of length 2 important?

Label propagation is an eigenvector problem has eigenvalues in [-1,1] most positive most negative

slide-24
SLIDE 24

Why are paths of length 2 important?

Label propagation is an eigenvector problem has eigenvalues in [-1,1] When we consider even path lengths using L2 (or A2 in the case of cosine LP) the eigenvectors remain unchanged, but the eigenvalues are all positive positive positive

slide-25
SLIDE 25
slide-26
SLIDE 26

Gratuitous Comp. Sci. “My curve is better than your curve” slide

slide-27
SLIDE 27

Take home messages...

1) Complex networks are not (necessarily) the same as similarity graphs

  • we should adapt our methods accordingly
slide-28
SLIDE 28

Take home messages...

1) Complex networks are not (necessarily) the same as similarity graphs

  • we should adapt our methods accordingly

2) Machine Learning for Complex Networks does not require representing nodes as feature vectors

  • use Network Science!
slide-29
SLIDE 29

Advertisement

Applications now open! http://wwcs2019.org/ February 4-8th 2019 Zakopane, Poland

slide-30
SLIDE 30

For more information...

Peel, Graph-based semi-supervised learning for relational networks. SIAM International Conference on Data Mining, 2017 https://arxiv.org/abs/1612.05001

Contact: leto.peel@uclouvain.be @PiratePeel

slide-31
SLIDE 31

Linear operator regularisation parameter predicted labels known labels

slide-32
SLIDE 32

Linear operator regularisation parameter predicted labels known labels L = B= N x N (graph connectivity) N x C 1 (or 0) if we know node belongs to class (or not) 1/C otherwise Initialise F=B

slide-33
SLIDE 33

Linear operator regularisation parameter predicted labels known labels smoothness consistency

slide-34
SLIDE 34

predicted labels known labels not I – D-(1/2)AD-(1/2) since we require the “smoothest” eigenvector to be dominant (associated with the largest eigenvalue) Zhou et al. Learning with local and global consistency, NIPS 2003

slide-35
SLIDE 35

predicted labels known labels Solve using the power method: Zhou et al. Learning with local and global consistency, NIPS 2003