Algorithmic Challenges in Link Streams: the case of clique - - PowerPoint PPT Presentation

algorithmic challenges in link streams the case of clique
SMART_READER_LITE
LIVE PREVIEW

Algorithmic Challenges in Link Streams: the case of clique - - PowerPoint PPT Presentation

Introduction Maximal cliques in link streams Link stream edition problems Algorithmic Challenges in Link Streams: the case of clique computations Cl emence Magnien work in collaboration with Tiphaine Viard, Matthieu Latapy, Phan Thi Ha


slide-1
SLIDE 1

logobas Introduction Maximal cliques in link streams Link stream edition problems

Algorithmic Challenges in Link Streams: the case

  • f clique computations

Cl´ emence Magnien work in collaboration with Tiphaine Viard, Matthieu Latapy, Phan Thi Ha Duong, Binh-Minh Bui-Xuan, Pierre Meyer

ComplexNetworks(.fr) LIP6 (CNRS, Sorbonne Universit´ e) first.last@lip6.fr

July 9th, 2018

  • C. Magnien

1/21

slide-2
SLIDE 2

logobas Introduction Maximal cliques in link streams Link stream edition problems

Outline

1

Introduction

2

Maximal cliques in link streams Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

3

Link stream edition problems

  • C. Magnien

2/21

slide-3
SLIDE 3

logobas Introduction Maximal cliques in link streams Link stream edition problems

Link streams

Models of temporal interactions L = (T, V , E) T = [α; ω] V set of nodes E ⊆ T × V ⊗ V set of links One link = (t, uv) Two cases of interest instantaneous link streams link streams with durations

a b c d

2 4 time

  • C. Magnien

3/21

slide-4
SLIDE 4

logobas Introduction Maximal cliques in link streams Link stream edition problems

Link streams

Models of temporal interactions L = (T, V , E) T = [α; ω] V set of nodes E ⊆ T × V ⊗ V set of links One link = (t, uv) Two cases of interest instantaneous link streams link streams with durations

a b c d

2 4 time

  • C. Magnien

3/21

slide-5
SLIDE 5

logobas Introduction Maximal cliques in link streams Link stream edition problems

Link streams

Models of temporal interactions L = (T, V , E) T = [α; ω] V set of nodes E ⊆ T × V ⊗ V set of links One link = (b, e, uv) Two cases of interest instantaneous link streams link streams with durations

a b c d

2 4 time

  • C. Magnien

3/21

slide-6
SLIDE 6

logobas Introduction Maximal cliques in link streams Link stream edition problems

Definitions

Extensions of graph definitions Paths (Strongly) Connected components Betweenness Centrality Cores and shells . . . Extensions of algorithms ?

  • C. Magnien

4/21

slide-7
SLIDE 7

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Outline

1

Introduction

2

Maximal cliques in link streams Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

3

Link stream edition problems

  • C. Magnien

5/21

slide-8
SLIDE 8

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Clique (in a graph)

X ⊆ V Induced subgraph : all possible links exist

  • C. Magnien

6/21

slide-9
SLIDE 9

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Clique (in a graph)

X ⊆ V Induced subgraph : all possible links exist Maximal clique : not included in any other clique

  • C. Magnien

6/21

slide-10
SLIDE 10

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Clique (in a graph)

X ⊆ V Induced subgraph : all possible links exist Maximal clique : not included in any other clique

  • C. Magnien

6/21

slide-11
SLIDE 11

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Clique (in a graph)

X ⊆ V Induced subgraph : all possible links exist Maximal clique : not included in any other clique

  • C. Magnien

6/21

slide-12
SLIDE 12

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

∆-clique in instantaneous link streams

(X, [b; e]) ⊆ V × T Induced sub-stream : all possible links exist all the time All the time : at least every ∆ Maximal if is not included in any other Examples for ∆ = 3 : Signatures of distributed applications, meetings, . . .

  • C. Magnien

7/21

slide-13
SLIDE 13

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

∆-clique in instantaneous link streams

(X, [b; e]) ⊆ V × T Induced sub-stream : all possible links exist all the time All the time : at least every ∆ Maximal if is not included in any other Examples for ∆ = 3 :

2 4 6

b a c

8 2 4 6

c b a

Signatures of distributed applications, meetings, . . .

  • C. Magnien

7/21

slide-14
SLIDE 14

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Cliques in link streams with duration

(X, [b; e]), ⊆ V × T Induced sub-stream : all possible links exist all the time Maximal if is not included in any other

a b c d

2 4 6 8 time

  • C. Magnien

8/21

slide-15
SLIDE 15

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Cliques in link streams with duration

(X, [b; e]), ⊆ V × T Induced sub-stream : all possible links exist all the time Maximal if is not included in any other

a b c d

2 4 6 8 time

  • C. Magnien

8/21

slide-16
SLIDE 16

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Outline

1

Introduction

2

Maximal cliques in link streams Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

3

Link stream edition problems

  • C. Magnien

9/21

slide-17
SLIDE 17

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Enumerate maximal ∆-cliques in a link stream

Naive algorithm Queue Q for all (t, uv) ∈ E, ({u, v}, [t, t]) is a ∆-clique − →Q While Q = ∅ : pop C from Q : if a node or time can be added − → Q

  • therwise C is maximal

U

X {u} , b, e X, b, e

e’ > e ? X {u} ?

U

Discovered cliques X, b, e’

b’<b ?

X, b’, e

  • C. Magnien

10/21

slide-18
SLIDE 18

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Enumerate maximal ∆-cliques in a link stream

Naive algorithm Queue Q for all (t, uv) ∈ E, ({u, v}, [t, t]) is a ∆-clique − →Q While Q = ∅ : pop C from Q : if a node or time can be added − → Q

  • therwise C is maximal

X {u} , b, e X, b, e’ X, b, e

U

e’ > e ? X {u} ?

U

Discovered cliques Is maximal

  • C. Magnien

10/21

slide-19
SLIDE 19

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Time extension

a b c d

∆ = 4

2 4 6 time

  • C. Magnien

11/21

slide-20
SLIDE 20

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Time extension

a b c d

∆ = 4

2 4 6 time

for all links : latest occurrence earliest such occurrence

  • C. Magnien

11/21

slide-21
SLIDE 21

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Time extension

a b c d

∆ = 4

2 4 6 time

for all links : latest occurrence earliest such occurrence add ∆

  • C. Magnien

11/21

slide-22
SLIDE 22

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Sketch of proof (1)

1 Initially, all elements of Q are ∆-cliques 2 one step : transforms a ∆-clique into (several) ∆-cliques 3 the output contains only maximal ∆-cliques

  • C. Magnien

12/21

slide-23
SLIDE 23

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Sketch of proof (2)

All maximal ∆-cliques of L are in the output Let C = (X, [b, e]) be an arbitrary maximal ∆-clique. (s, uv) : earliest link of C C0 = ({u, v}, [s, s]) C1 = ({u, v}, [s, s + ∆]) . . . (add nodes) Ck = (X, [s, s + ∆]) . . . (increase time on the right) Ce = (X, [s, e]) C = (X, [b, e])

  • C. Magnien

13/21

slide-24
SLIDE 24

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Complexity

O(2nn2m3 + 2nn3m2) Interesting observations No relation between n and m small n, large m − → reasonable running time 2n : All subsets of nodes In practice : of nodes linked at the same time − → Running time increases with ∆

  • C. Magnien

14/21

slide-25
SLIDE 25

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Outline

1

Introduction

2

Maximal cliques in link streams Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

3

Link stream edition problems

  • C. Magnien

15/21

slide-26
SLIDE 26

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Same algorithm, except time extension

a b c d

2 4 6 8 time

earliest link end

  • C. Magnien

16/21

slide-27
SLIDE 27

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Same algorithm, except time extension

a b c d

2 4 6 8 time

earliest link end Extend time

  • C. Magnien

16/21

slide-28
SLIDE 28

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Same algorithm, except time extension

a b c d

2 4 6 8 time

earliest link end Extend time No time extension to the left

  • C. Magnien

16/21

slide-29
SLIDE 29

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Same algorithm, except time extension

a b c d

2 4 6 8 time

earliest link end Extend time No time extension to the left Small complexity gain from O(2nn2m3 + 2nn3m2) to O(2nn2m2 log m + 2nn3m2)

  • C. Magnien

16/21

slide-30
SLIDE 30

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Running times

Algorithms ∆-cliques in instantaneous linkstreams cliques in linkstreams with durations Bron-Kerbosch algorithm [HMNS, 2017]

100 200 300 400 500 600 700 800 10 100 1000 10000 100000 1x106 Delta-cliques Himmel et. al Durations 50 100 150 200 250 300 10 100 1000 10000 100000 1x106 Delta-cliques Himmel et. al Durations 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 10 100 1000 10000 100000 1x106 Delta-cliques Himmel et. al Durations

Emails Highschool Museum Our algorithm fastest for many relevant values of ∆

  • C. Magnien

17/21

slide-31
SLIDE 31

logobas Introduction Maximal cliques in link streams Link stream edition problems Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

Case studies

Physical proximity in high school Detected structures not observable in the aggregated graph (e.g., students from different classes meeting before class starts) IP traffic (bipartite) Dataset too large to compute all maximal cliques Sampling strategy for finding balanced cliques Correlation between cliques and malevolent activity

  • C. Magnien

18/21

slide-32
SLIDE 32

logobas Introduction Maximal cliques in link streams Link stream edition problems

Outline

1

Introduction

2

Maximal cliques in link streams Maximal ∆-cliques in instantaneous link stream Maximal cliques in link streams with durations

3

Link stream edition problems

  • C. Magnien

19/21

slide-33
SLIDE 33

logobas Introduction Maximal cliques in link streams Link stream edition problems

Link Stream Edition problems

Sparse Split Link Stream Edtion Problem Given a link stream L and an integer k : possible to transform L into a clique (+isolated vertices) in k editions ? Well studied in graphs, NP-complete Possible to adapt existing algorithms − → kernel algorithm Bi-sparse split linkstream edition

a b c d e f

Showed that the problem is fixed parameter tractable and proposed an algorithm

  • C. Magnien

20/21

slide-34
SLIDE 34

logobas Introduction Maximal cliques in link streams Link stream edition problems

Conclusion and perspectives

Maximal cliques in link streams

Algorithms and code Application to : social interactions description, IP traffic

Sparse-split and Bi-Sparse-split

Algorithms

Currently

clique edge cover problem betwenness centrality strongly connected components

many graph problems Link stream description and understanding

anomaly detection

  • C. Magnien

21/21

slide-35
SLIDE 35

logobas

Configuration space

a,b 6;6 a,b 6;9 a,b 3;6 a,b 3;9 a,b,c 3;6 a,b 0;6 a,c 5;5 a,c 5;8 a,c 2;5 a,c 2;8 a,b,c 4;7 a,b,c 3;7 b,c 4;4 b,c 4;7 b,c 1;4 b,c 1;7 a,b,c 2;7 a,b 0;9 a,b,c 2;5 a,b,c 2;6 a,b 3;3 a,b 0;3 2 4 6

c b a

  • C. Magnien

1/6

slide-36
SLIDE 36

logobas

Enumerate maximal cliques in link streams

[Himmel, Molter, Niedermeier, Sorge 2016, 2017] Adaptation of the Bron-Kerbosch algoritmh to maximal ∆-cliques

  • C. Magnien

2/6

slide-37
SLIDE 37

logobas

Enumerate maximal cliques in graphs

R : clique (not maximal) P ∪ X : all vertices adjacent to all vertices

  • f R Compute all maximal cliques ⊇ R containing no vertex in X :

Bron-Kerbosch algorithm if P ∪ X = ∅ − → R is maximal for each v ∈ P

Bron-Kerbosch(P ∩ N(v), R ∪ {v}, X ∩ N(v)) P ← P\{v} X ← X ∪ {v}

  • C. Magnien

3/6

slide-38
SLIDE 38

logobas

Enumerate maximal cliques in link streams

[Himmel, Molter, Niedermeier, Sorge 2016, 2017] (R, I) : time maximal clique P, X : sets of (v, I ′) such that (R ∪ {v}, I ′) is a time maximal clique R P + X V for each (v, I ′) ∈ P add v to R Restrict time to I ′ Update P and X

  • C. Magnien

4/6

slide-39
SLIDE 39

logobas

Enumerate maximal cliques in link streams

[Himmel, Molter, Niedermeier, Sorge 2016, 2017] (R, I) : time maximal clique P, X : sets of (v, I ′) such that (R ∪ {v}, I ′) is a time maximal clique

R P + X V

for each (v, I ′) ∈ P add v to R Restrict time to I ′ Update P and X

  • C. Magnien

4/6

slide-40
SLIDE 40

logobas

Example : k-core

k-core in a graph : largest induced subgraph s.t. all nodes have degree ≥ t.

a b c d

2 4 6 8 time

  • C. Magnien

5/6

slide-41
SLIDE 41

logobas

Example : k-core

k-core in a graph : largest induced subgraph s.t. all nodes have degree ≥ t.

a b c d

2 4 6 8 time

Possible to compute the graph k-core at each relevant time-step

  • C. Magnien

5/6

slide-42
SLIDE 42

logobas

Path from (α, u) to (ω, v)

Sequence (u0, u1, t0), (u1, u2, t1), . . . (u−1, uk, tk−1) s.t. u0 = u, uk = v (ti, ui, ui+1) ∈ E ti ≤ ti+1, t0 ≥ α, tk−1 ≤ ω

a b c d

2 4 6 8 time

Not possible to consider graphs induced by time instants Extensions from graph algorithms exist, not direct

  • C. Magnien

6/6

slide-43
SLIDE 43

logobas

Path from (α, u) to (ω, v)

Sequence (u0, u1, t0), (u1, u2, t1), . . . (u−1, uk, tk−1) s.t. u0 = u, uk = v (ti, ui, ui+1) ∈ E ti ≤ ti+1, t0 ≥ α, tk−1 ≤ ω

a b c d

2 4 6 8 time

Not possible to consider graphs induced by time instants Extensions from graph algorithms exist, not direct

  • C. Magnien

6/6