Core Decomposition of Uncertain Graphs Francesco Bonchi (Yahoo Labs, - - PowerPoint PPT Presentation

core decomposition of uncertain graphs
SMART_READER_LITE
LIVE PREVIEW

Core Decomposition of Uncertain Graphs Francesco Bonchi (Yahoo Labs, - - PowerPoint PPT Presentation

Core Decomposition of Uncertain Graphs Francesco Bonchi (Yahoo Labs, Barcelona) Francesco Gullo (Yahoo Labs, Barcelona) Andreas Kaltenbrunner (Barcelona Media) Yana Volkovich (Cornell Tech, Barcelona Media) Introduction Core Decomposition of


slide-1
SLIDE 1

Core Decomposition

  • f Uncertain Graphs

Francesco Bonchi (Yahoo Labs, Barcelona) Francesco Gullo (Yahoo Labs, Barcelona) Andreas Kaltenbrunner (Barcelona Media) Yana Volkovich (Cornell Tech, Barcelona Media)

slide-2
SLIDE 2

Introduction

Core Decomposition of Uncertain Graphs

slide-3
SLIDE 3

Introduction

Core Decomposition of Uncertain Graphs

slide-4
SLIDE 4

Dense subgraphs

¡ finding dense subgraphs is a fundamental primitive in many graph problems

slide-5
SLIDE 5

Dense subgraphs

¡ finding dense subgraphs is a fundamental primitive in many graph problems ¡ different definitions of dense subgraphs: cliques, n-cliques, n-clans, k-plexes, k-cores, etc.

slide-6
SLIDE 6

Dense subgraphs

¡ finding dense subgraphs is a fundamental primitive in many graph problems ¡ different definitions of dense subgraphs: cliques, n-cliques, n-clans, k-plexes, k-cores, etc. ¡ most of them are computationally prohibitive: NP-hard or at least quadratic

slide-7
SLIDE 7

k-core decomposition

¡ core decomposition is particularly appealing:

¡ it can be computed in linear time ¡ it relates to many definitions of dense subgraphs

slide-8
SLIDE 8

k-core decomposition

¡ G =(V, E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k

slide-9
SLIDE 9

k-core decomposition

¡ G =(V, E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k

slide-10
SLIDE 10

k-core decomposition

¡ G =(V, E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k

k=1

slide-11
SLIDE 11

k-core decomposition

¡ G =(V, E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k

k=1

slide-12
SLIDE 12

k-core decomposition

¡ G =(V, E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k

k=2

slide-13
SLIDE 13

k-core decomposition

¡ G =(V,E) is an undirected graph ¡ k-core of G is a maximal subgraph H = (C, E|C) such that ∀v∈C : degH(v) ≥ k ¡ core index of a vertex v is the highest order of a core that contains v

2 2 2 1 1 1

slide-14
SLIDE 14

Introduction

Core Decomposition of Uncertain Graphs

slide-15
SLIDE 15

Uncertain graphs

¡ Many real live networks are associated with uncertainty:

¡ data collection process ¡ employed machine-learning methods ¡ privacy-preserving reasons

¡ biological networks, protein-interaction networks ¡ social networks

slide-16
SLIDE 16

Uncertain graphs

¡ Edges in an uncertain graph are associated with a probability of existence

0.5 0.2 0.7 0.5 0.4 0.1

slide-17
SLIDE 17

Uncertain graphs

¡ Edges in an uncertain graph are associated with a probability of existence

¡ Uncertain graph is a generative model for deterministic graphs

0.5 0.2 0.7 0.5 0.4 0.1

slide-18
SLIDE 18

Uncertain graphs

¡ G = (V , E , p) be an uncertain graph: p : E → (0, 1] is a function that assigns a probability

  • f existence to each edge.

0.5 0.2 0.7 0.5 0.4 0.1

slide-19
SLIDE 19

Uncertain graphs

¡ G = (V , E , p) be an uncertain graph: p : E → (0, 1] is a function that assigns a probability

  • f existence to each edge.

0.5 0.2 0.7 0.5 0.4 0.1 …

slide-20
SLIDE 20

Uncertain graphs

¡ G = (V , E , p) be an uncertain graph: p : E → (0, 1] is a function that assigns a probability

  • f existence to each edge.

0.5 0.2 0.7 0.5 0.4 0.1 …

slide-21
SLIDE 21

Introduction

Core Decomposition of Uncertain Graphs

slide-22
SLIDE 22

Introduction

Core Decomposition of Uncertain Graphs We want to extend the graph tool of core decomposition to the context of uncertain graphs.

slide-23
SLIDE 23

Complications

¡ The fact that core decomposition can be performed in linear time in deterministic graphs does not guarantee efficiency in uncertain graphs.

slide-24
SLIDE 24

Complications

¡ The fact that core decomposition can be performed in linear time in deterministic graphs does not guarantee efficiency in uncertain graphs. ¡ Are any two vertices connected?

slide-25
SLIDE 25

Complications

¡ The fact that core decomposition can be performed in linear time in deterministic graphs does not guarantee efficiency in uncertain graphs. ¡ Are any two vertices connected?

¡ in deterministic graph: a simple scan of the graph

slide-26
SLIDE 26

Complications

¡ The fact that core decomposition can be performed in linear time in deterministic graphs does not guarantee efficiency in uncertain graphs. ¡ Are any two vertices connected?

¡ in deterministic graph: a simple scan of the graph ¡ in uncertain graph: computing the probability that two vertices are connected is a #P-complete problem

slide-27
SLIDE 27

Probabilistic (k,η)-cores

¡ uncertain graph G = (V, E, p) ¡ threshold of uncertainty η∈ [0, 1] Probabilistic (k,η)-core of G is a maximal subgraph H = (C, E|C, p) such that ∀v∈C : Pr[degH(v)≥k] ≥η

slide-28
SLIDE 28

Probabilistic (k,η)-cores

¡ uncertain graph G = (V, E, p) ¡ threshold of uncertainty η∈ [0, 1] Probabilistic (k,η)-core of G is a maximal subgraph H = (C, E|C, p) such that ∀v∈C : Pr[degH(v)≥k] ≥η

v 0.5 0.4 0.1 dv=3 e.g. Pr[deg(v)≥2] =Pr[deg(v)=2] + [deg(v)=3] = =(0.1*0.5*0.6+0.1*0.4*0.5+0.5*0.4*0.9)+(0.5*0.1*0.4)

slide-29
SLIDE 29

Probabilistic (k,η)-cores

¡ uncertain graph G = (V, E, p) ¡ threshold of uncertainty η∈ [0, 1] Probabilistic (k,η)-core of G is a maximal subgraph H = (C, E|C, p) such that ∀v∈C : Pr[degH(v)≥k] ≥η

v 0.5 0.4 0.1 dv=3 This probability is monotonically non-increasing with k

slide-30
SLIDE 30

Probabilistic (k,η)-cores

¡ η-degree of any vertex v ∈ V is defined as η-deg(v) = max { k∈[0..dv ] | Pr[deg(v) ≥ k] ≥ η}

slide-31
SLIDE 31

Probabilistic (k,η)-cores

¡ η-degree of any vertex v ∈ V is defined as η-deg(v) = max { k∈[0..dv ] | Pr[deg(v) ≥ k] ≥ η}

v 0.5 0.4 0.1 dv=3

η=0.02 η-deg = 3 η=0.25 η-deg = 2 η=0.73 η-deg = 1 η= 1 η-deg = 0

slide-32
SLIDE 32

Probabilistic (k,η)-cores

¡ η-degree of any vertex v ∈ V is defined as η-deg(v) = max { k∈[0..dv ] | Pr[deg(v) ≥ k] ≥ η} ¡ We use η-degree to define (k,η)-core decomposition in a similar manner as degree in deterministic case.

v 0.5 0.4 0.1 dv=3

η=0.02 η-deg = 3 η=0.25 η-deg = 2 η=0.73 η-deg = 1 η= 1 η-deg = 0

slide-33
SLIDE 33

Computing probabilistic cores

¡ We have proven uniqueness and existence of (k,η)-core decomposition of G.

slide-34
SLIDE 34

Computing probabilistic cores

¡ Since naïve computation of η-degrees leads to exponential time complexity, we defined a dynamic-programming method for (k,η)-core decomposition.

slide-35
SLIDE 35

Computing probabilistic cores

¡ We have shown the running time of (k,η)-core decomposition is O(m∆), where

¡ m is the number of edges ¡ ∆ is the maximum η-degree over all vertices

slide-36
SLIDE 36

Computing probabilistic cores

¡ We have derived a fast-to-compute lower bound

  • n the η-degree to speed up (k,η)-core

computations.

slide-37
SLIDE 37

Applications

1. Task-driven team formation 2. Influence-maximization problem

slide-38
SLIDE 38
  • 1. Task-driven team

formation problem

¡ A collaboration graph:

¡ vertices are individuals ¡ edges exhibit a probabilistic topic model representing the topic(s) of past collaborations

slide-39
SLIDE 39
  • 1. Task-driven team

formation problem

¡ A collaboration graph:

¡ vertices are individuals ¡ edges exhibit a probabilistic topic model representing the topic(s) of past collaborations

¡ A query is a pair ⟨T,Q⟩:

¡ T is a set of terms describing a new task ¡ Q is a set of vertices

slide-40
SLIDE 40
  • 1. Task-driven team

formation problem

¡ A collaboration graph:

¡ vertices are individuals ¡ edges exhibit a probabilistic topic model representing the topic(s) of past collaborations

¡ A query is a pair ⟨T,Q⟩:

¡ T is a set of terms describing a new task ¡ Q is a set of vertices

¡ The goal is to find an answer set of vertices A, such that A⊇Q is a good team for the task described by T.

slide-41
SLIDE 41
  • 2. Influence-maximization

problem

¡ Independent cascade (IC) model:

¡ Links have associated probability; ¡ Every active node v has a single chance of activating each currently inactive neighbor w with probability pvw

slide-42
SLIDE 42
  • 2. Influence-maximization

problem

¡ Independent cascade (IC) model:

¡ Links have associated probability; ¡ Every active node v has a single chance of activating each currently inactive neighbor w with probability pvw v 0.5 0.2 0.7 0.5 0.4 0.1

slide-43
SLIDE 43
  • 2. Influence-maximization

problem

¡ Independent cascade (IC) model:

¡ Links have associated probability; ¡ Every active node v has a single chance of activating each currently inactive neighbor w with probability pvw w x v y 0.5 0.2 0.7 0.5 0.4 0.1

slide-44
SLIDE 44
  • 2. Influence-maximization

problem

¡ Independent cascade (IC) model:

¡ Links have associated probability; ¡ Every active node v has a single chance of activating each currently inactive neighbor w with probability pvw w x v y u z 0.5 0.2 0.7 0.5 0.4 0.1

slide-45
SLIDE 45
  • 2. Influence-maximization

problem

¡ Finding a set S, |S|=s, of vertices that maximizes the expected spread σ(S) is a NP-hard problem

slide-46
SLIDE 46
  • 2. Influence-maximization

problem

¡ Finding a set S, |S|=s, of vertices that maximizes the expected spread σ(S) is a NP-hard problem ¡ Greedy algorithm adds the vertex bringing the largest marginal gain in the objective function.

slide-47
SLIDE 47
  • 2. Influence-maximization

problem

¡ Finding a set S, |S|=s, of vertices that maximizes the expected spread σ(S) is a NP-hard problem ¡ Greedy algorithm adds the vertex bringing the largest marginal gain in the objective function. ¡ We reduce the input graph G by some rule and run the Greedy algorithm.

slide-48
SLIDE 48
  • 2. Influence-maximization

problem

¡ Finding a set S, |S|=s, of vertices that maximizes the expected spread σ(S) is a NP-hard problem ¡ Greedy algorithm adds the vertex bringing the largest marginal gain in the objective function. ¡ We reduce the input graph G by some rule and run the Greedy algorithm. ¡ On deterministic graph k-core index is a direct indicator of the expected spread of any vertex (experimentally observed).

slide-49
SLIDE 49

Influence-maximization experiment

¡ Small directed graph from Twitter with influence probabilities learned from past propagations of URLs (|V| = 21882,|E| = 372005).

slide-50
SLIDE 50

Influence-maximization experiment

¡ Small directed graph from Twitter with influence probabilities learned from past propagations of URLs (|V| = 21882,|E| = 372005).

(k,η)-cores-based method

  • utperforms all the baselines
slide-51
SLIDE 51

Conclusions

¡ We have extended the graph tool of core decomposition to the context of uncertain graphs.

slide-52
SLIDE 52

Conclusions

¡ We have defined the (k,η)-core concept, and devised efficient algorithms for computing a (k,η)-core decomposition.

slide-53
SLIDE 53

Conclusions

¡ We have extensively evaluated our definitions and methods on a number of real-world datasets and applications.

slide-54
SLIDE 54

Conclusions

¡ We plan to investigate the relationship between (k,η)-cores and other definitions of (probabilistic) dense subgraphs.

slide-55
SLIDE 55

Questions?