Using Partial Probes to Infer Network States Pavan Rangudu , Bijaya - - PowerPoint PPT Presentation

using partial probes to infer network states
SMART_READER_LITE
LIVE PREVIEW

Using Partial Probes to Infer Network States Pavan Rangudu , Bijaya - - PowerPoint PPT Presentation

Using Partial Probes to Infer Network States Pavan Rangudu , Bijaya Adhikari , B. Aditya Prakash , Anil Vullikanti Department of Computer Science, Virginia Tech NDSSL, Biocomplexity Institute, Virginia Tech Contact:


slide-1
SLIDE 1

Using Partial Probes to Infer Network States

Pavan Rangudu◦, Bijaya Adhikari∗, B. Aditya Prakash∗, Anil Vullikanti∗ ◦

∗Department of Computer Science, Virginia Tech

  • NDSSL, Biocomplexity Institute, Virginia Tech

Contact: badityap@cs.vt.edu

slide-2
SLIDE 2

Motivation

  • Network nodes and links fail dynamically
  • Networks not known fully because of privacy constraints
  • Our focus: if some failed nodes are known, can we infer the

states of the remaining nodes?

Node failures in internet Traffic jam in road network

Prior works fail to the address the problem directly.

slide-3
SLIDE 3

Our model

  • Graph G(V , E) with set I ⊆ V which have failed
  • Goegraphically correlated failure model [Agarwal et al., 2013]
  • Single seed of the failure, with probability ps(v) of node v

being the seed

  • Correlated failure model: F(u|v) denotes the probability that

node u fails given that v has failed

  • Assume independence, i.e., F(u1, u2|v) = F(u1|v) · F(u2|v)
  • Motivation: attacks or natural disasters in infrastructure

networks

  • Probes: subset Q ⊆ I of failed nodes is known
  • Objective: find the set I − Q

Figure: A toy road network with node failures

slide-4
SLIDE 4

Our approach: Minimum Description Length

  • Model cost L(|Q|, |I|, I) has three components

L(|Q|, |I|, I) = L(|Q|) + L

  • |I|
  • |Q|
  • + L
  • I
  • |Q|, |I|
  • .
  • L(|Q|) = − log
  • Pr(|Q|)
  • by using the Shannon-Fano code
  • L
  • |I|
  • |Q|
  • = − log

Pr

  • |Q|
  • |I|
  • Pr(|I|)

Pr(|Q|)

  • L
  • I
  • |Q|, |I|
  • = − log
  • Pr
  • I
  • |Q|, |I|
  • = − log
  • Pr
  • I
  • |I|
  • Data cost: description of Q+ = I \ Q (assuming no
  • bservation errors)
  • L(Q+|I) = − log
  • γ|Q|(1 − γ)|Q+|

= −|Q| log(γ) − (|I| − |Q|) log(1 − γ)

slide-5
SLIDE 5

Problem Description

Model Cost

L(|Q|, |I|, I) =L(|Q|) + L

  • |I|
  • |Q|
  • + L
  • I
  • |Q|, |I|
  • = − log
  • |I|

|Q|

  • − |Q| log(γ) − (|I| − |Q|) log(1 − γ)

− log

s∈V

ps(s)

  • v∈I

F(v | s)

  • v′ /

∈I

  • 1 − F(v ′ | s)
  • *after algebra

Problem Formulation Given G, ps, F(·), Q, find I that minimizes the total MDL cost:

L

  • |Q|, |I|, I, Q
  • = − log
  • |I|

|Q|

  • − log

s∈V

ps(s)

  • v∈I

F(v | s)

  • v′ /

∈I

  • 1 − F(v ′ | s)
  • −2|Q| log(γ) − 2(|I| − |Q|) log(1 − γ)
slide-6
SLIDE 6

Algorithm Greedy

Input: Instance (V , Q, p, P, γ) Output: Solution ˆ I that minimizes L(|Q|, |ˆ I|, ˆ I, Q)

1: for each s ∈ V do 2:

for each k ∈ [|Q|, |V |] do

3:

Is(k) ← Top k − |Q| nodes in V \ Q with highest weight f (s, v)

4:

Is(k) ← Is(k) ∪ Q

5:

end for

6: end for 7: S ← {Is(k) : ∀s ∈ V &k ∈ [|Q|, |V |]} 8: ˆ

I ← arg min

I∈S

L(|Q|, |I|, I, Q)

9: Return ˆ

I

slide-7
SLIDE 7

Analysis of Greedy

Theorem: (Additive Approximation)

Let I ∗ be the set minimizing the MDL cost, and let I denote the solution computed by Algorithm Greedy. Then, L(|Q|, |I|, I, Q) ≤ L(|Q|, |I ∗|, I ∗, Q) + log(n), where n is the number of seed nodes.

Running time

Algorithm Greedy runs in O(|V |3) time

slide-8
SLIDE 8

Experiments

  • Baseline: local improvement algorithm LocalSearch
  • Datasets
  • Synthetic grid
  • 60 × 60 grid
  • Uniform seed probability ps(·)
  • Conditional failure probability distribution using model of

[Agarwal et al., 2013]: F(v | s) = 1 − d(s, v), where d(·) is (normalized) distance

  • Real datasets: Seed and conditional failure probability

distributions computed from data

  • JAM data from WAZE for Boston: road network with 2650

nodes.

  • WEATHER data from WAZE for Boston: road network with

1520 nodes.

  • POWER-GRID: network of 24 nodes from Electric disturbance

events

slide-9
SLIDE 9

WAZE dataset

Visualization of Waze dataset. Partitions in the 119 × 78 grid represent nodes in our network.

slide-10
SLIDE 10

Takeaways

Results for JAM dataset

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Gamma

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Precision/ Recall/ F1 Score/ MDL Cost Ratio Precision Recall F1 Score MDL Cost Ratio

Algorithm LocalSearch

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Gamma

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Precision/ Recall/ F1 Score/ MDL Cost Ratio Precision Recall F1 Score MDL Cost Ratio

Algorithm Greedy

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Gamma

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5

MDL Cost Ratio (L(I, Q)/L(I*, Q)) Quick Local Greedy

Compa

  • f the MDL costs
  • Our MDL based approach helps identify missing failures
  • Promising approach for other problems with missing

information