using partial probes to infer network states
play

Using Partial Probes to Infer Network States Pavan Rangudu , Bijaya - PowerPoint PPT Presentation

Using Partial Probes to Infer Network States Pavan Rangudu , Bijaya Adhikari , B. Aditya Prakash , Anil Vullikanti Department of Computer Science, Virginia Tech NDSSL, Biocomplexity Institute, Virginia Tech Contact:


  1. Using Partial Probes to Infer Network States Pavan Rangudu ◦ , Bijaya Adhikari ∗ , B. Aditya Prakash ∗ , Anil Vullikanti ∗ ◦ ∗ Department of Computer Science, Virginia Tech ◦ NDSSL, Biocomplexity Institute, Virginia Tech Contact: badityap@cs.vt.edu

  2. Motivation • Network nodes and links fail dynamically • Networks not known fully because of privacy constraints • Our focus: if some failed nodes are known, can we infer the states of the remaining nodes? Node failures in internet Traffic jam in road network Prior works fail to the address the problem directly.

  3. Our model • Graph G ( V , E ) with set I ⊆ V which have failed • Goegraphically correlated failure model [Agarwal et al., 2013] • Single seed of the failure, with probability p s ( v ) of node v being the seed • Correlated failure model: F ( u | v ) denotes the probability that node u fails given that v has failed • Assume independence, i.e., F ( u 1 , u 2 | v ) = F ( u 1 | v ) · F ( u 2 | v ) • Motivation: attacks or natural disasters in infrastructure networks • Probes: subset Q ⊆ I of failed nodes is known • Objective: find the set I − Q Figure: A toy road network with node failures

  4. Our approach: Minimum Description Length • Model cost L ( |Q| , | I | , I ) has three components � |Q| � |Q| , | I | � � � � � � L ( |Q| , | I | , I ) = L ( |Q| ) + L | I | + L I . � � • L ( |Q| ) = − log Pr ( |Q| ) by using the Shannon-Fano code � Pr � � � | I | � |Q| Pr ( | I | ) � � |Q| � � � • L | I | = − log Pr ( |Q| ) � �� � �� � |Q| , | I | � |Q| , | I | � | I | � � � � � � � • L = − log = − log I Pr I Pr I • Data cost: description of Q + = I \ Q (assuming no observation errors) � γ |Q| (1 − γ ) |Q + | � • L ( Q + | I ) = − log = −|Q| log( γ ) − ( | I | − |Q| ) log(1 − γ )

  5. Problem Description Model Cost � |Q| � |Q| , | I | � � � � � � L ( |Q| , | I | , I ) = L ( |Q| ) + L | I | + L I � � | I | = − log − |Q| log( γ ) − ( | I | − |Q| ) log(1 − γ ) |Q| 1 − F ( v ′ | s ) � � � �� � � − log p s ( s ) F ( v | s ) v ′ / s ∈ V v ∈ I ∈ I *after algebra Problem Formulation Given G , p s , F ( · ), Q , find I that minimizes the total MDL cost: � � | I | � � � 1 − F ( v ′ | s ) �� � � � � L |Q| , | I | , I , Q = − log − log p s ( s ) F ( v | s ) |Q| v ′ / s ∈ V v ∈ I ∈ I − 2 |Q| log( γ ) − 2( | I | − |Q| ) log(1 − γ )

  6. Algorithm Greedy Input: Instance ( V , Q , p , P , γ ) Output: Solution ˆ I that minimizes L ( |Q| , | ˆ I | , ˆ I , Q ) 1: for each s ∈ V do for each k ∈ [ |Q| , | V | ] do 2: I s ( k ) ← Top k − |Q| nodes in V \ Q with highest weight 3: f ( s , v ) I s ( k ) ← I s ( k ) ∪ Q 4: end for 5: 6: end for 7: S ← { I s ( k ) : ∀ s ∈ V & k ∈ [ |Q| , | V | ] } 8: ˆ I ← arg min L ( |Q| , | I | , I , Q ) I ∈S 9: Return ˆ I

  7. Analysis of Greedy Theorem: (Additive Approximation) Let I ∗ be the set minimizing the MDL cost, and let I denote the solution computed by Algorithm Greedy . Then, L ( |Q| , | I | , I , Q ) ≤ L ( |Q| , | I ∗ | , I ∗ , Q ) + log( n ), where n is the number of seed nodes. Running time Algorithm Greedy runs in O ( | V | 3 ) time

  8. Experiments • Baseline: local improvement algorithm LocalSearch • Datasets • Synthetic grid • 60 × 60 grid • Uniform seed probability p s ( · ) • Conditional failure probability distribution using model of [Agarwal et al., 2013]: F ( v | s ) = 1 − d ( s , v ), where d ( · ) is (normalized) distance • Real datasets: Seed and conditional failure probability distributions computed from data • JAM data from WAZE for Boston: road network with 2650 nodes. • WEATHER data from WAZE for Boston: road network with 1520 nodes. • POWER-GRID: network of 24 nodes from Electric disturbance events

  9. WAZE dataset Visualization of Waze dataset. Partitions in the 119 × 78 grid represent nodes in our network.

  10. Takeaways Results for JAM dataset 1.0 1.0 Precision/ Recall/ F1 Score/ MDL Cost Ratio Precision/ Recall/ F1 Score/ MDL Cost Ratio 2.5 2.4 0.9 0.9 2.3 MDL Cost Ratio (L(I, Q)/L(I*, Q)) 2.2 2.1 0.8 0.8 2.0 1.9 0.7 0.7 1.8 1.7 1.6 0.6 0.6 1.5 1.4 1.3 0.5 0.5 1.2 1.1 1.0 0.4 0.4 0.9 0.8 0.7 0.3 Precision 0.3 Precision 0.6 Recall Recall 0.5 0.2 F1 Score 0.2 F1 Score 0.4 Quick Local 0.3 MDL Cost Ratio MDL Cost Ratio Greedy 0.2 Algorithm Algorithm Compa 0.1 0.1 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Gamma Gamma Gamma LocalSearch Greedy of the MDL costs • Our MDL based approach helps identify missing failures • Promising approach for other problems with missing information

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend