bayesian inference and traffic analysis
play

Bayesian Inference and Traffic Analysis Carmela Troncoso George - PowerPoint PPT Presentation

Bayesian Inference and Traffic Analysis Carmela Troncoso George Danezis September-November 2008 Microsoft Research Cambridge/ KU Leuven(COSIC) Anonymous Communications T ell me who your friends are. .. => Anonymous


  1. Bayesian Inference and Traffic Analysis Carmela Troncoso George Danezis September-November 2008 Microsoft Research Cambridge/ KU Leuven(COSIC)

  2. Anonymous Communications  “T ell me who your friends are. .. ” => Anonymous communications to hide communication partners  High latency systems (e.g.anonymous remailers) use mixes [Chaum 81]: hide input/output relationship MIX MIX MIX 2

  3. Anonymous Communications  Attacks to mix networks  Restricted routes [Dan03]  Bridging and Fingerprinting [DanSyv08]  Social information:  Disclosure Attack [Kes03],  Statistical Disclosure Attack [Dan03],  P erfect Matching Disclosure Attacks [T ron08]  Heuristics and specific models 3

  4. Mix networks and traffic analysis  Determine probability distributions input-output ( , , ) A B C 1 1 A A or B 2 2 MIX 1 3 3 1 ( , , ) B Q 8 8 4 MIX 3 3 3 1 R ( , , ) 8 8 4 1 1 A or B 1 1 1 2 2 A or B or C 4 4 2 MIX 2 1 1 1 ( , , ) C S 4 4 2

  5. Mix networks and traffic analysis  Constraints, e.g. length=2 ( , , ) A B C 1 1 A A or B 2 2 MIX 1 1 1 1 ( , , ) B Q 4 4 2 MIX 3 1 1 1 R ( , , ) 4 4 2 1 1 A or B 1 C 2 2 MIX 2 1 1 ( , , 0 ) C S 2 2 N on trivial given observation!!

  6. “The real thing” S enders Mixes (Threshold = 3) Receivers How to compute probabilities How to compute probabilities systematically? systematically? ? ?

  7. Mix networks and traffic analysis  Find “hidden state” of the mixes A B Q Pr( | , ) ? M1 HS O C R M3 C S M2 Prior information   Pr( | , ) Pr( | ) Pr( | , ) O HS C HS C O HS C K   Pr( | , ) HS O C   Pr( , | ) HS O C T oo large to HS enumerate!!

  8. Mix networks and traffic analysis  “hidden state” + Observation = P aths A B Q M1 R M3 C S M2 P 1 A M1 M2 M3 R P 2 B M1 M3 Q P 3 C M2 S  Pr( | , ) Pr( | ) O HS C K Paths C   Pr( | , ) HS O C  

  9. Bayesian Inference  Actually… we want marginal probabilities ( , , ) A B C 1 1 A or B A 2 2 3 3 1 ( , , ) B Q 8 8 4 3 3 1 ( , , ) R 8 8 4 1 1 A or B 1 1 1 2 2 A or B or C 4 4 2 1 1 1 C S ( , , ) 4 4 2  ( ) I HS  A Q j   Pr( | , , ) HS A Q HS O C j  But… we cannot obtain them directly

  10. Bayesian Inference - sampling  If we obtain samples ~ Pr( | , ) HS O C HS 1 , HS 2 , HS 3 , HS 4 ,…, HS j (A → Q)? 0 1 0 1 … 1  ( ) I HS  A Q j   HS Pr( | , , ) A Q HS O C j  Markov Chain Monte Carlo Methods  Metropolis Hastings alg orithm Pr( | ) Paths C  Pr( | , ) HS O C  How does Pr(P aths|C) look like?

  11. Probabilistic model – Basic Constraints  Users decide independently   Pr( | ) Pr( | ) Paths C P x C x L  Pr( | )  Length restrictions with any distribution l C 1  e.g.   uniform ( L min , L max ) Pr( | ) L l C  L L max min  N ode choice restrictions 1   Pr( | , )  Choose l out of the N mix node a vailable M L l C x ( , ) P N l mix  Choose a set ( ) I set M x      Pr( | ) Pr( | ) Pr( | , ) ( ) P C L l C M L l C I M x x set x

  12. Probabilistic model – Basic Constraints  Unknown destinations max  3 L S C S     L max       Pr( | ) Pr( | ) Pr( | , ) ( ) P C L l C M L l C I M x x set x    l L obs

  13. Probabilistic model – More Constraints  Bridging ( )  Known nodes I M bridging x  N on-compliant clients (with probability ) p c p  Do not respect length restrict ions ( , ) L L min, max, c p c p  Choose l out of the N mix node a vailable, allow repetiti ons 1   Pr( | , , ( )) M L l C I Path x c p ( , ) P N l r mix   Pr( | ) Pr( | ) Paths C P x C x              Pr( | ) Pr( | , ( )) ( 1 ) Pr( | ) Paths C p P C I P p P C i c p i j     c p c p       i P j P c p cp

  14. Probabilistic model – More constraints  S ocial network information x   Assuming we know sending profiles Pr( Sen Rec ) x        Pr( | ) Pr( | ) Pr( | , ) ( ) Pr( Sen Rec ) P C L l C M L l C I M x x set x x x  O ther constraints  Unknown origin  Dummies  O ther mixing strategies  ….

  15. Markov Chain Monte Carlo  S ample from a distribution difficult to sample from directly   Pr( | , ) Pr( | ) Pr( | , ) Pr( | ) O HS C HS C O HS C K Paths C    Pr( | , ) HS O C    Pr( , | ) HS O C HS  3 K ey advantages:  Requires generative model (we know how to compute it!)  Good estimation of errors  N ot false positives and negatives  Systematic

  16. Metropolis Hastings Algorithm Pr( | , )  Constructs a Markov Chain with stationary distribution HS O C Q  Current state Candidate state ( | ) Q HS HS candidate current HS HS candidate current ( | ) Q HS HS current candidate Pr( ) ( | ) HS Q HS HS   candidate candidate current 1. Compute Pr( ) ( | ) HS Q HS HS   current current candidate 1 2. If  HS HS current candidate ~ U ( 0 , 1 ) u else   u if  HS HS current candidate else  HS HS current current

  17. Our sampler: Q transition ( | ) Q Paths Paths candidate current Pr( | ) Paths C  Pr( | , ) HS O C Pahts Paths candidate current Z ( | ) Q Paths Paths current candidate Pr( ) ( | ) Paths Q Paths Paths   candidate candidate current Pr( ) ( | ) Paths Q Paths Paths current current candidate  T ransition Q : swap operation Q A R B M3 M1 C S M2  More complicated transitions for non-compliant clients

  18. Iterations ( | ) Q Paths Paths candidate current Pr( | ) Paths C  Pr( | , ) HS O C Pahts Paths Z candidate current ( | ) Q Paths Paths current candidate  Consecutive samples dependant   S ufficiently separated   Paths Pr( | ) Pr( ) Paths Paths Paths Paths i j i Paths Paths Paths i Paths Paths Paths Paths Paths Paths j

  19. Error estimation  P 1 P 2 P 3 P 4   Paths I (A → Q)? Paths    A Q 1 0 1 0 Pr( ) A Q j Paths  Error estimation Paths  Bernouilli distribution Paths  Pr[ , , ,... | Pr( )] Paths Paths Paths A Q 1 2 3  Prior Beta(1,1) ~ uniform A  Pr[Pr( ) | , , ,...] Q Paths Paths Paths 1 2 3      Pr( ) ~ ( ( ) 1 , ( ) 1 ) A Q Beta I Path I Path    A Q i A Q i Paths Paths  Confidence intervals

  20. Evaluation Create an instance of a network 1. Run the sampler 2. Choose a target sender and a receiver 3. Estimate probability 4.  ( ) I Paths  Sen Rec j   Pr( Sen Rec ) Paths j Check if actually S en chose Rec as receiver ( ) I network 5.  Sen Rec Choose new network and g o to 2 6. Events should happen with the estimated probability  ( ) I Paths  Sen Rec j    Paths Pr( Sen Rec ) ( ( )) E I network  Sen Rec j

  21. Results – compliant clients ( ( )) E I network  Sen Rec  ( ) I Paths  Sen Rec j Paths j

  22. Results – 50 messages

  23. Results – 10 messages

  24. Results – big networks

  25. Performance – RAM usage Nmix t Nmsg Samples RAM(Mb) 3 3 10 500 16 3 3 50 500 18 5 10 100 500 19 10 20 1 000 1 000 24 10 20 10 000 1 000 125  S ize of network and population  Results are kept in memory during simulation  N umber samples collected increases

  26. Performance – Running time Nmix t Nmsg iter Full analysis (min) One sample(ms) 3 3 10 6011 2.33 267.68 3 3 50 6011 2.55 306.00 5 10 100 4011 1.58 190.35 10 20 1 000 7011 3.16 379.76  O perations should be O (1)  W riting of the results on a file  Different number of iterations

  27. Conclusions  T raffic analysis is non trivial when there are constraints  Probabilistic model: incorpor ates most attacks  N on-compliant clients  Monte Carlo Markov Chain methods to extract marginal probabilities  Future work:  SDA based on Ba yesian Inferenc e  Added value?

  28. Thanks for y our attention Carmela.T roncoso@ esat.kuleuven .be Microsoft technical report coming soon… 28

  29. Bayes theorem   Pr( , | ) Pr( | , ) Pr( | ) O HS C HS O C O C   Pr( , | ) Pr( | , ) Pr( | ) O HS C O HS C HS C   Pr( | , ) Pr( | ) Pr( | , ) Pr( | ) O HS C HS C O HS C HS C   Pr( | , ) HS O C  Pr( | ) Pr( , | ) O C HS O C HS J oint probability:     Pr( , ) Pr( | ) Pr( ) Pr( | ) Pr( ) X Y X Y Y Y X X

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend