Website fingerprinting on Tor: attacks and defenses Claudia Diaz KU - - PowerPoint PPT Presentation

website fingerprinting on tor
SMART_READER_LITE
LIVE PREVIEW

Website fingerprinting on Tor: attacks and defenses Claudia Diaz KU - - PowerPoint PPT Presentation

Website fingerprinting on Tor: attacks and defenses Claudia Diaz KU Leuven Joint work with: Marc Juarez , Sadia Afroz, Gunes Acar, Rachel Greenstadt, Mohsen Imani, Mike Perry, Matthew Wright Post-Snowden Cryptography Workshop Brussels, December


slide-1
SLIDE 1

Website fingerprinting on Tor: attacks and defenses

Claudia Diaz KU Leuven

Post-Snowden Cryptography Workshop Brussels, December 10, 2015 Joint work with: Marc Juarez, Sadia Afroz, Gunes Acar, Rachel Greenstadt, Mohsen Imani, Mike Perry, Matthew Wright

slide-2
SLIDE 2

Metadata

It’s not just about communications content: Sigint

Time, duration, size, identities, location, pattern

Exposed by default in communications protocols

Bulk collection: size much smaller than content Machine readable, cheap to analyze, highly revealing Much lower level of legal protection

Dedicated systems to protect metadata

Tor network NSA program “Egotistical Giraffe”

slide-3
SLIDE 3

Introduction: how does WF work?

User Tor Web Adversary

2

User = Alice Webpage = ??

slide-4
SLIDE 4

Why is WF so important?

Tor as the most advanced anonymity network (according to NSA) Allows an adversary to recover users web browsing history Series of successful attacks Weak adversary model (local adversary)

Number of top conference publications

  • n WF

(30)

3

slide-5
SLIDE 5

Introduction: assumptions

User Tor Web Adversary

Client settings: Browsing behaviour: which pages, one at the time

4

slide-6
SLIDE 6

Introduction: assumptions

User Tor Web Adversary

Adversary: Replicability system config, parsing (start/ end page), clean traces

4

slide-7
SLIDE 7

Introduction: assumptions

User Tor Web Adversary

Web: No personalisation,

  • r staleness

4

slide-8
SLIDE 8

Methodology

Based on Wang and Goldberg’s

Batches and k-fold cross-validation Fast-levenshtein attack (SVM)

Comparative experiments

Key: isolate variable under evaluation (e.g., TBB version)

6

slide-9
SLIDE 9

Comparative experiments: example

  • Step 1:
  • Step 2:

7

slide-10
SLIDE 10

Comparative experiments: example

  • Step 1:
  • Step 2:

Train: on data with default value Test: on data with default value Accuracy Control

7

slide-11
SLIDE 11

Comparative experiments: example

  • Step 1:
  • Step 2:

Train: on data with default value Test: on data with value of interest Accuracy Test

7

slide-12
SLIDE 12

Experiments: multitab browsing

  • FF users use average 2 or 3 tabs

9

slide-13
SLIDE 13

Experiments: multitab browsing

  • Experiment with 2 tabs: 0.5s, 3s, 5s
  • FF users use average 2 or 3 tabs

9

slide-14
SLIDE 14

Experiments: multitab browsing

  • Experiment with 2 tabs: 0.5s, 3s, 5s
  • FF users use average 2 or 3 tabs

9

slide-15
SLIDE 15

Experiments: multitab browsing

Background Background Foreground Foreground

  • Experiment with 2 tabs: 0.5s, 3s, 5s
  • FF users use average 2 or 3 tabs
  • Background page picked at random

9

slide-16
SLIDE 16

Experiments: multitab browsing

  • Experiment with 2 tabs: 0.5s, 3s, 5s
  • FF users use average 2 or 3 tabs
  • Background page picked at random for a batch
  • Success: detection of either page

9

slide-17
SLIDE 17

Control Test (0.5s)

77.08% 9.8% 7.9% 8.23%

Test (3s) Test (5s)

Experiments: multitab browsing

Accuracy for different time gaps

10

Time BW Tab 2 Tab 1

slide-18
SLIDE 18

Experiments: TBB versions

Coexisting Tor Browser Bundle (TBB) versions Versions: 2.4.7, 3.5 and 3.5.2.1 (changes in RP, etc.)

11

slide-19
SLIDE 19

Experiments: TBB versions

Coexisting Tor Browser Bundle (TBB) versions Versions: 2.4.7, 3.5 and 3.5.2.1 (changes in RP, etc.)

Control (3.5.2.1) Test (2.4.7) Test (3.5)

79.58% 66.75% 6.51%

11

slide-20
SLIDE 20

VM New York VM Leuven VM Singapore

Experiments: network conditions

12

KU Leuven DigitalOcean (virtual private servers)

slide-21
SLIDE 21

VM New York VM Leuven VM Singapore

Experiments: network conditions

66.95% 8.83%

Control (LVN) Test (NY)

12

slide-22
SLIDE 22

VM New York VM Leuven VM Singapore

Experiments: network conditions

66.95% 9.33%

Control (LVN) Test (SI)

12

slide-23
SLIDE 23

VM New York VM Leuven VM Singapore

Experiments: network conditions

76.40% 68.53%

Test (NY) Control (SI)

12

slide-24
SLIDE 24

Experiments: data staleness

Accuracy (%) Time (days)

Staleness of our collected data over 90 days (Alexa Top 100)

Less than 50% after 9d.

15

slide-25
SLIDE 25

Summary

16

slide-26
SLIDE 26

Closed vs Open world

Early prior WF works considered closed world of pages users may browse (train and test on that world) In practice: in the Tor case, extremely large universe of web pages How likely is the user (a priori) to visit a target web page?

  • If adversary has a good prior, the attack becomes “confirmation attack”
  • BUT may be hard for adversary to have a good prior, particularly for less

popular pages

  • If the prior is not a good estimate: base rate fallacy many false positives

“False positives matter a lot”1

1Mike Perry, “A Critique of Website Traffic Fingerprinting Attacks”, Tor project Blog, 2013. https://

blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks.

slide-27
SLIDE 27

Breathalyzer test:

0.88 identifies truly drunk drivers (true positives) 0.05 false positives

Alice gives positive in the test

What is the probability that she is indeed drunk? (BDR) Is it 0.95? Is it 0.88? Something in between?

The base rate fallacy: example

17

slide-28
SLIDE 28

Breathalyzer test:

0.88 identifies truly drunk drivers (true positives) 0.05 false positives

Alice gives positive in the test

What is the probability that she is indeed drunk? (BDR) Is it 0.95? Is it 0.88? Something in between?

The base rate fallacy: example

Only 0.1!

17

slide-29
SLIDE 29

The base rate fallacy: example

  • Circumference represents

the world of drivers.

  • Each dot represents a

driver.

18

slide-30
SLIDE 30

The base rate fallacy: example

  • 1% of drivers are driving

drunk (base rate or prior).

19

slide-31
SLIDE 31

The base rate fallacy: example

  • From drunk people 88%

are identified as drunk by the test

20

slide-32
SLIDE 32

The base rate fallacy: example

  • From the not drunk

people, 5% are erroneously identified as drunk

21

slide-33
SLIDE 33
  • Alice must be within the

black circumference

  • Ratio of red dots within

the black circumference: BDR = 7/70 = 0.1 !

The base rate fallacy: example

22

slide-34
SLIDE 34

Base rate must be taken into account In WF:

Blue: webpages Red: monitored Base rate?

The base rate fallacy in WF

23

slide-35
SLIDE 35

Probability of visiting a monitored page? Experiment

  • 4 monitored pages
  • Train on Alexa top 100, test on Alexa top 35K
  • Binary classification: monitored / non-monitored

Prior probability of visiting a monitored page:

  • Uniform in 35K
  • Priors estimated from Active Linguistic Authentication Dataset (ALAD)

dataset (3,5%): Real-world users (80 users, 40K unique URLs)

The base rate fallacy in WF

24

slide-36
SLIDE 36

Experiment: BDR in a 35K world

Size of the world

  • Uniform world
  • Non-popular pages

from ALAD

25

0.13 0.026 0.8

slide-37
SLIDE 37

Classify, but verify

Verification step to test classifier confidence Number of FPs reduced But BDR is still very low for non popular pages

26

slide-38
SLIDE 38

Cost for the adversary

Adversary’s cost will depend on: Number of pages (versions, personalisation)

27

slide-39
SLIDE 39

Cost for the adversary

Adversary’s cost will depend on: Number of pages (versions, personalisation) Number of target users (system configuration, location)

29

slide-40
SLIDE 40

Cost for the adversary

Adversary’s cost will depend on: Number of pages (versions, personalisation) Number of target users (system configuration, location) Training and testing complexities of the classifier

31

slide-41
SLIDE 41

Cost for the adversary

Adversary’s cost will depend on: Number of pages (versions, personalisation) Number of target users (system configuration, location) Training and testing complexities of the classifier Maintaining a successful WF system is costly

32

slide-42
SLIDE 42

Defenses against WF attacks

High level: Randomized pipelining, HTTPOS Ineffective Supersequence approaches, traffic morphing: grouping pages to create anonymity sets Infeasible BuFLO: constant rate Expensive (bandwidth), usability (latency) Tamaraw, CS-BuFLO Still expensive (bandwidth), usability (latency)

slide-43
SLIDE 43

Requirements defenses

Effectiveness Do not increase latency No need for computing / distributing auxiliary information No server-side cooperation needed Bandwidth: some increase is tolerable in the input connections to the network

slide-44
SLIDE 44

Adaptive padding

Based on proposal by Shmatikov and Wang as defense for E2E traffic confirmation attacks Generates traffic packets at random times Inter-packet timings following distribution of general web traffic Does NOT introduce latency: real packets are not delayed Disturbs key traffic features exploited by classifiers (burst features, total size) in an unpredictable way, different for each visit to the same page

slide-45
SLIDE 45

Adaptive padding implementation

Implemented as a pluggable transport Implemented by both ends (OP Guard or Bridge) Controlled by the client (OP) Need to obtain the distribution of inter-packet delays: crawl

slide-46
SLIDE 46

Adaptive padding

slide-47
SLIDE 47

Modifications to adaptive padding

Interactivity: two additional histograms to generate dummies in response to a packet received from the other end Control messages: for client to tell server parameters of padding Soft-stop condition: sampling an infinity value (probabilistic)

slide-48
SLIDE 48

Adaptive padding evaluation

Classifier: kNN (Wang et al.) Experimental setup:

Training: Top Alexa 100 Monitored pages: 10, 25, 80 Open world: 5K-30K pages

slide-49
SLIDE 49

Evaluation results

Comparison with other defenses Closed world: 100 pages Ideal attack conditions

slide-50
SLIDE 50

Evaluation results (realistic)

50 monitored pages 5K pages (open world) k-NN with k=5

slide-51
SLIDE 51

Conclusions

Significant difference between WF attack in ideal lab conditions and more realistic conditions Effect of False Positives (base rate fallacy) Attack is costly for the adversary Adaptive padding is an effective defence with no latency and moderate bandwidth overhead

34

slide-52
SLIDE 52

Questions?

Thank you for your attention.