CrowdSurf Empowering Transparency in the Web Hassan Metwalley 25 - - PowerPoint PPT Presentation

crowdsurf
SMART_READER_LITE
LIVE PREVIEW

CrowdSurf Empowering Transparency in the Web Hassan Metwalley 25 - - PowerPoint PPT Presentation

CrowdSurf Empowering Transparency in the Web Hassan Metwalley 25 Aug 2016, Stefano Traverso ACM SIGCOMM, Florianopolis Marco Mellia Stanislav Miskovic Mario Baldi Introduction 26 August 2016 CrowdSurf - Stefano Traverso 2 Do you


slide-1
SLIDE 1

CrowdSurf

Empowering Transparency in the Web

Hassan Metwalley Stefano Traverso Marco Mellia Stanislav Miskovic Mario Baldi

25 Aug 2016, ACM SIGCOMM, Florianopolis

slide-2
SLIDE 2

26 August 2016 CrowdSurf - Stefano Traverso 2

Introduction

slide-3
SLIDE 3

Do you know what you HTTP?

26 August 2016 CrowdSurf - Stefano Traverso 3

slide-4
SLIDE 4

Example

Web tracking

Thousands of Web trackers collect our data

q Browsing histories q Religious, sexual, and political preferences qOn average, the first tracker is met as soon as the browser starts [1] qSome trackers reach 96% of users [1] q71% of websites host at least one tracker [1]

[1] Metwalley, H. et al. “The Online Tracking Horde: A View from Passive Measurements”, TMA 2015 26 August 2016 CrowdSurf - Stefano Traverso 4

slide-5
SLIDE 5

The Open Question How to know and choose which services our data is exchanged with and how?

26 August 2016 CrowdSurf - Stefano Traverso 5

slide-6
SLIDE 6

Partial solutions

In-network devices

q Firewalls and proxies ØFail in case of encrypted traffic (HTTPS) ØLack scalability ØManaged by third parties

26 August 2016 CrowdSurf - Stefano Traverso 6

On-client

q Browser plugins ØLimited scope ØNo control on device traffic ØNot transparent

slide-7
SLIDE 7

q Holistic

working in any scenario

q Client-centric

available on any kind of device

q Practical, not revolutionary

use existing technology

q Crowd-sourced

knowledge built on a community of users

q Automatic

little engagement of the user

q Privacy-safe

never compromise users’ privacy

Goal Let users re-gain visibility and control on the information they exchange with Web services

A New System

26 August 2016 CrowdSurf - Stefano Traverso 7

Design Principles

slide-8
SLIDE 8

26 August 2016 CrowdSurf - Stefano Traverso 8

CrowdSurf

slide-9
SLIDE 9

CrowdSurf

26 August 2016 CrowdSurf - Stefano Traverso 9

Cloud

q A controller collects information about the services users visit

Ø Explicit -> their opinion Ø Implicit -> traffic samples

q Users’ contributions processed by data-analyzers and the advising community q Results = suggestions about the reputation of services

Client

q Users download the suggestions they like q the CrowdSurf Layer translates them into rules q Rules = actions on users’ traffic Ø Regexp + action

slide-10
SLIDE 10

CrowdSurf Controllers

26 August 2016 CrowdSurf - Stefano Traverso 10

Open Controller

qCollaborative approach qUsers improve the wisdom

  • f the system

Ø Traffic samples and

  • pinions

Ø Build data analyzers and suggestions

Corporate Controller

qBuilds directly rules for employees qEmployees can not customize rules qAll devices follow the same rules

slide-11
SLIDE 11

HTTP

TLS TCP

Open Controller Corporat e Controller Suggestions to Rules

CrowdSurf Layer

Rule Processor

Action

Redirect

Regular Expression Matching

Modify Allow Bloc k Log and Report

The CrowdSurf Layer

Anonymization

slide-12
SLIDE 12

CrowdSurf in a picture

26 August 2016 CrowdSurf - Stefano Traverso 12

Web Services

Opinions + Traffic samples Suggestions Traffic samples Rules Ruled Interaction

Open Controller Corporate Controller

slide-13
SLIDE 13

26 August 2016 CrowdSurf - Stefano Traverso 13

Proof of Concept

slide-14
SLIDE 14

Prototype

26 August 2016 CrowdSurf - Stefano Traverso 14

Controller

q Java-based web service q Communicates with CrowdSurf devices q Hosts a data analyzer for identification of tracking sites q Collects traffic samples q Distributes suggestions

Client

q Implemented as a Firefox plugin q Supports block, redirect, log&report

slide-15
SLIDE 15

Example of Data Analyzer: Automatic Tracker Detector

26 August 2016 CrowdSurf - Stefano Traverso 15

Unsupervised methodology to identify third-party trackers [2]

q Observation:

q trackers usually embed UIDs as URL parameters

q Procedure:

  • 1. Input: HTTP traffic samples provided by CS users
  • 2. Take all HTTP queries to third-party services

http://acmetrack.com/query?key1=X&key2=Y

  • 3. Extract keys (key1, key2) and their values
  • 4. Check the presence of key values uniquely associated

to the users

[2] Metwalley, H. et al “Unsupervised Detection of Web Trackers”, IEEE Globecom 2015

slide-16
SLIDE 16

26 August 2016 CrowdSurf - Stefano Traverso 16

Visit 1 Time

http://acmetrack.com/query?sid=X&tmp=Y&uid=Z

Visit 3 Visit 2 a b c d e f g h i m m m n n n p p p sid tmp uid x y z x y z x y z

Example of Data Analyzer: Automatic Tracker Detector

34 new third-party trackers found

slide-17
SLIDE 17

Performance Implications

  • f running CrowdSurf

26 August 2016 CrowdSurf - Stefano Traverso 17

Paranoid Profile

q Blocks q adv/tracking q JS code q Does not report traffic samples

Kid Profile

q Activates child protection rules q Reports traffic to trackers

Corporate Profile

q Redirects search.google.com to search.bing.com q Blocks social networks, e- commerce sites, trackers q Reports acitivity on DropBox

Different user profiles

slide-18
SLIDE 18

Impact on Web site loading time

26 August 2016 CrowdSurf - Stefano Traverso 18

Kid Paranoid Corporate

Paranoid is 1.07 times faster than baseline Kid is 1.08 times slower Corporate is 1.18 time slower

slide-19
SLIDE 19

26 August 2016 CrowdSurf - Stefano Traverso 19

Conclusion

slide-20
SLIDE 20

Open Problems

26 August 2016 CrowdSurf - Stefano Traverso 20

q Lot of details to consider q Design/develop/stardardize a new network layer q Protecting users’ privacy

q Anonymizing HTTP/S traffic

q Usability q Involve users to join q Protection from malicious biases

slide-21
SLIDE 21

26 August 2016 CrowdSurf - Stefano Traverso 21

Holistic, crowd-sourced system for the auditing of the information we expose in the Web

CrowdSurf

https://www.myermes.com

slide-22
SLIDE 22

CrowdSurf - Stefano Traverso

Thank you!

26 August 2016 22

slide-23
SLIDE 23

Need a new model that…

26 August 2016 CrowdSurf - Stefano Traverso 23

Enables transparency and visibility

Takes actions

Under user’s control

Monitor the HTTP traffic before encryption takes place

Block/manipulate/report transactions to undesired services

Automatic, but configurable

slide-24
SLIDE 24

Example of Data Analyzer: Automatic Tracker Detector

26 August 2016 CrowdSurf - Stefano Traverso 24

Automatic Tracker Detector

Dataset

HTTP trace from ISP running Tstat q 10 days of October 2014 q ~19k monitored users q ~240k HTTP transactions per day

vs

Website Embedded Third- party Trackers Portal1 26 News1 13 E-commerce1 12 E-commerce2 9 E-commerce3 4 Portal2 4 Porn 3 Sportnews 1 SearchEngine 1

News1

Third-party Trackers Keys cl.adform.net xid atemda.com bidderuid x.bidswitch.net user_id www.77tracking.com rand rack.movad.net us

  • vo01.webtrekk.net

cs2 dis.criteo.com uid p.rfihub.com bk-uuid ib.adnxs.com xid

34 new third-party trackers found

slide-25
SLIDE 25

Example

A growing business around our data

26 August 2016 CrowdSurf - Stefano Traverso 25 [3] Metwalley, H. et al. “The Online Tracking Horde: A View from Passive Measurements”, TMA 2015

slide-26
SLIDE 26

Loss of visibility and control

q HTTPS protects our privacy, but… q …prevents third parties to check what’s going on under the hood of encryption q …and severely limits network functions

“Child protection through the use of Internet Watch Foundation blacklists has become

ineffective, with just 5% of entries still being blocked when HTTPS is deployed” [2]

[2] Naylor, D. et al. “The Cost of the "S" in HTTPS”, CoNEXT 2014

26 August 2016 CrowdSurf - Stefano Traverso 26

slide-27
SLIDE 27

Time to collect a dataset

26 August 2016 CrowdSurf - Stefano Traverso 27

googleanalytics

slide-28
SLIDE 28

Monitoring the Web

[1] Popa, L. et al.,“HTTP As the Narrow Waist of the Future Internet,” ACM HotNets, 2010

26 August 2016 CrowdSurf - Stefano Traverso 28

HTTP [1] HTTPS/HTTP 2.0

slide-29
SLIDE 29

CrowdSurf Controllers

26 August 2016 CrowdSurf - Stefano Traverso 29

Open Controller

q Collaborative approach q Users improve the wisdom of the system

Ø Traffic samples and

  • pinions

Ø Build data analyzers and suggestions

Third party Controller

q Suggestions for commercial purposes q Opens to a market of suggestions

Corporate Controller

q Builds directly rules for employees q Employees can not customize rules q All devices follow the same rules

slide-30
SLIDE 30

CrowdSurf in a picture

26 August 2016 CrowdSurf - Stefano Traverso 30

Web Services

Open controller

Traffic samples Corporate Rules Web Browsing Suggestions Corporate Device Private User Device Data Analyzer

Corporate controller Third-party controller