FilterMap: Measuring Censorship Filters at Global Scale
Ram Sundara Raman1, Adrian Stoll1, Jakub Dalek2, Reethika Ramesh1, Will Scott3, Roya Ensafi1 University of Michigan1, The Citizen Lab2, Independent3 24 February 2020
FilterMap: Measuring Censorship Filters at Global Scale Ram Sundara - - PowerPoint PPT Presentation
FilterMap: Measuring Censorship Filters at Global Scale Ram Sundara Raman 1 , Adrian Stoll 1 , Jakub Dalek 2 , Reethika Ramesh 1 , Will Scott 3 , Roya Ensafi 1 University of Michigan 1 , The Citizen Lab 2 , Independent 3 24 February 2020 Content
Ram Sundara Raman1, Adrian Stoll1, Jakub Dalek2, Reethika Ramesh1, Will Scott3, Roya Ensafi1 University of Michigan1, The Citizen Lab2, Independent3 24 February 2020
Content Filtering Technologies
2
○ Intended use - Security ○ Side effect - Censorship, surveillance
advanced features
Netsweeper and Citizen Lab
3
grade filtering, dynamic categorization of websites
products over several years
block LGBTQ content
4
Proliferation of Filters
5
Previous Work
6
○ Physical access ○ In-country collaborators
Blockpages
○ Trademark of the manufacturing vendor ○ Identity of the deploying actor
censorship filter deployments
consistent and scalable
7
Objectives
Data Collection
Collect many blockpages from filter deployments
8
Data Analysis
Identify filters from blockpages
Data Collection
Collect the most comprehensive database of filter blockpages
9
Data Collection
10
Censorship measurement techniques frequently observe blockpages
Data Collection
11
Censorship measurement techniques frequently observe blockpages
TCP Handshake
Server
GET https://blocked.com
Inject V
u n t e e r
Volunteer measurement https://ooni.org/
Challenges
Data Collection
https://ooni.org
12
Censorship measurement techniques frequently observe blockpages
Quack
Remote measurement
VanderSloot et al. [USENIX 2018]
Measurement Machine Echo Server
GET https://blocked.com (Port 7) TCP Handshake
Inject Inject
GET https://blocked.com
Challenges
Data Collection
13
Censorship measurement techniques frequently observe blockpages Quack
Remote measurement
Hyperquack
New remote measurement
requesting a domain not hosted on the server is predictable
https://ooni.org
Hyperquack
14
46.43.36.222
Hyperquack
15
46.43.36.222
Hyperquack
16
46.43.36.222
Measurement Machine
Hyperquack
17
46.43.36.222
Measurement Machine
GET https://www.ndss-symposium.org
Hyperquack
18
46.43.36.222
Measurement Machine
GET https://www.ndss-symposium.org
GET https://www.usenix.org
Hyperquack
19
46.43.36.222
Measurement Machine
GET https://www.usenix.org
Hyperquack
20
46.43.36.222
Measurement Machine
GET https://www.sigsac.org
Hyperquack
21
46.43.36.222
Measurement Machine
Hyperquack
22
46.43.36.222
Measurement Machine
GET https://www.sigsac.org
Hyperquack
23
46.43.36.222
Measurement Machine
GET https://www.sigsac.org
Canonical Templates
benign domain patterns (<www>.example1298.<com>)
commonly changing elements e.g. date, domain
save as canonical template
24
Censorship Detection
for sensitive keywords
canonical template, then there is censorship
and after to ensure consistency
Measurement Machine W e b S e r v e r
GET https://example{1,2,3}.com TCP Handshake HTTPS reply (e.g., Status Code: 301 Moved) Build Canonical template of server response GET https://blocked.com Inject Response different from Canonical Template: Censorship
x4
HTTPS reply (e.g., Status Code: 301 Moved) GET https://example{1,2,3}.com 25
Censorship Detection
for sensitive keywords
canonical template, then there is censorship
and after to ensure consistency
Measurement Machine W e b S e r v e r
GET https://example{1,2,3}.com TCP Handshake HTTPS reply (e.g., Status Code: 301 Moved) Build Canonical template of server response GET https://blocked.com Inject Response different from Canonical Template: Censorship
x4
HTTPS reply (e.g., Status Code: 301 Moved) GET https://example{1,2,3}.com 26
Censorship Detection
for sensitive keywords
canonical template, then there is censorship
and after to ensure consistency
Measurement Machine W e b S e r v e r
GET https://example{1,2,3}.com TCP Handshake HTTPS reply (e.g., Status Code: 301 Moved) Build Canonical template of server response GET https://blocked.com Inject Response different from Canonical Template: Censorship
x4
HTTPS reply (e.g., Status Code: 301 Moved) GET https://example{1,2,3}.com 27
Censorship Detection
for sensitive keywords
canonical template, then there is censorship
and after to ensure consistency
Measurement Machine W e b S e r v e r
GET https://example{1,2,3}.com TCP Handshake HTTPS reply (e.g., Status Code: 301 Moved) Build Canonical template of server response GET https://blocked.com Inject Response different from Canonical Template: Censorship
x4
HTTPS reply (e.g., Status Code: 301 Moved) GET https://example{1,2,3}.com 28
53 million public HTTP hosts
29
Source - censys.io
Vantage Point Selection
30
providers
Vantage Point Selection
31
providers
https://corporate.comcast.com/
Vantage Point Selection
32
providers
23.219.228.121
https://corporate.comcast.com/
Ethics
33
website
Measurements
○ HyperQuack and Quack twice a week - November 2018 to January 2019 ○ Citizen Lab Global List (~1200 domains) + Alexa Top 1000 domains
34
○
3 weeks in October 2018
○
HyperQuack - 9,223 VPs
○
Quack - 33,602 VPs
○
18,736 domains - Citizen Lab Test List
○
Added OONI data
Data Analysis
Automate the identification of filters from more than a million disrupted responses
35
Iterative Classification
36
same content
subset of the HTML page or header
Image Clustering
37
FilterMap
FilterMap enables continuous, sustainable, data-driven view of filter deployment
38
Results
FilterMap creates a map of filter deployments based on the vantage points measured
39
FilterMap Results
40
either vendors or actors)
firewalls, ISP and organizational deployments
Commercial Filters
41
Commercial Filters
42
anonymization tools most commonly blocked
FilterMap Results
43
FilterMap Results
44
FilterMap Results
45
○ 70 - Latitudinal ○ 20 additional - Longitudinal
Limitations and Future Work
46
○ Future work - Certificate, TCP/IP header
Implications
47
circumvent
Summary
48
technologies for censorship
filter deployments continuously and sustainably
content
https://censoredplanet.org/filtermap
Thank you
49
Ram Sundara Raman1, Adrian Stoll1, Jakub Dalek2, Reethika Ramesh1, Will Scott3, Roya Ensafi1 University of Michigan1, The Citizen Lab2, Independent3 https://censoredplanet.org/filtermap
49
50
Netsweeper
Canadian Filter Vendor
51
Pros Cons OONI In-depth measurements close to the user (Volunteer -> Site) Scale, Continuity, Ethics Quack Scale - 33,000 vantage points Only Port 7 measurements Hyperquack Port 80 and Port 443 measurements Can only detect filter if it acts in both directions (MM -> VP)
Summary of Data Collection Techniques
52
Blockpages as Identifiers
53
blockpages
TCP/IP headers, DNS records, certificates
Unexpected Responses
54
filter blockpages or unexpected responses - Server not found errors, DDoS checks
unexpected responses
The page length metric
55
Data Collection
Volunteer measurement https://ooni.org/
Hyperquack
New remote measurement
56
Censorship measurement techniques frequently observe blockpages
Quack
Remote measurement
VanderSloot et al. [USENIX 2018]
OONI
Challenges
57
TCP Handshake
Server
GET https://blocked.com
Inject Volunteer
Direct measurement technique Pros
Quack
Measurement Machine Echo Server
GET https://blocked.com (Port 7) TCP Handshake
Inject Inject
GET https://blocked.com
58
Challenges
common Port 80/443 Remote measurement - TCP port 7 (Echo) Pros
Hyperquack
59
study
domain not hosted on the server is predictable
Ethics
60
consent
○ Servers of ISPs ○ Echo servers having NMap labels such as routers, switches etc.
community
Ethics
61
a time, add delays, and use a round-robin schedule
Vantage Point Characterization
62
Iterative Classification Evaluation
63
FilterMap Results - Data Collection
64
countries as Quack and OONI
FilterMap Results - Blockpages
65
were in English
access to content
FilterMap Results - Manufacturing Country
66
FilterMap Results - Categories
67
FilterMap Results - Longitudinal
68
FilterMap Results - Censys
69