Characterizing Global Web Censorship: Why is it so hard? Phillipa - - PowerPoint PPT Presentation

characterizing global web censorship why is it so hard
SMART_READER_LITE
LIVE PREVIEW

Characterizing Global Web Censorship: Why is it so hard? Phillipa - - PowerPoint PPT Presentation

Workshop on Active Internet Measurements CAIDA Feb. 8, 2012 Characterizing Global Web Censorship: Why is it so hard? Phillipa Gill The Citizen Lab/Stony Brook University Work done in collaboration with: Masashi Crete Nishihata, Jakub Dalek,


slide-1
SLIDE 1

Characterizing Global Web Censorship: Why is it so hard?

Phillipa Gill The Citizen Lab/Stony Brook University

Work done in collaboration with: Masashi Crete Nishihata, Jakub Dalek, Sharon Goldberg, Adam Senft and Greg Wiseman

Workshop on Active Internet Measurements CAIDA

  • Feb. 8, 2012
slide-2
SLIDE 2

Overview

Large-scale politically driven Internet outages are well known…

  • …but what happens within countries is less well understood

We leverage data gathered by an interdisciplinary group (Open Net Initiative) to bootstrap analysis

  • 77 countries, 286 distinct ISPs, measured from 2007-2012
  • Advantages: context about what, when, and where to measure
  • Disadvantages: dearth of technical data/raw measurements

Our results highlight important challenges for censorship research!

2

slide-3
SLIDE 3

Background

  • Where censorship can happen:

3

slide-4
SLIDE 4

Background

  • Where censorship can happen:

4

Start DNS reply?

slide-5
SLIDE 5

Background

  • Where censorship can happen:

5

Start DNS blocking DNS reply? No

slide-6
SLIDE 6

Background

  • Where censorship can happen:

6

Start DNS blocking DNS reply? Yes No DNS redirect?

slide-7
SLIDE 7

Background

  • Where censorship can happen:

7

Start DNS blocking DNS reply? Yes Yes No No DNS redirect? Response to SYN?

slide-8
SLIDE 8

Background

  • Where censorship can happen:

8

Start DNS blocking DNS reply? Yes Yes No No DNS redirect? IP blocking No Response to SYN?

slide-9
SLIDE 9

Background

  • Where censorship can happen:

9

Start DNS blocking DNS reply? Yes Yes No No Response to HTTP request? DNS redirect? IP blocking No Response to SYN? Yes

slide-10
SLIDE 10

Background

  • Where censorship can happen:

10

Start DNS blocking DNS reply? No HTTP Reply Yes Yes No No No Response to HTTP request? DNS redirect? IP blocking No Response to SYN? Yes

slide-11
SLIDE 11

Background

  • Where censorship can happen:

11

Start DNS blocking DNS reply? No HTTP Reply What was it? Yes Yes Yes No No No Response to HTTP request? DNS redirect? IP blocking No Response to SYN? Yes

slide-12
SLIDE 12

Background

  • Where censorship can happen:

12

Start DNS blocking DNS reply? No HTTP Reply RST Block page What was it? Yes Yes Yes No No No Response to HTTP request? DNS redirect? IP blocking No Response to SYN? Yes Infinite HTTP Redirect

slide-13
SLIDE 13

Background

  • Where censorship can happen:

13

Start DNS blocking DNS reply? No HTTP Reply RST Block page What was it? Yes Yes Yes No No No Response to HTTP request? DNS redirect? IP blocking No Response to SYN? Yes Infinite HTTP Redirect

slide-14
SLIDE 14

Methodology

  • Basic idea: Issue requests for a consistent set of sites in the

field and a control location (lab)

  • Software synchronizes the requests between lab and field
  • Once both lab and field have completed, results sent back to

the lab for more analysis

  • What is tested:

– Sites that are likely to trigger censorship – Determined in collaboration with regional groups

  • Where are tests run:

– Combination of targeted/opportunistic testing – Performed by regional collaborators after informed consent meeting

14

slide-15
SLIDE 15

Challenges for censorship research

15

slide-16
SLIDE 16

0.2 0.4 0.6 0.8 1 China Iran UAE Yemen Burma Vietnam Fraction of blocking results Country No DNS Reply DNS Redirection No HTTP Reply RST Blockpage

  • 1. Variation between countries

16

slide-17
SLIDE 17

0.2 0.4 0.6 0.8 1 China Iran UAE Yemen Burma Vietnam Fraction of blocking results Country No DNS Reply DNS Redirection No HTTP Reply RST Blockpage

  • 1. Variation between countries

17

slide-18
SLIDE 18

0.2 0.4 0.6 0.8 1 China Iran UAE Yemen Burma Vietnam Fraction of blocking results Country No DNS Reply DNS Redirection No HTTP Reply RST Blockpage

  • 1. Variation between countries

18

slide-19
SLIDE 19

0.2 0.4 0.6 0.8 1 China Iran UAE Yemen Burma Vietnam Fraction of blocking results Country No DNS Reply DNS Redirection No HTTP Reply RST Blockpage

  • 1. Variation between countries

19

There is no such thing as a “representative” country

slide-20
SLIDE 20
  • 2. Variation between ISPs

20

Decentralized blocking in UAE

0.05 0.1 0.15 0.2 0.25 2007 2008 2009 2010 2011 2012 Fraction of content blocked Year AS 5384 AS 15802

slide-21
SLIDE 21
  • 2. Variation between ISPs

21

Decentralized blocking in UAE

0.05 0.1 0.15 0.2 0.25 2007 2008 2009 2010 2011 2012 Fraction of content blocked Year AS 5384 AS 15802

“Du” ISP does not censor prior to April 2008

slide-22
SLIDE 22
  • 2. Variation between ISPs

22

Decentralized blocking in UAE

0.05 0.1 0.15 0.2 0.25 2007 2008 2009 2010 2011 2012 Fraction of content blocked Year AS 5384 AS 15802

Censorship is a per-ISP property (when censorship is decentralized)

slide-23
SLIDE 23
  • 2. Variation between types of networks

23

slide-24
SLIDE 24
  • 2. Variation between types of networks

24

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Jaccard similarity coeff. Country

Academic networks block an average of 40% less!

slide-25
SLIDE 25
  • 2. Variation between types of networks

25

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Jaccard similarity coeff. Country

Academic networks block an average of 40% less!

Academic networks are not representative!

slide-26
SLIDE 26
  • 3. Sudden temporal shifts in blocking

26

0.05 0.1 0.15 0.2 0.25 0.3 0.35 2009 2010 2011 2012 Fraction of tests blocked Year Political Social Internet Conflict

Censorship in Burma over time

slide-27
SLIDE 27
  • 3. Sudden temporal shifts in blocking

27

0.05 0.1 0.15 0.2 0.25 0.3 0.35 2009 2010 2011 2012 Fraction of tests blocked Year Political Social Internet Conflict

Censorship in Burma over time

End of military rule in 2011 brought political reforms.

slide-28
SLIDE 28
  • 3. Sudden temporal shifts in blocking

28

0.05 0.1 0.15 0.2 0.25 0.3 0.35 2009 2010 2011 2012 Fraction of tests blocked Year Political Social Internet Conflict

Censorship in Burma over time

End of military rule in 2011 brought political reforms.

Need to measure over time and correlate with political changes

slide-29
SLIDE 29
  • 4. Stealthy blocking of certain content

29

0.2 0.4 0.6 0.8 1 Political Social Internet Conflict Fraction of block results Theme No DNS Reply No HTTP Reply RST Blockpage

Censorship of content in Yemen

slide-30
SLIDE 30
  • 4. Stealthy blocking of certain content

30

0.2 0.4 0.6 0.8 1 Political Social Internet Conflict Fraction of block results Theme No DNS Reply No HTTP Reply RST Blockpage

Censorship of content in Yemen

Transparent blocking of social and Internet content

slide-31
SLIDE 31
  • 4. Stealthy blocking of certain content

31

0.2 0.4 0.6 0.8 1 Political Social Internet Conflict Fraction of block results Theme No DNS Reply No HTTP Reply RST Blockpage

Censorship of content in Yemen

Transparent blocking of social and Internet content “Stealthy” blocking of political and conflict related content

slide-32
SLIDE 32
  • 4. Stealthy blocking of certain content

32

0.2 0.4 0.6 0.8 1 Political Social Internet Conflict Fraction of block results Theme No DNS Reply No HTTP Reply RST Blockpage

Censorship of content in Yemen

Transparent blocking of social and Internet content “Stealthy” blocking of political and conflict related content

Measurement needs to be robust to distinguish failure from censorship

slide-33
SLIDE 33
  • 5. The type of content tested matters

33

0.1 0.2 0.3 0.4 0.5 Fraction blocked Country Local Global

slide-34
SLIDE 34
  • 5. The type of content tested matters

34

0.1 0.2 0.3 0.4 0.5 Fraction blocked Country Local Global 3-5X more blocking of local content in China/Yemen * most blocked content is political

slide-35
SLIDE 35
  • 5. The type of content tested matters

35

0.1 0.2 0.3 0.4 0.5 Fraction blocked Country Local Global Less discrepancy in UAE * most blocked content is social

slide-36
SLIDE 36
  • 5. The type of content tested matters

36

0.1 0.2 0.3 0.4 0.5 Fraction blocked Country Local Global

Need to take an interdisciplinary approach to determine what content to test

slide-37
SLIDE 37

Challenges for censorship research:

  • 1. Variations between technology used by countries
  • 2. Variations between ISPs and between ISPs and

institutions

  • 3. Sudden temporal shifts in blocking
  • 4. Stealthy blocking of certain content
  • 5. Locally relevant content is more likely to be blocked

And more! … maintaining infrastructure across funding cycles/staff turn over … informed consent/preserving user privacy when testing can pose a physical risk!

37

slide-38
SLIDE 38

What’s next?

More measurements, taking an interdisciplinary approach to tackle the problem:

  • Rigorous measurements + political context

Data sharing?

  • Short answer: we’re working on it.
  • Longer answer: this project has laid the

foundation in terms of unifying the data and removing PII.

– Anticipate releasing data in the next ~4 months

38

slide-39
SLIDE 39

What I hope to get out of this workshop

  • Discuss how existing platforms may be used for censorship

research Particularly interested in:

– Platforms with visibility into the network edge – DNS/BGP measurements

  • Discuss how a large scale, long-term censorship measurement

platform may be built

  • Discuss how we might distinguish transient failures/TCP bugs

from actual censorship

39