Examining How The Great Firewall Discovers Hidden Circumvention - - PowerPoint PPT Presentation

examining how the great firewall discovers hidden
SMART_READER_LITE
LIVE PREVIEW

Examining How The Great Firewall Discovers Hidden Circumvention - - PowerPoint PPT Presentation

Examining How The Great Firewall Discovers Hidden Circumvention Servers Roya Ensafi , David Fifield, Philipp Winter, Nick Feamster, Nicholas Weaver, and Vern Paxson Oct 29, 2015 1 Circumventing Internet Censorship Using Proxies Web servers


slide-1
SLIDE 1

Examining How The Great Firewall Discovers Hidden Circumvention Servers

Roya Ensafi, David Fifield, Philipp Winter, Nick Feamster, Nicholas Weaver, and Vern Paxson Oct 29, 2015

1

slide-2
SLIDE 2

Internet

Circumventing Internet Censorship Using Proxies

Web servers

2

slide-3
SLIDE 3
  • Not everyone can connect to all web servers

Circumventing Internet Censorship Using Proxies

Web servers

DPI 3

Internet

slide-4
SLIDE 4

Internet

  • Not everyone can connect to all web servers
  • Many use proxy servers to circumvent censorship

Circumventing Internet Censorship Using Proxies

Web servers

DPI

Proxy server

4

slide-5
SLIDE 5

Internet

  • Not everyone can connect to all web servers
  • Many use proxy servers to circumvent censorship
  • Governments are getting smarter at detecting proxy servers

Circumventing Internet Censorship Using Proxies

Web servers

DPI

Proxy server

5

slide-6
SLIDE 6
  • Not everyone can connect to all web servers
  • Many use proxy servers to circumvent censorship
  • Governments are getting smarter at detecting proxy servers

Circumventing Internet Censorship Using Proxies

Web servers Internet

DPI

Proxy server

6

How do governments find these proxies?

slide-7
SLIDE 7

How GFW Discovers Hidden Circumvention Servers

We focus on the GFW and Tor

  • GFW is a sophisticated censorship system
  • Tor has a long history of being used for circumventing government

censorship

7

slide-8
SLIDE 8

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW

T i m e

8

slide-9
SLIDE 9

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays

T i m e

9

slide-10
SLIDE 10

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited

T i m e

10

slide-11
SLIDE 11

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited Use DPI to detect Tor TLS handshake

T i m e

11

slide-12
SLIDE 12

Fingerprinting the Tor TLS Handshake

  • TLS handshake is unencrypted and leaks information
  • Tor’s use of TLS has some peculiarities

○ X.509 certificate life times ○ Cipher suites ○ Randomly generated server name indication (e.g., www.6qgoz6epdi6im5rvxnlx. com)

  • GFW looks (at least) for cipher suites in the TLS client hello

12

slide-13
SLIDE 13

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited Introduce pluggable transports to hide the handshake such as obfs2, obfs3 Use DPI to detect Tor TLS handshake

T i m e

13

slide-14
SLIDE 14

Tor Pluggable Transport

  • Pluggable transports are drop-in modules for traffic obfuscation
  • Many modules have been written, but we focus on

  • bfs2 (First deployed module)

■ First 20 bytes can be used to detect Tor traffic with high confidence. ○

  • bfs3 (obfs2’s successor)

■ Makes Tor traffic look like a uniformly random byte stream

14

slide-15
SLIDE 15

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited Introduce pluggable transports to hide the handshake such as obfs2, obfs3 Use DPI to detect Tor TLS handshake, then probe and block bridges

T i m e

15

  • Detection of pluggable transports is uncertain

○ Implies false positives → collateral damage

slide-16
SLIDE 16

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited Introduce pluggable transports to hide the handshake such as obfs2, obfs3 Use DPI to detect Tor TLS handshake, then probe and block bridges

T i m e

16

  • Detection of pluggable transports is uncertain

○ Implies false positives → collateral damage GFW added active probing to complement the DPI fingerprinting

slide-17
SLIDE 17

How does GFW Block Tor Hidden Circumvention Servers?

1. Network monitoring (e.g., switch mirror port) 2. DPI for suspicious traffic (e.g., cipher suite) 3. Actively probing server to verify suspicion 4. Blocking server

17

slide-18
SLIDE 18

Censorship Arms Race: GFW vs. Tor

Use public Tor network to circumvent GFW Download consensus and block relays Introduce private bridges, whose distribution is rate-limited Introduce pluggable transports to hide the handshake such as obfs2, obfs3 Use DPI to detect Tor TLS handshake

T i m e

18

Use DPI + Active probing

slide-19
SLIDE 19

Many Questions about Active Probing are Unanswered!

  • Only two blog posts and Winter’s FOCI’12 paper
  • We lack a comprehensive picture of more complicated questions
  • We want to know:

○ Implementation, i.e., how does it block? ○ Architecture, i.e., how is a system added to China’s backbone? ○ Policy, i.e., what kind of protocols does it block? ○ Effectiveness, i.e., what’s the degree of success at discovering Tor bridges?

19

slide-20
SLIDE 20

Overview of Our Datasets:

20

EC2-Vanilla Unicom ISP

Clients in China

Shadow Infrastructure

EC2-Obfs2 EC2-Obfs3 EC2-Vanilla EC2-Obfs2 EC2-Obfs3

Amazon AWS

CERNET Network

slide-21
SLIDE 21

Overview of Our Datasets:

21

30000

. .

30300

. .

30600 Vanilla Tor Client in China

Sybil Infrastructure

Forwarding 600 ports to Tor port

EC2-Vanilla Unicom ISP

Clients in China

Shadow Infrastructure

EC2-Obfs2 EC2-Obfs3 EC2-Vanilla EC2-Obfs2 EC2-Obfs3

Amazon AWS

CERNET Network

slide-22
SLIDE 22

Overview of Our Datasets:

22

Server Log Analysis Application logs of a web server that also runs a Tor bridge since 2010.

EC2-Vanilla Unicom ISP

Clients in China

Shadow Infrastructure

EC2-Obfs2 EC2-Obfs3 EC2-Vanilla EC2-Obfs2 EC2-Obfs3

Amazon AWS

CERNET Network

30000

. .

30300

. .

30600 Vanilla Tor Client in China

Sybil Infrastructure

Forwarding 600 ports to Tor port

slide-23
SLIDE 23
  • For the Shadow and the Sybil datasets:

○ We had pcap files of both the clients and the bridges.

  • For the Log dataset, we only had application logs.

Overview of Our Datasets:

23

Dataset Time span

Shadow Dec 2014 -- Feb 2015 (3 months) Sybil Jan 29, 2015 -- Jan 30, 2015 (20 hours) Log Jan 2010 -- Aug 2015 (5 years)

slide-24
SLIDE 24

How to Distinguish Probers from Genuine Clients?

24

slide-25
SLIDE 25

How to Distinguish Probers from Genuine Clients?

  • Detecting probers in Sybil dataset is easy, all the probers:

○ Visited our vanilla Tor bridge after our client established connections ○ Originated from China

25

slide-26
SLIDE 26

How to Distinguish Probers from Genuine Clients?

  • Detecting probers in Sybil dataset is easy, all the probers:

○ Visited our vanilla Tor bridge after our client established connections ○ Originated from China

  • For the other datasets, we adopt an algorithm:

○ If the cipher suites is in the TLS client hello => Vanilla bridge probes ○ If the first 20 bytes can reveal Obfs2 => Obfs2 bridges probers ○ ...

26

slide-27
SLIDE 27

How Many Unique Probers did We Find?

27

slide-28
SLIDE 28
  • Using Sybil, Shadow and Log dataset

○ In total, we collected 16,083 unique prober IP addresses

How Many Unique Probers did We Find?

28

14,802

(Over 5 years)

135

(3 months)

1,090

(22 hours)

Sybil Shadow Log 2

89

20

GFW’s famous IP: 202.108.181.70

slide-29
SLIDE 29

Can We Fingerprint Active Probers?

29

slide-30
SLIDE 30

Can We Fingerprint Active Probers?

  • TCP layer

○ TSval slope: timestamp clock rate ○ TSval intercept: (rough) system uptime ○ GFW likely operate a handful of physical probing systems

30

Shadow exp. with 158 Prober IPs Sybil exp. with 1,182 Prober IPs Log dataset with

14,912 Prober IPs

slide-31
SLIDE 31

Can We Fingerprint Active Probers?

  • TCP layer

○ Striking pattern in initial sequence numbers (derived from time) of 1,182 probes ○ Shared pattern in TSval for all three datasets

31

Initial sequence number

slide-32
SLIDE 32

What do These Patterns Mean?

  • Active probing connections leak shared state

○ ISNs, TSval, source ports, ...

  • GFW likely operates only few physical systems
  • Thousands of IP addresses are controlled by central source

32

slide-33
SLIDE 33

How Quickly do Active Probes Show Up?

33

slide-34
SLIDE 34

How Quickly do Active Probes Show Up?

  • Sybil dataset shows that system now works in real time

○ Median delay between Tor connection and subsequent probing connection is ~500ms ○ 1,182 distinct probes showed up in 22 hours

34

slide-35
SLIDE 35

Is Active Probing Successful?

35

slide-36
SLIDE 36

Is Active Probing Successful?

36

slide-37
SLIDE 37
  • Tor clients succeed in connecting roughly every 25 hours

○ Might reflect implementation artifact of GFW

Is Active Probing Successful?

37

slide-38
SLIDE 38
  • Tor clients succeed in connecting roughly every 25 hours

○ Might reflect implementation artifact of GFW

  • bfs2 and obfs3 (~98%) were almost always reachable for clients

○ Surprising because GFW can probe and block obfs2 and obfs3

Is Active Probing Successful?

38

slide-39
SLIDE 39

Takeaway messages

Our results show that the active probing system

  • Makes use of a large amount of IP addresses, clearly centrally controlled

○ We can not just blacklist probers’ IP addresses

  • Operates in real time
  • Probes Vanilla, Obfs2, and Obfs3 Bridge

Tor’s pluggable transports led to GFW’s “pluggable censorship”

39

slide-40
SLIDE 40

Q&A

  • Project page: https://nymity.ch/active-probing/
  • Log and Sybil data sets are available online
  • Contact: rensafi@cs.princeton.edu

40

slide-41
SLIDE 41

What Is the Characteristic of the Probing System?

  • Sensor responsible for triggering probes operates single-sidedly:

○ SYN, followed by ACK, then Tor’s TLS client hello) => trigger probe.

  • The sensor does not seem to robustly reassemble TCP:

○ The fragmented data did not trigger an active probe, which differs from the GFW

  • Traceroute to the sensors suggested:

○ Unicom’s sensor appears to operate on the same link as the GFW ○ CERNET sensor appears one hop closer to our server

41