NANO: Network Access Neutrality Observatory Mukarram Bin Tariq, - - PowerPoint PPT Presentation

nano network access neutrality observatory
SMART_READER_LITE
LIVE PREVIEW

NANO: Network Access Neutrality Observatory Mukarram Bin Tariq, - - PowerPoint PPT Presentation

NANO: Network Access Neutrality Observatory Mukarram Bin Tariq, Murtaza Motiwala, Nick Feamster, Mostafa Ammar Georgia Tech Tuesday, October 7, 2008 1 Net Neutrality Tuesday, October 7, 2008 2 Net Neutrality Tuesday, October 7, 2008 2


slide-1
SLIDE 1

NANO: Network Access Neutrality Observatory

Mukarram Bin Tariq, Murtaza Motiwala, Nick Feamster, Mostafa Ammar Georgia Tech

1 Tuesday, October 7, 2008

slide-2
SLIDE 2

Net Neutrality

2 Tuesday, October 7, 2008

slide-3
SLIDE 3

Net Neutrality

2 Tuesday, October 7, 2008

slide-4
SLIDE 4

Net Neutrality

2 Tuesday, October 7, 2008

slide-5
SLIDE 5

Example: BitTorrent Blocking

3 Tuesday, October 7, 2008

slide-6
SLIDE 6

Example: BitTorrent Blocking

http://broadband.mpi-sws.mpg.de/transparenccy

3 Tuesday, October 7, 2008

slide-7
SLIDE 7

Many Forms of Discrimination

Throttling and prioritizing based on destination

  • r service

Target domains, applications, or content

4 Tuesday, October 7, 2008

slide-8
SLIDE 8

Many Forms of Discrimination

Throttling and prioritizing based on destination

  • r service

Target domains, applications, or content Discriminatory peering Resist peering with certain content providers ...

4 Tuesday, October 7, 2008

slide-9
SLIDE 9

Problem Statement

Identify whether a degradation in a service performance is caused by discrimination by an ISP Quantify the causal effect

5 Tuesday, October 7, 2008

slide-10
SLIDE 10

Problem Statement

Identify whether a degradation in a service performance is caused by discrimination by an ISP Quantify the causal effect Existing techniques detect specific ISP methods TCP RST (Glasnost) ToS-bit based de-prioritization (NVLens)

5 Tuesday, October 7, 2008

slide-11
SLIDE 11

Problem Statement

Identify whether a degradation in a service performance is caused by discrimination by an ISP Quantify the causal effect Existing techniques detect specific ISP methods TCP RST (Glasnost) ToS-bit based de-prioritization (NVLens) Goal: Establish a causal relationship in the general case, without assuming anything about the ISP’s methods

5 Tuesday, October 7, 2008

slide-12
SLIDE 12

Causality: An Analogy from Health

  • Epidemiology: study causal relationships

between risk factors and health outcome

  • NANO: infer causal relationship between

ISP and service performance

6 Tuesday, October 7, 2008

slide-13
SLIDE 13

Does Aspirin Make You Healthy?

7 Tuesday, October 7, 2008

slide-14
SLIDE 14

Does Aspirin Make You Healthy?

Sample of patients Positive correlation in health and treatment

Aspirin No Aspirin Healthy Not Healthy

40% 15% 10% 35%

7 Tuesday, October 7, 2008

slide-15
SLIDE 15

Does Aspirin Make You Healthy?

Sample of patients Positive correlation in health and treatment Can we say that Aspirin causes better health?

Aspirin No Aspirin Healthy Not Healthy

40% 15% 10% 35%

Aspirin Health

?

7 Tuesday, October 7, 2008

slide-16
SLIDE 16

Does Aspirin Make You Healthy?

Sample of patients Positive correlation in health and treatment Can we say that Aspirin causes better health?

Confounding Variables: correlate with both cause and outcome variables and confuse the causal inference

Aspirin No Aspirin Healthy Not Healthy

40% 15% 10% 35%

Sleep Duration Diet Other Drugs Gender Aspirin Health

?

7 Tuesday, October 7, 2008

slide-17
SLIDE 17

Does an ISP Cause Service Degradation?

8 Tuesday, October 7, 2008

slide-18
SLIDE 18

Does an ISP Cause Service Degradation?

Comcast No Comcast BitTorrent Download Time

5 sec 2 sec

Sample of client performances Some correlation in ISP and service performance

8 Tuesday, October 7, 2008

slide-19
SLIDE 19

Does an ISP Cause Service Degradation?

Comcast No Comcast BitTorrent Download Time

5 sec 2 sec

Client Setup ToD Content Location

Sample of client performances Some correlation in ISP and service performance Can we say that Comcast is discriminating?

Many confounding variables can confuse the inference.

Comcast BT Download Time

?

8 Tuesday, October 7, 2008

slide-20
SLIDE 20

Causation vs. Association (1)

Causal Effect = E(Real Download time using Comcast) E(Real Download time not using Comcast)

9 Tuesday, October 7, 2008

slide-21
SLIDE 21

Causation vs. Association (1)

Performance with the ISP Causal Effect = E(Real Download time using Comcast) E(Real Download time not using Comcast)

9 Tuesday, October 7, 2008

slide-22
SLIDE 22

Causation vs. Association (1)

Baseline Performance Performance with the ISP Causal Effect = E(Real Download time using Comcast) E(Real Download time not using Comcast)

9 Tuesday, October 7, 2008

slide-23
SLIDE 23

Causation vs. Association (1)

Baseline Performance Performance with the ISP Causal Effect = E(Real Download time using Comcast) E(Real Download time not using Comcast)

θ = E(G1) − E(G0)

G1, G0: Ground-truth values for performance (aka. Counter-factual values)

9 Tuesday, October 7, 2008

slide-24
SLIDE 24

Causation vs. Association (1)

Baseline Performance Performance with the ISP Causal Effect = E(Real Download time using Comcast) E(Real Download time not using Comcast)

θ = E(G1) − E(G0)

G1, G0: Ground-truth values for performance (aka. Counter-factual values) Problem: Generally, we do not observe both ground truth values for the same clients. Consequently, in situ data sets are not sufficient to directly estimate causal effect.

9 Tuesday, October 7, 2008

slide-25
SLIDE 25

Causation vs. Association (2)

We can observe association in an in situ data set.

10 Tuesday, October 7, 2008

slide-26
SLIDE 26

Causation vs. Association (2)

Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set.

10 Tuesday, October 7, 2008

slide-27
SLIDE 27

Causation vs. Association (2)

Observed Performance with the ISP Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set.

10 Tuesday, October 7, 2008

slide-28
SLIDE 28

Causation vs. Association (2)

Observed Baseline Performance Observed Performance with the ISP Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set.

10 Tuesday, October 7, 2008

slide-29
SLIDE 29

Causation vs. Association (2)

α = E(Y |X = 1) − E(Y |X = 0)

Observed Baseline Performance Observed Performance with the ISP Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set.

10 Tuesday, October 7, 2008

slide-30
SLIDE 30

Causation vs. Association (2)

α = E(Y |X = 1) − E(Y |X = 0)

Observed Baseline Performance Observed Performance with the ISP Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set. In general, .

al α = θ.

10 Tuesday, October 7, 2008

slide-31
SLIDE 31

Causation vs. Association (2)

α = E(Y |X = 1) − E(Y |X = 0)

Observed Baseline Performance Observed Performance with the ISP Association = E(Download time using Comcast) E(Download time not using Comcast) We can observe association in an in situ data set. In general, . How to estimate causal effect ( ) ?

al α = θ. θ.

10 Tuesday, October 7, 2008

slide-32
SLIDE 32

Estimating the Causal Effect

Two common approaches

  • a. Random Treatment
  • b. Adjusting for Confounding

Variables

11 Tuesday, October 7, 2008

slide-33
SLIDE 33

Random Treatment

!H !H !H !H !H

H H H H

12 Tuesday, October 7, 2008

slide-34
SLIDE 34

Random Treatment

Given a population:

!H !H !H !H !H

H H H H

12 Tuesday, October 7, 2008

slide-35
SLIDE 35

Random Treatment

Given a population:

  • 1. Treat subjects with Aspirin randomly,

irrespective of their health

Aspirin Treated Not Aspirin Treated

!H !H !H !H !H

H H H H

12 Tuesday, October 7, 2008

slide-36
SLIDE 36

Random Treatment

Given a population:

  • 1. Treat subjects with Aspirin randomly,

irrespective of their health

  • 2. Observe new outcome and

measure association

Aspirin Treated Not Aspirin Treated

!H

H H H H

!H !H !H

H

!H !H !H !H !H

H H H H

= 0.8 - 0.25 = 0.55

α

12 Tuesday, October 7, 2008

slide-37
SLIDE 37

Random Treatment

Given a population:

  • 1. Treat subjects with Aspirin randomly,

irrespective of their health

  • 2. Observe new outcome and

measure association

  • 3. For large samples, association

converges to causal effect if confounding variables do not change Diet, other drugs, etc. should not change

Aspirin Treated Not Aspirin Treated

!H

H H H H

!H !H !H

H

!H !H !H !H !H

H H H H

θ.

= 0.8 - 0.25 = 0.55

α

12 Tuesday, October 7, 2008

slide-38
SLIDE 38

Random Treatment

(How to apply to the ISP Case?)

13 Tuesday, October 7, 2008

slide-39
SLIDE 39

Random Treatment

(How to apply to the ISP Case?)

  • Ask clients to change their ISP to an

arbitrary one

13 Tuesday, October 7, 2008

slide-40
SLIDE 40

Random Treatment

(How to apply to the ISP Case?)

  • Ask clients to change their ISP to an

arbitrary one

  • Difficult to achieve on the Internet

Changing ISP is cumbersome for the users Changing ISP may change other confounding variables, i.e., the ISP network

13 Tuesday, October 7, 2008

slide-41
SLIDE 41

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

14 Tuesday, October 7, 2008

slide-42
SLIDE 42

!H H !H H !H H H !H !H !H H H H !H !H !H H H H H H H H !H !H !H !H !H !H !H !H H H H H H

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

  • 2. Collect a data set

An in situ data set

Treatment: Baseline: Border( , ) Treated: No border Outcome: Healthy (H), Not Healthy (!H) Stratum: Type {Circle, Square}

14 Tuesday, October 7, 2008

slide-43
SLIDE 43

!H H !H H !H H H !H !H !H H H H !H !H !H H H H H H H H !H !H !H !H !H !H !H !H H H H H H

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

  • 2. Collect a data set
  • 3. Stratify along confounder

variable values Treated Baseline Strata Effect

14 Tuesday, October 7, 2008

slide-44
SLIDE 44

!H H !H H !H H H !H !H !H H H H !H !H !H H H H H H H H !H !H !H !H !H !H !H !H H H H H H

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

  • 2. Collect a data set
  • 3. Stratify along confounder

variable values

!H H !H !H !H !H H !H !H !H H H H H H H H H !H !H !H !H !H !H H H H H !H !H H H H H H

Treated Baseline Strata Effect

14 Tuesday, October 7, 2008

slide-45
SLIDE 45

!H H !H H !H H H !H !H !H H H H !H !H !H H H H H H H H !H !H !H !H !H !H !H !H H H H H H

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

  • 2. Collect a data set
  • 3. Stratify along confounder

variable values

  • 4. Measure Association

!H H !H !H !H !H H !H !H !H H H H H H H H H !H !H !H !H !H !H H H H H !H !H H H H H H

0.55 0.75 0.2 0.44 0.55

  • 0.11

Treated Baseline Strata Effect

14 Tuesday, October 7, 2008

slide-46
SLIDE 46

!H H !H H !H H H !H !H !H H H H !H !H !H H H H H H H H !H !H !H !H !H !H !H !H H H H H H

Adjusting for Confounding Variables

  • 1. List confounders

e.g., gender ={ , }

  • 2. Collect a data set
  • 3. Stratify along confounder

variable values

  • 4. Measure Association
  • 5. If there still is association,

then it must be causation

!H H !H !H !H !H H !H !H !H H H H H H H H H !H !H !H !H !H !H H H H H !H !H H H H H H

0.55 0.75 0.2 0.44 0.55

  • 0.11

Treated Baseline Strata Effect

14 Tuesday, October 7, 2008

slide-47
SLIDE 47

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

15 Tuesday, October 7, 2008

slide-48
SLIDE 48

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline?

15 Tuesday, October 7, 2008

slide-49
SLIDE 49

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline? What are the confounding variables?

15 Tuesday, October 7, 2008

slide-50
SLIDE 50

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline? What are the confounding variables? Is the list of confounders sufficient?

15 Tuesday, October 7, 2008

slide-51
SLIDE 51

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline? What are the confounding variables? Is the list of confounders sufficient? How to collect the data?

15 Tuesday, October 7, 2008

slide-52
SLIDE 52

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline? What are the confounding variables? Is the list of confounders sufficient? How to collect the data? Can we infer more than the effect? e.g., the discrimination criteria

15 Tuesday, October 7, 2008

slide-53
SLIDE 53

Adjusting for Confounding

(How to Apply to the ISP Case?)

Challenges

What is the baseline? What are the confounding variables? Is the list of confounders sufficient? How to collect the data? Can we infer more than the effect? e.g., the discrimination criteria

15 Tuesday, October 7, 2008

slide-54
SLIDE 54

What is the Baseline?

16 Tuesday, October 7, 2008

slide-55
SLIDE 55

What is the Baseline?

Baseline: service performance when ISP is NOT used

16 Tuesday, October 7, 2008

slide-56
SLIDE 56

What is the Baseline?

Baseline: service performance when ISP is NOT used

We need to use some ISP for comparison What if the one we use is not neutral

16 Tuesday, October 7, 2008

slide-57
SLIDE 57

What is the Baseline?

Baseline: service performance when ISP is NOT used

We need to use some ISP for comparison What if the one we use is not neutral

Solutions:

  • a. Use average performance over all other ISPs

16 Tuesday, October 7, 2008

slide-58
SLIDE 58

What is the Baseline?

Baseline: service performance when ISP is NOT used

We need to use some ISP for comparison What if the one we use is not neutral

Solutions:

  • a. Use average performance over all other ISPs
  • b. Use a lab model

16 Tuesday, October 7, 2008

slide-59
SLIDE 59

What is the Baseline?

Baseline: service performance when ISP is NOT used

We need to use some ISP for comparison What if the one we use is not neutral

Solutions:

  • a. Use average performance over all other ISPs
  • b. Use a lab model
  • c. Use service providers’ model

16 Tuesday, October 7, 2008

slide-60
SLIDE 60

Determine Confounding Variables

Using Domain Knowledge

17 Tuesday, October 7, 2008

slide-61
SLIDE 61

Determine Confounding Variables

Client Side Client setup (Network Setup) Application (Browser, BT Client, VoIP client) Resources (Memory, CPU, Utilization)

Using Domain Knowledge

17 Tuesday, October 7, 2008

slide-62
SLIDE 62

Determine Confounding Variables

Client Side Client setup (Network Setup) Application (Browser, BT Client, VoIP client) Resources (Memory, CPU, Utilization) ISP Related Not all ISPs are equal; e.g., location.

Using Domain Knowledge

17 Tuesday, October 7, 2008

slide-63
SLIDE 63

Determine Confounding Variables

Client Side Client setup (Network Setup) Application (Browser, BT Client, VoIP client) Resources (Memory, CPU, Utilization) ISP Related Not all ISPs are equal; e.g., location. Temporal Diurnal cycles, transient failures

Using Domain Knowledge

17 Tuesday, October 7, 2008

slide-64
SLIDE 64

Inferring the Criteria

18 Tuesday, October 7, 2008

slide-65
SLIDE 65

Inferring the Criteria

Label data in two classes:

discriminated (-), non-discriminated (+)

18 Tuesday, October 7, 2008

slide-66
SLIDE 66

Inferring the Criteria

Label data in two classes:

discriminated (-), non-discriminated (+)

Train a decision tree for classification +

Domain Content Size

youtube.com

  • ++

+ + + + + ++ ++ + ++ ++

  • 1MB

18 Tuesday, October 7, 2008

slide-67
SLIDE 67

Inferring the Criteria

Label data in two classes:

discriminated (-), non-discriminated (+)

Train a decision tree for classification

rules provide hints about the criteria

+

Domain Content Size

youtube.com

  • ++

+ + + + + ++ ++ + ++ ++

  • 1MB

18 Tuesday, October 7, 2008

slide-68
SLIDE 68

A Simple Simulation

19 Tuesday, October 7, 2008

slide-69
SLIDE 69

A Simple Simulation

  • Clients use two applications:

App1 and App2 to access services

  • Service 1 is slower using App2
  • App is confounding
  • ISPB throttles access to Service 1

19 Tuesday, October 7, 2008

slide-70
SLIDE 70

A Simple Simulation

  • Clients use two applications:

App1 and App2 to access services

  • Service 1 is slower using App2
  • App is confounding
  • ISPB throttles access to Service 1

Association Service 1 Service 2 Baseline 7.68 2.67 ISPB 8.60 2.7 Association 0.92 (10%) 0.04 (1%)

19 Tuesday, October 7, 2008

slide-71
SLIDE 71

A Simple Simulation

  • Clients use two applications:

App1 and App2 to access services

  • Service 1 is slower using App2
  • App is confounding
  • ISPB throttles access to Service 1

Association Service 1 Service 2 Baseline 7.68 2.67 ISPB 8.60 2.7 Association 0.92 (10%) 0.04 (1%) Causation Service 1 Service 2 App1 App2 App1 App2 Baseline 9.90 2.77 2.61 2.59 ISPB 11.95 7.95 2.67 2.67 Causation 2.05 (20%) 5.18(187%) 0.06(2%) 0.12(4%)

19 Tuesday, October 7, 2008

slide-72
SLIDE 72

Conclusions

20 Tuesday, October 7, 2008

slide-73
SLIDE 73

Conclusions

NANO: Black-box approach to infer and quantify discrimination; generically applicable.

20 Tuesday, October 7, 2008

slide-74
SLIDE 74

Conclusions

NANO: Black-box approach to infer and quantify discrimination; generically applicable. Ongoing work, Open Issues: Privacy: Can we do local inference?

20 Tuesday, October 7, 2008

slide-75
SLIDE 75

Conclusions

NANO: Black-box approach to infer and quantify discrimination; generically applicable. Ongoing work, Open Issues: Privacy: Can we do local inference? Deployment: PlanetLab, CPR, Real Users

20 Tuesday, October 7, 2008

slide-76
SLIDE 76

Conclusions

NANO: Black-box approach to infer and quantify discrimination; generically applicable. Ongoing work, Open Issues: Privacy: Can we do local inference? Deployment: PlanetLab, CPR, Real Users How much data?: Depends on variance

20 Tuesday, October 7, 2008

slide-77
SLIDE 77

Conclusions

NANO: Black-box approach to infer and quantify discrimination; generically applicable. Ongoing work, Open Issues: Privacy: Can we do local inference? Deployment: PlanetLab, CPR, Real Users How much data?: Depends on variance Contact: mmt@gatech.edu

20 Tuesday, October 7, 2008

slide-78
SLIDE 78

Backup

21 Tuesday, October 7, 2008

slide-79
SLIDE 79

Variables

  • Causal

Variables (X) The brand of an ISP IP address to ISP name mapping

  • Outcome

Variables (Y) Identify the Service The performance of a service Throughput, Delay, Jitter, Loss

  • Confounding

Variables (Z) Those that correlate with both

22 Tuesday, October 7, 2008

slide-80
SLIDE 80

Adjusting for Confounding Variables

  • Treatment ( whether using ISP i ): Xi

Outcome ( performance of service j ): Yj Causal Effect:

23 Tuesday, October 7, 2008

slide-81
SLIDE 81

Adjusting for Confounding Variables

  • Treatment ( whether using ISP i ): Xi

Outcome ( performance of service j ): Yj Causal Effect:

effect w . B(z) stratum

Boundaries of a stratum z with all-things equal

θij =

  • z

θ(z)

ij

θ(z)

ij

= θij(1; z) − θij(0; z)

ij

θij(x; z) = E(Yj|Xi = x, Z = B(z))

23 Tuesday, October 7, 2008

slide-82
SLIDE 82

Sufficient Confounders?

If we have enough variables, then we should be able to predict the performance Suppose: Confounding Var (Z), Treatment (X), Outcome ( y ) Predict: Test for prediction error:

treatment variabl as ˆ y = f(X; Z).

able obser |y − ˆ y|/y.

24 Tuesday, October 7, 2008

slide-83
SLIDE 83

Incentivizing

  • Useful Diagnostics get a piggybacked ride

NANO will identify transient failures in the elimination process Useful diagnostics dashboard for the users

25 Tuesday, October 7, 2008

slide-84
SLIDE 84

P2P-ized Architecture

  • Coordination to speed up inference and

active learning

26 Tuesday, October 7, 2008