Drongo Speeding Up CDNs with Subnet Assimilation from the Client - - PowerPoint PPT Presentation

drongo
SMART_READER_LITE
LIVE PREVIEW

Drongo Speeding Up CDNs with Subnet Assimilation from the Client - - PowerPoint PPT Presentation

Drongo Speeding Up CDNs with Subnet Assimilation from the Client CoNEXT 17 Authors: Incheon, South Korea Marc Anthony Warrior CDN & Caching Session Uri Klarman Marcel Flores Aleksandar Kuzmanovic Birds Eye View What is


slide-1
SLIDE 1

Drongo

Speeding Up CDNs with Subnet Assimilation from the Client

Authors: Marc Anthony Warrior Uri Klarman Marcel Flores Aleksandar Kuzmanovic CoNEXT ‘17 Incheon, South Korea CDN & Caching Session

slide-2
SLIDE 2

Bird’s Eye View

  • What is Drongo?
  • Why we need Drongo
  • Performance Analysis
  • Thoughts & Conclusions
  • Questions

2 2

slide-3
SLIDE 3

What is Drongo?

3

slide-4
SLIDE 4

What is Drongo?

It’s a bird!

4

slide-5
SLIDE 5

What is Drongo?

It’s a bird!

5

slide-6
SLIDE 6

What is Drongo?

It’s a bird!

6

slide-7
SLIDE 7

What is Drongo?

It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks (CDNs)

7

slide-8
SLIDE 8

What is Drongo?

It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks (CDNs) (in this talk, QoS = latency)

8

slide-9
SLIDE 9

Why Latency?

9

slide-10
SLIDE 10

Why Latency?

  • Latency is time

10

slide-11
SLIDE 11

Why Latency?

  • Latency is time
  • Latency is money

○ Google (Marissa Mayer), Amazon (Greg Linden) ■ Web 2.0 Summet, glinden.blogspot.com

11

slide-12
SLIDE 12

Why Latency?

  • Latency is time
  • Latency is money

○ Google (Marissa Mayer), Amazon (Greg Linden) ■ Web 2.0 Summet, glinden.blogspot.com

  • Latency is the bottom line

○ “What we have found running our applications at Google is that latency is as important, or more important, for our applications than relative bandwidth,” Amin Vahdat (Google)

12

slide-13
SLIDE 13

Drongo helps you (the end user) lower your own latency!

13

slide-14
SLIDE 14

14

Drongo’s Effect on Latency

Google Amazon Alibaba CDNetworks C h i n a N e t C t r CubeCDN

slide-15
SLIDE 15

15

Drongo’s Effect on Latency

Google Amazon Alibaba CDNetworks C h i n a N e t C t r CubeCDN

ONLY client-side changes

slide-16
SLIDE 16

Example Scenario

16

slide-17
SLIDE 17

Provider wants to serve client

17

slide-18
SLIDE 18

Client is far

18

slide-19
SLIDE 19

CDN = more replica locations

19

slide-20
SLIDE 20

DNS Redirection

Which replica serves the client?

20

slide-21
SLIDE 21

Choose the “closest” server

21

slide-22
SLIDE 22

Choose the “closest” server This choice is nontrivial!

22

slide-23
SLIDE 23

Often Suboptimal Choices!

23

slide-24
SLIDE 24

24

Maybe just a far LDNS...

[Chen - SigComm ’15; Huang - SigComm CCR ‘12; Alzoubi - WWW ‘13; Rula - SigComm ‘14 …]

slide-25
SLIDE 25

Ordinary DNS Query

25

DNS Query LDNS IP

Somewhere in California

slide-26
SLIDE 26

EDNS0 Client-Subnet extension (ECS)

26

DNS Query LDNS IP

Somewhere in California Actually somewhere in New York

Client Subnet

slide-27
SLIDE 27

27

(ECS User)

We used ECS:

slide-28
SLIDE 28

This still happens

28

(ECS User)

We used ECS:

slide-29
SLIDE 29

This still happens

29

… frequently

(ECS User)

We used ECS:

slide-30
SLIDE 30

Really? ...

30

slide-31
SLIDE 31

Really? ... YES! We measured it!

31

slide-32
SLIDE 32

How did we measure it?

32

slide-33
SLIDE 33

How did we measure it?

33

Find subnets directed to different replicas

slide-34
SLIDE 34

Subnet Assimilation

34

DNS Query LDNS IP Client Subnet

slide-35
SLIDE 35

Subnet Assimilation

35

DNS Query LDNS IP Client Subnet Other Subnet

slide-36
SLIDE 36

How did we measure it?

36

Find subnets directed to different replicas Perform pings and downloads to each replica

slide-37
SLIDE 37

How did we measure it?

37

Find subnets directed to different replicas Perform pings and downloads to each replica Identify which subnet resulted in the “best” replica

slide-38
SLIDE 38
  • 1. Get “Default” Choice

38

(use client’s own subnet for ECS)

slide-39
SLIDE 39
  • 2. Traceroute to default choice

39

slide-40
SLIDE 40
  • 3. Get Hop Subnet Choices

40

(use hops’ subnets for ECS)

slide-41
SLIDE 41
  • 4. Measure Latencies

41

slide-42
SLIDE 42
  • 4. Measure Latencies

42

Steps 1-4: a “trial”

slide-43
SLIDE 43

Latency Ratio

43

1 0.6 1.4

Normalize to default choice’s RTT

slide-44
SLIDE 44

We’re looking for this

44

1 0.6 1.4

slide-45
SLIDE 45

Valley = better choice from hop subnet

100 ms 0 ms

RTT: client to replica traceroute replica choice for subnet

45

slide-46
SLIDE 46

Valley = better choice from hop subnet

100 ms 0 ms

RTT: client to replica traceroute replica choice for subnet

46

slide-47
SLIDE 47

PlanetLab Sees Valleys!

47

slide-48
SLIDE 48

PlanetLab Sees Valleys!

48

slide-49
SLIDE 49

PlanetLab Sees Valleys!

  • Google: 20.24%
  • Amazon: 14.02%
  • Alibaba: 33.68%
  • CDNetworks: 15.61%
  • ChinaNetCenter: 27.42%
  • CubeCDN: 38.58%

Room for improvement!

49

slide-50
SLIDE 50

5.

50

slide-51
SLIDE 51
  • 5. Use best subnet for ECS

51

slide-52
SLIDE 52
  • 5. Use best subnet for ECS

52

Get best mapping!

slide-53
SLIDE 53

Are Valleys Predictable?

  • Trials are not “fast”

53

slide-54
SLIDE 54

Are Valleys Predictable?

  • Trials are not “fast”
  • We want valleys “on the fly”

54

slide-55
SLIDE 55

Are Valleys Predictable?

  • Trials are not “fast”
  • We want valleys “on the fly”
  • We need to find valley-prone subnets

55

slide-56
SLIDE 56

Testing Persistence

20 5 10 15

56

consecutive trials

slide-57
SLIDE 57

Testing Persistence

20 5 10 15

VS Trial A Trial B

57

slide-58
SLIDE 58

Latency Ratio Difference Over Time

58

Latency Ratio = (hop replica RTT) / (default replica RTT)

slide-59
SLIDE 59

Testing Persistence

20 5 10 15

VS Window A Window B

59

slide-60
SLIDE 60

Testing Persistence

20 5 10 15

VS Window A Window C

60

slide-61
SLIDE 61

Testing Persistence

20 5 10 15

VS Window A Window C

61

15 hours

slide-62
SLIDE 62

Latency Ratio Difference Over Time

62

Latency Ratio = (hop replica RTT) / (default replica RTT)

slide-63
SLIDE 63

Latency Ratio Difference Over Time

63

Latency Ratio = (hop replica RTT) / (default replica RTT)

slide-64
SLIDE 64

Latency Ratio Difference Over Time

64

Latency Ratio = (hop replica RTT) / (default replica RTT)

slide-65
SLIDE 65

Latency Ratio Difference Over Time

65

Latency Ratio = (hop replica RTT) / (default replica RTT)

SURPRISE! The Internet is crazy!

slide-66
SLIDE 66

66

Filter: at least one valley {0,0,0,0,0,V,0,0,0,0,0,0,V} {0,0,0,0,0,0,0,0,0,0,0,0,0} {V,V,V,V,0,0,0,0,V,V,V,0,V}

Subnet A Subnet B Subnet C

slide-67
SLIDE 67

67

Filter: at least one valley {0,0,0,0,0,V,0,0,0,0,0,0,V} {0,0,0,0,0,0,0,0,0,0,0,0,0} {V,V,V,V,0,0,0,0,V,V,V,0,V}

Subnet A Subnet B Subnet C

slide-68
SLIDE 68

68

Filter: at least one valley

Latency Ratio = (hop replica RTT) / (default replica RTT)

slide-69
SLIDE 69

69

Filter: at least one valley

Latency Ratio = (hop replica RTT) / (default replica RTT)

very flat

slide-70
SLIDE 70

70

Filter: at least one valley

Latency Ratio = (hop replica RTT) / (default replica RTT)

very flat Close to zero

slide-71
SLIDE 71

71

slide-72
SLIDE 72

72

slide-73
SLIDE 73

73

Parameter Exploration

slide-74
SLIDE 74

How deep are the valleys from useful subnets?

74

Vthresh =

slide-75
SLIDE 75

Vthresh

75

1 0.6 0.9

Latency Ratio Replicas 1

A B C

slide-76
SLIDE 76

76

Vthresh

1 0.6 0.9

Latency Ratio 1 Replicas

A B C

slide-77
SLIDE 77

How often do valleys

  • ccur in

useful subnets?

77

Vfreq =

slide-78
SLIDE 78

78

TRAINING WINDOW

slide-79
SLIDE 79

TRIALS

79

slide-80
SLIDE 80

80

Vfreq = 2/5

slide-81
SLIDE 81

81

Vfreq = 2/5 Valley-Prone Subnet

slide-82
SLIDE 82

82

Vfreq = 2/5 Valley-Prone Subnet

slide-83
SLIDE 83

83

Vfreq = 2/5 NOT Valley-Prone Subnet

slide-84
SLIDE 84

Overview of Drongo:

  • 1. Collect training window

84

slide-85
SLIDE 85

Overview of Drongo:

  • 1. Collect training window
  • 2. Count the # of sufficiently deep valleys

85

slide-86
SLIDE 86

Overview of Drongo:

  • 1. Collect training window
  • 2. Count the # of sufficiently deep valleys
  • 3. Apply subnet assimilation
  • a. Training window is already complete
  • b. Both parameters met

86

slide-87
SLIDE 87

System Wide Performance

87

slide-88
SLIDE 88

System Wide Performance

88

slide-89
SLIDE 89

System Wide Performance

89

better

slide-90
SLIDE 90

System Wide Performance

90

better

slide-91
SLIDE 91

System Wide Performance

91

better

slide-92
SLIDE 92

System Wide Performance

92

better

slide-93
SLIDE 93

System Wide Performance

93

better

slide-94
SLIDE 94

System Wide Performance

94

better

Vfreq = 1.0

slide-95
SLIDE 95

System Wide Performance

95

better

Vfreq = 1.0 Vthresh = 0.95

slide-96
SLIDE 96

96

Switch Quality

Global Params Per Prov. Params

Google Amazon Alibaba CDNetworks C h i n a N e t C t r CubeCDN

slide-97
SLIDE 97

Conclusion & Insights

  • CDNs have a lot of room for

improvement

97

slide-98
SLIDE 98

Conclusion & Insights

  • CDNs have a lot of room for

improvement

  • Clients can help

98

slide-99
SLIDE 99

Conclusion & Insights

  • CDNs have a lot of room for

improvement

  • Clients can help
  • Low requirements

99

slide-100
SLIDE 100

Conclusion & Insights

  • CDNs have a lot of room for

improvement

  • Clients can help
  • Low requirements
  • Can provide 50% improvement

100

slide-101
SLIDE 101

Questions?

101

slide-102
SLIDE 102

# Clients Affected

better

102

slide-103
SLIDE 103

Per Provider Overall Performance

103

slide-104
SLIDE 104

Performance of Drongo’s choices

104