Anycast for Any Service Michael J. Freedman Karthik - - PowerPoint PPT Presentation

anycast for any service
SMART_READER_LITE
LIVE PREVIEW

Anycast for Any Service Michael J. Freedman Karthik - - PowerPoint PPT Presentation

Anycast for Any Service Michael J. Freedman Karthik Lakshminarayanan David Mazires http://oasis.coralcdn.org/ Whats the replica-selection problem? mycdn ? Client needs to choose a good replica server Performance and cost


slide-1
SLIDE 1

Anycast for Any Service

Michael J. Freedman

Karthik Lakshminarayanan David Mazières http://oasis.coralcdn.org/

slide-2
SLIDE 2

mycdn

?

What’s the replica-selection problem?

Client needs to choose a “good” replica server Performance and cost dependent on replica selection

slide-3
SLIDE 3

What do we currently do?

slide-4
SLIDE 4

How bad can it get?

slide-5
SLIDE 5

Anycast is the solution

Anycast = automated “good” replica selection OASIS is a flexible anycast system for multiple services

?

mycdn

slide-6
SLIDE 6

The need for anycast

Internet systems rely on replicated content and services Distributed mirrors: Web servers, FTP servers, … Content Distribution Networks: Akamai, CoralCDN, … Internet Naming Systems: DNS, SFR, DOA, … Distributed File Systems: CFS, Shark, …. Routing Overlays: RON, Detour, i3, …

Distributed Hash Storage Systems: OpenDHT, …

All could benefit from anycast service

slide-7
SLIDE 7

How should one implement anycast?

slide-8
SLIDE 8

Strawman: probe & find nearest

mycdn

I D B C A E

ICMP

slide-9
SLIDE 9

Strawman: probe & find nearest

mycdn

I D B C A E

ICMP

, ,

E I D

  • Result highly accurate
  • Lots of probing
  • Slow to compute
slide-10
SLIDE 10

mycdn

I D B C A E

ICMP

, ,

E I D

  • Result highly accurate
  • Lots of probing
  • Slow to compute

Strawman: probe & find nearest

slide-11
SLIDE 11

mycdn

I D B C A E

ICMP

, ,

E I D

  • Result highly accurate
  • Lots of probing
  • Slow to compute

Avoid probing on-demand

slide-12
SLIDE 12

mycdn

I D B C A E

ICMP

, ,

E I D

Avoid probing on-demand

  • Result highly accurate
  • Lots of probing
  • Slow to compute

18.0.0.0/8

[IMC05] shows IP prefixes often preserve locality

( 99% of /24s by stub AS at the same location )

This is the problem Akamai must solve

slide-13
SLIDE 13

What about yourcdn?

mycdn

E I D

yourcdn

M N O

18.0.0.0/8

  • Result highly accurate
  • Lots of probing
  • Slow to compute
slide-14
SLIDE 14

Idea: Use geographic coordinates

mycdn yourcdn

18.0.0.0/8

(42N,71W)

  • Result highly accurate
  • Lots of probing
  • Slow to compute
  • Stable across services, time, and failures

Assume all replicas know geo-coords

(42N,71W)

  • Amortize costs
slide-15
SLIDE 15

OASIS provides…

  • Result highly accurate
  • Stable across time, services, and failures
  • Amortize costs
  • Fast response time
  • Supports flexible anycast policies

Balances tension between: Performance: finding nearest replica Cost: minimizing 95% bandwidth usage

slide-16
SLIDE 16

Outline

Architecture and design decisions Detailed design Evaluation Deployment and integration lessons

OASIS deployed since November 2005 Currently in use by 10 services

slide-17
SLIDE 17

Two-tier architecture

mycdn OASIS core Large set of replicas that assist in measurement

Reliable core of hosts that implement anycast

=

DNS OASIS node RPC mycdn replica proxy

=

slide-18
SLIDE 18

Using OASIS via DNS

1. Client issues DNS request for mycdn.nyuld.net 2. OASIS redirects client to nearby application replica mycdn OASIS core Client Resolver

1 2

slide-19
SLIDE 19

1 2 3

OASIS core mycdn 1. Client issues HTTP request 2. Web cgi-bin issues RPC to OASIS core 3. Client redirected to nearby application replica Client

Using OASIS via HTTP

slide-20
SLIDE 20

service bucketing proximity replicas

IP prefix policy coords

response request

IP addr name

Using OASIS via HTTP How does core answer anycast?

slide-21
SLIDE 21

How does core answer anycast?

service bucketing proximity replicas

IP prefix coords

request

IP addr name policy

response

mycdn

18.26.4.9 171.66.3.181 216.165.109.81

mycdn.nyuld.net

18.71.0.3 18.0.0.0/8 18.26.4.9

slide-22
SLIDE 22

How to map IP prefix to coords?

proximity

IP prefix ( Lat, Lng, RTT distance )

location accuracy

slide-23
SLIDE 23

Two-pronged approach

Find closest replica proxy

How to map IP prefix to coords?

: 18.0.0.0/8 (42N,71W) (42N,71W)

proximity

IP prefix ( Lat, Lng, RTT distance )

location accuracy

slide-24
SLIDE 24

Two-pronged approach

Find closest replica proxy Use closest replica’s geo-coords + error RTT as location

How to map IP prefix to coords?

: 18.0.0.0/8 (42N,71W) , 6.0 ms (42N,71W)

proximity

IP prefix ( Lat, Lng, RTT distance )

location accuracy

slide-25
SLIDE 25

Two-pronged approach

Find closest replica proxy with less probing Use closest replica’s geo-coords + error RTT as location

18.168.0.23

Find replica nearest prefix efficiently

18.0.0.0/8 “Probe 18.0.0.0/8”

slide-26
SLIDE 26

Two-pronged approach

Find closest replica proxy with less probing Use closest replica’s geo-coords + error RTT as location

18.168.0.23

Find replica nearest prefix efficiently

[ Meridian 05 ]

slide-27
SLIDE 27

18.168.0.23

Find replica nearest prefix efficiently

[ Meridian 05 ]

Two-pronged approach

Find closest replica proxy with less probing Use closest replica’s geo-coords + error RTT as location

: 18.0.0.0/8 , 6.0 ms (42N,71W)

slide-28
SLIDE 28

Geographic distance vs. RTT

Strong correlation b/w geographical distance and RTT

slide-29
SLIDE 29

Geographic distance vs. RTT

Strong correlation b/w geographical distance and RTT RTT accuracy has real-world meaning

Check if new coordinates improve accuracy vs. old coords

slide-30
SLIDE 30

Geographic distance vs. RTT

Strong correlation b/w geographical distance and RTT RTT accuracy has real-world meaning

Check if new coordinates improve accuracy vs. old coords

[ Meridian 05 ]

: 18.0.0.0/8 , 6.0 ms (42N,71W) : 18.0.0.0/8 , 3.0 ms (42N,72W) : 18.0.0.0/8 , 3.0 ms (42N,72W)

slide-31
SLIDE 31

Geographic distance vs. RTT

Strong correlation b/w geographical distance and RTT RTT accuracy has real-world meaning

Check if new coordinates improve accuracy vs. old coords

[ Meridian 05 ]

: 18.0.0.0/8 , 6.0 ms (42N,71W) : 18.0.0.0/8 , 3.0 ms (42N,72W) : 18.0.0.0/8 , 3.0 ms (42N,72W)

slide-32
SLIDE 32

Geographic distance vs. RTT

Strong correlation b/w geographical distance and RTT RTT accuracy has real-world meaning

Check if new coordinates improve accuracy vs. old coords

Useful for sanity check for network peculiarities

Do multiple results satisfy constraints (e.g., speed of light) ?

slide-33
SLIDE 33

Outline

Architecture and design decisions Detailed design Evaluation Deployment and integration lessons

OASIS deployed since November 2005 Currently in use by 10 services

slide-34
SLIDE 34

mycdn

  • pendht

OASIS core Global membership view Epidemic gossiping

  • Scalable failure detection
  • Spread policies, prefix

coords

Consistent hashing

  • Divide up responsibility for prefixes

Service replicas Heartbeats to OASIS node Form global Meridian

  • verlay for probing

OASIS core

slide-35
SLIDE 35

How to find “nearby” nodes?

replicas

IP prefix coords

request

IP addr name

service bucketing proximity

policy

response

mycdn mycdn.nyuld.net

18.26.4.9 171.66.3.181 216.165.109.81

18.26.4.9 18.26.4.9 18.0.0.0/8

Local info from gossiping (stale data okay)

slide-36
SLIDE 36

How to find “nearby” nodes?

replicas

IP prefix coords

request

IP addr name

service bucketing proximity

policy

response

mycdn

18.26.4.9

Local info from gossiping (stale data okay)

mycdn.nyuld.net

18.26.4.9 18.0.0.0/8

18.26.4.9 171.66.3.181 216.165.109.81

Clients react poorly to stale data

slide-37
SLIDE 37

Aggregate replica information

OASIS mycdn

Define service’s rendezvous node via consistent hashing Service replicas send keepalives to nearby OASIS nodes Update rendezvous when replicas join, leave, large load change

OASIS

H(srv)

slide-38
SLIDE 38

Aggregate replica information

OASIS mycdn

Define service’s rendezvous node via consistent hashing Service replicas send keepalives to nearby OASIS nodes Update rendezvous when replicas join, leave, large load change

OASIS

Bottleneck?

H(srv)

slide-39
SLIDE 39

Aggregate replica information

OASIS mycdn

Aggregate over k nodes for scalability Rendezvous gossip liveness state for loose consistency k can be dynamic for better scalability

OASIS

H(srv)

slide-40
SLIDE 40

A client’s view: Finding a nameserver

OASIS OASIS

Core lookup: Contacts 1 of 13 nameservers for .nyuld.net

OASIS “uses itself” to discover replica for service dns

H(dns)

Client

slide-41
SLIDE 41

A client’s view: Finding a nameserver

OASIS OASIS

Core lookup: Contacts 1 of 13 nameservers for .nyuld.net

OASIS “uses itself” to discover replica for service dns Returns nearby nameservers for subsequent requests

H(dns)

Client

slide-42
SLIDE 42

OASIS OASIS

H(mycdn)

Replica lookup: Client contacts nearby nameserver

OASIS discover replica for service mycdn Returns nearby replicas for application

A client’s view: Finding a replica

R

H(dns)

Client

slide-43
SLIDE 43

Evaluation

Deployed on PlanetLab since November 2005

How much end-to-end benefit from OASIS? How accurate is OASIS? Effective for load balancing? What are OASIS’s bandwidth costs?

slide-44
SLIDE 44

E2E download of web page

290% faster than Meridian 500% faster than RR Cached virtual coords highly inaccurate

slide-45
SLIDE 45

Client RTT to chosen replica

Outperforms Meridian 60% of time

slide-46
SLIDE 46

OASIS minimizes bandwidth spikes

Load + Latency 0.0 0.0 0.0 23.3 Latency Only Germany NY TX CA

95% bandwidth usage per replica (MB)

metric loc

8 clients in CA repeatedly request 1 MB file Replicas report load as log (95% bandwidth per 1-min slot)

slide-47
SLIDE 47

9.2 9.6 11.3 9.0 Load + Latency 0.0 0.0 0.0 23.3 Latency Only Germany NY TX CA

95% bandwidth usage per replica (MB)

metric loc

8 clients in CA repeatedly request 1 MB file Replicas report load as log (95% bandwidth per 1-min slot)

OASIS minimizes bandwidth spikes

slide-48
SLIDE 48

Bandwidth costs: OASIS v. on-demand

1-2 orders of magnitude # DNS reqs to CoralCDN

slide-49
SLIDE 49

Outline

Architecture and design decisions Detailed design Evaluation Deployment and integration lessons

OASIS deployed since November 2005 Currently in use by 10 services

slide-50
SLIDE 50

Sanity check for network peculiarities

Employ measurement redundancy Easy visualization significantly helped debugging

slide-51
SLIDE 51

Probing generates abuse complaints Your service can get blacklisted!

Keyword frequency on PlanetLab support lists 9 months, 1820 threads, 4682 msgs

Netops have low tolerance for probing

slide-52
SLIDE 52

Netops have low tolerance for probing

Be careful what you probe

Probe slowly and rarely No random ports or obvious attack vectors (TCP port 22/23)

Be careful whom you probe

Check blacklist for netblock and target IP (after traceroute)

slide-53
SLIDE 53

Make it easy to integrate

replica proxy dns nakika OASIS core node

listen(7060) ServiceName nakika LocalPort 7060 SecretCode 555555 ServiceName nakika ServiceAlias nakika.net SortType latencycap MaxAddrs 2 AddrTTLs 120

slide-54
SLIDE 54

replica proxy dns nakika OASIS core node

code load cap ServiceName nakika LocalPort 7060 SecretCode 555555 ServiceName nakika ServiceAlias nakika.net SortType latencycap MaxAddrs 2 AddrTTLs 120 listen(7060)

?

Clients immediate use nakika.nyuld.net

Make it easy to integrate

slide-55
SLIDE 55

Current services using OASIS…

Chunkcast block anycast (Berkeley) CoralCDN (NYU) Na Kika content distribution (NYU) OASIS

RPC, DNS, HTTP interfaces

OCALA overlay convergence (Berkeley)

Separate services for client and server IPs gateways

OpenDHT public DHT service (Berkeley)

OverCite distributed library (MIT): Deployed on RON

slide-56
SLIDE 56

Summary

OASIS is a general, open anycast service Supports multiple services: more are better Performs accurate server selection Removes all on-demand probing Provides easy integration Use OASIS for your distributed system!

http://oasis.coralcdn.org/