Optimizing Cost and Performance in Online Service Provider Networks - - PowerPoint PPT Presentation

optimizing cost and performance in online service
SMART_READER_LITE
LIVE PREVIEW

Optimizing Cost and Performance in Online Service Provider Networks - - PowerPoint PPT Presentation

Optimizing Cost and Performance in Online Service Provider Networks Ming Zhang Microsoft Research Joint work with Zheng Zhang (Purdue), Albert Greenberg (MSR), Y. Charlie Hu (Purdue), Ratul Mahajan (MSR), and Blaine Christian (Microsoft) 1


slide-1
SLIDE 1

1

Optimizing Cost and Performance in Online Service Provider Networks

Ming Zhang Microsoft Research

Joint work with Zheng Zhang (Purdue), Albert Greenberg (MSR), Y. Charlie Hu (Purdue), Ratul Mahajan (MSR), and Blaine Christian (Microsoft)

slide-2
SLIDE 2

2

OSP ISP1 ISP2 ISP3 ISP4 ISP6 ISP5

OSP network

DC1 DC2 DC3

User (IP prefix)

slide-3
SLIDE 3

Key factors in OSP traffic engineering

  • Cost

– Google Search: 5B queries/month – MSN Messenger: 330M users/month – Traffic volume exceeding a PB/day

  • Performance

– Directly impacts user experience and revenue

  • Purchases, search queries, ad click-through rates

3

slide-4
SLIDE 4

4

Current TE solution is limited

  • Current practice is mostly manual

– Incoming: DNS redirection, nearby DC – Outgoing: BGP, manually configured

  • Complex TE strategy space

– (~300K prefixes) x (~10 DC) x (~10 routes/prefix) – Link capacity creates dependencies among prefixes

4

slide-5
SLIDE 5

5

Prior work on TE

  • Intra-domain TE for transit ISPs

– Balancing load across internal paths – Not considering end-to-end performance

  • Route selection for multi-homed stub networks

– Single site – Small number of ISPs

5

slide-6
SLIDE 6

6

Our contributions

  • Formulation of OSP TE problem
  • Design & implementation of Entact

– A route-injection-based measurement technique – An online TE optimization framework

  • Extensive evaluations in MSN

– 40% cost reduction – Low operational overheads

6

slide-7
SLIDE 7

7

Problem formulation

  • INPUT: user prefixes, DCs, & external links
  • OUTPUT: TE strategy, user prefix  (DC, external link)
  • CONSTRAINTS: link capacity, route availability

Users DCs Links Users

d1 d3 d2

Link1 Link2 Link3

DC1 DC2

d1 d3 d2

slide-8
SLIDE 8

8

Cost & performance measures

  • Use RTT as the performance measure

– Many latency-sensitive apps: search, email, maps – Apps are chatty: N x RTT quickly gets to 100+ms

  • Transit cost: F(v) = price x v

– Ignore internal traffic cost

slide-9
SLIDE 9

9

Measuring alternative paths with route injection

OSP

AS 2 AS 3 AS 1

IP2 IP3

5.6.7.0/24

Route injection daemon

Routing table prefix next-hop AS Path * 5.6.7.0/24 IP2 AS2 AS1 IP3 AS3 AS1 * 5.6.7.8/32 IP3 5.6.7.8/32 next-hop=IP3

  • Minimal impact on

current traffic

  • Existing approaches

are inapplicable

slide-10
SLIDE 10

Selecting desirable strategy

10

Weighted RTT Cost

  • M strategies for N prefixes

and M alternative paths/prefix

  • Only consider optimal strategies
  • Finding “sweet spot” based
  • n desirable cost-performance

tradeoff

  • K extra cost for unit latency

decrease

N

Optimal strategy curve Sweet spot, slope = - K

slide-11
SLIDE 11

11

Computing optimal strategy

  • P95 cost optimization is complex

– Optimize short-term cost online – Evaluate using P95 cost

  • Reduced an ILP problem

– Find a fractional solution – Convert to an integer solution

slide-12
SLIDE 12

12

Finding optimal strategy curve

Weighted RTT Cost Optimal strategy curve

slide-13
SLIDE 13

13

Entact architecture

Per-prefix traffic volume Optimal TE strategy n-1 injected routes per prefix RTT of n alternative routes per prefix n live IPs per prefix Netflow data Routing tables Capacity & price of external links, slope K

ENTACT

slide-14
SLIDE 14

14

Experimental setup

  • MSN: one of the largest OSP networks

– 11 DCs, 1,000+ external links

  • Assumptions in evaluation

– Traffic and performance do not change with TE strategies

  • 6K destination prefixes from 2,791 ASes

– High-volume, single-location, representative

slide-15
SLIDE 15

15

Benefits of Entact

50 100 150 200 250 300 350 25 35 45 55 65 wRTT (msec) P95 cost

Default Entact (K=10) LowestCost (K=0) BestPerf (K= ∞)

  • 40% cost reduction
  • Significant cost/perf tradeoff
slide-16
SLIDE 16

16

Where does cost reduction come from?

path chosen by Entact prefixes (%) wRTT difference (msec) short-term cost difference same 88.2 cheaper & shorter 1.7

  • 8
  • 309

cheaper & longer 5.5 +12

  • 560

pricier & shorter 4.6

  • 15

+42 pricier & longer 0.1

  • Entact makes “intelligent” performance-cost tradeoff
  • Automation is crucial for handling complexity & dynamics
slide-17
SLIDE 17

17

Conclusions

  • TE automation is crucial for large OSP network

– Multiple DCs – Many external links – Dependencies between prefixes

  • Entact -- first online TE scheme for OSP network

– 40% cost reduction w/o performance degradation – Low operational overhead