efficiently delivering online services over integrated
play

Efficiently Delivering Online Services over Integrated - PowerPoint PPT Presentation

Efficiently Delivering Online Services over Integrated Infrastructure Hongqiang Harry Liu, Raajay Viswanathan, MaC Calder Aditya Akella, Ratul Mahajan, Jitendra Padhye, Ming Zhang 1 Online Services 2 Online Service Delivery Infrastructure


  1. Efficiently Delivering Online Services over Integrated Infrastructure Hongqiang Harry Liu, Raajay Viswanathan, MaC Calder Aditya Akella, Ratul Mahajan, Jitendra Padhye, Ming Zhang 1

  2. Online Services 2

  3. Online Service Delivery Infrastructure Proxies Wide Area Network Data Centers 3

  4. Online Service Delivery is Evolving Operated by different ISPs Operated by content providers WAN MulOple owners Owned by a single enOty Tradi*onal infrastructure Integrated infrastructure 4

  5. Integrated Infrastructure Enables Joint Control of All Decisions WAN 1. User – Proxy mapping 2. Proxy – DC mapping 3. Paths in the wide area network 5

  6. Advantages of Joint Control DC-2 WAN DC-1 Proxy-3 Proxy-1 Proxy-2 • Increase efficiency : total traffic without congesOon • Improve performance : aggregate end-to-end latency 6

  7. Footprint : Jointly Controls the Integrated Infrastructure Topology DC2 P3 DC1 Controller P1 P2 System UG – proxy User workload capacity latency UG1 Users grouped by locaOon, service provider Control decisions for a user group: Goals: • UG—proxy mapping Maximize congesOon free traffic • • proxy—DC mapping Minimize end-to-end latency • • network paths 7

  8. Outline • Challenges in compuOng forwarding configuraOon • Other challenges in realizing Footprint • EvaluaOon 8

  9. CompuOng ConfiguraOon: Basic Approach EsOmated user demands for an epoch w 1 1 { d 1 , d 2 ,..., d n } C 1 1 l 1 d 1 w 1 2 Resource capaciOes 2 l 1 { C 1 , C 2 ,..., C m } C 2 w 1 m d 2 EsOmated load from user ‘u’ on resource ‘r’ m l 1 r = d u . w u r n u Capacity constraint for resource ‘r’ n r = r ≤ C r ∑ w n m n u d n C m Linear Program u m l n ObjecOve ∑ ∑ r n u r ∑ ∑ r l u maximize n u − Latencies u r u r 9

  10. Does such a simple model suffice ? No Because of the nature of traffic from different online applicaOons 10

  11. User Traffic Arrives over Sessions • MulOple requests and responses over a single session Requests Proxy TCP Responses • Sessions are long-lived and arrive all through the duraOon of an epoch #sessions varies over Ome #Sessions on a Resource 0 Ome (s) 11

  12. Session SOckiness Sessions s*ck to proxy and DC Old sessions are Long lived TCP sessions • sOll forwarded No fresh DNS query in the middle of a session • to P1 1 1 n 1 n 1 P1 UG1 # sessions # sessions on P1 on P1 T T P2 Ome (s) Ome (s) Overlay link 2 2 n 1 n 1 P1 # sessions # sessions UG1 on P2 on P2 T T Ome (s) Ome (s) P2 Non-s*cky sessions S*cky sessions Switch traffic at t = T 12

  13. Challenge: Temporal VariaOon of Load Gradually varying load from a 1. Non-zero session lifeOme user group to a resource 2. Session sOckiness • Resource capacity constraints should be saOsfied during enOre epoch r ≤ C r ∑ r ( t ) ≤ C r ∑ n u n u u u r ( t ) n u • ComputaOonally infeasible if does not have a closed form - ApplicaOons have arbitrary session life distribuOons 13

  14. How to guarantee congesOon free delivery for traffic on sessions? 14

  15. High Fidelity Modeling of Load n r ( t ) = n r new ( t ) + n r old ( t ) previous current Always holds this paCern n r 1 ProporOonal to o CDF arrival rate of n new n old new sessions 0 100 300 200 0 300 0 300 Ome (s) Ome (s) Session life Ome (s) n r ( t ) = λ r F ( t ) + n o r G ( t ) Arrival rate of sessions (decision variable) PaCern FuncOons ß Session length distribuOon 15

  16. DiscreOzing the Temporal Model __ F ( t ) Approximate by a Oght piecewise F ( t ) F ( t ) __ linear upper bound , F ( t ) __ r ( t ) = λ r F __ n r ( t ) ≤ n ( t ) + n r 0 G ( t ) 0 t 3 t 1 t 2 T Ome __ r ( t ) • has maximum at one of the corners n • Capacity constraints have to be checked only at fixed set of points λ r • Op*mal ‘s obtained by solving a linear program 16

  17. Footprint: System ImplementaOon Gathering Inputs CompuOng OpOmal Forwarding ImplemenOng Computed ConfiguraOon 17

  18. Footprint: Inputs to the controller • Input data collected every 5 minutes • Inputs: – User group – proxy latency measurements • Piggy-back on end-host applicaOons • Instrumented JavaScript on bing.com webpage [Calder et al., IMC 2015] – User workload • EsOmated using observed workload in prior epochs – System health status • From Microsoo internal system monitoring pipelines • Deployed in producOon 18

  19. ImplemenOng Computed ConfiguraOon • UG—proxy mapping: DNS (BIND) • Proxy—DC mapping: Custom sooware to change configuraOon • WAN path selecOon: OpenFlow • Prototyped on a modest-sized testbed 19

  20. EvaluaOon 1. Joint Decisions 2. Temporal Modeling 20

  21. EvaluaOon Setup • Trace driven simulaOons • Data - Taken from producOon deployment of Footprint - One week worth of data - MulOple topologies (North America, Europe) • Scale - O(10k) user groups - O(100) routers and links - O(100) proxies - O(10) data centers • Metric - Efficiency: Maximum traffic with no congesOon - Performance: Aggregated end-to-end latency 21

  22. EvaluaOon: Efficiency of Joint Control FastRoute [Flavel et al., NSDI 2015] • UG—proxy: Closest proxy decided by Anycast rouOng • Proxy—DC: Closest proxy based on acOve measurements • WAN path selecOon: Independent traffic engineering module 2.5 2 Normalized Traffic Scale 1.5 1 0.5 0 FastRoute Footprint Footprint can carry 2x more load because user traffic is • diverted to resources with unused capacity 22

  23. EvaluaOon: Latency Improvement Compare end-to-end latency at 70% capacity of FastRoute 100 80 Latency (ms) Queuing Delay 60 Internal PropagaOon Delay 40 External Delay 20 0 FastRoute Footprint Footprint decreases overall latency by ~60% 23

  24. EvaluaOon: Efficiency of Temporal Modeling • Compare with non-temporal models – JointAverage : n r ( t ) = λ x Average session length – JointWorst: n r ( t ) = max (#old sessions) + max (#new sessions) t t 3 2.3 Normalized Traffic Scale 2 1.48 1.18 1 0 JointAverage JointWorst Footprint More than 50% gains with respect to non-temporal models. 24

  25. Related Work To coordinate or not to coordinate? [Narayana et. al , SIGMETRICS 2012] • CooperaOve world vs Single enOty world • Show importance of temporal load modeling • 25

  26. Summary • Joint decision for proxy, DC and WAN path selecOon • 100% increase in supported users, and, • 60% reducOon in end-to-end latency • High fidelity temporal models 50% efficient than non- temporal models #Sessions 0 Ome (s) 26

  27. 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend