survivable and bandwidth guaranteed embedding of virtual
play

Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters - PowerPoint PPT Presentation

IEEE INFOCOM 2017 Datacenter Networks 1 Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters in Cloud Data Centers Ruozhou Yu , Guoliang Xue, and Xiang Zhang Arizona State University Dan Li Tsinghua University 1/25 Outlines q


  1. IEEE INFOCOM 2017 Datacenter Networks 1 Survivable and Bandwidth- Guaranteed Embedding of Virtual Clusters in Cloud Data Centers Ruozhou Yu , Guoliang Xue, and Xiang Zhang Arizona State University Dan Li Tsinghua University 1/25

  2. Outlines q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 2/25

  3. The Cloud Shift q Cloud computing : seems an omnipotent solution to all kinds of performance requirements The Mighty Cloud q But is it as mighty as it seems? 3/25

  4. Inside the Cloud q An illusion of infinite computing resources created by large clusters of interconnected machines in data centers q Performance bottleneck: Cloud network ! 4/25

  5. VM & Bandwidth q Traditional approach: Network-agnostic VM allocation q Recent advance: Bandwidth-guaranteed VM allocation q Or Virtual Cluster Embedding (VCE) ! v Existing algorithms can allocate bandwidth-guaranteed VMs with minimum bandwidth, migration costs, etc. q But we know that Cloud machines do fail, quite often… 5/25

  6. Survivable VCE q Question : How can we ensure VM availability even when its host machine could fail? q Answer : We prepare extra VMs and bandwidth just in case! q Question : And how much will that cost us? q Answer : No problem! We can minimize that! q Question : How are we going to achieve that? q Answer : Dynamic programming! 6/25

  7. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 7/25

  8. Network Topology q Assumption : the DCN has a tree structure v Abstracts many common DCN topologies (FatTree, VL2, etc) Original FatTree 1 Gbps / Link c 1 c 2 c 3 c 4 a 11 a 12 a 21 a 22 a 31 a 32 a 41 a 42 e 11 e 12 e 21 e 22 e 31 e 32 e 41 e 42 Abstract Tree c 4 Gbps / Link a 1 a 2 a 3 a 4 2 Gbps / Link e 11 e 12 e 21 e 22 e 31 e 32 e 41 e 42 1 Gbps / Link 8/25

  9. VM Survivability Model q Primary VMs : VMs that are active during normal operations; q Backup VMs : VMs in standby mode, activated when a primary VM’s PM fails v Each backup VM synchronizes the states of multiple primary VMs a b c Migrate q Question : Can we find a bandwidth-guaranteed allocation of both primary and backup VMs to cover an arbitrary single- PM failure, with the minimum number of backup VMs? 9/25

  10. Dynamic Programming for SVCE q Given : topology tree T , request J = < N , B > q Assumption : single PM failure v Interpretation : a failure can be either within a subtree, or outside a subtree, but cannot be both. q Key observation : each subtree’s ability to provide VMs is independent from the rest of the tree, both during normal operations and during an arbitrary failure q Two layers of Dynamic Programming v Outer DP : DP for entire subtrees v Inner DP : DP for the first k sub-subtrees of each subtree 10/25

  11. DP in Details q Outer DP : N v [ n 0 , n 1 ] as the minimum number of total VMs needed in subtree T v , to ensure that v T v can provide at least n 0 VMs when no failure is in T v ; v T v can provide at least n 1 VMs when any PM fails in T v . q Inner DP : N v ’ [ n 0 , n 1 , k ] as the minimum number of total VMs needed in the first k subtrees of v , to ensure that v The k subtrees can provide n 0 VMs when no failure is in them; v The k subtrees can provide n 1 VMs when any PM fails in them. q Alternately update the two tables: v N v [ n 0 , n 1 ] depends on N v ’ [ n 0 ’ , n 1 ’ , d v ] ( d v is the # subtrees under v ); v N v ’ [ n 0 , n 1 , k ] depends on N v [ n 0 ’’ , n 1 ’’ ] of lower-layer nodes. 11/25

  12. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps 2 x x x 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 3 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ 2 2 ∞ ∞ 2 2 ∞ ∞ 12/25

  13. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x x 4 n 1 =2 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ n 0 =2 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 n 1 =0 13/25

  14. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 4 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ 1 1 ∞ ∞ 1 1 ∞ ∞ n 0 =2 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 n 1 =0 14/25

  15. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 3 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 15/25

  16. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 2 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 1 =0 2 2 ∞ ∞ 2 2 ∞ ∞ 16/25

  17. Heuristic SVCE q Optimal DP time complexity: O (| V | N 6 ) v where | V | is # tree nodes , N is # requested VMs . q Question : Can we find a near - optimal solution with less time ? q Observation : if we find a normal VCE with N + N ’ VMs, such that each PM hosts at most N ’ VMs, then we can always recover from any single PM failure. q Algorithm : search from N ’=1 to N , each time using an existing VCE algorithm to find a VCE with N ’ extra VMs, and each PM’s # VMs is bounded by N ’ . q Time Complexity : O ( N · | V |log| V |) 17/25

  18. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 18/25

  19. Simulation Setups q Tree-structured DCN v 4-layer 8-ary (512 PMs, 73 switches) v 5 VM slots / PM v ToR bandwidth: 1 Gbps | Aggr/Core bandwidth: 10 Gbps q Tenant VCs v 1000 requests v 15 VMs and 300 Mbps per VM, on average v Poisson arrivals q Comparison: v OPT: Optimal DP SVCE algorithm v HEU: Heuristic SVCE algorithm v SBS: Shadow-based solution (dedicated VC backup) 19/25

  20. Simulation Results: Average VM Usage 20/25

  21. Simulation Results: Acceptance Ratio 21/25

  22. Simulation Results: Running Time 22/25

  23. Outline q Introduction and Motivation q System Model and Algorithm Design q Performance Evaluation q Conclusions 23/25

  24. Conclusions q A first study on Survivable VCE v A two-layer optimal DP algorithm v A faster near-optimal heuristic algorithm q Discussions v Extension to tree-like topologies (FatTree, VL2, etc.) v Extension to cover a constant number of simultaneous failures q Future work v SVCE on generic data center topologies (BCube, JellyFish, etc.) v Covering link failures in addition to PM failures 24/25

  25. Q&A? THANK YOU VERY MUCH! 25/25

  26. Hose Model Bandwidth Guarantee q Request J = < N , B > v N = 7, B = 100 Mbps a 200 Mbps Number of VMs T c can offer (bandwidth constrained): b c [0, 2] ∩ [5, 7] n c ∈ 26/25

  27. DP in Details /2 Bandwidth feasible VMs q Outer DP update : v PM level: Bandwidth Lower bound of infeasible VMs upper bracket bw feasible VM v Switch level: q Inner DP update : v No subtree: v k -th subtree: 27/25

  28. Work-through Example ’ [n 0 ,n 1 ,2] / N 3 [n 0 ,n 1 ] N 3 q J = < 2 , 100 Mbps> n 0 \n 1 0 1 2 0 x x x 1 x x x 100Mbps n 0 =2 2 x 3 4 n 1 =1 100Mbps 100Mbps SW 3 ’ [n 0 ,n 1 ,1] N 1 [n 0 ,n 1 ] / N 1 N 2 [n 0 ,n 1 ] PM 1 PM 2 n 0 \n 1 0 1 2 n 0 \n 1 0 1 2 0 0 ∞ ∞ 0 0 ∞ ∞ n 0 =1 1 1 ∞ ∞ 1 1 ∞ ∞ n 1 =0 n 0 =2 2 2 ∞ ∞ 2 2 ∞ ∞ n 1 =0 28/25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend