looking for the perfect vm scheduler
play

Looking for the perfect VM scheduler @fhermeni Fabien Hermenier - PowerPoint PPT Presentation

Looking for the perfect VM scheduler @fhermeni Fabien Hermenier fabien.hermenier@nutanix.com placing rectangles since 2006 https://fhermeni.github.io 2006 - 2010 PhD - Postdoc Gestion dynamique des tches dans les grappes, une


  1. Looking for the perfect VM scheduler @fhermeni Fabien Hermenier fabien.hermenier@nutanix.com — placing rectangles since 2006 https://fhermeni.github.io

  2. 2006 - 2010 PhD - Postdoc Gestion dynamique des tâches dans les grappes, une approche à base de machines virtuelles 2011 Postdoc How to design a better testbed: Lessons from a decade of network experiments 2011 - 2016 Associate professor VM scheduling, green computing

  3. Entreprise cloud company “Going beyond hyperconverged infrastructures” VM scheduling, resource management Virtualization

  4. Inside a private cloud

  5. Clusters from 2 to x physical servers Isolated applications virtual machines containers storage layer SAN based: converged infrastructure shared over the nodes: hyper-converged infrastructure

  6. VM decisions actuators scheduler VM queue model monitoring data

  7. VM scheduling find a server to every VM to run Such that compatible hw enough pCPU enough RAM enough storage enough whatever While min or max sth

  8. ? A good VM scheduler provides Bigger business value, same infrastructure

  9. A good VM scheduler provides Same business value, smaller infrastructure

  10. 1 node = KEEP CALM VDI workload: 12+ vCPU/1 pCPU AND CONSOLIDATE 100+ VMs / server AS HELL

  11. static dynamic schedulers schedulers live-migrations [5] to consider the VM queue address fragmentation deployed everywhere [1,2,3,4] Costly (storage, migration latency) fragmentation issues thousands of articles [10-13] over-hyped ? [9] but used in private clouds [6,7,8] 
 (steady workloads ?)

  12. Placement constraints performance, security, power e ffi ciency, various concerns legal agreements, high-availability, fault-tolerance … dimension spatial or temporal enforcement level hard or soft manipulated state, placement, resource allocation, concepts action schedule, counters, etc.

  13. discrete N1 constraints N2 VM1 N3 VM2 >> spread(VM[1,2]) ban(VM1, N1) “simple” spatial problem ban(VM2, N2) continuous N1 constraints N2 VM1 [15] N3 VM2 spread(VM[1,2]) ban(VM1, N1) harder scheduling problem ban(VM2, N2) (think about actions interleaving)

  14. spread(VM[1..50]) hard must be satisfied constraints all or nothing approach not always meaningful mostlySpread(VM[1..50], 4, 6) soft satisfiable or not constraints internal or external penalty model [6] harder to implement/scale hard to standardise ?

  15. High-availability x-FT VMs must survive to any crash of x nodes 1 - FT 0 - FT x exact approach: solve n placement problems [17]

  16. The VMWare DRS way slot based catch the x- biggest nodes checks the remaining free slots simple, scalable waste with heterogeneous VMs cluster based

  17. The constraint catalog evolves Dynamic Power Management VM-VM a ffi nity Dedicated instances (DRS 3.1) (DRS) (EC2) 2009 ? 2010 ? mar. 2011 VM-host a ffi nity apr. 2011 (DRS 4.1) The constraint needed MaxVMsPerServer in 2014 (DRS 5.1) sep. 2012 2016

  18. the bjective provider side min(x) or max(x)

  19. atomic objectives min( penalties ) min( Total Cost Ownership ) min( unbalance ) …

  20. composite objectives using weights min( α x + β y ) How to estimate coefficients ? useful to model sth. you don’t understand ? min( α TCO + β VIOLATIONS ) € as a common quantifier: max( REVENUES )

  21. Optimize or satisfy ? min(…) or max(…) threshold based easy to say domain specific expertise verifiable hardly provable composable through composable weighting magic

  22. Acropolis Dynamic Scheduler [18] a t i o n m i t i g s p o t H o t Trigger Thresholds Maintain Minimize 85% Σ mig. a ffi nity constraints cost Resource demand CPU (from machine learning) storage-CPU

  23. adapt the VM placement depending on pluggable expectations network and memory-aware migration scheduler, VM-(VM|PM) a ffi nities, resource matchmaking, node state manipulation, counter based restrictions, energy e ffi ciency, discrete or continuous restrictions

  24. interaction though a DSL, an API or JSON messages The reconfiguration plan 0’00 to 0’02: relocate(VM2,N2) 0’00 to 0’04: relocate(VM6,N2) 0’02 to 0’05: relocate(VM4,N1) 0’04 to 0’08: shutdown(N4) 0’05 to 0’06: allocate(VM1,‘cpu’,3) spread (VM[2..3]); preserve (VM1,’cpu’, 3); offline (@N4); BtrPlace

  25. An Open-Source java library for constraint programming deterministic composition high-level constraints the right model for the right problem

  26. boot ( v ∈ V ) � D ( v ) ∈ N st ( v ) = [0 , H − D ( v )] ed ( v ) = st ( v ) + D ( v ) d ( v ) = ed ( v ) − st ( v ) BtrPlace core CSP d ( v ) = D ( v ) models a reconfiguration plan ed ( v ) < H 1 model of transition per element d ( v ) < H action durations as constants * h ( v ) ∈ { 0 , .., | N | − 1 } relocatable ( v ∈ V ) � . . . shutdown ( v ∈ V ) � . . . suspend ( v ∈ V ) � . . . resume ( v ∈ V ) � . . . kill ( v ∈ V ) � . . . bootable ( n ∈ N ) � . . . haltable ( n ∈ N ) � . . .

  27. s e r n n c c o a l o n t i d d i a n g b r i w s e V i new variables and relations ShareableResource(r) ::= Network() ::= … Power() ::= … High-Availability() ::= …

  28. Constraints state new relations …

  29. vector packing problem mem VM1 items with a finite volume to place inside finite bins VM3 generalisation of the bin packing problem cpu N1 mem VM2 the basic to model the infra. 1 dimension = 1 resource VM4 NP-hard problem cpu N2

  30. how to support migrations temporary, resources are used on the source and the destination nodes

  31. Migration duration [min.] 1000*200K 3 1000*100K 1000*10K 2 1 0 1000 900 800 700 600 500 400 300 200 Allocated bandwidth [Mbit/s] y t l s o c e r a s n o i t a r g M i

  32. dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM6 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3

  33. dynamic schedulers Using Vector packing [10,12] mem mem VM1 VM2 N1 N2 cpu cpu mem mem VM4 VM6 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3

  34. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM1 VM2 N1 N2 cpu cpu mem mem VM4 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3

  35. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM2 N1 N2 cpu cpu mem mem VM4 VM1 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3

  36. dynamic schedulers Using Vector packing [10,12] mem mem VM6 sol #1: 1m,1m,2m VM2 N1 N2 cpu cpu mem mem VM4 VM1 VM3 VM5 N3 N4 cpu cpu min(#onlineNodes) = 3

  37. dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM6 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  38. dynamic schedulers Using Vector packing [10,12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM6 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  39. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 N1 N2 cpu cpu mem mem VM5 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  40. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  41. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 lower MTTR (faster) VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  42. dynamic schedulers Using Vector packing [10,12] mem mem VM6 VM4 sol #1: 1m,1m,2m VM1 VM2 sol #2: 1m,2m N1 N2 cpu cpu 1m mem mem VM5 lower MTTR (faster) VM3 N3 N4 cpu cpu min(#onlineNodes) = 3

  43. dynamic scheduling using vector packing [10, 12] mem mem VM4 VM1 VM2 N1 N2 cpu cpu mem mem mem VM7 VM5 VM6 VM3 N5 N3 N4 cpu cpu cpu offline(N2) + no CPU sharing

  44. VM4 VM1 VM2 N1 N2 VM7 VM5 VM6 VM3 N5 N3 N4 Dependency management

  45. VM5 VM1 N1 N2 VM7 VM4 VM6 VM3 VM2 N5 N3 N4 Dependency management 1) migrate VM2, migrate VM4, migrate VM5

  46. VM5 VM1 N1 VM7 VM4 VM6 VM3 VM2 N5 N3 N4 Dependency management 1) migrate VM2, migrate VM4, migrate VM5 2) shutdown(N2), migrate VM7

  47. coarse grain staging delay actions mig(VM2) mig(VM4) mig(VM5) o ff (N2) mig(VM7) time stage 1 stage 2

  48. Resource-Constrained Project Scheduling Problem [14] time 0 8 4 3 VM5 N1 VM1 VM1 VM2 N2 off VM4 VM7 N3 VM3 VM3 VM6 VM6 N4 VM5 VM7 VM2 N5 VM4

  49. Resource-Constrained Project Scheduling Problem 1 resource per (node x dimension), bounded capacity tasks to model the VM lifecycle. height to model a consumption width to model a duration at any moment, the cumulative task consumption on a resource cannot exceed its capacity comfortable to express continuous optimisation NP-hard problem

  50. From a theoretical to a practical solution duration may be longer convert to an event based schedule 0:3 - migrate VM4 - : migrate VM4 0:3 - migrate VM5 - : migrate VM5 0:4 - migrate VM2 - : migrate VM2 3:8 - migrate VM7 !migrate(VM2) & !migrate(VM4): shutdown(N2) 4:8 - shutdown(N2) !migrate(VM5): migrate VM7

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend