checkpoint restart for a network of virtual machines
play

Checkpoint-Restart for a Network of Virtual Machines Rohan Garg, - PowerPoint PPT Presentation

Checkpoint-Restart for a Network of Virtual Machines Rohan Garg, Komal Sodha, Zhengping Jin, Gene Cooperman College of Computer and Information Science Northeastern University, Boston Boston, Massachusetts 02115 { rohgarg, komal, jinzp, gene }


  1. Checkpoint-Restart for a Network of Virtual Machines Rohan Garg, Komal Sodha, Zhengping Jin, Gene Cooperman College of Computer and Information Science Northeastern University, Boston Boston, Massachusetts 02115 { rohgarg, komal, jinzp, gene } @ccs.neu.edu September 24, 2013

  2. Outline Motivation Related Work Design and Implementation DMTCP and Plugins Generic Checkpoint-Restart for Virtual Machines Checkpointing a network of VMs Experimental Results Conclusion

  3. Outline Motivation Related Work Design and Implementation Experimental Results Conclusion

  4. Motivation ◮ Parallel Computations on the Cloud ◮ Not everybody uses MPI: IaaS (Infrastructure as a Service) ◮ Flexibility and maintainability

  5. Motivation ◮ Parallel Computations on the Cloud ◮ Not everybody uses MPI: IaaS (Infrastructure as a Service) ◮ Flexibility and maintainability Imagine if you could... ◮ deploy complex software configuration in a secure environment ◮ gain high reliability by running within a virtual machine that is set to take snapshots every minute ◮ checkpoint a network of virtual machines including the state of a parallel computation

  6. Outline Motivation Related Work Design and Implementation Experimental Results Conclusion

  7. Related Work ◮ Virtual Machine checkpointing ◮ QEMU, KVM, Xen, VMware: Snapshotting ◮ Remus: High Availability on Xen-based servers ◮ VM- µ Checkpoint: High frequency checkpointing on Xen ◮ Emulab: Distributed checkpointing with Xen; record-replay of network packets ◮ BlobSeer

  8. Related Work ◮ Virtual Machine checkpointing ◮ QEMU, KVM, Xen, VMware: Snapshotting ◮ Remus: High Availability on Xen-based servers ◮ VM- µ Checkpoint: High frequency checkpointing on Xen ◮ Emulab: Distributed checkpointing with Xen; record-replay of network packets ◮ BlobSeer ◮ Checkpoint-restart ◮ BLCR: Kernel-space ◮ CryoPid2: Process Pods; 32-bit only ◮ CRIU: User-space; Linux containers ◮ DMTCP: User-space; distributed

  9. Outline Motivation Related Work Design and Implementation DMTCP and Plugins Generic Checkpoint-Restart for Virtual Machines Checkpointing a network of VMs Experimental Results Conclusion

  10. DMTCP and Plugins DMTCP: ◮ Distributed MultiThreaded Checkpointing ◮ User-space ◮ Transparent checkpointing ◮ Distributed processes ◮ Wide range of supported applications: MPI, Perl/Python, GDB, X-windows , Matlab, R

  11. DMTCP and Plugins DMTCP: ◮ Distributed MultiThreaded Checkpointing ◮ User-space ◮ Transparent checkpointing ◮ Distributed processes ◮ Wide range of supported applications: MPI, Perl/Python, GDB, X-windows , Matlab, R DMTCP Plugins: ◮ DMTCP extensions; shared libraries ◮ Short, well-defined API ◮ Add support to handle the checkpoint-restart of specific resources

  12. DMTCP Plugins: Features Two essential features: ◮ Wrapper Fuctions: ◮ Interpose on library and system function calls ◮ Process the arguments; call the interposed function; and return back (possibly modified) return value ◮ DMTCP Events: ◮ Notify plugin of several events: Pre-checkpoint, Post-restart, etc.

  13. Generic Checkpoint-Restart for VMs: Background Generic VM Architecture Guest VM (user space component) tables (shared w/ kernel space) Async I/O threads vCPU threads User Space Memory Kernel Space Memory Kernel Module for VM: VM Shell tables (shared with user space) Hardware description (peripherals, IRQ, etc.) vCPU0 vCPUn vCPUs for virtual cores

  14. Generic Checkpoint-Restart for VMs: Background Generic VM Architecture Guest VM (user space component) tables (shared w/ kernel space) Async I/O Special Cases: threads vCPU threads ◮ Xen, VMware ESXi Server: User Space Memory very thin hypervisor; Kernel Space Memory bare-metal; no host OS Kernel Module for VM: ◮ QEMU: Software emulation; user-space VM Shell tables (shared with user space) Hardware description (peripherals, IRQ, etc.) vCPU0 vCPUn vCPUs for virtual cores

  15. Generic Checkpoint-Restart for VMs: Background ◮ DMTCP: ◮ Handle user-space memory, file descriptors, sockets, etc. % dmtcp checkpoint qemu < args − for − qemu > % dmtcp command −− checkpoint % dm tc p re s tar t ckpt − qemu − img . dmtcp

  16. Checkpoint-Restart for KVM: Key Ideas Guest VM (user space component) ◮ DMTCP KVM Plugin: tables (shared w/ kernel space) ◮ Launch empty VM shell Async I/O threads ◮ Copy the checkpoint vCPU threads image (they’re just bits) User Space Memory from the old Kernel Space Memory checkpointed VM Kernel Module for VM: ◮ Restore kernel VM driver parameters VM Shell tables (shared with user space) ◮ Patch kernel VM driver (Empty H/W description) vCPU0 vCPUn parameters vCPUs for virtual cores

  17. Checkpoint-Restart for KVM: Key Ideas Guest VM (user space component) ◮ DMTCP KVM Plugin: tables (shared w/ kernel space) ◮ Launch empty VM shell Async I/O threads ◮ Copy the checkpoint vCPU threads image (they’re just bits) User Space Memory from the old Kernel Space Memory checkpointed VM Kernel Module for VM: ◮ Restore kernel VM driver parameters VM Shell tables (shared with user space) ◮ Patch kernel VM driver (Empty H/W description) vCPU0 vCPUn parameters vCPUs for virtual cores % dmtcp checkpoint \ −− with − p l u g i n dmtcp kvm plugin . so \ qemu − enable − kvm < args − for − qemu > % dmtcp command −− checkpoint % dm tc p re s tar t ckpt − qemu − img . dmtcp

  18. Challenges for checkpointing a network of VMs

  19. Challenges for checkpointing a network of VMs Challenges: ◮ Synchronization between VMs ◮ Re-generating the virtual network ◮ Saving and restoring in-flight data

  20. Challenges for checkpointing a network of VMs: Solutions ◮ Synchronization between VMs

  21. Challenges for checkpointing a network of VMs: Solutions ◮ Synchronization between VMs ◮ DMTCP Co-ordinator

  22. Challenges for checkpointing a network of VMs: Solutions ◮ Synchronization between VMs ◮ DMTCP Co-ordinator ◮ Re-generating the virtual network ◮ Saving and restoring in-flight data

  23. Challenges for checkpointing a network of VMs: Solutions ◮ Synchronization between VMs ◮ DMTCP Co-ordinator ◮ Re-generating the virtual network ◮ Saving and restoring in-flight data ◮ DMTCP TUN/TAP Plugin: Heuristic: ◮ Quiesce the user-application threads ◮ Wait for a fixed time: assume all packets have arrived ◮ Write the checkpoint image (if additional packets continue to arrive, try again) ◮ Alternative approach: broadcast a cookie % dmtcp checkpoint \ −− with − p l u g i n dmtcp kvm plugin . so \ −− with − p l u g i n dmtcp tun plugin . so \ qemu − enable − kvm < args − for − qemu > % dmtcp command −− checkpoint % dm tc p re s tar t ckpt − qemu − img . dmtcp

  24. Outline Motivation Related Work Design and Implementation Experimental Results Conclusion

  25. Experimental Results: Setup ◮ Network of Virtual Machines ◮ 12-node cluster (at University of Alabama, Birmingham) ◮ Each node: 12-core Intel Xeon (1.6 GHz) server; 24 GB RAM ◮ KVM/QEMU with Tap ◮ Host OS: 64-bit CentOS; Linux Kernel 2.6.32 ◮ Guest OS: Ubuntu 12.04 Server ◮ Others: ◮ Btrfs (nested VMs) ◮ DMTCP optimizations ◮ Commodity computer

  26. Experimental Results: Scalability 12 10 Time (seconds) 8 6 4 Checkpoint Restart 2 0 2 4 6 8 10 12 Number of Nodes Checkpoint-restart of HPCC benchmark on a Gigabit Ethernet cluster, (Memory allocated in each case is 1024 MB.)

  27. Experimental Results: Optimizations - I ◮ Btrfs filesystem ◮ Fast, incremental checkpoints ◮ Copy-on-write filesystem ◮ Going to be the default filesystem (soon?) ◮ Nested VMs

  28. Experimental Results: Optimizations - I ◮ Btrfs filesystem ◮ Fast, incremental checkpoints ◮ Copy-on-write filesystem ◮ Going to be the default filesystem (soon?) ◮ Nested VMs ◮ DMTCP optimizations ◮ Forked checkpointing : copy-on-write: fork a child to write checkpoint; parent continues ◮ mmap-based fast restart : on-demand paging from the checkpoint image

  29. Experimental Results: Optimizations - II 40 Ckpt w/ Btrfs 35 Ckpt w/o Btrfs Restart w/ Btrfs 30 Restart w/o Btrfs Time (seconds) 25 20 15 10 5 0 1 2 4 Number of Nodes Snapshotting up to four distributed VMs running HPCC under KVM/QEMU. The Btrfs filesystem is used to snapshot the filesystem using nested VMs. (Memory allocated in each case is 384 MB. The size of the guest filesystem is 2 GB.)

  30. Experimental Results: Optimizations - II 12 Ckpt Ckpt w/ F/C 10 Ckpt w/ F/R Ckpt w/ F/C + F/R 8 Time (seconds) 6 4 2 0 1 2 4 8 12 Number of Nodes Checkpoint of HPCC benchmark on a Gigabit Ethernet cluster, as influenced by DMTCP’s optional optimizations: forked checkpoint (F/C) and fast restart (F/R). DMTCP’s default gzip compression of checkpoint images is incompatible with DMTCP F/R, and so is not used in those cases. (Memory allocated in each case is 1024 MB.)

  31. Experimental Results: Optimizations - II 6 Restart Restart w/ F/C 5 Restart w/ F/R Restart w/ F/C + F/R 4 Time (seconds) 3 2 1 0 1 2 4 8 12 Number of Nodes Restart of HPCC benchmark on a Gigabit Ethernet cluster, as influenced by DMTCP’s optional optimizations: forked checkpoint (F/C) and fast restart (F/R). DMTCP’s default gzip compression of checkpoint images is incompatible with DMTCP F/R, and so is not used in those cases. (Memory allocated in each case is 1024 MB.)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend