Recall: virtual machines (VMs) Each guest VM runs a complete OS - - PowerPoint PPT Presentation

recall virtual machines vms
SMART_READER_LITE
LIVE PREVIEW

Recall: virtual machines (VMs) Each guest VM runs a complete OS - - PowerPoint PPT Presentation

Remus: VM Replica/on Jeff Chase Duke University Recall: virtual machines (VMs) Each guest VM runs a complete OS instance over an isolated sliver of host physical memory. Hypervisors support migration


slide-1
SLIDE 1

Remus: ¡VM ¡Replica/on ¡

Jeff ¡Chase ¡ Duke ¡University ¡

slide-2
SLIDE 2

Recall: virtual machines (VMs)

  • Each guest VM runs a complete OS instance over an

isolated “sliver” of host physical memory.

  • Hypervisors support migration and suspend/resume.

– Both operations require an atomic snapshot (checkpoint) of VM memory state and register contexts. – Capture modified pages and write them to snapshot.

hypervisor (VMM)

host guest

guest kernel

slide-3
SLIDE 3

Capturing modified pages

  • How to do it?
  • Recall the Address Translation Uses slides earlier.
  • <Discuss.>
slide-4
SLIDE 4

Remus checkpoints

  • Snapshot the VM, but don’t suspend it.

– Snapshot periodically as it executes. – Snapshot concurrently: keep running while snap is in progress.

  • Migrate the VM, but don’t start the remote copy.

– Just load the snapshot on the remote host. – Transmit “live” incremental checkpoints over the network. – Update the remote snapshot/copy/instance in place. – Remote host is a warm standby or backup replica.

  • All checkpoints are atomic: they capture a point in time.
slide-5
SLIDE 5

Remus Checkpoints

n Remus divides time into epochs (~25ms) n Performs a checkpoint at the end of each epoch

  • 1. Suspend primary VM
  • 2. Copy all state changes to a buffer in Domain 0
  • 3. Resume primary VM
  • 4. Send asynchronous message to backup containing state changes
  • 5. Backup VM applies state changes

5

Periodic Checkpoints (Changes to VM State) Primary Server Domain 0 Backup Server Domain 0 Xen VMM Primary VM Xen VMM Backup VM

[Ashraf Aboulnaga RemusDB]

slide-6
SLIDE 6

Changes to VM State

Transparent HA for DBMS

n RemusDB: efficient and transparent active/standby high

availability for DBMS implemented in the virtualization layer

n Propagates all changes in VM state from primary to backup n High availability with no code changes to the DBMS n Completely transparent failover from primary to backup n Failover to a warmed up backup server

Backup Server

DB DBMS

Primary Server

VM DB DBMS VM Primary Server

6 [Ashraf Aboulnaga RemusDB]

slide-7
SLIDE 7

Remus

slide-8
SLIDE 8

Remus Checkpoints

n After a failure, the backup resumes execution from the

latest checkpoint

n Any work done by the primary during epoch C will be lost (unsafe)

n Remus provides a consistent view of execution to clients

n Any network packets sent during an epoch are buffered until the

next checkpoint

n Guarantees that a client will see results only if they are based on

safe execution

n Same principle is also applied to disk writes

8 [Ashraf Aboulnaga RemusDB]

slide-9
SLIDE 9

Outbound packet buffering

slide-10
SLIDE 10

Disk (FS) updates

slide-11
SLIDE 11

Remus implementation

slide-12
SLIDE 12

Tardigrade (NSDI-15)

slide-13
SLIDE 13

Remus checkpoint latency

slide-14
SLIDE 14

Remus overhead

slide-15
SLIDE 15

Tardigrade

slide-16
SLIDE 16

Tardigrade