Using Application-Driven Checkpointing for Hot Spare High - - PowerPoint PPT Presentation

using application driven checkpointing for hot spare high
SMART_READER_LITE
LIVE PREVIEW

Using Application-Driven Checkpointing for Hot Spare High - - PowerPoint PPT Presentation

Using Application-Driven Checkpointing for Hot Spare High Availability Antti Kantee Cubical Solutions Ltd . The Target: 2n hotspare Antti Kantee <pooka@cubical.fi>, 2004 . imagine some mission critical service fact: all hardware will


slide-1
SLIDE 1

Using Application-Driven Checkpointing for Hot Spare High Availability

Antti Kantee Cubical Solutions Ltd .

slide-2
SLIDE 2

The Target: 2n hotspare

Antti Kantee <pooka@cubical.fi>, 2004 .

imagine some mission critical service

fact: all hardware will break some day

for each server, install a spare server if something bad happens to the server, the spare will take over

machine crash service crash

service state will be migrated

not migrating service state: Cold spare easy ...

adding support should not cripple service

slide-3
SLIDE 3

The Presentation

Antti Kantee <pooka@cubical.fi>, 2004 .

problem solution boosting performance implementation adaption conclusions standing ovation

(or the more likely rotten tomato scene)

slide-4
SLIDE 4

The Problem

Antti Kantee <pooka@cubical.fi>, 2004 .

how to preserve state? classic approach: checkpoint

usually below process-level ==> transparent to process

problem in classic approach implementations: apply rarely to networked services

checkpointing will take very long checkpoint will be huge feature-support limited no external communication allowed (fd’s ...) thread-support usually non-existant

slide-5
SLIDE 5

The Solution

Antti Kantee <pooka@cubical.fi>, 2004 .

Application-Driven Checkpointing instead of checkpointing being transparent to the process, do the opposite: leave checkpointing entirely up to application good: checkpoint exactly the right data checkpoint at exactly the right time possible to get extended feature support bad: need to modify each application separately

slide-6
SLIDE 6

What is process state?

Antti Kantee <pooka@cubical.fi>, 2004 .

in other words: what do we want to capture memory: for a C program, this is pretty much WYGIWYG loads and stores are directly mapped to memory might be more difficult for actual programming languages "other stuff": file descriptors / sockets threads you name it ...

slide-7
SLIDE 7

Application-Checkpointing: Naive Approach

Antti Kantee <pooka@cubical.fi>, 2004 .

simply write out pieces in previous two sets the application decides what gets stored

instead of application deciding what does not get stored

need to figure out some serialization form for information

for memory this is pretty easy: (addr, len, content) for "other stuff" equally easy, just more laborious

we could just write out everything in the process context when checkpoint() is called

but that doesn’t perform especially well

slide-8
SLIDE 8

Boosting Performance

Antti Kantee <pooka@cubical.fi>, 2004 .

two common & cheap solutions asynchonous do not checkpoint in process context while actual cost is still there, the application does not hopefully take such a heavy penalty incremental write out deltas only the more you checkpoint, the more you save

hmm, where have I heard that before?

slide-9
SLIDE 9

Asynchronous checkpointing

Antti Kantee <pooka@cubical.fi>, 2004 .

many employ fork()

get new execution context memory "protected" by copy-on-write

  • k, that was easy
slide-10
SLIDE 10

Incremental checkpointing

Antti Kantee <pooka@cubical.fi>, 2004 .

many employ mprotect() and signal handlers

userspace solution

MMU already tracks modification information

used by pagedaemon wire pages, and pagedaemon no longer needs that info asking MMU perhaps not the best option, but it was easy to implement ;-) some archs have soft "dirty" bit, not in MMU

slide-11
SLIDE 11

Pulling memory checkpointing together

Antti Kantee <pooka@cubical.fi>, 2004 .

two new syscalls: cptctl() and cptfork() cptctl: add/remove checkpoint areas monitored for deltas query changes cptfork: mostly same as fork() check for modified pages

slide-12
SLIDE 12

Additional State

Antti Kantee <pooka@cubical.fi>, 2004 .

we cannot take file descriptors, sockets, signals, threads etc. from a memory dump

kernel state, including lots of structure linkage, so transfer as opaque data not possible

use a syscall augmentation-style approach:

for most entities, it is possible to query the current state from the kernel when restoring, use normal syscalls to "trick" kernel

so basically handle this entirely in userspace unfortunately TCP is not supported :(

slide-13
SLIDE 13

Dealing with Multithreading

Antti Kantee <pooka@cubical.fi>, 2004 .

do not record program counter, register values, etc. treat a thread as like any other "additional state"

record "worker function" address and argument only

for each registered thread, at restore a thread is created and the worker function is called problem: locking

slide-14
SLIDE 14

Additional Support

Antti Kantee <pooka@cubical.fi>, 2004 .

define spare machine(s) move snapshots of runtime state to spare machines

TCP/IP, IP/carrier pigeon, whatever suits you

detect failures

leave that up to the application to define ;-) provide a simple "ping"-approach in the framework

direct network traffic to "spare" after master has crashed and process has been rebuilt

slide-15
SLIDE 15

Application Interface to Framework

Antti Kantee <pooka@cubical.fi>, 2004 .

Philosophy: everything that can be supported application-transparently should be, but it should not prevent any tricks the application might want to pull generally what needs to be done:

reserve checkpoint memory with hsmalloc() group essential memory into e.g. structs register some additional info: hsfdreg(), hsthreadreg() sprinkle checkpoints into appropriate places: hscpt()

restore handled in framework also

slide-16
SLIDE 16

Adapting

Antti Kantee <pooka@cubical.fi>, 2004 .

kernel portion should be in theory adaptable to

  • ther systems

Linux & FreeBSD & Chorus investigated

userspace library should be portable code as-is adapting application is an interesting question

most UNIX programs are stateless state tied to TCP persistence state dealt with by application-specific methods

tetris was easy to adapt sqlite almost equally easy

slide-17
SLIDE 17

Conclusions

Antti Kantee <pooka@cubical.fi>, 2004 .

transparent checkpointing has problems application-driven checkpointing ties application semantics to the task of checkpointing

knowledge can be used in optimizing checkpoint time & place

kernel support provides additional boost state annoyingly tied to TCP

but at least but application-driven checkpointing we have a chance to deal with it

adaption effort depends greatly on application