Grid Checkpointing John Mehnert-Spahn Heinrich-Heine University - - PowerPoint PPT Presentation

grid checkpointing
SMART_READER_LITE
LIVE PREVIEW

Grid Checkpointing John Mehnert-Spahn Heinrich-Heine University - - PowerPoint PPT Presentation

Grid Checkpointing John Mehnert-Spahn Heinrich-Heine University Duesseldorf, Germany XtreemOS Summer School, Gnzburg, Germany, 2010 XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 XtreemOS IP project 1


slide-1
SLIDE 1

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 1

Grid Checkpointing

John Mehnert-Spahn

Heinrich-Heine University Duesseldorf, Germany XtreemOS Summer School, Günzburg, Germany, 2010

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576

slide-2
SLIDE 2
  • Checkpointing
  • XtreemGCP
  • Communication channel checkpointing with

heterogeneous checkpointers

  • ( Adaptive Checkpointing – incremental grid cp )

Overview

2

slide-3
SLIDE 3

Grid Jobs

Paris London Duesseldorf Barcelona Job A running in a VO Job unit A1 Job unit A2 Job unit A3 Job unit A4

3

slide-4
SLIDE 4

Faults

Paris London Duesseldorf Barcelona Job A running in a VO Job unit A1 Job unit A2 Job unit A3 Job unit A4

4

Fault tolerance needed

slide-5
SLIDE 5

Fault tolerance

  • Replication
  • Forward error recovery
  • Backward error recovery

5

slide-6
SLIDE 6

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 6

Checkpointing & Restart

  • Checkpointing: The application state is saved periodically

to stable storage.

  • Restart: The application gets reestablished from a recent
  • checkpoint. Thus, no fall back to the initial state will occur.
slide-7
SLIDE 7

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 7

Checkpointing & Restart

  • Checkpointing: Saving periodically the state of the

application in stable storage

  • Restart: In case of a fault we can restart from a

checkpoint and do not fall back to the initial state

  • Challenges:
  • Trade-off between costs during fault-free execution and costs at recovery
  • Size of the distributed state may be very large
  • Checkpointing images must be replicated
  • Heterogeneity of checkpointer packages
slide-8
SLIDE 8

Many Checkpointers exist

CoCheck Condor DCR DMTCP & MTCP BLCR LAM/MPI&BLCR zap CLIP libckpt Dynamite LinuxSSI Linux-native OpenVZ tmPVM VMWare player Ckpt CHPOX CRAK UCLiK Epckpt MCR SCore TICK VMADump

8

KMU

CP/R

slide-9
SLIDE 9

Workflow: Coordinated CP

9

XtreemGCP checkpointing service

slide-10
SLIDE 10

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 10

XtreemGCP

  • A grid service integrated within AEM implementing

job migration and job fault tolerance for grid jobs

  • Integrates existing checkpointer packages
  • Supports transparent and application-level checkpointing
  • Security
slide-11
SLIDE 11

Grid-Checkpointing Architecture

11

slide-12
SLIDE 12

12

Grid-Checkpointing Architecture

slide-13
SLIDE 13

13

Grid-Checkpointing Architecture

slide-14
SLIDE 14

14

Grid-Checkpointing Architecture

slide-15
SLIDE 15

15

Grid-Checkpointing Architecture

slide-16
SLIDE 16

16

Grid-Checkpointing Architecture

slide-17
SLIDE 17

Grid-Checkpointing Architecture

17

slide-18
SLIDE 18

Uniform Checkpointer Interface

  • Uniform access to different checkpointer packages

 implemented by a translib (shared library)

  • Translations
  • function signatures
  • job-to-Linux process group
  • grid user id-to-local user id
  • callback management
  • checkpoint image dependencies
  • checkpointer-to-checkpointer
  • application-checkpointer-compatibility

18

slide-19
SLIDE 19
  • To which extent must existing checkpointers be adapted to

support various checkpointing protocols?

  • We need the following sequences
  • Stop
  • Checkpoint
  • Resume_cp
  • Rebuild
  • Resume_rst

Uniform Checkpointer Interface

19

Checkpoint Restart

slide-20
SLIDE 20
  • Currently, supported checkpointer packages
  • BLCR
  • OpenVZ
  • MTCP
  • LinuxSSI
  • (Linux native)

Uniform Checkpointer Interface

20

slide-21
SLIDE 21

Checkpoint files

  • Must be replicated
  • And accessible from each grid node
  • Stored in XtreemFS, providing:
  • Stripping
  • Automatic replication
  • Location-transparent access
  • Access control via XtreemOS user accounts

21

slide-22
SLIDE 22

Coordinated Checkpointing Workflow

22

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

slide-23
SLIDE 23

Coordinated Checkpointing Workflow

23

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

slide-24
SLIDE 24

Coordinated Checkpointing Workflow

24

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

slide-25
SLIDE 25

Coordinated Checkpointing Workflow

25

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux job meta-data job-unit meta-data checkpointer images

sync/split/replicate

slide-26
SLIDE 26

Independent Checkpointing Workflow

26

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

slide-27
SLIDE 27

Independent Checkpointing Workflow

27

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

slide-28
SLIDE 28

Independent Checkpointing Workflow

28

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux job meta-data job-unit meta-data checkpointer images

sync/split/replicate

slide-29
SLIDE 29

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job-unit Checkpointer Job Checkpointer LinuxSSI cluster Linux

Independent Restart Workflow (during application runtime) receive determinants (create dependency graph) wrappers for send, recv, etc. (LD_PRELOAD)

29

slide-30
SLIDE 30

Job Checkpointer

Independent Restart Workflow calculate recovery line from received determinants

30

slide-31
SLIDE 31

Translation Library LinuxSSI Checkp. Job-unit Checkpointer Translation Library BLCR Job Checkpointer LinuxSSI cluster Linux

Independent Restart Workflow restart from CP1 rollback to CP2

Job-unit Checkpointer

31

slide-32
SLIDE 32

Measurements Checkpoint Restart

32

slide-33
SLIDE 33

Callback Management

  • Implemented in generic part of translib
  • Called before and after a checkpoint and after restart
  • Common API for application callback registration
  • Useful for:
  • Application-level checkpointing
  • Application-level enhancements/optimizations
  • System-level checkpointing of communication channels

33

slide-34
SLIDE 34

Workflow: Coordinated CP

34

Channel checkpointing with heterogeneous checkpointers

slide-35
SLIDE 35

Consistent Checkpoints

  • in-transit messages -
  • orphan message
  • lost message:

35

slide-36
SLIDE 36
  • Soluition save in-transit messages
  • Marker-based approach

Challenges in the grid context

Node A Node B Marker

slide-37
SLIDE 37
  • Marker-based approach
  • Challenges
  • incompatible checkpointers must cooperate
  • migration support
  • transparency (application, checkpointer, operating system)

Node A Checkpointer X Node B „This is my marker.“

Challenges in the grid context

slide-38
SLIDE 38
  • Marker-based approach
  • Challenges
  • incompatible checkpointers must cooperate
  • migration support
  • transparency (application, checkpointer, operating system)

Node A Checkpointer X Node B Checkpointer Y „This is my marker.“ „What's that? A normal paket with no specific meaning.“

Challenges in the grid context

slide-39
SLIDE 39

Architecture

slide-40
SLIDE 40

Gridkanalsicherung

  • Messungen -
  • Nachrichtenlänge und Sendefrequenz ohne Auswirkungen

40

slide-41
SLIDE 41

Workflow: Coordinated CP

41

Adaptive checkpointing

slide-42
SLIDE 42
  • Incremental Checkpointing
  • write-bit
  • reflect dynamical memory layout changes
  • mprotect und jsdl

Adaptive Checkpointing

  • Incremental Checkpointing -
slide-43
SLIDE 43

Adaptive Checkpointing

  • Incremental Checkpointing -
slide-44
SLIDE 44

Adaptive Checkpointing

  • Incremental Checkpointing -

Common Checkpoint Incremental Checkpoint

slide-45
SLIDE 45

Adaptive Checkpointing

  • Incremental Checkpointing -

Common Restart Incremental Restart

slide-46
SLIDE 46

Summary

  • XtreemGCP offers migration and fault tolerance in grids by

providing checkpointing and restart

  • It is designed for heterogeneous setups integrating

existing checkpointer packages

  • Future work:

virtual machine support & adaptive checkpointing

46

slide-47
SLIDE 47

Acknowledgment

  • EC for funding XtreemOS
  • XtreemOS- GCP contributors:
  • Heinrich-Heine Universität Düsseldorf

John Mehnert-Spahn, Eugen Feller

  • INRIA, Rennes, France

Christine Morin, Thomas Ropars, Surbi Chitre, Stefania Costache

47