How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me - - PowerPoint PPT Presentation

how to backup ceph at scale
SMART_READER_LITE
LIVE PREVIEW

How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me - - PowerPoint PPT Presentation

How to backup Ceph at scale FOSDEM, Brussels, 2018.02.04 About me Bartomiej wicki OVH Wrocaw, PL Current job: More Ceph awesomeness Speedlight Ceph intro Open-source Network storage Scalable Reliable


slide-1
SLIDE 1

How to backup Ceph at scale

FOSDEM, Brussels, 2018.02.04

slide-2
SLIDE 2

About me

Bartłomiej Święcki

OVH Wrocław, PL

Current job:

More Ceph awesomeness

slide-3
SLIDE 3

Speedlight Ceph intro

  • Open-source
  • Network storage
  • Scalable
  • Reliable
  • Self-healing
  • Fast
slide-4
SLIDE 4

Ceph @ OVH

  • Almost 40 PB of

raw HDD storage

  • 150 clusters
  • Mostly RBD images
slide-5
SLIDE 5

Why we need Ceph backup ?

  • Protection against software bugs
  • Didn’t see that yet but better safe than sorry
  • One more protection against disaster
  • Probability spikes at scale (i.e. HDD failures)
  • XFS (used by Ceph) can easily corrupt during power

failures

  • Human mistakes – those always happen
  • Ops accidentally removing data
  • Clients removing / corrupting data by mistake
  • Geographically separated backups
  • Not easily available in Ceph (yet)
slide-6
SLIDE 6

Resource estimation and planning

slide-7
SLIDE 7

Software selection

  • Compression
  • Deduplication
  • Encryption
  • Speed
  • Work with data streams
  • Support for OpenStack

SWIFT

slide-8
SLIDE 8

Software selection

  • No perfect match at

that time

  • Selected duplicity –

already used at OVH

  • Promising alternatives

(i.e. Restic)

slide-9
SLIDE 9

Storage, network

  • Assumed compression

and deduplication – 30%

  • f raw data
  • Use existing OVH

services – PCA (swift)

  • Dynamically scale

computing resources with OVH Cloud

slide-10
SLIDE 10

Impact on Ceph infrastructure

20PB raw data: 6.6 PB of data without replicas

For daily backup:

  • ~281 GB / h = ~ 4.7 GB /min = ~ 0.078 GB / sec
  • 0.63 Gb/sec constant traffic
slide-11
SLIDE 11

Backup architecture – idea

CEPH Cluster RBD Image (snapshot) Backup VM PCA Swift Docker Container Duplicity

slide-12
SLIDE 12

Implementation challenges

slide-13
SLIDE 13

Duplicity quirks

  • Can backup files only – export rbd

image locally need temporary storage

  • Files should not be larger than few

MB due to librsync limits – rbd image split into files of up to 256MB size

  • Can not backup large images

(large >= 500GB): not enough local storage, timeouts, interruptions – split image into 25GB chunks and backup separately

slide-14
SLIDE 14

Duplicity + SWIFT overview

CEPH Cluster RBD Image (snapshot) Backup VM PCA Swift Docker Container Duplicity Local SSD 256 MB 256 MB 256 MB 256 MB Chunk 25GB

slide-15
SLIDE 15

FUSE to the rescue

  • Expose part of image through

FUSE

  • Can easily work on part of the

image

  • Can expose image as list of smaller

files

  • No need for local storage, all can

be done in memory

  • Restore a bit more problematic but

possible

slide-16
SLIDE 16

Prod impact

  • Throttle number of

simultaneous backups

– Global limit imposed by our compute resources – Limits per cluster – Limits per backup VM – No simultaneous backups of

  • ne RBD image
  • Used locks and

semaphores stored in zookeeper

slide-17
SLIDE 17

Scaling issues

  • Zookeeper does not work well

with frequently changing data

  • Lots of issues with celery

workers – memory leaks, ulimit, ping timeouts, rare bugs

  • Issues with docker – orphaned

network interfaces, local storage not removed

  • Duplicity requires lots of CPU to

restore backup (restore 4x slower than backup)

slide-18
SLIDE 18

Hot / cold backup strategy

slide-19
SLIDE 19

Backup to Ceph

  • Separate Ceph cluster with

copy of data

  • Export / import diff a huge

advantage

  • Can use backup cluster as a

hot-swap replacement

  • Reuse previous backup

architecture

  • Can backup spare cluster as

before – cold backup

slide-20
SLIDE 20

Ceph on Ceph overview

Source CEPH Cluster RBD Image (snapshot) Backup Container Backup CEPH Cluster RBD Image Chunk 25GB

slide-21
SLIDE 21

Advantages

  • Can backup large cluster in less

than 24h

  • Greatly reduced compute

power needed

  • Can recover in minutes, not

hours / days

slide-22
SLIDE 22

OVH Ceph Backups - numbers

slide-23
SLIDE 23

Global info:

34 Clusters with active backup ~9000 backups finished daily ~0.6 PB of data exported daily

slide-24
SLIDE 24

Large cluster case study:

3350 4776 33432 5000 10000 15000 20000 25000 30000 35000 40000 Weekly backups

WEEKLY BACKUPS

Duplicity Swift Ceph on Ceph Ceph on Ceph with diff

slide-25
SLIDE 25

Large cluster case study:

slide-26
SLIDE 26

Large cluster case study:

slide-27
SLIDE 27

To sum up…

  • Backups at scale definitely possible…
  • … but better start with Ceph-on-Ceph
  • You can get down to 24h backup window
  • n highly utilized clusters
  • Alternative storage to Ceph can give even

better protection but will be slow

  • Ceph-on-Ceph as a first line, alternative

storage as a second line backup

slide-28
SLIDE 28

Image sources

http://alphastockimages.com/ https://www.flickr.com/photos/soldiersmediacenter/4473414070 https://commons.wikimedia.org/wiki/File:Open_Floodgates_- _Beaver_Lake_Dam_-_Northwest_Arkansas,_U.S._-_21_May_2011.jpg https://commons.wikimedia.org/wiki/File:Hot_Cold_mug.jpg

slide-29
SLIDE 29

Questions?

bartlomiej.swiecki@corp.ovh.com