 
              Single-pass restore after a media failure Caetano Sauer , Goetz Graefe, Theo Härder
20% of drives fail after 4 years High failure rate on first year (factory defects) Expectation of 50% for 6 years https://www.backblaze.com/blog/how-long-do-disk-drives-last/ 2
Bad batches among otherwise reliable products More reliability at a premium price tag https://www.backblaze.com/blog/best-hard-drive/ 3
Disk is dead? Drawbacks of traditional recovery apply (in varying degrees) to all types of media Sep 2013 4
Media restore Primary storage Secondary storage (latency-optimized) (bandwidth-optimized) 2. Log replay Log Log archive Backups DB Incr. Full 1. Restore backups Replacement 5
How bad can it get? Scenario: • Drive: 6 TB; 200 MB/s bandwidth; 4ms latency • Backups: full every week; incremental every day • Workload: 2% of data pages change daily Full Incr. Incr. Incr. Incr. Incr. Incr. Sun Tue Wed Thu Fri Sat Mon 11PM 11PM 11PM 11PM 11PM 11PM 11PM Sun 10PM Disk failure* * Subject to Murphy‘s law 6
How bad can it get? 1. Restore full backup: 6 TB at 200 MB/s = 9 hours 2. Restore 6 incremental backups: 6 * 16 mi pages * 1 ms = 26h (Assuming 1ms latency of jump-sequential access) 3. Log replay with 75% buffer hit ratio = 10h (Assuming 20 log records per modified page) TOTAL = 45 hours Depends on Depends on workload Depends on amount device (growth rate & skew) and luck of RAM and luck Single-pass restore: 7
Scope: physiological logging Fundamental assumption: page-based storage (i.e., virtually all database systems in use today) Every log record reflects changes to a delimited region of physical storage page = unit of recovery „ recovery independence among objects “ (C. Mohan) 8
Some recent work … 9
Some recent work … 10
Sorted log archive Sorted Log Log log archive archive DB Full Merge join Merge log and backup in a single pass! Replacement 11
So what's new? 12
Partially sorted log archive External sort has two phases: 1. Run generation during log archiving (normal processing) 2. Merge during restore Log archive In-memory sorting Log Run Run Run Run Run Run Run DB Run 13
Single-pass restore Run Run Run Run 1 2 3 1000 ... Merge Sorted Full backup log Merge No need for incrementals Restored DB 15
Single-pass restore Run Run Run Run 1 2 3 1000 ... RAM Buffer Buffer Buffer Buffer Merge fan-in is Merge limited by memory But number of runs can Merge Full backup be kept manageable Restored DB 16
Run management Back to example: 6 TB with 2% daily change; 20 log records per page • Assume average log record size = 128 bytes → Log volume: 2 GB per day; 14 GB per week 1 MB of RAM • Assume initial run size = 100 MB to merge → 20 runs per day 100 MB of log • Assume 1 MB merge buffers → 140 MB to merge a whole week worth of log Runs as history units: Mar 2 Jan Feb W 1 W 2 2007 2008 2014 2014 2015 2015 2015 ... ... ... 40 MB to merge 1 year worth of log (11 monthly + 3 weekly + 6 daily + 20 current day) 17
Backups reconsidered How often? • monthly, quarterly, annually? • log replay bottleneck eliminated Virtual backups • new full backup generated by merging old backup and log archive • decoupled from DB activity – may run on remote site Merge Old backup New backup Monthly Run Run Run Run Run Run Run Run run 19
Incremental backups Goal: alleviate cost of log replay without overhead of taking a full backup But does it still make sense? In single-pass restore: • very fast log replay (faster than loading scattered pages) • log replay hidable on load of full backup • log records can be aggregated (net change) • log archiving with zero interference in transaction processing logic • full backups much cheaper to take 20
Experiments 21
Hypotheses 1. The cost of run generation during normal processing is negligible i.e., substantially faster media recovery practically for free 2. Single-pass restore uses very little memory, independent of DB size i.e., memory is not stolen from buffer pool 3. Single-pass restore hides log replay costs i.e., restoring up-to-date DB takes the same as loading an outdated full backup 22
Run generation overhead is negligible Scenarios: • B = baseline (no log archiving → no media recovery) • T = traditional log archiving (process and copy; no sorting) • S = sorting (run generation) • S+M = sorting with asynchronous merging Setup: • Shore-MT • 24-core machine • TPC-C benchmark • All in memory 23
Run generation overhead is negligible Log in DRAM 1.5% Slightly higher CPU utilization (OLTP workloads rarely fully utilize CPU) B = baseline T = traditional log archiving S = sorting (run generation) M = async. merging 24
Run generation overhead is negligible Log in SSD No observable difference in throughput B = baseline T = traditional log archiving S = sorting (run generation) M = async. merging 25
Single-pass restore uses very little memory Small memory footprint, independent of database size 26
Single-pass restore hides log replay Outdated (days or weeks) Up-to-date (TPC-C databases) 27
Future work (teaser) 28
Instant restore Log archive In-memory sorting Log + index load DB Run Run Run Run Run Run Run Run Restore failed device: • incrementally Backup • on-demand • while transactions are running (on buffer pool and healthy/restored storage) 30
Conclusion Traditional restore: R Single-pass restore: R Instant restore: New transactions running Thank you! http://instantrecovery.github.io 31
Recommend
More recommend