motivation
play

Motivation Atomicity: Transactions may abort (Rollback). Logging - PDF document

Motivation Atomicity: Transactions may abort (Rollback). Logging and Durability: Recovery What if DBMS stops running? (Causes?) Chapter 18 Desired Behavior after crash! system restarts: If you are going to be in


  1. Motivation • Atomicity: – Transactions may abort (“Rollback”). Logging and • Durability: Recovery – What if DBMS stops running? (Causes?) Chapter 18 � Desired Behavior after crash! system restarts: If you are going to be in the T1 logging business, one of the – T1, T2 & T3 should be T2 things that you have to do is to durable. T3 learn about heavy equipment. – T4 & T5 should be T4 - Robert VanNatta, aborted (effects not seen). T5 Logging History of Columbia County Review: The ACID properties Assumptions • A • A tomicity: All actions in the Xact happen, or none happen. • Concurrency control is in effect. • • C C onsistency: If each Xact is consistent, and the DB starts – Strict 2PL, in particular. consistent, it ends up consistent. • Updates are happening “in place”. I solation: Execution of one Xact is isolated from that of other I • • – i.e. data is overwritten on (deleted from) the disk. Xacts. • • D D urability: If a Xact commits, its effects persist. • A simple scheme to guarantee Atomicity & Durability? • The Recovery Manager guarantees Atomicity & Durability.

  2. Handling the Buffer Pool Basic Idea: Logging • Force write to disk at • Record REDO and UNDO information, for every commit? update, in a log. No Steal Steal – Poor response time. – Sequential writes to log (put it on a separate disk). – But provides durability. Force Trivial – Minimal info (diff) written to log, so multiple updates • Steal buffer-pool frames fit in a single log page. from uncommited Xacts? • Log: An ordered list of REDO/UNDO actions Desired – If not, poor throughput. No Force – Log record contains: – If so, how can we ensure <XID, pageID, offset, length, old data, new data> atomicity? – and additional control info (which we’ll see soon). More on Steal and Force Write-Ahead Logging (WAL) STEAL (why enforcing Atomicity is hard) • • The Write-Ahead Logging Protocol: – To steal frame F: Current page in F (say P) is � Must force the log record for an update before written to disk; some Xact holds lock on P. the corresponding data page gets to disk. • What if the Xact with the lock on P aborts? � Must write all log records for a Xact before • Must remember the old value of P at steal time (to commit . support UNDOing the write to page P). • #1 guarantees Atomicity. NO FORCE (why enforcing Durability is hard) • • #2 guarantees Durability. – What if system crashes before a modified page is written to disk? • Exactly how is logging (and recovery!) done? – Write as little as possible, in a convenient place, at commit time,to support REDOing modifications. – We’ll study the ARIES algorithms.

  3. RAM DB WAL & the Log Other Log-Related State LSNs pageLSNs flushedLSN • Each log record has a unique Log Sequence Number (LSN). • Transaction Table: Log records – LSNs always increasing. flushed to disk – One entry per active Xact. • Each data page contains a pageLSN. – Contains XID, status (running/commited/aborted), – The LSN of the most recent log record and lastLSN. for an update to that page. • Dirty Page Table: • System keeps track of flushedLSN. – One entry per dirty page in buffer pool. – The max LSN flushed so far. – Contains recLSN -- the LSN of the log record which pageLSN “Log tail” • WAL: Before a page is written, first caused the page to be dirty. in RAM – pageLSN ≤ flushedLSN Log Records Normal Execution of an Xact Possible log record types: • Update LogRecord fields: • Series of reads & writes, followed by commit or abort. • Commit prevLSN – We will assume that page write is atomic on disk. • Abort XID • In practice, additional details to deal with non-atomic writes. • End (signifies end of commit type • Strict 2PL. or abort) pageID STEAL, NO-FORCE buffer management, with Write- length • Compensation Log Records • update Ahead Logging. offset (CLRs) records before-image only – for UNDO actions after-image – (and some other tricks!)

  4. Checkpointing Simple Transaction Abort • Periodically, the DBMS creates a checkpoint, in order to • For now, consider an explicit abort of a Xact. minimize the time taken to recover in the event of a system – No crash involved. crash. Write to log: – begin_checkpoint record: Indicates when chkpt began. • We want to “play back” the log in reverse – end_checkpoint record: Contains current Xact table and dirty page order, UNDO ing updates. table . This is a `fuzzy checkpoint’: – Get lastLSN of Xact from Xact table. • Other Xacts continue to run; so these tables only known to reflect – Can follow chain of log records backward via the some mix of state after the time of the begin_checkpoint record . prevLSN field. • No attempt to force dirty pages to disk; effectiveness of checkpoint limited by oldest unwritten change to a dirty page. (So it’s a good idea – Note: before starting UNDO, could write an Abort to periodically flush dirty pages to disk!) log record. – Store LSN of chkpt record in a safe place ( master record). • Why bother? The Big Picture: What’s Stored Where Abort, cont. LOG RAM DB • To perform UNDO , must have a lock on data! – No problem! LogRecords • Before restoring old value of a page, write a CLR: Xact Table prevLSN – You continue logging while you UNDO!! Data pages lastLSN XID – CLR has one extra field: undonextLSN each status type • Points to the next LSN to undo (i.e. the prevLSN of the record we’re with a pageID currently undoing). pageLSN Dirty Page Table length – CLR contains REDO info recLSN offset – CLRs never Undone master record before-image • Undo needn’t be idempotent (>1 UNDO won’t happen) flushedLSN after-image • But they might be Redone when repeating history (=1 UNDO guaranteed) • At end of all UNDOs , write an “end” log record.

  5. Transaction Commit Recovery: The Analysis Phase • Write commit record to log. • Reconstruct state at checkpoint. • All log records up to Xact’s lastLSN are flushed. – via end_checkpoint record. – Guarantees that flushedLSN ≥ lastLSN. • Scan log forward from begin_checkpoint. – Note that log flushes are sequential, synchronous – End record: Remove Xact from Xact table. writes to disk. – Other records: Add Xact to Xact table, set – Many log records per log page. lastLSN=LSN, change Xact status on commit. • Make transaction visible – Update record: If P not in Dirty Page Table, – Commit() returns, locks dropped, etc. • Add P to D.P.T., set its recLSN=LSN. • Write end record to log. Crash Recovery: Big Picture Recovery: The REDO Phase • We repeat History to reconstruct state at crash: Oldest log – Reapply all updates (even of aborted Xacts!), redo rec. of Xact � Start from a checkpoint (found CLRs. active at crash via master record). • Scan forward from log rec containing smallest � Three phases. Need to: Smallest recLSN in D.P.T. For each CLR or update log rec LSN, recLSN in REDO the action unless: – Figure out which Xacts dirty page table after committed since checkpoint, – Affected page is not in the Dirty Page Table, or Analysis which failed (Analysis). – Affected page is in D.P.T., but has recLSN > LSN, or – REDO all actions. – pageLSN (in DB) ≥ LSN. � (repeat history) Last chkpt • To REDO an action: – UNDO effects of failed Xacts. – Reapply logged action. CRASH – Set pageLSN to LSN. No additional logging! A R U

  6. Recovery: The UNDO Phase Example: Crash During Restart! LSN LOG ToUndo={ l | l a lastLSN of a “loser” Xact} 00,05 begin_checkpoint, end_checkpoint Repeat: RAM 10 update: T1 writes P5 – Choose largest LSN among ToUndo. 20 update T2 writes P3 undonextLSN Xact Table 30 T1 abort – If this LSN is a CLR and undonextLSN==NULL lastLSN 40,45 CLR: Undo T1 LSN 10, T1 End • Write an End record for this Xact. status 50 update: T3 writes P1 – If this LSN is a CLR, and undonextLSN != NULL Dirty Page Table recLSN 60 update: T2 writes P5 • Add undonextLSN to ToUndo flushedLSN CRASH, RESTART • (Q: what happens to other CLRs?) 70 CLR: Undo T2 LSN 60 – Else this LSN is an update. Undo the update, ToUndo 80,85 CLR: Undo T3 LSN 50, T3 end write a CLR, add prevLSN to ToUndo. CRASH, RESTART Until ToUndo is empty. CLR: Undo T2 LSN 20, T2 end 90 Example of Recovery Additional Crash Issues LSN LOG • What happens if system crashes during Analysis? During REDO ? 00 begin_checkpoint RAM • How do you limit the amount of work in REDO ? 05 end_checkpoint – Flush asynchronously in the background. 10 update: T1 writes P5 prevLSNs Xact Table lastLSN 20 update T2 writes P3 – Watch “hot spots”! status 30 T1 abort • How do you limit the amount of work in UNDO ? Dirty Page Table 40 CLR: Undo T1 LSN 10 recLSN – Avoid long-running Xacts. flushedLSN 45 T1 End 50 update: T3 writes P1 ToUndo 60 update: T2 writes P5 CRASH, RESTART

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend