Data Management Systems Transaction Processing Concurrency control - PowerPoint PPT Presentation

Data Management Systems • Transaction Processing • Concurrency control and recovery 2 Phase Locking • Transactions Deadlocks • Recovery Locking Table Snapshot isolation Gustavo Alonso Institute of Computing Platforms Department of Computer Science ETH Zürich Transactions - 2PL 1

2 Phase Locking (2PL) Transactions - 2PL 2

2PL Background • 2 Phase Locking • Widely used since many decades • The canonical implementation of concurrency control • With important implications for recovery • Although alternatives exist (snapshot isolation), the underlying locking mechanisms are still widely used even in those alternative systems • How locking is implemented has a huge impact on performance and scalability. Transactions - 2PL 3

Locking in databases • Concurrency control in databases is implemented using locking • Generally using a lock table (see later) • Locks cover tuples, tables, indexes, blocks … • Locks can have different semantics that indicate whether they can be shared or are exclusive: • Compatibility Matrix NL S X • Shared lock (read) -   S • Exclusive lock (write) request • Other modes possible - -  X Transactions - 2PL 4

2 Phase Locking Protocol 1. Before accessing an object, a transaction must acquire lock. 2. A transaction acquires a lock only once. Lock upgrades are possible. 3. A transaction is blocked if the lock request cannot be granted according to the compatibility matrix. 4. A transaction goes through two phases: 1. Growth: Acquire locks, but never release a lock. 2. Shrink: Release locks, but never acquire a lock. 5. At EOT (commit or abort) all locks must be released. Transactions - 2PL 5

Example Transactions - 2PL 6

Why two phases? • 2PL prevents histories where a transaction must be aborted due to concurrency control issues: • An operation is not allowed unless we are sure it is safe • Without the two phases, the following can happen: • … L 1 [x]w 1 [x]U 1 [x]L 2 [x]w 2 [x]L 2 [y]w 2 [y]U 2 [y] … • If T1 now accesses y for read or write, it can do so, creating a non serializable history • This cannot happen if the two phase rule is enforced: if transaction T 2 gets a lock from another transaction T 1 , T1 cannot get a lock from T 2 since the two phase rule would be violated (T 2 would be acquiring a lock after it has released a lock) Transactions - 2PL 7

2PL results in conflict serializable histories • Locks prevent from two transactions modifying the same item at the same time • To do an operation, a lock must be held • To get a lock, it must be free or shared • This guarantees the atomicity of single operations • The two phase rule ensure no cycles can occur in the conflict graph • A cycle implies a circular dependency created by conflicting operations • T 1 -> T 2 implies T 1 released a lock before T 2 got it • For a cycle to exist, T 2 must release a lock before T 1 acquires it • That would violate the two phase rule Transactions - 2PL 8

2PL and SQL isolation levels • Uncommitted read • Typically only allowed for read-only transactions (see later) • Implemented by not acquiring a lock before reading • Read committed • Keep write locks until end => read committed • Get a short lived lock for reading (release after reading) => non repeatable read • Repeatable Read • Keep write locks until end • Keep read locks until end => what is read will not change • Serializable • Like repeatable read • Read lock on table to prevent phantoms (or cursor locking) Transactions - 2PL 9

What about recovery? • Locking affects recovery in terms of deciding what to do when a transaction aborts • Basic 2PL does not say anything about aborting or committing transactions: • … L 1 [x]w 1 [x]U 1 [x]L 2 [x]w 2 [x]L 2 [y]w 2 [y]U 2 [y] … • In principle, it is serializable • If T 1 aborts, recovery using before images would not work as undoing T 1 would undo the changes done by T 2 (one could argue that there is no need to recover anything as x has been rewritten, but if T 2 aborts, then T 2 cannot be recovered by restoring a before image) Transactions - 2PL 10

Strict 2PL • Strict 2PL combines 2PL with Strict histories: • Enforce 2PL • Keep read and write locks until the transaction commits or aborts • Strict 2PL results in serializable and strict histories • 2PL ensures serializability • Keeping all locks to the end ensures no uncommitted data can be read or over-written • All engines implement Strict 2PL (research papers sometimes propose to weaken the rule but it is neither a good nor practical idea) Transactions - 2PL 11

Deadlocks Step T 1 T 2 Comment 1. BOT 2. lockX ( A ) 3. BOT 4. lockS (B) 5. read(B) 6. read(A) 7. write(A) 8. lockX (B) T 1 must wait for T 2 9. lockS (A) T 2 must wait for T 1  Deadlock 10. ... ...

Deadlock Detection: in theory Wait-for Graph T 1 T 4 • T 1  T 2  T 3  T 4  T 1 • Aborting T 3 will resolve the cycle T 2 T 3 • Building the graph is expensive in systems running many concurrent transactions

Deadlock detection: in practice • In practice, no wait-for-graph is built: • Transactions/queries have a timer • If the timer expires, the transaction/query is aborted • Deadlocks are considered rare enough to prefer making mistakes every now and then (aborting a transactions that was fine) than implementing something as expensive as a wait for graph • This approach has many implications in practice: • A long transaction might block queries • The queries might abort because their timer expires while waiting for the transaction • Badly written applications can create such scenarios Transactions - 2PL 14

Locking Table Transactions - 2PL 15

Transaction Manager T 1 T 2 T 3 ...... T n • A database engine uses a Transaction-Manager TM transaction manager to enforce Scheduler concurrency control • Implemented in many different Data-Manager forms Recovery-Manager • Deeply connected with the Buffer-Manager recovery manager and I/O system • We will cover the classical implementation Storage System Transactions - 2PL 16

Basic functions of a Transaction Manager Transaction Lock List Lock Table w[x] Lock Hash Table w[y] Lock Header X Lock Lock T i Transaction Table Request T j Request T i handler hash Lock Header Y Lock Request T i Log Log Record Log Record …….. …….. …….. T i , w[x] T i , w[y] Transactions - 2PL 17

Basic transaction Manager • Transaction table: list of active transactions in the system, generally maintained by the engine in the common area • Transaction Handler: pointer to the structures containing al the relevant information related to a transaction (these data structures could be in the private area of a session) • Lock Table: a hash table containing entries that correspond to active locks. Locks on the same item are captured as a linked list • Log: entries that capture the operation performed and are kept in memory until it is time to write them to disk Transactions - 2PL 18

Basic operations transaction manager • Begin Transaction: • create an entry in the transaction table (no log entry is created unless explicitly requested) • Read/write operation: • hash the tuple id to find the corresponding entry in the lock table • If empty, lock is granted • If there is a list, attached request at the end of the list (grant request is earlier requests are compatible) • Write operation: • Create a log entry (with Log Sequence Number (LSN), before and/or after image, transaction id, and pointers (LSN) to the previous log entry of the same transaction) Transactions - 2PL 19

Basic operations transaction manager • Commit transaction: • Release locks using the transaction lock list • Resume transactions waiting for the locks now released • Finalize log entries • Write log entries to storage • May write actual data modified to storage • Abort transaction • Release locks using the transactional lock list • Resume transactions waiting for the locks now released • Use log entries to undo changes (or discard changes) • May write log entries to storage Transactions - 2PL 20

Real systems • Actual database engines implement this basic design in many different ways: • Often, there is an actual lock table similar to the one described • Sometimes, no lock table (e.g., Oracle): locks stored in the blocks containing the tuples (makes sense instead of using the tuple id (block id, offset) to hash to the lock table, use the corresponding block as the entry in the lock table for all tuples in that block => no need to keep a separate table) • Log records can take many forms (redo, undo, undo/redo, etc.; see later) • Lock entries can also vary in their structure • Recall the headers in the blocks: also used to implement some of these data structures, including information about what is locked Transactions - 2PL 21

Implementation aspects Transactions - 2PL 22

What is a lock? What to lock? • We have so far studied locks in a limited context • Read/write, Shared/Exclusive • In reality, many more modes. • Read/write locks are too coarse and may result in more conflicts than necessary • Some type of locks (table locks) are too big (lock many tuples) and having more information helps to reduce conflicts • What lock modes are supported and what can be locked is system specific Transactions - 2PL 23

Data Management Systems Transaction Processing Concurrency control - PowerPoint PPT Presentation

Data Management Systems Transaction Processing Concurrency control and recovery 2 Phase Locking Transactions Deadlocks Recovery Locking Table Snapshot isolation Gustavo Alonso Institute of Computing Platforms Department of

Systems Systems Systems Integration Systems Integration Systems Systems Integration Systems

Types of Expert Systems Interpretation Systems Prediction Systems Diagnosis Systems

DataCamp Data Types for Data Science DataCamp Data Types for Data Science Data types Data type

CMPUT 391 Database Management Systems Spatial Data Management 1 Dr. Jrg Sander, 2006

BSD Data Systems Report to the School Board, April 2019 Data-Driven ESSA Best Practices Data

Data Collection and Data Management saverio . giallorenzo @gmail.com 1 Web Science Data

PRESENTS PRESENTS TRUE HOTEL MANAGEMENT SYSTEM TRUE HOTEL MANAGEMENT SYSTEM MANAGEMENT FEATURE

PUBLIC FINANCE MANAGEMENT SYSTEMS MANAGEMENT SYSTEMS AND RISK MODELLING IN TURKISH TREASURY

Database Management Objectives of Lecture 10 Systems Spatial Data Management Spatial Data

Bracing Systems Bracing Systems 1 1 Rod Bracing Rod Bracing 2 2 Wind Bracing Systems Wind

COMP9313: Big Data Management Introduction to Big Data Management What is big data? Tweeted by

Database Management Objectives of Lecture 5 Systems Data Warehousing and OLAP Data Warehousing

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Diagnose data for cleaning Cleaning Data in Python Cleaning data Prepare data for analysis

CS378 Introduction to Data Mining Data Exploration and Data Preprocessing Li Xiong Data

Data Preparation Discretization Data cleaning (Data pre-processing) Data

Mike Dietrich BU Database Technologies ORACLE Deutschland GmbH Page 1 www.decus.de 1

CS302: Paradigms of Programming Variations on the Scheme Interpreter (Cont.) Manas Thakur

From Description to Causation Department of Government London School of Economics and Political

Maths week commencing 28th September 2020 Overall learning: ordering numbers, counting in 2's, 5's

Cut elimination for infinitary proofs Amina Doumane IRIF-Universit Paris Diderot August 2016 -

Cut elimination for infinitary proofs Amina Doumane LSV-IRIF-Universit Paris Diderot March

Curry-Howard isomorphism for sequent calculus at last Jos e Carlos Esp rito Santo Centro

Complex submanifolds and holonomy joint work with A.J. Di Scala and C. Olmos Sergio Console July