ADVANCED DATABASE SYSTEMS Transaction Models & Concurrency - - PowerPoint PPT Presentation

advanced database systems
SMART_READER_LITE
LIVE PREVIEW

ADVANCED DATABASE SYSTEMS Transaction Models & Concurrency - - PowerPoint PPT Presentation

Lect ure # 02 ADVANCED DATABASE SYSTEMS Transaction Models & Concurrency Control @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721 (Spring 2019) 2 Background Transaction Models Concurrency Control Protocols Isolation Levels CMU


slide-1
SLIDE 1

Transaction Models & Concurrency Control

@ Andy_Pavlo // 15- 721 // Spring 2019

ADVANCED DATABASE SYSTEMS

Lect ure # 02

slide-2
SLIDE 2 CMU 15-721 (Spring 2019)

Background Transaction Models Concurrency Control Protocols Isolation Levels

2

slide-3
SLIDE 3 CMU 15-721 (Spring 2019)

CO URSE OVERVIEW

This course is on database systems for modern transaction processing and analytical workloads. The first three weeks are focused on how to ingest new data quickly. We will then discuss how to analyze that data and ask complex questions about it.

3

slide-4
SLIDE 4 CMU 15-721 (Spring 2019)

DATABASE WO RKLOADS

On-Line Transaction Processing (OLTP)

→ Fast operations that only read/update a small amount of data each time.

On-Line Analytical Processing (OLAP)

→ Complex queries that read a lot of data to compute aggregates.

Hybrid Transaction + Analytical Processing

→ OLTP + OLAP together on the same database instance

4

slide-5
SLIDE 5 CMU 15-721 (Spring 2019)

BIFURCATED EN VIRO N M EN T

5

OLAP Data Warehouse OLTP Data Silos

Transactions

slide-6
SLIDE 6 CMU 15-721 (Spring 2019)

BIFURCATED EN VIRO N M EN T

5

Extract Transform Load OLAP Data Warehouse OLTP Data Silos

Transactions

slide-7
SLIDE 7 CMU 15-721 (Spring 2019)

BIFURCATED EN VIRO N M EN T

5

Extract Transform Load OLAP Data Warehouse OLTP Data Silos

Analytical Queries Transactions

slide-8
SLIDE 8 CMU 15-721 (Spring 2019)

BIFURCATED EN VIRO N M EN T

5

Extract Transform Load OLAP Data Warehouse

Analytical Queries Transactions

HTAP Database

slide-9
SLIDE 9 CMU 15-721 (Spring 2019)

WO RKLOAD CH ARACTERIZATIO N

6

Writes Reads Simple Complex

Workload Focus Operation Complexity

OLTP OLAP

Source: Michael Stonebraker

slide-10
SLIDE 10 CMU 15-721 (Spring 2019)

TRAN SACTIO N DEFIN ITIO N

A txn is a sequence of actions that are executed on a shared database to perform some higher-level function. Txns are the basic unit of change in the DBMS. No partial txns are allowed.

7

slide-11
SLIDE 11 CMU 15-721 (Spring 2019)

ACTIO N CLASSIFICATIO N

Unprotected Actions

→ These lack all of the ACID properties except for

  • consistency. Their effects cannot be depended upon.

Protected Actions

→ These do not externalize their results before they are completely done. Fully ACID.

Real Actions

→ These affect the physical world in a way that is hard or impossible to reverse.

8

slide-12
SLIDE 12 CMU 15-721 (Spring 2019)

TRAN SACTIO N M O DELS

Flat Txns Flat Txns + Savepoints Chained Txns Nested Txns Saga Txns Compensating Txns

9

slide-13
SLIDE 13 CMU 15-721 (Spring 2019)

FLAT TRAN SACTIO NS

Standard txn model that starts with BEGIN, followed by one or more actions, and then completed with either COMMIT or ROLLBACK.

10

Txn #1

BEGIN READ(A) COMMIT WRITE(B)

Txn #2

BEGIN READ(A) WRITE(B) ROLLBACK

slide-14
SLIDE 14 CMU 15-721 (Spring 2019)

LIM ITATIO N S O F FLAT TRAN SACTIO NS

The application can only rollback the entire txn (i.e., no partial rollbacks). All of a txn's work is lost is the DBMS fails before that txn finishes. Each txn takes place at a single point in time.

11

slide-15
SLIDE 15 CMU 15-721 (Spring 2019)

LIM ITATIO N S O F FLAT TRAN SACTIO NS

Example #1: Multi-Stage Planning

→ An application needs to make multiple reservations. → All the reservations need to occur or none of them.

Example #2: Bulk Updates

→ An application needs to update one billion records. → This txn could take hours to complete and therefore the DBMS is exposed to losing all of its work for any failure

  • r conflict.

12

slide-16
SLIDE 16 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

Save the current state of processing for the txn and provide a handle for the application to refer to that savepoint. The application can control the state of the txn through these savepoints:

→ ROLLBACK – Revert all changes back to the state of the DB at the savepoint. → RELEASE – Destroys a savepoint previously defined in the txn.

13

slide-17
SLIDE 17 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) New Savepoint

slide-18
SLIDE 18 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint

slide-19
SLIDE 19 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint Savepoint#1

slide-20
SLIDE 20 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint B Savepoint#1

slide-21
SLIDE 21 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint B New Savepoint

X

Savepoint#1

slide-22
SLIDE 22 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint B New Savepoint C

X

Savepoint#1

slide-23
SLIDE 23 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

14

Txn #1

BEGIN WRITE(A) COMMIT

SAVEPOINT 1

WRITE(B)

ROLLBACK TO 1

WRITE(C) A New Savepoint B New Savepoint C

X

Savepoint#1

slide-24
SLIDE 24 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C)

ROLLBACK TO 3 RELEASE 2

WRITE(D)

slide-25
SLIDE 25 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint

slide-26
SLIDE 26 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A

ROLLBACK TO 3 RELEASE 2

WRITE(D) Savepoint#1 New Savepoint

slide-27
SLIDE 27 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B

ROLLBACK TO 3 RELEASE 2

WRITE(D) Savepoint#1 New Savepoint

slide-28
SLIDE 28 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B

ROLLBACK TO 3 RELEASE 2

WRITE(D) Savepoint#1 Savepoint#2 New Savepoint

slide-29
SLIDE 29 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) Savepoint#1 Savepoint#2 New Savepoint

slide-30
SLIDE 30 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint Savepoint#1 Savepoint#2 Savepoint#3

slide-31
SLIDE 31 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint Savepoint#1

slide-32
SLIDE 32 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint D Savepoint#1

slide-33
SLIDE 33 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint D

???

Savepoint#1

slide-34
SLIDE 34 CMU 15-721 (Spring 2019)

TRAN SACTIO N SAVEPO IN TS

15

Txn #1

BEGIN WRITE(A)

SAVEPOINT 3 SAVEPOINT 1

WRITE(B)

SAVEPOINT 2

WRITE(C) A B C

ROLLBACK TO 3 RELEASE 2

WRITE(D) New Savepoint D Savepoint#1

slide-35
SLIDE 35 CMU 15-721 (Spring 2019)

N ESTED TRAN SACTIO NS

Savepoints organize a transaction as a sequence of actions that can be rolled back individually. Nested txns form a hierarchy of work.

→ The outcome of a child txn depends on the outcome of its parent txn.

16

slide-36
SLIDE 36 CMU 15-721 (Spring 2019)

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-37
SLIDE 37 CMU 15-721 (Spring 2019)

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-38
SLIDE 38 CMU 15-721 (Spring 2019)

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-39
SLIDE 39 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-40
SLIDE 40 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-41
SLIDE 41 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-42
SLIDE 42 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1 Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-43
SLIDE 43 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1

BEGIN

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-44
SLIDE 44 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1

BEGIN

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-45
SLIDE 45 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1

BEGIN

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

slide-46
SLIDE 46 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1

BEGIN

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

X X X

slide-47
SLIDE 47 CMU 15-721 (Spring 2019)

Sub-Txn #1.1

N ESTED TRAN SACTIO NS

17

Sub-Txn #1.1.1

BEGIN

Txn #1

BEGIN WRITE(A) BEGIN BEGIN WRITE(C) COMMIT COMMIT WRITE(B) ROLLBACK WRITE(D) BEGIN

X X X ✓

slide-48
SLIDE 48 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

Multiple txns executed one after another. Combined COMMIT / BEGIN operation is atomic.

→ No other txn can change the state of the database as seen by the second txn from the time that the first txn commits and the second txn begins.

Differences with savepoints:

→ COMMIT allows the DBMS to free locks. → Cannot rollback previous txns in chain.

18

slide-49
SLIDE 49 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK WRITE(B)

slide-50
SLIDE 50 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK A WRITE(B)

slide-51
SLIDE 51 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK A WRITE(B)

slide-52
SLIDE 52 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK A B WRITE(B)

slide-53
SLIDE 53 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK A B WRITE(B)

slide-54
SLIDE 54 CMU 15-721 (Spring 2019)

TRAN SACTIO N CH AIN S

19

Txn #1

BEGIN WRITE(A) COMMIT

Txn #2

BEGIN READ(A) COMMIT

Txn #3

BEGIN WRITE(C) ROLLBACK A B C

X ✓ ✓

WRITE(B)

slide-55
SLIDE 55 CMU 15-721 (Spring 2019)

BULK UPDATE PRO BLEM

These other txn models are nice, but they still do not solve our bulk update problem. Chained txns seems like the right idea but they require the application to handle failures and maintain its own state.

→ Has to be able to reverse changes when things fail.

20

slide-56
SLIDE 56 CMU 15-721 (Spring 2019)

CO M PEN SATIN G TRAN SACTIO N S

A special type of txn that is designed to semantically reverse the effects of another already committed txn. Reversal has to be logical instead of physical.

→ Example: Decrement a counter by one instead of reverting to the original value.

21

slide-57
SLIDE 57 CMU 15-721 (Spring 2019)

SAGA TRAN SACTIO N S

A sequence of chained txns T1–Tn and compensating txns C1–Cn-1 where one of the following is guaranteed: →The txns will commit in the order T1…Tj,Cj…C1 (where j < n) This allows the DBMS to support long-running, multi-step txns without application-managed logic

22

SAGAS

SIGMOD 1987

slide-58
SLIDE 58 CMU 15-721 (Spring 2019)

SAGA TRAN SACTIO N S

23

Txn #1

BEGIN WRITE(A+1) COMMIT

Txn #2

BEGIN WRITE(B+1) COMMIT

Txn #3

BEGIN WRITE(C+1)

slide-59
SLIDE 59 CMU 15-721 (Spring 2019)

SAGA TRAN SACTIO N S

23

Txn #1

BEGIN WRITE(A+1) COMMIT

Txn #2

BEGIN WRITE(B+1) COMMIT

Txn #3

BEGIN WRITE(C+1)

slide-60
SLIDE 60 CMU 15-721 (Spring 2019)

SAGA TRAN SACTIO N S

23

Txn #1

BEGIN WRITE(A+1) COMMIT

Txn #2

BEGIN WRITE(B+1) COMMIT

Txn #3

BEGIN WRITE(C+1)

Comp Txn #2

BEGIN WRITE(B-1) COMMIT

slide-61
SLIDE 61 CMU 15-721 (Spring 2019)

SAGA TRAN SACTIO N S

23

Txn #1

BEGIN WRITE(A+1) COMMIT

Txn #2

BEGIN WRITE(B+1) COMMIT

Txn #3

BEGIN WRITE(C+1)

Comp Txn #1

BEGIN WRITE(A-1) COMMIT

Comp Txn #2

BEGIN WRITE(B-1) COMMIT

slide-62
SLIDE 62 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L

The protocol to allow txns to access a database in a multi-programmed fashion while preserving the illusion that each of them is executing alone on a dedicated system.

→ The goal is to have the effect of a group of txns on the database’s state is equivalent to any serial execution of all txns.

Provides Atomicity + Isolation in ACID

24

slide-63
SLIDE 63 CMU 15-721 (Spring 2019)

TXN IN TERN AL STATE

Status

→ The current execution state of the txn.

Undo Log Entries

→ Stored in an in-memory data structure. → Dropped on commit.

Redo Log Entries

→ Append to the in-memory tail of WAL. → Flushed to disk on commit.

Read/Write Set

→ Depends on the concurrency control scheme.

25

slide-64
SLIDE 64 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L SCH EM ES

Two-Phase Locking (2PL)

→ Assume txns will conflict so they must acquire locks on database objects before they are allowed to access them.

Timestamp Ordering (T/O)

→ Assume that conflicts are rare so txns do not need to first acquire locks on database objects and instead check for conflicts at commit time.

26

slide-65
SLIDE 65 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B)

Shrinking Phase

LOCK(A) LOCK(B)

Growing Phase

slide-66
SLIDE 66 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-67
SLIDE 67 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-68
SLIDE 68 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-69
SLIDE 69 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-70
SLIDE 70 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-71
SLIDE 71 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

27

Txn #2

BEGIN COMMIT

LOCK(B) LOCK(A) WRITE(A) UNLOCK(A) UNLOCK(B) WRITE(B)

Txn #1

BEGIN COMMIT

LOCK(A) LOCK(B) UNLOCK(A) UNLOCK(B) READ(A) WRITE(B) LOCK(A) LOCK(B)

slide-72
SLIDE 72 CMU 15-721 (Spring 2019)

TWO - PH ASE LO CKIN G

Deadlock Detection

→ Each txn maintains a queue of the txns that hold the locks that it waiting for. → A separate thread checks these queues for deadlocks. → If deadlock found, use a heuristic to decide what txn to kill in order to break deadlock.

Deadlock Prevention

→ Check whether another txn already holds a lock when another txn requests it. → If lock is not available, the txn will either (1) wait, (2) commit suicide, or (3) kill the other txn.

28

slide-73
SLIDE 73 CMU 15-721 (Spring 2019)

TIM ESTAM P O RDERIN G

Basic T/O

→ Check for conflicts on each read/write. → Copy tuples on each access to ensure repeatable reads.

Optimistic Currency Control (OCC)

→ Store all changes in private workspace. → Check for conflicts at commit time and then merge.

29

slide-74
SLIDE 74 CMU 15-721 (Spring 2019)

BASIC T/ O

30

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

slide-75
SLIDE 75 CMU 15-721 (Spring 2019)

BASIC T/ O

30

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001

slide-76
SLIDE 76 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10000 10000 10000

10001

slide-77
SLIDE 77 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10000 10000 10000

10001

slide-78
SLIDE 78 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001 10000 10000

10001

slide-79
SLIDE 79 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001 10000 10000

10001

slide-80
SLIDE 80 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001 10001 10000

10001

slide-81
SLIDE 81 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001 10001 10005

10001

slide-82
SLIDE 82 CMU 15-721 (Spring 2019)

BASIC T/ O

30 Record Read Timestamp Write Timestamp

A B 10000

Txn #1

BEGIN COMMIT

READ(A) WRITE(B) WRITE(A)

10001 10001 10005

10001

slide-83
SLIDE 83 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

Timestamp-ordering scheme where txns copy data read/write into a private workspace that is not visible to other active txns. When a txn commits, the DBMS verifies that there are no conflicts. First proposed in 1981 at CMU by H.T. Kung.

31

ON OPTIMISTIC METHODS FOR CONCURRENCY C CONTROL

ACM T TRANSACTIONS ON DATABASE S SYSTEMS 1981

slide-84
SLIDE 84 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT

slide-85
SLIDE 85 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Read Phase

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT

slide-86
SLIDE 86 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT

slide-87
SLIDE 87 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Workspace

Record Value Write Timestamp

123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT

slide-88
SLIDE 88 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Workspace

Record Value Write Timestamp

123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT

slide-89
SLIDE 89 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Workspace

Record Value Write Timestamp

123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

COMMIT

slide-90
SLIDE 90 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

COMMIT

slide-91
SLIDE 91 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B)

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

999

COMMIT

slide-92
SLIDE 92 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B) VALIDATE PHASE WRITE PHASE

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

999

COMMIT

slide-93
SLIDE 93 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B) VALIDATE PHASE WRITE PHASE

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

999

COMMIT

slide-94
SLIDE 94 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B) VALIDATE PHASE WRITE PHASE

10001

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

999

COMMIT

slide-95
SLIDE 95 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B) VALIDATE PHASE WRITE PHASE

10001

Workspace

Record Value Write Timestamp

B 456 10000 123 A 10000

Record Value Write Timestamp

B 456 10000 123 A 10000 888

999

COMMIT 888 999 10001 10001

slide-96
SLIDE 96 CMU 15-721 (Spring 2019)

O PTIM ISTIC CO N CURREN CY CO N TRO L

32

Txn #1

BEGIN

READ(A) WRITE(A) WRITE(B) VALIDATE PHASE WRITE PHASE

10001

Record Value Write Timestamp

B 456 10000 123 A 10000 COMMIT 888 999 10001 10001

slide-97
SLIDE 97 CMU 15-721 (Spring 2019)

O BSERVATIO N

When there is low contention, optimistic protocols perform better because the DBMS spends less time checking for conflicts. At high contention, the both classes of protocols degenerate to essentially the same serial execution.

33

slide-98
SLIDE 98 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L EVALUATIO N

Compare in-memory concurrency control protocols at high levels of parallelism.

→ Single test-bed system. → Evaluate protocols using core counts beyond what is available on today's CPUs.

Running in extreme environments exposes what are the main bottlenecks in the DBMS.

34

STARING INTO THE ABYSS: AN EVALUATION OF CONCURRENCY CONTROL W WITH ONE THOUSAND CORES

VLDB 2014

slide-99
SLIDE 99 CMU 15-721 (Spring 2019)

10 0 0 - CO RE CPU SIM ULATO R

DBx1000 Database System

→ In-memory DBMS with pluggable lock manager. → No network access, logging, or concurrent indexes

MIT Graphite CPU Simulator

→ Single-socket, tile-based CPU. → Shared L2 cache for groups of cores. → Tiles communicate over 2D-mesh network.

35

slide-100
SLIDE 100 CMU 15-721 (Spring 2019)

TARGET WO RKLOAD

Yahoo! Cloud Serving Benchmark (YCSB)

→ 20 million tuples → Each tuple is 1KB (total database is ~20GB)

Each transactions reads/modifies 16 tuples. Varying skew in transaction access patterns. Serializable isolation level.

36

slide-101
SLIDE 101 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L SCH EM ES

37

DL_DETECT NO_WAIT WAIT_DIE 2PL w/ Deadlock Detection 2PL w/ Non-waiting Prevention 2PL w/ Wait-and-Die Prevention TIMESTAMP MVCC OCC Basic T/O Algorithm Multi-Version T/O Optimistic Concurrency Control

slide-102
SLIDE 102 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L SCH EM ES

37

DL_DETECT NO_WAIT WAIT_DIE 2PL w/ Deadlock Detection 2PL w/ Non-waiting Prevention 2PL w/ Wait-and-Die Prevention

slide-103
SLIDE 103 CMU 15-721 (Spring 2019)

CO N CURREN CY CO N TRO L SCH EM ES

37

TIMESTAMP MVCC OCC Basic T/O Algorithm Multi-Version T/O Optimistic Concurrency Control

slide-104
SLIDE 104 CMU 15-721 (Spring 2019)

READ- O N LY WO RKLOAD

38

slide-105
SLIDE 105 CMU 15-721 (Spring 2019)

WRITE- INTENSIVE / M EDIUM - CO N TEN TIO N

39

slide-106
SLIDE 106 CMU 15-721 (Spring 2019)

WRITE- INTENSIVE / H IGH - CO N TEN TIO N

40

slide-107
SLIDE 107 CMU 15-721 (Spring 2019)

WRITE- INTENSIVE / H IGH - CO N TEN TIO N

40

slide-108
SLIDE 108 CMU 15-721 (Spring 2019)

WRITE- INTENSIVE / H IGH - CO N TEN TIO N

40

slide-109
SLIDE 109 CMU 15-721 (Spring 2019)

BOTTLEN ECKS

Lock Thrashing

→ DL_DETECT, WAIT_DIE

Timestamp Allocation

→ All T/O algorithms + WAIT_DIE

Memory Allocations

→ OCC + MVCC

41

slide-110
SLIDE 110 CMU 15-721 (Spring 2019)

LO CK TH RASH IN G

Each txn waits longer to acquire locks, causing

  • ther txn to wait longer to acquire locks.

Can measure this phenomenon by removing deadlock detection/prevention overhead.

→ Force txns to acquire locks in primary key order. → Deadlocks are not possible.

42

slide-111
SLIDE 111 CMU 15-721 (Spring 2019)

LO CK TH RASH IN G

43

slide-112
SLIDE 112 CMU 15-721 (Spring 2019)

LO CK TH RASH IN G

43

slide-113
SLIDE 113 CMU 15-721 (Spring 2019)

TIM ESTAM P ALLO CATIO N

Mutex

→ Worst option.

Atomic Addition

→ Requires cache invalidation on write.

Batched Atomic Addition

→ Needs a back-off mechanism to prevent fast burn.

Hardware Clock

→ Not sure if it will exist in future CPUs.

Hardware Counter

→ Not implemented in existing CPUs.

44

slide-114
SLIDE 114 CMU 15-721 (Spring 2019)

TIM ESTAM P ALLO CATIO N

45

slide-115
SLIDE 115 CMU 15-721 (Spring 2019)

M EM O RY ALLO CATIO N S

Copying data on every read/write access slows down the DBMS because of contention on the memory controller.

→ In-place updates and non-copying reads are not affected as much.

Default libc malloc is slow. Never use it.

46

slide-116
SLIDE 116 CMU 15-721 (Spring 2019)

O BSERVATIO N

Serializability is useful because it allows programmers to ignore concurrency issues but enforcing it may allow too little parallelism and limit performance. We may want to use a weaker level of consistency to improve scalability.

47

slide-117
SLIDE 117 CMU 15-721 (Spring 2019)

ISO LATIO N LEVELS

Controls the extent that a txn is exposed to the actions of other concurrent txns. Provides for greater concurrency at the cost of exposing txns to uncommitted changes:

→ Dirty Read Anomaly → Unrepeatable Reads Anomaly → Phantom Reads Anomaly

48

slide-118
SLIDE 118 CMU 15-721 (Spring 2019)

AN SI ISO LATIO N LEVELS

SERIALIZABLE

→ No phantoms, all reads repeatable, no dirty reads.

REPEATABLE READS

→ Phantoms may happen.

READ COMMITTED

→ Phantoms and unrepeatable reads may happen.

READ UNCOMMITTED

→ All of them may happen.

49

slide-119
SLIDE 119 CMU 15-721 (Spring 2019)

ISO LATIO N LEVEL H IERARCH Y

50

REPEATABLE READS READ UNCOMMITTED SERIALIZABLE READ COMMITTED

slide-120
SLIDE 120 CMU 15-721 (Spring 2019)

REAL- WO RLD ISO LATIO N LEVELS

51

Default Maximum

Actian Ingres SERIALIZABLE SERIALIZABLE Greenplum READ COMMITTED SERIALIZABLE IBM DB2 CURSOR STABILITY SERIALIZABLE MySQL REPEATABLE READS SERIALIZABLE MemSQL READ COMMITTED READ COMMITTED MS SQL Server READ COMMITTED SERIALIZABLE Oracle READ COMMITTED SNAPSHOT ISOLATION Postgres READ COMMITTED SERIALIZABLE SAP HANA READ COMMITTED SERIALIZABLE VoltDB SERIALIZABLE SERIALIZABLE

Source: Peter Bailis

slide-121
SLIDE 121 CMU 15-721 (Spring 2019)

CRITICISM O F ISO LATIO N LEVELS

The isolation levels defined as part of SQL-92 standard only focused on anomalies that can occur in a 2PL-based DBMS. Two additional isolation levels:

→ CURSOR STABILITY → SNAPSHOT ISOLATION

52

A CRITIQUE OF ANSI SQL ISOLATION LEVELS

SIGMOD 1995

slide-122
SLIDE 122 CMU 15-721 (Spring 2019)

CURSO R STABILITY (CS)

The DBMS’s internal cursor maintains a lock on a item in the database until it moves on to the next item. CS is a stronger isolation level in between REPEATABLE READS and READ COMMITTED that can (sometimes) prevent the Lost Update Anomaly.

53

Source: Jepsen

slide-123
SLIDE 123 CMU 15-721 (Spring 2019)

LO ST UPDATE AN O M ALY

54

Txn #2

BEGIN COMMIT

WRITE(A)

Txn #1

BEGIN COMMIT

READ(A) WRITE(A)

slide-124
SLIDE 124 CMU 15-721 (Spring 2019)

LO ST UPDATE AN O M ALY

54

Txn #2

BEGIN COMMIT

WRITE(A)

Txn #1

BEGIN COMMIT

READ(A) WRITE(A)

slide-125
SLIDE 125 CMU 15-721 (Spring 2019)

LO ST UPDATE AN O M ALY

54

Txn #2

BEGIN COMMIT

WRITE(A)

Txn #1

BEGIN COMMIT

READ(A) WRITE(A)

slide-126
SLIDE 126 CMU 15-721 (Spring 2019)

LO ST UPDATE AN O M ALY

54

Txn #2

BEGIN COMMIT

WRITE(A)

Txn #1

BEGIN COMMIT

READ(A) WRITE(A)

slide-127
SLIDE 127 CMU 15-721 (Spring 2019)

LO ST UPDATE AN O M ALY

54

Txn #2’s write to A will be lost even though it commits after Txn #1. Txn #2

BEGIN COMMIT

WRITE(A)

Txn #1

BEGIN COMMIT

READ(A) WRITE(A)

A cursor lock on A would prevent this problem.

slide-128
SLIDE 128 CMU 15-721 (Spring 2019)

SN APSH OT ISO LATIO N (SI)

Guarantees that all reads made in a txn see a consistent snapshot of the database that existed at the time the txn started.

→ A txn will commit under SI only if its writes do not conflict with any concurrent updates made since that snapshot.

SI is susceptible to the Write Skew Anomaly

55

slide-129
SLIDE 129 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-130
SLIDE 130 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-131
SLIDE 131 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-132
SLIDE 132 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-133
SLIDE 133 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-134
SLIDE 134 CMU 15-721 (Spring 2019)

WRITE SKEW AN O M ALY

56

Txn #1

Change white marbles to black.

Txn #2

Change black marbles to white.

slide-135
SLIDE 135 CMU 15-721 (Spring 2019)

ISO LATIO N LEVEL H IERARCH Y

57

REPEATABLE READS SNAPSHOT ISOLATION READ UNCOMMITTED CURSOR STABILITY SERIALIZABLE READ COMMITTED

slide-136
SLIDE 136 CMU 15-721 (Spring 2019)

ISO LATIO N LEVEL H IERARCH Y

57

REPEATABLE READS SNAPSHOT ISOLATION READ UNCOMMITTED CURSOR STABILITY SERIALIZABLE READ COMMITTED

slide-137
SLIDE 137 CMU 15-721 (Spring 2019)

PARTIN G TH O UGH TS

Transactions are hard. Transactions are awesome. Things get even more wild when we add more internal components to the DBMS:

→ Indexes → Triggers → Catalogs → Sequences → Materialized Views

58

slide-138
SLIDE 138 CMU 15-721 (Spring 2019)

N EXT CLASS

Multi-Version Concurrency Control

59