Serializable Snapshot Isolation Making ISOLATION LEVEL SERIALIZABLE - - PowerPoint PPT Presentation

serializable snapshot isolation
SMART_READER_LITE
LIVE PREVIEW

Serializable Snapshot Isolation Making ISOLATION LEVEL SERIALIZABLE - - PowerPoint PPT Presentation

Serializable Snapshot Isolation Making ISOLATION LEVEL SERIALIZABLE Provide Serializable Isolation Dan Ports Kevin Grittner MIT Wisconsin Court System Saturday, May 21, 2011 1 Overview Serializable isolation makes it easier to reason


slide-1
SLIDE 1

Serializable Snapshot Isolation

Making ISOLATION LEVEL SERIALIZABLE Provide Serializable Isolation

Dan Ports MIT Kevin Grittner Wisconsin Court System

1 Saturday, May 21, 2011

slide-2
SLIDE 2

Overview

Serializable isolation makes it easier to reason about concurrent transactions In 9.0 and before, SERIALIZABLE was really snapshot isolation – allows anomalies in 9.1: Serializable Snapshot Isolation (SSI)

  • a new way to ensure true serializability

(first implementation in a production DBMS!)

2 Saturday, May 21, 2011

slide-3
SLIDE 3

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Using SSI
  • Performance results

3 Saturday, May 21, 2011

slide-4
SLIDE 4

Transactions

Transactions group related operations:

shouldnʼt see one operation without the others

  • ...even if the system crashes (recoverability)
  • ...even if other transactions are executing

concurrently (isolation)

4 Saturday, May 21, 2011

slide-5
SLIDE 5

Isolation

Serializable isolation: each transaction is guaranteed to behave as though itʼs the only one running

  • makes it easy to reason about each

transactionʼs behavior in isolation Weaker isolation levels:

  • concurrent transactions can

cause anomalous behavior

5 Saturday, May 21, 2011

slide-6
SLIDE 6

Isolation Levels

SQL Standard SERIALIZABLE REPEATABLE READ READ COMMITTED READ UNCOMMITTED t ment s

6 Saturday, May 21, 2011

slide-7
SLIDE 7

Isolation Levels

SQL Standard SERIALIZABLE REPEATABLE READ READ COMMITTED READ UNCOMMITTED t ment s 9.0 snapshot isolation per-statement snapshots

6 Saturday, May 21, 2011

slide-8
SLIDE 8

Isolation Levels

SQL Standard SERIALIZABLE REPEATABLE READ READ COMMITTED READ UNCOMMITTED t ment s 9.0 snapshot isolation per-statement snapshots 9.1 SSI snapshot isolation per-statement snapshots

6 Saturday, May 21, 2011

slide-9
SLIDE 9

Snapshot Isolation

Each transaction sees a “snapshot” of DB taken at its first query

  • implemented using MVCC
  • tuple-level write locks prevent

concurrent modifications Still a weaker isolation level than true serializability!

7 Saturday, May 21, 2011

slide-10
SLIDE 10

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Using SSI
  • Performance results

8 Saturday, May 21, 2011

slide-11
SLIDE 11

Goal: ensure at least one guard always on-duty

guard

  • n-duty?

uty? Alice y Bob y

9 Saturday, May 21, 2011

slide-12
SLIDE 12

BEGIN SELECT count(*) FROM guards WHERE on-duty = y if > 1 { UPDATE guards SET on-duty = n WHERE guard = x } COMMIT

Goal: ensure at least one guard always on-duty

guard

  • n-duty?

uty? Alice y Bob y

9 Saturday, May 21, 2011

slide-13
SLIDE 13

guard

  • n-duty?

uty? Alice y Bob y

10 Saturday, May 21, 2011

slide-14
SLIDE 14

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2]

10 Saturday, May 21, 2011

slide-15
SLIDE 15

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2]

10 Saturday, May 21, 2011

slide-16
SLIDE 16

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

10 Saturday, May 21, 2011

slide-17
SLIDE 17

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n

10 Saturday, May 21, 2011

slide-18
SLIDE 18

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guards = ʻBobʼ } COMMIT BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n

10 Saturday, May 21, 2011

slide-19
SLIDE 19

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guards = ʻBobʼ } COMMIT BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n n

10 Saturday, May 21, 2011

slide-20
SLIDE 20

Serializable means: results equivalent to some serial ordering of the transactions Serialization history graph shows dependencies between transactions

  • A ➔ B (“wr-dependency”)

if B sees a change made by A

  • A ➔ B (“ww-dependency”)

if B overwrites a change by A

  • B ➔ A (“rw-conflict”)

if B doesnʼt see a change made by A

Serializable if no cycle in graph

11 Saturday, May 21, 2011

slide-21
SLIDE 21

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guards = ʻBobʼ } COMMIT BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n n

12 Saturday, May 21, 2011

slide-22
SLIDE 22

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guards = ʻBobʼ } COMMIT BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n n

rw-conflict: T1 didnʼt see T2ʼs UPDATE

12 Saturday, May 21, 2011

slide-23
SLIDE 23

guard

  • n-duty?

uty? Alice y Bob y

BEGIN SELECT count(*) FROM guard WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guards = ʻBobʼ } COMMIT BEGIN SELECT count(*) FROM guards WHERE on-duty = y [result = 2] if > 1 { UPDATE guards SET on-duty = n WHERE guard = ʻAliceʼ } COMMIT

n n

rw-conflict: T1 didnʼt see T2ʼs UPDATE rw-conflict: T2 didnʼt see T1ʼs UPDATE

12 Saturday, May 21, 2011

slide-24
SLIDE 24

T2

rw-conflict: T1 didnʼt see T2ʼs UPDATE

T1

rw-conflict: T2 didnʼt see T1ʼs UPDATE

cycle means no serial order exists! T1 before T2 before T1...

13 Saturday, May 21, 2011

slide-25
SLIDE 25

Batch Processing Example

  • control table just holds current batch #
  • receipts table entries tagged w/ batch #

Three transactions:

  • read current batch, insert receipt tagged w/ it
  • increment current batch #
  • read batch, get all receipts for previous batch

Invariant: after we read yesterdayʼs report, no new receipts for yesterday should appear

14 Saturday, May 21, 2011

slide-26
SLIDE 26

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT

rw-conflict: T1 didnʼt see T2ʼs UPDATE

15 Saturday, May 21, 2011

slide-27
SLIDE 27

T2

[incr-batch]

T1

[add-receipt]

rw-conflict: T1 didnʼt see T2ʼs UPDATE

Serializable!

Apparent order of execution: T1 before T2 ...but T2 committed before T1. Thatʼs OK!

16 Saturday, May 21, 2011

slide-28
SLIDE 28

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT

T3

17 Saturday, May 21, 2011

slide-29
SLIDE 29

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT SELECT batch... [result = 5/20] SELECT 5/19 receipts [...]

T3

17 Saturday, May 21, 2011

slide-30
SLIDE 30

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT SELECT batch... [result = 5/20] SELECT 5/19 receipts [...]

T3

rw-conflict

17 Saturday, May 21, 2011

slide-31
SLIDE 31

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT SELECT batch... [result = 5/20] SELECT 5/19 receipts [...]

T3

rw-conflict wr-dependency

17 Saturday, May 21, 2011

slide-32
SLIDE 32

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 COMMIT INSERT receipt (5/19, …) COMMIT SELECT batch... [result = 5/20] SELECT 5/19 receipts [...]

T3

rw-conflict wr-dependency rw-conflict

17 Saturday, May 21, 2011

slide-33
SLIDE 33

T2

[incr-batch]

T1

[add-receipt]

rw-conflict: T1 didnʼt see T2ʼs UPDATE

Not serializable!

Adding the read-only transaction creates a cycle.

T3

[report]

wr-dependency: T3 did see T2ʼs UPDATE rw-conflict: T3 didnʼt see T1ʼs INSERT

18 Saturday, May 21, 2011

slide-34
SLIDE 34

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Using SSI
  • Performance results

19 Saturday, May 21, 2011

slide-35
SLIDE 35

Existing Approaches to Serializability

  • ignore the problem, make the user deal
  • use SELECT FOR UPDATE, LOCK TABLE
  • can be hard to figure out where to put these!
  • run one transaction at a time [not practical]
  • strict two-phase locking
  • acquire lock on every object read or written
  • causes readers to block writers & vice versa

20 Saturday, May 21, 2011

slide-36
SLIDE 36

T1 T2

SELECT batch FROM control [result = 5/19]

21 Saturday, May 21, 2011

slide-37
SLIDE 37

T1 T2

SELECT batch FROM control [result = 5/19]

21 Saturday, May 21, 2011

slide-38
SLIDE 38

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 [blocked!]

21 Saturday, May 21, 2011

slide-39
SLIDE 39

T1 T2

SELECT batch FROM control [result = 5/19] UPDATE control SET batch = 5/20 [blocked!] INSERT receipt (5/19, …) COMMIT

21 Saturday, May 21, 2011

slide-40
SLIDE 40

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1

22 Saturday, May 21, 2011

slide-41
SLIDE 41

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1

22 Saturday, May 21, 2011

slide-42
SLIDE 42

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1 T3

22 Saturday, May 21, 2011

slide-43
SLIDE 43

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1 T3

X

22 Saturday, May 21, 2011

slide-44
SLIDE 44

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1 T3

22 Saturday, May 21, 2011

slide-45
SLIDE 45

SSI Approach (Almost.)

Actually build the dependency graph!

  • If a cycle is created,

abort some transaction to break it

T2 T1 T3

X

22 Saturday, May 21, 2011

slide-46
SLIDE 46

Serializability theory tells us:

  • every cycle contains two adjacent

rw-conflict edges (where A didnʼt see Bʼs update) So we can just look for those

  • donʼt need to track other types of edges
  • conservative (occasional false positives)

SSI Rule: Donʼt let a transaction have both a rw-conflict in and a rw-conflict out!

[Cahill et al. Serializable Isolation For Snapshot Databases, SIGMOD ʼ08]

23 Saturday, May 21, 2011

slide-47
SLIDE 47

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Using SSI
  • Performance results

24 Saturday, May 21, 2011

slide-48
SLIDE 48

Implementing SSI

Need to keep some extra transaction state

  • mainly: list of rw-conflicts in and out
  • if one transaction has both, abort something
  • note: need to keep lists after xact commits,

until all concurrent transactions commit

But how do we identify a rw-conflict?

25 Saturday, May 21, 2011

slide-49
SLIDE 49

Identifying rw-conflicts

Recall: T1 —> T2 if T2 makes a change, and T1ʼs read doesnʼt see its effects

  • If T2ʼs write happens first:

T1 will see tupleʼs MVCC data and ignore it

  • If T1ʼs read happens first:

use a “lock” to know that T2ʼs write conflicts

xmin xmax data T2 …

26 Saturday, May 21, 2011

slide-50
SLIDE 50

Identifying rw-conflicts

Recall: T1 —> T2 if T2 makes a change, and T1ʼs read doesnʼt see its effects

  • If T2ʼs write happens first:

T1 will see tupleʼs MVCC data and ignore it

  • If T1ʼs read happens first:

use a “lock” to know that T2ʼs write conflicts

xmin xmax data T2 …

T2 not in T1ʼs snapshot => conflict w/ T1

26 Saturday, May 21, 2011

slide-51
SLIDE 51

Tracking Read Dependencies

Acquire a “SIREAD lock” on anything read Check for SIREAD locks on write, flag conflict New lock manager — unlike current locks:

  • no blocking! (just flag a conflict instead)
  • can persist beyond transaction commit
  • multi-granularity (relation, page, tuple); promotion
  • needs predicate locking

27 Saturday, May 21, 2011

slide-52
SLIDE 52

Predicate Locking

Not enough just to lock returned tuples: Really want predicate locking:

“lock everything where x=42” (but not feasible)

Instead: lock corresponding index page

  • if no index, lock entire relation

SELECT FROM... WHERE x=42 [3 results] INSERT INTO… VALUES (x=42) [should conflict; wonʼt]

28 Saturday, May 21, 2011

slide-53
SLIDE 53

Other Features

Deferrable read-only transactions

  • wait until xact can be executed safely

without lock overhead or risk of abort

Dealing with shared memory exhaustion

  • promote locks to coarser granularity
  • reduce information about committed transactions

and push to disk if necessary (SLRU)

29 Saturday, May 21, 2011

slide-54
SLIDE 54

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Using SSI
  • Performance results

30 Saturday, May 21, 2011

slide-55
SLIDE 55

Conflicts may cause transactions to abort

  • source of conflict might not be obvious
  • will usually succeed if retried
  • middleware that automatically retries can help

Performance tips

  • declare transactions READ ONLY if possible
  • donʼt put more into a single transaction than needed
  • donʼt leave connections dangling “idle in transaction”

31 Saturday, May 21, 2011

slide-56
SLIDE 56

Agenda

  • What is serializability? Why do we want it?
  • Snapshot isolation vs. serializability
  • Serializable Snapshot Isolation
  • SSI implementation overview
  • Performance results

32 Saturday, May 21, 2011

slide-57
SLIDE 57

Performance

Two main sources of slowdown

  • How much CPU overhead does the

SIREAD lock manager add?

  • in-memory pgbench: not much slowdown
  • How often are transactions rolled back

because of conflicts?

  • depends heavily on workload

33 Saturday, May 21, 2011

slide-58
SLIDE 58

Measuring Abort Rate

DBT-2 benchmark (OLTP, like TPC-C)

  • modified to retry transactions after

serialization failure Configuration:

  • 16-core Xeon E7310, 1.60GHz, 8 GB RAM
  • 3x 15K drives for data; 1 for log
  • database size ~20 GB

34 Saturday, May 21, 2011

slide-59
SLIDE 59

DBT-2 Performance

Approach: use highest scale factor that gives 90% request latency < 5 seconds

REPEATABLE READ (snapshot isolation):

  • 160 warehouses, 1941 new order transactions/minute
  • 1.5% transactions retried due to serialization failure

SERIALIZABLE (SSI):

  • 157 warehouses, 1923 NOTPM (< 2% slowdown)
  • 3.1% transactions retried due to serialization failure
  • no aborts of read-only transactions
  • 15% abort rate for “delivery” xacts (4% of workload)

35 Saturday, May 21, 2011

slide-60
SLIDE 60

Summary

True serializable transactions are here!

  • avoiding snapshot isolation anomalies

can simplify applications

  • implemented using a novel technique
  • reuses existing snapshot isolation

mechanisms

  • performance cost is reasonable

36 Saturday, May 21, 2011