Transactions and concurrency Introduction to databases CSCC43 - - PowerPoint PPT Presentation

transactions and concurrency
SMART_READER_LITE
LIVE PREVIEW

Transactions and concurrency Introduction to databases CSCC43 - - PowerPoint PPT Presentation

Transactions and concurrency Introduction to databases CSCC43 Winter 2013 Ryan Johnson Thanks to Arnold Rosenbloom and Renee Miller for material in these slides 2 Transactions: an old idea Database models business transactions


slide-1
SLIDE 1

Transactions and concurrency

Introduction to databases CSCC43 Winter 2013 Ryan Johnson

Thanks to Arnold Rosenbloom and Renee Miller for material in these slides

slide-2
SLIDE 2

Transactions: an old idea

  • Database models business transactions

– Customer pays, shopkeeper hands over goods – Bank gives money, debits account – Traveler reserves seat, airline doesn’t sell it to others – etc.

  • Key: transactions are all-or-nothing

– Customer pays, shopkeeper keeps goods

  • Called a rip-off or scam (not a transaction)

– Customer pays nothing, shopkeeper hands over goods

  • Usually called theft or robbery (again, not a transaction)

2

Transactions protect/coordinate business dealings

slide-3
SLIDE 3

Transactions: a programming model

  • Manage information rather than goods

– Concerned with reads and updates to database objects … and how those changes propagate through the system

  • All-or-nothing execution

– On success, commit: apply outstanding changes atomically – On failure, abort: outstanding changes rolled back (undone)

  • Some finer-grained control

– Savepoint: back up instead of starting over on error – Nested transactions: compose small-scale operations

3

slide-4
SLIDE 4

Transactions vs. other languages

  • C

– Error detection: manual (always check return codes) – Partial changes: manual undo (very hard!)

  • Java/python/etc.

– Error detection: try…catch – Error recovery: manual undo (still very hard!)

  • Transaction

– Error detection: automatic abort unless overridden – Error recovery: rollback of all partial changes (no confusion)

4

slide-5
SLIDE 5

Examples: error detection, recovery

5

// C language int do_something(int input) { void* ptr = malloc(input); int err = do_some_work(ptr); if(err) { free(ptr); // never mind return err; } err = do_some_more_work(); if(err) { // undo some_work? // free ptr? return err; } return 0; } // Java void do_something(int input) { char[] tmp = new char[input]; do_some_work(buf); try { do_some_more_work(); } catch(Exception e) { // undo some_work? throw; } } // Transactional void do_something(int input) { char[] tmp = new char[input]; do_some_work(buf); do_some_more_work(); }

slide-6
SLIDE 6

Transaction pros and cons

  • Pro: clean code (previous slides)
  • Pro: well-suited to concurrent execution (next)
  • Pro: transactions compose

– E.g. wrap two smaller transactions in an outer one

  • Con: transaction footprint tends to “snowball”

– E.g. searching a linked list uses only 1-2 nodes at a time … but transaction remembers all nodes (bad for concurrency)

  • Con: some actions can’t be rolled back

– Fire missiles, dispense cash, ship package, overwrite file, …

  • Con: transaction tracks all updates

– Irrelevant: no need to undo (e.g. temp variables) – Independent: should not undo (e.g. caching)

6

slide-7
SLIDE 7

Properties of transactions in DBMS

  • The infamous ACID properties

– Atomic (updates applied in all-or-nothing fashion) – Consistent (committed state always “makes sense”) – Isolated (transactions can’t see each others’ updates) – Durable (committed updates never lost)

7

Note: C and I are closely related

slide-8
SLIDE 8

Concurrent transactions

  • Problem: serial execution is slow

– Imagine a grocery store with only one checkout lane – Or a bus with only one seat

  • Alternative: concurrent execution

– Allow “non-conflicting” requests to proceed in parallel – Exact definition of “conflicting” application-dependent ... but same basic rules usually apply

8

slide-9
SLIDE 9

Dealing with concurrent updates

  • Semantics when transactions overlap?
  • Ideal: coherence

– Reads always return the most recently-written value – Writes always replace older values – Property of individual objects

  • Ideal: consistency

– Reads and writes appear in program order – OK to interleave different transactions, but not reorder them – Property of system as a whole (or at least sets of objects)

9

slide-10
SLIDE 10

Example: incoherent execution

10

Process A Process B x = 10 x = 20 x += 5 print x “15” print x “20”

Time

Read stale version of x Older version persists after update

A and B not using the same definition of “time”

slide-11
SLIDE 11

Example: inconsistent execution

11

Process A Process B x = 10 y= 1 y = 5 x = 20 print x,y “20, 1” print x,y “10,5”

Time

y assigned 5 before x assigned 20 Broken no matter where in time we put B’s updates

Above implies reordering of operations in A or B

slide-12
SLIDE 12

Serializable execution

  • The observed sequence of reads and writes

corresponds to some non-interleaved (serial) execution of the transactions/processes

12

Process A Process B x = 10 x = 20 x += 5 print x “15” print x “20”

What change to the schedule serializes A and B?

Always preserves both coherence and consistency

slide-13
SLIDE 13

What can go wrong in parallel?

  • Phenomenon: an interleaving of actions

(schedule) where “something” might go wrong

– Write-after-write (uncommitted value overwritten) – Read-after-write (uncommitted write “leaked”) – Write-after-read (in-use value overwritten)

  • Anomaly: a schedule where “something” did go

wrong (the phenomenon affected correctness)

– two people reserve the same theater seat – airfare goes up while you’re trying to buy it

13

slide-14
SLIDE 14

Phenomena and related anomalies

P – Write after Write (WAW) A – dirty write w1(x) w2(x) ... c2 c1 Lost committed write by T1 P – Read after Write (RAW) A1 – dirty read w1(x) r2(x) ... [a1 and c2] T2 uses rolled-back value of x A2 – read skew r1(x) w2(x) w2(y) c2 r1(y) T1 sees old x and new y P – Write after Read (WAR) A1 – fuzzy read r1(x) w2(x) ... r1(x) Value of x not stable during xct A2 – lost update r1(x) w2(x) ... w1(x) T1’s update depends on stale x A3 – phantom r1(P) w2(add x to P) ... c2 P as originally read is missing x P – snapshot-related A1 – write skew r1(x) r2(y) w1(y) w2(x) [c1 and c2] Two updates cross in flight A2 – read-only r2(x) r1(y) w1(y) c1 r3(x) r3(y) c3 w2(x) c2 T3 sees mix of past/future data

14

slide-15
SLIDE 15

Examples: phenomena and anomalies

  • Dirty write (WAW)

– You make offer on a house but the Realtor waits several days before responding that somebody else made a higher offer.

  • Dirty read (RAW)

– Your friend invites you to go with her to a concert; you buy a ticket; she changes her mind and decides not to go after all. No refunds.

  • Read skew (RAW)

– You park for three minutes in 10-minute parking to hand in an assignment; five minutes down the road you realize that part of the assignment is still in the car, so you turn around and park again to hand in the rest. When you come back three minutes later you have a parking ticket.

  • Fuzzy read (WAR)

– You grab the last tag for a hot item at a Boxing Day doorbuster sale, but the cashier says it’s actually out of stock (has actually happened to me)

15

slide-16
SLIDE 16

Examples: phenomena and anomalies

  • Lost update (WAR)

– You save a minor edit to your project report, not realizing that your project partner saved major edits since you last opened the file.

  • Phantom (WAR)

– You buy a new textbook from the bookstore because you couldn’t find a used copy; meanwhile 100 cheap used copies go on sale.

  • Write skew (snapshot)

– You agree to go out with some friends on Friday because your ex- girlfriend is *not* coming; meanwhile, she decides to go because she thinks you won’t be there.

  • Read-only anomaly (snapshot)

– End of month, no money. Rent deduction hits checking, tax return hits

  • savings. You log into online banking, see the tax return but not rent, and

assume overdraft protection will kick when the rent arrives. Bank says the rent arrived first and charges a $20 overdraft fee. The timestamps in the database show the rent arrived before you made your printout.

16

slide-17
SLIDE 17

Weakened consistency models

  • Repeatable read (SQL)

– Allows phantoms

  • Read committed (SQL)

– Allows WAR anomalies

  • Read uncommitted (SQL)

– Allows RAW and WAR anomalies

  • Snapshot isolation (Oracle)

– Allows snapshot anomalies and phantoms

  • Cursor stability (IBM)

– Allows lost updates

  • Eventual consistency (internet)

– Only guarantee: values read were written at some point in the past – Just about anything else can happen (WAW, lost updates, skew...)

17