Falcon - built for speed Ann Harrison Kevin Lewis MySQL Users' - - PowerPoint PPT Presentation

falcon built for speed
SMART_READER_LITE
LIVE PREVIEW

Falcon - built for speed Ann Harrison Kevin Lewis MySQL Users' - - PowerPoint PPT Presentation

Falcon - built for speed Ann Harrison Kevin Lewis MySQL Users' Conference April 2009 If it's so fast, why isn't it done yet? Talk overview Falcon at a glance Project history Multi-threading for the database developer Cycle locking Falcon


slide-1
SLIDE 1

MySQL Users' Conference April 2009

Falcon - built for speed Ann Harrison Kevin Lewis

slide-2
SLIDE 2

If it's so fast, why isn't it done yet?

slide-3
SLIDE 3

Talk overview

Falcon at a glance Project history Multi-threading for the database developer Cycle locking

slide-4
SLIDE 4

Falcon at a glance – read first record

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-5
SLIDE 5

Falcon at a glance – read complete

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-6
SLIDE 6

Falcon at a glance – read again

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-7
SLIDE 7

Falcon at a glance – write new record

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-8
SLIDE 8

Falcon at a glance – commit

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-9
SLIDE 9

Falcon at a glance – write complete

Serial Log Files Database Tablespaces Serial Log Windows Page Cache Record Cache

MySQL Server

slide-10
SLIDE 10

Falcon history

Origin

Transactional SQL Engine for Web App Environment Bought by MySQL in 2006

MVCC

Consistent Read Verisons control write access Memory only – no steal

Indexes and data separate Data encoded on disk and in memory Fine grained multi-threading

slide-11
SLIDE 11

Falcon Goals circa 2006

Exploit large memory for more than just a bigger cache Use threads and processors for data migration Eliminate tradeoffs, minimize tuning Scale gracefully to very heavy loads Support web applications

slide-12
SLIDE 12

Web application characteristics

Large archive of data Smaller active set High read:write ratio Uneven, bursty activity

slide-13
SLIDE 13

What we did instead

Enforce limit on record cache size Respond to simple atypical loads

Autocommit single record access Repeat “insert ... select” Single pass read of large data set

Challenge InnoDB on DBT2

Large working set Continuous heavy load

Hired the world's most vicious test designer

slide-14
SLIDE 14

Record Cache

Record Cache contains:

Committed records with no versions

slide-15
SLIDE 15

Record Cache

Record Cache contains:

Committed records with no versions New, uncommitted records

slide-16
SLIDE 16

Record Cache

Record Cache contains:

Committed records with no versions New, uncommitted records Records with multiple versions

slide-17
SLIDE 17

Record Cache cleanup – step 1

Cleanup old committed single version records Scavenger Runs on schedule or demand Removes oldest mature records Settable limits – start and stop

slide-18
SLIDE 18

Record Cache Cleanup – step 2

Clean out record versions too old to be useful Prune

Remove old, unneeded versions

slide-19
SLIDE 19

Record Cache Cleanup – step 3

Clean up a cache full of new records Chill

Copy new record data to log Done by transaction thread Settable start size

slide-20
SLIDE 20

Record Cache Cleanup – step 4

Clean up multiple versions of a single record created by a single transaction Remove intermediate versions

Created by a single transaction Rolled back to save point Repeated updates

slide-21
SLIDE 21

Record Cache Cleanup – step 5

Clean up records with multiple versions, still potentially visible Backlog

Copy entire record tree to disk Expensive Not yet working

slide-22
SLIDE 22

Simple, atypical loads

Challenge:

Autocommit single record access Record cache is useless Record encoding is useless Transaction creation / destruction is too expensive

Response:

Reuse read only transactions

Result:

Multi-threaded bookkeeping nightmare

slide-23
SLIDE 23

Simple, atypical loads

Challenge:

Repeat “insert ... select...”

Fill cache with old and new records

slide-24
SLIDE 24

Simple, atypical loads

Challenge:

Repeat “insert ... select...”

Fill cache with old and new records First solution

Scavenge old records Chill new record data

slide-25
SLIDE 25

Simple, atypical loads

Challenge:

Repeat “insert ... select...”

Fill cache with old and new records First solution

Scavenge old records Chill new records

Second solution

Move the records headers out Also helps index creation

slide-26
SLIDE 26

Simple, atypical loads

Single pass read of large data set

Read more records than Read them over and over Caches are useless Encoding is overhead

Response:

Make encoding optional?

slide-27
SLIDE 27

Challenge InnoDB on DBT2

Initial results were not encouraging (2007)

5000 10000 15000 20000 25000 30000 10 20 50 100 150 200 Connections Transactions Falcon2007 InnoDB2007

slide-28
SLIDE 28

Challenge InnoDB on DBT2

But Falcon has improved a lot since April 2007

5000 10000 15000 20000 25000 30000 10 20 50 100 150 200 Connections Transactions Falcon2007 InnoDB2007 Falcon2009

slide-29
SLIDE 29

Challenge InnoDB on DBT2

So did InnoDB

5000 10000 15000 20000 25000 30000 10 20 50 100 150 200 Connections Transactions Falcon2007 InnoDB2007 Falcon2009 InnoDB2009

slide-30
SLIDE 30

Bug trends

slide-31
SLIDE 31

Multi-threading

Databases are a natural fit for multi-threading

Connections Gophers Scavenger Disk reader/writer

Except for shared structures Locking blocks parallel operations Challenge – sharing without locking

slide-32
SLIDE 32

Multi-threading

Non-locking operation Purge old record versions

slide-33
SLIDE 33

Multi-threading

Non-locking operation Purge old record versions

slide-34
SLIDE 34

Multi-threading

Locking operation Remove intermediate versions

slide-35
SLIDE 35

Multi-threading

Locking operation Remove intermediate versions What granularity of lock?

slide-36
SLIDE 36

Multi-threading – Lock granularity

One per record:

Too many interlocked instructions

One per record group:

Thread reading one record prevents scavenge of another

No answer is right – more options?

slide-37
SLIDE 37

Cycle locking – read record chain

Before starting to read a record chain, get a shared lock on a “cycle” Transaction A Transaction B Transaction C

Cycle 1 = 3 shared Cycle 2 inactive

slide-38
SLIDE 38

Cycle locking – clean a record chain

Before starting to read a record chain, get a shared lock on a “cycle” Transaction A active in Cycle 1 Transaction B active in Cycle 1 Transaction C active in Cycle 1 Scavenger unlinks versions from record chain and links them to a “to be deleted” list.

Cycle 1 = 4 shared Cycle 2 inactive

slide-39
SLIDE 39

Cycle locking – records relinked

Transaction A releases lock Transaction B releases lock Transaction C still active Scavenger releases lock

Cycle 1 = 1 shared Cycle 2 inactive

slide-40
SLIDE 40

Cycle locking – swap cycles

New access locks cycle 2 Transaction C holds Cycle 1 lock Cycle Manager requests exclusive

  • n Cycle 1 (pumps cycle)

Transaction A acquires Cycle 2 lock

Cycle 1 = 1 shared Cycle 2 = 1 shared

slide-41
SLIDE 41

Cycle locking – cleanup phase

Transaction C releases lock Transaction B acquires Cycle 2 lock Cycle manager exclusive Cycle 1

Cycle 1 = 0 shared exclusive Cycle 2 = 2 shared

slide-42
SLIDE 42

Cycle locking – cleanup complete

Transaction C acquires Cycle 2 lock Cycle manager exclusive Cycle 1 Remove unlinked, unloved, old versions When cleanup is done, Cycle manager releases cycle 1

Cycle 1 exclusive Cycle 2 = 2 shared

slide-43
SLIDE 43

Questions