XtraDB 5.7: Key Performance Algorithms Laurynas Biveinis Alexey - - PowerPoint PPT Presentation

xtradb 5 7 key performance algorithms
SMART_READER_LITE
LIVE PREVIEW

XtraDB 5.7: Key Performance Algorithms Laurynas Biveinis Alexey - - PowerPoint PPT Presentation

XtraDB 5.7: Key Performance Algorithms Laurynas Biveinis Alexey Stroganov Percona firstname.lastname@percona.com XtraDB 5.7 Key Performance Algorithms Focus on the buffer pool , flushing , the doublewrite buffer Talk assumes


slide-1
SLIDE 1

XtraDB 5.7: Key Performance Algorithms

Laurynas Biveinis Alexey Stroganov Percona firstname.lastname@percona.com

slide-2
SLIDE 2

XtraDB 5.7 Key Performance Algorithms

  • Focus on the buffer pool, flushing, the doublewrite

buffer

  • Talk assumes familiarity, but feel free to interrupt
  • What we learned
  • What we did
  • How we did it
slide-3
SLIDE 3

InnoDB buffer pool

  • Memory cache of disk data pages
  • In-memory data pages accessible through several

data structures

  • 1) Page hash for lookup

(space_id; page_id) ? Hash array Data page lists

Fold

slide-4
SLIDE 4

InnoDB buffer pool

  • 2) flush list for dirty page management. Dirtying:

Clean page INSERT INTO foo VALUES(bar) Dirty page, LSN = 42 Dirty page, LSN = 25 Dirty page, LSN = 32 Dirty page, LSN = 42

Flush list tail:

slide-5
SLIDE 5

InnoDB buffer pool

  • 2) flush list for dirty page management. Flushing:

Dirty page, LSN = 5 Dirty page, LSN = 7 Dirty page, LSN = 12

Flush list head:

Flush up to LSN 10 Clean page Clean page Dirty page, LSN = 12

Flush list head:

slide-6
SLIDE 6

InnoDB buffer pool

  • 3) LRU list for deciding which pages to evict
  • Preventing eviction for recently-used pages

(making them young):

Dirty page Clean page Dirty page Clean page Clean page Page access Dirty page Clean page Dirty page Clean page Clean page

slide-7
SLIDE 7

InnoDB buffer pool

  • 4) free list for having free space in the buffer pool

to read currently non-present pages. Reading:

Free page Free page Free page Free page Free page Page read Free page Free page Free page Free page Clean page

slide-8
SLIDE 8

InnoDB buffer pool

  • 3/4) Evicting/flushing pages from the LRU list and

putting them on the free list:

Dirty page Clean page Dirty page Clean page Clean page Free page Free page Free page Free page Dirty page Dirty page Clean page Clean page Free page Free page Free page Free page Free page

slide-9
SLIDE 9

The doublewrite buffer

Data page Doublewrite buffer in disk Data file Doublewrite buffer in memory

Add Flush Write

Step 1 Step 2 Step 3

slide-10
SLIDE 10

Buffer pool concurrency

flush list LRU list free list page hash misc. buffer pool mutex flush list mutex page hash latch

slide-11
SLIDE 11

Buffer pool instances

flush list LRU list free list page hash misc. buffer pool mutex flush list mutex page hash latch buffer pool instance 0 flush list LRU list free list page hash misc. buffer pool mutex flush list mutex page hash latch buffer pool instance 1

slide-12
SLIDE 12

Buffer pool instances

  • Problem: some instances are cold and some are

hot

  • “First the accesses to the buffer pools is in no way

evenly spread out.”

  • http://bit.ly/bpsplit
  • Six year-old quote, still relevant the same today
slide-13
SLIDE 13

Concurrency in XtraDB

flush list page hash flush list mutex page hash latch LRU list LRU list mutex free list free list mutex misc misc mutex / atomics

Patch contributed to MySQL, and merged in 8.0.0 http://bugs.mysql.com/bug.php?id=75534

slide-14
SLIDE 14

Concurrency solutions are compatible

flush list page hash flush list mutex page hash latch LRU list LRU list mutex free list free list mutex misc misc mutex / atomics buffer pool instance 0 buffer pool instance 1 flush list page hash flush list mutex page hash latch LRU list LRU list mutex free list free list mutex misc misc mutex / atomics

slide-15
SLIDE 15

Buffer pool mutexes are so 5.5

Improvement by the buffer pool mutex split Improvement by adaptive flushing

slide-16
SLIDE 16

5.6+ changed things

  • In 5.5 and earlier: reduce mutex contention by X%,
  • bserve TPS increase by ~X%
  • Changing flushing heuristics is driven by

performance stability, not necessarily by peak performance

  • Pre-release Percona Server 5.6: reduce mutex

contention by X%, observe TPS increase by ~0%

  • What happened? InnoDB cleaner thread happened
slide-17
SLIDE 17

Buffer pool / flushing concurrency in 5.5

Time Master thread Query thread 1 Query thread 2

flush list flush flush list flush flush list flush make page young make page young LRU list flush make page young LRU list flush make page young

slide-18
SLIDE 18

Buffer pool / flushing concurrency in 5.6+

Time Cleaner thread Query thread 1 Query thread 2

flush list flush flush list flush flush list flush make page young make page young LRU list flush make page young LRU list flush make page young LRU list flush

slide-19
SLIDE 19

Buffer pool / flushing concurrency in 5.6+

  • In 5.6+, code-level changes to reduce locking

granularity are still important, but

  • Increasing thread specialization means that…
  • …flushing - including LRU - heuristics are very

important now

slide-20
SLIDE 20

MySQL 5.7 multi-threaded flushing

LRU instance #0 flush list instance #0 LRU instance #1 flush list instance #1 LRU instance #2 flush list instance #2 coordinator thread worker thread #0 worker thread #1 time 0 s 1 s LRU… LRU… LRU…

slide-21
SLIDE 21

MySQL 5.7.11 OLTP_RW

slide-22
SLIDE 22

PFS data is incomplete

slide-23
SLIDE 23

MySQL 5.7.11 OLTP_RW

660 pthread_cond_wait,enter (ib0mutex.h:850), buf_dblwr_write_single_page (ib0mutex.h:850),buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page (buf0flu.cc1096),buf_flush_single_page_from_LRU (buf0flu.cc:2217), buf_LRU_get_free_block(buf0lru.cc:1401),... 631 pthread_cond_wait,buf_dblwr_write_single_page (buf0dblwr.cc:1213), buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page (buf0flu.cc:1096), buf_flush_single_page_from_LRU (buf0flu.cc:2217), buf_LRU_get_free_block(buf0lru.cc:1401),... 337 pthread_cond_wait,PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89),get_next_redo_rseg (trx0trx.cc:1185), trx_assign_rseg_low(trx0trx.cc:1278),trx_set_rw_mode (trx0trx.cc:1278), lock_table(lock0lock.cc:4076),...

slide-24
SLIDE 24

631 pthread_cond_wait,buf_dblwr_write_single_page

slide-25
SLIDE 25

Single-page flushing

Is free page available?

Single-page flush Take a free page from the free list Query thread needs a free page

Yes No

Single-page doublewrite Query thread has a free page

slide-26
SLIDE 26

XtraDB innodb_empty_free_list_algorithm=backoff

Is free page available?

Wait Take a free page from the free list Query thread needs a free page

Yes No

Single-page doublewrite Query thread has a free page Single-page flush

slide-27
SLIDE 27

MySQL 5.7 multi-threaded flushing

LRU instance #0 flush list instance #0 LRU instance #1 flush list instance #1 LRU instance #2 flush list instance #2 coordinator thread worker thread #0 worker thread #1 time 0 s 1 s LRU… LRU… LRU… free pages Single page flushes! free pages

slide-28
SLIDE 28

Percona Server 5.7 multi- threaded flushing

LRU flusher #0 LRU flusher #1 LRU instance #0 LRU instance #0 LRU… free pages LRU instance #1 LRU… free pages flush list instance #0 flush list instance #1 coordinator worker #0 time 0 s 1 s flush… flush…

slide-29
SLIDE 29

Percona Server 5.7.10-3 OLTP_RW

slide-30
SLIDE 30

Percona Server 5.7.10-3 OLTP_RW

2678 nanosleep (libpthread.so.0), … ,buf_LRU_get_free_block (buf0lru.cc:1435), ... 867 pthread_cond_wait,...,log_write_up_to(log0log.cc:1293),... 396 pthread_cond_wait,…, mtr_t::s_lock(sync0rw.ic:433), btr_cur_search_to_nth_level(btr0cur.cc:1022),... 337 libaio::??(libaio.so.1),LinuxAIOHandler::collect (os0file.cc: 2325), ... 240 poll(libc.so. 6),...,Protocol_classic::read_packet(protocol_classic.cc:810),...

slide-31
SLIDE 31

2678 nanosleep, …, buf_LRU_get_free_block

slide-32
SLIDE 32

Percona Server 5.7.10-3 OLTP_RW flushers only

slide-33
SLIDE 33

Legacy doublewrite buffer: adding pages

slide-34
SLIDE 34

Percona Server 5.7.10-3 OLTP_RW flushers only

139 libaio::??(libaio.so.1),LinuxAIOHandler::collect (os0file.cc:2448), LinuxAIOHandler::poll(os0file.cc:2594),... 56 pthread_cond_wait,…,buf_dblwr_add_to_batch (buf0dblwr.cc:1111),…,buf_flush_LRU_list_batch (buf0flu.cc:1555), ...,buf_lru_manager(buf0flu.cc:2334),... 25 pthread_cond_wait,…,os_event_wait_low (os0event.cc:534),buf_flush_page_cleaner_worker(buf0flu.cc:3482),... 21 pthread_cond_wait, …, PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89),buf_page_io_complete (buf0buf.cc:5966), fil_aio_wait(fil0fil.cc:5754),io_handler_thread(srv0start.cc:330),... 8 pthread_cond_timedwait,…,buf_flush_page_cleaner_coordinator (buf0flu.cc:2726),...

slide-35
SLIDE 35

56 pthread_cond_wait, …, buf_dblwr_add_to_batch

slide-36
SLIDE 36

Legacy doublewrite buffer: flushing buffer

slide-37
SLIDE 37

Parallel doublewrite buffer: adding pages

slide-38
SLIDE 38

Parallel doublewrite buffer: flushing buffers

slide-39
SLIDE 39

Percona Server 5.7.11-4 OLTP_RW flushers only

slide-40
SLIDE 40

Percona Server 5.7.11-4 OLTP_RW flushers only

112 libaio::??(libaio.so.1),LinuxAIOHandler::collect (os0file.cc:2455),...,io_handler_thread(srv0start.cc:330),... 54 pthread_cond_wait,…,buf_dblwr_flush_buffered_writes (buf0dblwr.cc:1287),…,buf_flush_LRU_list (buf0flu.cc:2341),buf_lru_manager(buf0flu.cc:2341),... 35 pthread_cond_wait, …, PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89), buf_page_io_complete(buf0buf.cc:5986), …, io_handler_thread(srv0start.cc:330),... 27 pthread_cond_wait,...,buf_flush_page_cleaner_worker(buf0flu.cc:3489),... 10 pthread_cond_wait,…,enter(ib0mutex.h:845), buf_LRU_block_free_non_file_page(ib0mutex.h:845), buf_LRU_block_free_hashed_page(buf0lru.cc:2567), …,buf_page_io_complete(buf0buf.cc:6070), …,io_handler_thread (srv0start.cc:330),...

slide-41
SLIDE 41

Percona Server 5.7 OLTP_RW

slide-42
SLIDE 42

Percona Server 5.7 OLTP_RW

slide-43
SLIDE 43

Summary: 5.7 story

  • I/O-bound workloads: high demand for free pages,

provided by LRU batch flushing or single-page flushing

  • Single-page flushes are bad, w/ and w/o doublewrite
  • Removed it
  • Made batch LRU flusher truly parallel
  • Doublewrite buffer negates parallel flushing gains
  • Made it parallel too
slide-44
SLIDE 44

44

Rate My Session!