Exploring Percona XtraDB Cluster using Performance Schema Krunal - - PowerPoint PPT Presentation

exploring percona xtradb cluster using performance schema
SMART_READER_LITE
LIVE PREVIEW

Exploring Percona XtraDB Cluster using Performance Schema Krunal - - PowerPoint PPT Presentation

Exploring Percona XtraDB Cluster using Performance Schema Krunal Bauskar PXC Product Lead (Percona Inc.) Agenda PXC and Performance Schema Understanding PXC instruments Enabling P_S for monitoring PXC Monitoring threads


slide-1
SLIDE 1

Exploring Percona XtraDB Cluster using Performance Schema

Krunal Bauskar PXC Product Lead (Percona Inc.)

slide-2
SLIDE 2

Agenda

  • PXC and Performance Schema
  • Understanding PXC instruments
  • Enabling P_S for monitoring PXC
  • Monitoring threads
  • Monitoring files
  • Monitoring Stages
  • Monitoring Mutexes/Cond-Variables
slide-3
SLIDE 3

PXC and Performance Schema

  • PXC-5.6 had limited support for PFS
  • Starting PXC-5.7 support has been extended to cover more instruments as

part of PFS.

  • Idea is to use PFS for monitoring and tracking performance issues in PXC-5.7
  • Development to PFS will continue and we plan to make it much more helpful

in upcoming releases.

  • Is PFS alternative to logging?

  • No. It is not. Both has different use-cases but can use-cases can overlap.
slide-4
SLIDE 4

Understanding PXC instruments

  • PFS has concept of instruments. DB internal objects can be called instrument.

Like mutexes, condition variables, files, threads, etc...

  • User can decide what all instruments to monitor based on need.
  • PXC objects to monitor:

○ PXC threads (doesn’t generate event directly) ○ PXC files (doesn’t generate event directly) ○ PXC stages ○ PXC mutexes ○ PXC condition variables

slide-5
SLIDE 5

PXC threads

  • Applier Thread(s): Meant to apply replicated write-sets. There can more than one applier threads.
  • Rollback Thread: If local transaction needs to be force aborted (due conflict with replicated transaction) applier

thread with high priority transaction add this thread to a queue and signal rollback thread. Rollback thread then rolls back the given transaction.

  • IST Receiver/Sender Thread: As name suggest they are responsible for doing IST. Comes into effect only when

there is needed for IST.

  • Gcache File Remove Thread: Dedicated thread meant to get rid of gcache file thereby keeping watch on

disk-space.

  • Receiver Thread: Read the packet received on cluster channel and invoke appropriate action.
  • Service Thread: Handle the administrative work.
  • GComm Thread: Low level gcomm-communication thread.
  • SST Joiner/Donor Thread: Used during SST.
  • Write-set Checkum Thread: if checksum needs to be done
slide-6
SLIDE 6

PXC files

  • Ring Buffer File: GCache file that is used as circular buffer for caching write-sets.
  • GCache Page File: On-demand page store pages created as part of GCache Infrastructure.
  • RecordSet File: Created to temporary cache record-set generated during transaction processing.

(Certification keys and binary log data).

  • Grastate File/Gvwstate file: grastate cache the state of cluster (marked unsafe during normal

cluster operation) and gvwstate caches known good primary component state. [Not monitored through PFS as these are generally static and changes during configuration change only]

slide-7
SLIDE 7

PXC Stages

  • Any PXC action passes through different stages as commented below:

○ Writing/Deleting/Updating Rows: Corresponding DML in progress. ○ Applying write-sets: Applier thread is in process of applying write-set. ○ Commit/Rollback: Commit/Rollback transaction. ○ Replicating Commit: pre-commit action to put write-set on group channel. ○ Replaying: Replaying of locally aborted transaction. ○ Preparing for TO Isolation: Trigger TO isolation. ○ Applier Idle: Applier is waiting for next set of workload/write-sets to process. ○ Rollback/Aborted Idle: Rollback thread waiting watching queue for aborted transaction. ○ Aborter Active: Found a transaction to abort.

slide-8
SLIDE 8

PXC mutex/condition-variables

  • There are tons of them and most of them are self explanatory.

wsrep_ready, wsrep_sst_init, wsrep_sst, wsrep_rollback, wsrep_slave_threads, wsrep_desync, wsrep_sst_thread, wsrep_replaying, etc.. certification_mutex, stats collection mutex, monitoring mutexes, recvbuf mutex, transaction handling mutex, saved state mutexes, etc.. To start with you may need to just keep important mutex say monitoring mutex as they control parallel replication or for that matter rollback mutex as it control rollback activity or replaying mutex as it control replaying of transaction.

slide-9
SLIDE 9

Enabling P_S instruments for PXC

  • Setup_consumer should be enabled based on what user needs or objective is
  • Setup_instrument: select * from setup_instruments where name like '%wsrep%' or name like '%galera%'; Set the

ENABLED and TIME to YES. Check the MySQL guide on how to do this iteratively or startup.

+----------------------------------------------------------+---------+-------+ | NAME | ENABLED | TIMED | +----------------------------------------------------------+---------+-------+ | wait/synch/mutex/sql/LOCK_wsrep_ready | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_sst | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_sst_init | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_rollback | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_replaying | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_slave_threads | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_desync | NO | NO | | wait/synch/mutex/sql/LOCK_wsrep_thd | NO | NO | …... | stage/wsrep/wsrep: replicating commit | NO | NO | | stage/wsrep/wsrep: waiting on replaying | NO | NO | | stage/wsrep/wsrep: in pre-commit stage | NO | NO | | stage/wsrep/wsrep: preparing for TO isolation | NO | NO | | stage/wsrep/wsrep: replaying trx | NO | NO | | stage/wsrep/wsrep: applier idle | NO | NO | | stage/wsrep/wsrep: in rollback thread | NO | NO | | stage/wsrep/wsrep: aborter idle | NO | NO | | stage/wsrep/wsrep: aborter active | NO | NO |

slide-10
SLIDE 10

Monitoring threads

  • Threads

mysql> select * from threads where name like '%wsrep%' or name like '%galera%'; +-----------+-----------------------------------------+------------+----------------+------------------+------------------+----------------+---------------------+

  • -----------------+---------------------+------------------+------------------+------+--------------+---------+-----------------+--------------+

| THREAD_ID | NAME | TYPE | PROCESSLIST_ID | PROCESSLIST_USER | PROCESSLIST_HOST | PROCESSLIST_DB | PROCESSLIST_COMMAND | PROCESSLIST_TIME | PROCESSLIST_STATE | PROCESSLIST_INFO | PARENT_THREAD_ID | ROLE | INSTRUMENTED | HISTORY | CONNECTION_TYPE | THREAD_OS_ID | +-----------+-----------------------------------------+------------+----------------+------------------+------------------+----------------+---------------------+

  • -----------------+---------------------+------------------+------------------+------+--------------+---------+-----------------+--------------+

| 3 | thread/galera/THREAD_galera_service_thd | BACKGROUND | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | YES | YES | NULL | 28425 | | 4 | thread/galera/THREAD_galera_gcommconn | BACKGROUND | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | YES | YES | NULL | 28426 | | 5 | thread/galera/THREAD_galera_receiver | BACKGROUND | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | YES | YES | NULL | 28427 | | 6 | thread/sql/THREAD_wsrep_rollbacker | BACKGROUND | NULL | NULL | NULL | NULL | NULL | NULL | wsrep: aborter idle | NULL | NULL | NULL | YES | YES | NULL | 28428 | | 7 | thread/sql/THREAD_wsrep_applier | BACKGROUND | NULL | NULL | NULL | NULL | NULL | 14 | NULL | NULL | NULL | NULL | YES | YES | NULL | 28429 | | 35 | thread/sql/THREAD_wsrep_sst_donor | BACKGROUND | NULL | NULL | NULL | NULL | NULL | 4 | NULL | NULL | NULL | NULL | YES | YES | NULL | 28791 | +-----------+-----------------------------------------+------------+----------------+------------------+------------------+----------------+---------------------+

  • -----------------+---------------------+------------------+------------------+------+--------------+---------+-----------------+--------------+

6 rows in set (0.01 sec)

  • Thread-ID help us associate the events with respective thread.
  • Above snapshot was captured on DONOR node when SST was in progress.
slide-11
SLIDE 11

Monitoring Files

  • Files

mysql> select * from file_instances where file_name like '%wsrep%' or file_name like '%galera%'; +----------------------------------------------------------------------+--------------------------------------------+------------+ | FILE_NAME | EVENT_NAME | OPEN_COUNT | +----------------------------------------------------------------------+--------------------------------------------+------------+ | /opt/projects/codebase/pxc/installed/pxc57/pxc-node/dn1/galera.cache | wait/io/file/galera/FILE_galera_ringbuffer | 1 | +----------------------------------------------------------------------+--------------------------------------------+------------+ 1 row in set (0.00 sec)

  • I tried executing a bigger workload that causes creation of multiple gcache files.

mysql> select * from file_instances where event_name like '%wsrep%' or event_name like '%galera%'; +----------------------------------------------------------------------------+---------------------------------------------+--------

  • ---+

| FILE_NAME | EVENT_NAME | OPEN_COUNT | +----------------------------------------------------------------------------+---------------------------------------------+--------

  • ---+

| /opt/projects/codebase/pxc/installed/pxc57/pxc-node/dn1/galera.cache | wait/io/file/galera/FILE_galera_ringbuffer | 1 | | /opt/projects/codebase/pxc/installed/pxc57/pxc-node/dn1/gcache.page.000000 | wait/io/file/galera/FILE_galera_gcache_page| 1 | | /opt/projects/codebase/pxc/installed/pxc57/pxc-node/dn1/gcache.page.000001 | wait/io/file/galera/FILE_galera_gcache_page| 1 | +----------------------------------------------------------------------------+--------------------------------------------+---------

  • --+
slide-12
SLIDE 12

Monitoring Files

  • How do you use this data ?

○ For long running transaction often there could be spike in disk usage that is unexplainable because these temporary files are created and removed. You can trace it provided you are monitoring. ○ Gcache file instances are removed/purged once the set criteria is achieved. ○ If you often run long running transaction that can cause these then you can start monitoring and start planning things accordingly.

slide-13
SLIDE 13

Monitoring Stages

  • Understand what all operations are being done. Let work this out through

series of example:

EXAMPLE-1: Tried inserting 3K rows through sysbench. I was assuming it would be done as single commit but it turned out that it was done as 2 commit operations. Further investigation revealed that sysbench is uses 512K buffer for bulk loading.

| 46 | 11 | 11 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362 | 1486180763409000 | 1486180763915000 | 506000 | NULL | NULL | 10 | STATEMENT | | 46 | 12 | 12 | stage/wsrep/wsrep: in pre-commit stage | wsrep_hton.cc:486 | 1486181786634000 | 1486199728611000 | 17941977000 | NULL | NULL | 10 | STATEMENT | | 46 | 14 | 14 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362 | 1486220293439000 | 1486220293718000 | 279000 | NULL | NULL | 13 | STATEMENT | | 46 | 15 | 15 | stage/wsrep/wsrep: in pre-commit stage | wsrep_hton.cc:486 | 1486220421015000 | 1486222241027000 | 1820012000 | NULL | NULL | 13 | STATEMENT |

slide-14
SLIDE 14

Monitoring Stages

EXAMPLE-2: A said workload commit is generally done in x micros-seconds but one of the instance I observed it took x+delta mseconds and commit passed. What went wrong ? Was the commit replayed due to conflicting locks.

|35 |212 |212 | stage/wsrep/wsrep: preparing for TO isolation | wsrep_mysqld.cc:1731 | 64088912225000 | 64090048629000 | 1136404000 | NULL | NULL |211 | STATEMENT| |35 |214 |214 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362| 64142110739000 | 64142111495000 | 756000 | NULL | NULL |213 | STATEMENT| |35 |215 |215 | stage/wsrep/wsrep: in pre-commit stage| wsrep_hton.cc:486| 64142121356000 | 64142656271000 |534915000 | NULL | NULL |213 | STATEMENT| |35 |217 |217 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362| 64147276581000 | 64147277839000 |1258000 | NULL | NULL |216 | STATEMENT| |35 |218 |218 | stage/wsrep/wsrep: in pre-commit stage| wsrep_hton.cc:486| 64147293909000 | 64147978153000 |684244000 | NULL | NULL |216 | STATEMENT| |35 |225 |225 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362| 64169307624000 | 64169308833000 |1209000 | NULL | NULL |224 | STATEMENT| | 7 |1 |1 | stage/wsrep/wsrep: applying write set | wsrep_applier.cc:223 | 70055188484000 | 70055223360000 | 34876000 | NULL | NULL | NULL | NULL | |35 |226 |226 | stage/wsrep/wsrep: in pre-commit stage| wsrep_hton.cc:486| 64169332867000 | 77752698793000 | 13583365926000 | NULL | NULL |224 | STATEMENT| |35 |227 |227 | stage/wsrep/wsrep: replaying trx| wsrep_thd.cc:246 | 77753153241000 | 77753220922000 | 67681000 | NULL | NULL |224 | STATEMENT| |35 |228 |229 | stage/wsrep/wsrep: applying write set | wsrep_applier.cc:223 | 77753247274000 | 77753295894000 | 48620000 | NULL | NULL |229 | STATEMENT| | 7 |3 |3 | stage/wsrep/wsrep: updating rows| log_event.cc:12855 | 77753151479000 | 77753759600000 |608121000 | NULL | NULL |2 | STATEMENT| |35 |230 |230 | stage/wsrep/wsrep: updating rows| log_event.cc:12855 | 77753730247000 | 77754102408000 |372161000 | NULL | NULL |229 | STATEMENT| |35 |231 |231 | stage/wsrep/wsrep: committing | wsrep_applier.cc:288 | 77754150064000 | 77758755995000 | 4605931000 | NULL | NULL |229 | STATEMENT| | 7 |4 |4 | stage/wsrep/wsrep: applier idle | wsrep_applier.cc:255 | 77753839832000 | 77758913293000 | 5073461000 | NULL | NULL |2 | STATEMENT| | 7 |5 |5 | stage/wsrep/wsrep: committing | wsrep_applier.cc:288 | 77758913293000 | 77760707782000 | 1794489000 | NULL | NULL |2 | STATEMENT|

slide-15
SLIDE 15

Monitoring Stages

EXAMPLE-3: Issue conflicting transaction marking current active transaction aborted.

|35 |340 |340 | stage/wsrep/wsrep: preparing for TO isolation | wsrep_mysqld.cc:1731 | 930845989288000 | 930846682499000 | 693211000 | NULL | NULL |339 | STATEMENT| |35 |342 |342 | stage/wsrep/wsrep: replicating commit | wsrep_hton.cc:362| 940967846610000 | 940967847864000 | 1254000 | NULL | NULL |341 | STATEMENT| |35 |343 |343 | stage/wsrep/wsrep: in pre-commit stage| wsrep_hton.cc:486| 940967858337000 | 940969285384000 |1427047000 | NULL | NULL |341 | STATEMENT| | 7 |6 |6 | stage/wsrep/wsrep: applier idle | wsrep_applier.cc:300 |77760884088000 | 951560325752000 | 873799441664000 | NULL | NULL |2 | STATEMENT| | 7 |7 |8 | stage/wsrep/wsrep: applying write set | wsrep_applier.cc:223 | 951560325752000 | 951560395549000 |69797000 | NULL | NULL |8 | STATEMENT| | 6 |1 |1 | stage/wsrep/wsrep: aborter active | wsrep_thd.cc:465 | 951560836089000 | 951561258639000 | 422550000 | NULL | NULL | NULL | NULL | | 7 |9 |9 | stage/wsrep/wsrep: writing rows | log_event.cc:12550 | 951560698287000 | 951561357292000 | 659005000 | NULL | NULL |8 | STATEMENT| | 7 | 10 | 10 | stage/wsrep/wsrep: applier idle | wsrep_applier.cc:255 | 951561414711000 | 951561425594000 |10883000 | NULL | NULL |8 | STATEMENT| | 7 | 11 | 11 | stage/wsrep/wsrep: committing | wsrep_applier.cc:288 | 951561425594000 | 951569105219000 |7679625000 | NULL | NULL |8 | STATEMENT|

slide-16
SLIDE 16

Monitoring Mutexes/Cond-Variables

  • Use-case-1: DDL action will block all other DML actions till it is completed.

○ Most of the time these action are fired by an application and user keep wondering why things are taking time. ○ I tried something similar. Started sysbench workload and started to look at commit-monitor differences for given applier thread. Applier thread was busy applying/committing action and so the difference between 2 commit mutex was present in some range. ○ While sysbench was executing, I introduce a DDL statement (ALTER table using copy) that too a big table that takes time to complete. All DML operations are halted and so there is no action on commit-mutex front too. ○ My Graph suddenly dropped to 0 during this frame.

slide-17
SLIDE 17

Monitoring Mutexes/Cond-Variable

  • Use-case-2: Monitoring replication traffic using recvbuf.

Assume a 2 node cluster and a continuous workload is started on node-2 and node-1 is suppose to see replicated events at regular interval. Suddenly there are no replicated events for given period of time. Was it planned incident or something went wrong ? I can trace graph like these to keep check on replicated traffic.

mysql> select thread_id, event_name, timer_wait/1000000000000 from events_waits_current where event_name like '%COND%recv%' limit 10; +-----------+--------------------------------------------+--------------------------+ | thread_id | event_name | timer_wait/1000000000000 | +-----------+--------------------------------------------+--------------------------+ | 5 | wait/synch/cond/galera/COND_galera_recvbuf | 184.5678 | +-----------+--------------------------------------------+--------------------------+ 1 row in set (0.00 sec)

slide-18
SLIDE 18

Monitoring Mutexes/Cond-Variable

  • Use-case-3: How frequently is my workload being aborted by BFA

mysql> show status like 'wsrep_local_bf_aborts'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | wsrep_local_bf_aborts | 7 | +-----------------------+-------+

There were 7 aborts but if I need to understand how frequently then P_S analysis can help me.

mysql> select thread_id, event_name, timer_wait/1000000000000 from events_waits_history_long where event_name like '%COND%rollback%'; +-----------+-----------------------------------------+--------------------------+ | thread_id | event_name | timer_wait/1000000000000 | +-----------+-----------------------------------------+--------------------------+ | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 12.0341 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 11.2182 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 11.4327 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 11.5059 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 11.4743 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 179.4694 | | 7 | wait/synch/cond/sql/COND_wsrep_rollback | 12.0904 | +-----------+-----------------------------------------+--------------------------+

slide-19
SLIDE 19

Q&A

Do remember to “Rate this Talk”

Write to me at krunal.bauskar@percona.com Or DROP in at Percona Booth