Santa Clara, California | April 23th – 25th, 2018 MIchael Coburn, Product Manager
PMM 101 Santa Clara, California | April 23th 25th, 2018 MIchael - - PowerPoint PPT Presentation
PMM 101 Santa Clara, California | April 23th 25th, 2018 MIchael - - PowerPoint PPT Presentation
PMM 101 Santa Clara, California | April 23th 25th, 2018 MIchael Coburn, Product Manager Your Presenter Product Manager for PMM (also Percona Toolkit and Percona XtraBackup) With Percona for 6 years through 6 different roles
2
Your Presenter
- Product Manager for PMM (also Percona Toolkit and Percona
XtraBackup)
- With Percona for 6 years through 6 different roles
- Consultant, Managing Consultant, Principal Architect, Technical Account
Manager, Principal Support Engineer
- http://bit.ly/JoinPerconaLiveSlack
#monitoring-mysql-perf Percona Live App
3
Why does this talk exist?
- Troubleshooting performance issues can be a bit tricky, especially when
you’re given a broad statement that the database is slow.
- Learn to direct your attention to the correct moving pieces and fix what
needs your attention.
- Learn how all this is done at Percona, what we monitor and track, and the
tools we use.
4
What is PMM
- Free, Open Source database troubleshooting and performance
- ptimization platform for MySQL and MongoDB
○ We also support:
■ ProxySQL ■ Amazon RDS MySQL ■ Amazon Aurora MySQL
- Runs in your secure environment (this is not a SaaS product!), on your
equipment
- Secured with SSL between client and server
5
PMM Distribution Methods
- docker
○ docker pull percona/pmm-server:latest
- Virtual Appliance
○ Supports VMware, RedHat Virtualization, Microsoft Systems Center ○ … and VirtualBox!
- AWS Marketplace
○ Production-ready AMI running in EC2 ○ Available since November 2017
6
AWS Marketplace
- Deploy directly to EC2
- Running CentOS 7
Search for "pmm" or "Percona Monitoring and Management" https://aws.amazon.com/marketplace/pp/B077J7FYGX
7
PMM Architecture
- pmm-client
○ mysqld_exporter ○ node_exporter ○ qan-agent
- PMM Server
○ Query Analytics
■ QAN API ■ QAN App
○ Metrics Monitor
■ Prometheus ■ Grafana ■ Consul
8
PMM Server Components
- Metrics Monitor
○ Prometheus
■ Timeseries database ■ Powerful PromQL query language
○ Grafana
■ Visualization platform
○ Consul
■ Tracks which services are available to be scraped by Prometheus
- Query Analytics
○ View query performance in real-time ○ Aggregated for queries consuming most amount of time in MySQL ○ Query drill-down for individual query performance
■ Rows read, Rows scanned, Query time, Query count ■ InnoDB statistics (Percona Server for MySQL only
9
pmm-client Components
- pmm-admin
○ Command-line tool for client management
- node_exporter
○ Agent that exports Linux metrics
- mysqld_exporter
○ Agent that exports MySQL server metrics
- qan-agent
○ Agent that collects query metrics from MySQL Slow Log or PERFORMANCE_SCHEMA
10
Prometheus Data Collection
- Prometheus server asks Consul for which services & instances to query
○ by IP address and port ○ Example: curl https://192.168.56.3:42000/metrics
- Prometheus exporter performs data collection upon curl request
- Exporter generates text exposed via web server at :42002/metrics
[root@ps57r ~]# curl -s -k https://10.91.136.33:42002/metrics-hr |grep mysql | head -8 # HELP mysql_exporter_collector_duration_seconds Collector time duration. # TYPE mysql_exporter_collector_duration_seconds gauge mysql_exporter_collector_duration_seconds{collector="collect.global_status"} 0.019977679 mysql_exporter_collector_duration_seconds{collector="collect.info_schema.innodb_metric s"} 0.006224816 mysql_exporter_collector_duration_seconds{collector="connection"} 2.1584e-05 # HELP mysql_exporter_hr_last_scrape_error Whether the last scrape of metrics from MySQL resulted in an error (1 for error, 0 for success). # TYPE mysql_exporter_hr_last_scrape_error gauge mysql_exporter_hr_last_scrape_error 0
Query Analytics (QAN)
Examining queries in depth
12
Query Analytics Dashboard
13
Query Analytics Overview
- Query Abstract
○ Query pattern with placeholders
- ID
○ Unique fingerprint, used for query group by
- Load
○ Grand Total Time - percentage of time that MySQL server spent executing the query
- Count
○ QPS, total count during window, % of total
- Latency
○ Min, Med, Avg, P95, Max
14
PERFORMANCE_SCHEMA
15
Slow Log - Percona Server Enhanced
16
EXPLAIN - Table and JSON
17
CREATE TABLE, TABLE STATUS, and INDEXES
18
Server Summary Information
- Collects and displays per Server:
○ pt-summary ○ pt-mysql-summary
- _PMM System Summary
- Summary can be downloaded from the UI
Metrics Monitor
Eye candy
20
Grafana in a Nutshell
- Open Source data visualisation tool
- Popular datasources
○ Prometheus ○ CloudWatch ○ Graphite ○ Elasticsearch
- Templated Variables
○ Define your graph metrics, and let the hosts get filled in automatically ○ GREAT for large, dynamic environments where hosts are considered ephemeral
21
Prometheus revisited
- Timeseries database - metric name + key/value pairs
○ mysql_global_variables_innodb_buffer_pool_instances{instance= "ps57",job="mysql"} = 8 ○ mysql_slave_status_slave_io_running{instance="ps57r",job="mys ql",master_host="10.91.136.32",master_uuid="9809315d-4d97-11e 6-b85e-0007cb03dc86"} = 1
- Flexible query language - PromQL
- Collection of metrics based on HTTP pull
- Targets identified via service discovery or static configuration files
○ We're using consul in PMM for service discovery
22
How can I...
- Compare servers to each other
○ Cross Server graphs
- Show behaviour now() compared to past period
○ Trends Overview dashboard
- At a glance MySQL + indepth
○ MySQL Overview, InnoDB, InnoDB Advanced
- Table statistics*
○ Largest tables by rows and size, total DB size, tables by rows read and changed, auto_increment usage (about to hit the limit?)
- User statistics*
○ Top users by connection count, network usage, rows read/changed
23
Annotations
- Visualize Application Events in PMM
○ pmm-admin annotate "Application deployment v1.3"
24
Alerting
- Alerting
○ Cannot use Templated Variables ○ Instead, replace with string constants for instance name
Almost the end Parting thoughts
26
Advice
- PMM Metrics retention is 30 days
○ We are looking at options to present a longer history
- mysql:metrics are polled at 1s, 5s, and 60s resolutions, and linux:metrics
is every 1s
○ On high-latency links you might need to tune scrape_interval up
- Don't skimp on resources
○ Prometheus in particular needs a lot of CPU cores and fast disks, in order to sequence scrape data before writing chunks to disk
- Consider disabling some mysqld_exporter features to minimise
performance impact
○
- -disable-tablestats, --disable-processlist
- Keep queries in the database (security)
○
- -disable-queryexamples
27
PMM Roadmap
- Prometheus 2.x - faster, more instances per PMM Server
- PostgreSQL Support
- MySQL -> ClickHouse for QAN datastore
○ faster, aggregation across all servers, new filtering and sorting options
- Long term metrics storage (past 30 days)
- One-click ticket submission*
- Standardised data collection for tickets*
- Any feedback of what you'd like to see in PMM?
* for Percona Subscribers (Customers) only
28
Thank You Sponsors!!
29
Questions?
- Michael Coburn michael.coburn@percona.com
- Percona is looking for MongoDB and MySQL rockstars! Be sure to stop
by Percona’s booth.
- Do you have any areas or benchmarks you want Percona to talk about in
blogs together? Any features or tools you think we should focus on to meet the community's needs?
30