BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
Query Optimizer MySQL vs. PostgreSQL Percona Live, Santa Clara - - PowerPoint PPT Presentation
Query Optimizer MySQL vs. PostgreSQL Percona Live, Santa Clara - - PowerPoint PPT Presentation
Query Optimizer MySQL vs. PostgreSQL Percona Live, Santa Clara (USA), 24 April 2018 Christian Antognini @ChrisAntognini antognini.ch/blog BASEL BERN BRUGG DSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN
@ChrisAntognini
Query Optimizer – MySQL vs. PostgreSQL 2 2018-04-24
Senior principal consultant, trainer and partner at Trivadis christian.antognini@trivadis.com http://antognini.ch Focus: get the most out of database engines Logical and physical database design Query optimizer Application performance management Author of Troubleshooting Oracle Performance (Apress, 2008/14) OakTable Network, Oracle ACE Director
Agenda
Query Optimizer – MySQL vs. PostgreSQL 3 2018-04-24
- 1. Introduction
- 2. Configuration
- 3. Statistics about Data
- 4. Data Dictionary Metadata
- 5. Single-Table Access Paths
- 6. Joins and Sub-queries
- 7. Conclusion
- 8. References
Query Optimizer – MySQL vs. PostgreSQL 4 2018-04-24
Introduction
Compared Products
Query Optimizer – MySQL vs. PostgreSQL 5 2018-04-24
MySQL Community Server 8.0.11 Release date: 19 April 2018 Only the InnoDB engine is covered PostgreSQL 10.3 Release date: 1 March 2018
Terminology
Query Optimizer – MySQL vs. PostgreSQL 6 2018-04-24
In PostgreSQL the query optimizer is called planner
Disclaimer
Query Optimizer – MySQL vs. PostgreSQL 7 2018-04-24
No performance tests were performed No comparison between the execution plans generated by the two query
- ptimizers were performed
To compare and to evaluate the two query optimizers only the availability of key features and the ability of the query optimizer to correctly recognize common data patterns was considered
Aim of the Query Optimizer
Query Optimizer – MySQL vs. PostgreSQL 8 2018-04-24
SQL is a declarative language SQL doesn’t specify how to access data, only which data to process The strategy used to access data is delegated to the query optimizer The aim of the query optimizer is to find out the optimal way to execute a given SQL statement A B
Inputs Considered to Produce an Execution Plan
Query Optimizer – MySQL vs. PostgreSQL 9 2018-04-24
Query Optimizer SQL statement Execution plan Configuration Metadata Object statistics Others?
How Is Data in Tables Stored?
Query Optimizer – MySQL vs. PostgreSQL 10 2018-04-24
InnoDB uses a B-tree index Heap table
Query Optimizer – MySQL vs. PostgreSQL 11 2018-04-24
Controlling the Query Optimizer
Configuration
Query Optimizer – MySQL vs. PostgreSQL 12 2018-04-24
Three system variables control the behavior of the query optimizer Limit the number of evaluated plans (2) Control specific features (1 parameter for 20 features) The default (system) values can be
- verwritten at session level and at
the statement level Two dozen of parameters control the behavior of the query optimizer Limit the number of evaluated plans (2) Control specific features (15) Control the genetic optimizer (7) The default (system) values can be
- verwritten at session level
Configuration – Cost Model
Query Optimizer – MySQL vs. PostgreSQL 13 2018-04-24
A cost model database contains cost estimate information for a number of
- perations (8)
The default values can be changed at the system level only A number of parameters provide cost estimate information for a number of
- perations (11)
The default (system) values can be
- verwritten at session level
Hints
Query Optimizer – MySQL vs. PostgreSQL 14 2018-04-24
SELECT statement modifiers (4) Impact statement syntax Index hints (3) Impact statement syntax Cause error when index missing Optimizer hints (21) Similar to Oracle Database hints Global, query block and object-level Cause warning when syntax is wrong (Available in EDB Advanced Server)
hints_modifiers.sql hints_index.sql hints_optimizer.sql
Query Optimizer – MySQL vs. PostgreSQL 15 2018-04-24
Statistics about Data
Gathering Statistics
Query Optimizer – MySQL vs. PostgreSQL 16 2018-04-24
The ANALYZE statement gathers and, by default, stores statistics in the data dictionary By default, an asynchronous automatic statistics recalculation takes place Persistent (default) as well as non- persistent statistics exist The ANALYZE statement gathers and stores statistics in the data dictionary By default, the autovacuum daemon recalculate statistics of modified tables
Table Statistics
Query Optimizer – MySQL vs. PostgreSQL 17 2018-04-24
Clustered index size (pages) Number of rows Table size (pages) Number of rows Number of pages marked all-visible
statistics.sql
Column Statistics
Query Optimizer – MySQL vs. PostgreSQL 18 2018-04-24
Data distribution (optional) Including fraction of entries that are null Fraction of values that are null Average column width (bytes) Number of distinct values Statistical correlation between physical and logical row ordering Data distribution (optional) Most common values and their frequency (optional)
statistics.sql
Cross-Column Statistics
Query Optimizer – MySQL vs. PostgreSQL 19 2018-04-24
Functional dependencies (optional) Number of distinct values (optional)
statistics.sql
Index Statistics
Query Optimizer – MySQL vs. PostgreSQL 20 2018-04-24
Index size (pages) Number of leaf pages Number of distinct keys Several values are stored E.g. for index “a,b,c” “a”, “a,b”, “a,b,c”, “a,b,c,PK” Index size (pages) Number of indexed rows
statistics.sql
Query Optimizer – MySQL vs. PostgreSQL 21 2018-04-24
Data Dictionary Metadata
Constraints – Primary Key and Unique Key
Query Optimizer – MySQL vs. PostgreSQL 22 2018-04-24
Because of the clustered index, PK has precedence over other indexes Predicates based on UK take precedence over other indexes Equality predicates based on PK/UK are probed No particular precedence is given to predicates based on PK/UK Statistical correlation between physical and logical row ordering determines which index is used
constraints_pk_uk.sql
Constraints – Foreign Key
Query Optimizer – MySQL vs. PostgreSQL 23 2018-04-24
No evidence about the usage of FK to avoid unnecessary loss-less joins has been observed No evidence about the usage of FK to avoid unnecessary loss-less joins has been observed
constraints_fk.sql
Constraints – NOT NULL
Query Optimizer – MySQL vs. PostgreSQL 24 2018-04-24
NOT NULL constraints are used to verify the validity of predicates By default the usage of NOT NULL constraints to verify the validity of predicates is enabled for specific cases only constraint_exclusion = partition Statistics are used instead
constraints_not_null.sql
Constraints – CHECK
Query Optimizer – MySQL vs. PostgreSQL 25 2018-04-24
No evidence about the usage of CHECK constraints to verify the validity of predicates has been
- bserved
Statistics are used instead By default the usage of CHECK constraints to verify the validity of predicates is enabled for specific cases only constraint_exclusion = partition Statistics are used instead
constraints_check.sql
Query Optimizer – MySQL vs. PostgreSQL 26 2018-04-24
Single-Table Access Paths
Available Index Types
Query Optimizer – MySQL vs. PostgreSQL 27 2018-04-24
Supported index types B-tree (default) R-tree (for spatial indexes) For string columns, indexes can be created on the leading part of column values B-tree indexes store NULL values Support for invisible indexes Supported index types B-tree (default) Hash, GiST, SP-GiST, GIN, BRIN Indexes can be created on expressions as well as on a subset
- f the rows
B-tree indexes store NULL values
indexes_expression.sql indexes_partial.sql indexes_invisible.sql indexes_prefix.sql indexes_nulls.sql
Optimization of ORDER BY, MIN and MAX
Query Optimizer – MySQL vs. PostgreSQL 28 2018-04-24
B-tree indexes can be used to
- ptimize ORDER BY, MIN and MAX
Index scans can be performed in both directions Keys are stored according to the specified order No NULLS FIRST/LAST support B-tree indexes can be used to
- ptimize ORDER BY, MIN and MAX
Index scans can be performed in both directions Keys are stored according to the specified order NULLS FIRST/LAST supported
indexes_order_by.sql indexes_min_max.sql
Merging Indexes
Query Optimizer – MySQL vs. PostgreSQL 29 2018-04-24
Two or more B-tree indexes can be merged at runtime to evaluate multiple predicates combined with AND or OR When appropriate, B-tree indexes are dynamically converted to bitmaps in memory One utilization of this feature is to merge indexes to evaluate multiple predicates combined with AND or OR
indexes_merge.sql
(Declarative) Partitioning
Query Optimizer – MySQL vs. PostgreSQL 30 2018-04-24
Available methods: Multi-column range and list Single-column hash Sub-partitioning by hash Only local indexes (PK/UK must contain partition key) FK not supported Partition pruning Available methods: Multi-column range Single-column list Sub-partitioning Only local indexes (no PK/UK) FK not supported Partition exclusion (pruning)
partitioning_range.sql partitioning_list.sql partitioning_hash.sql
Query Optimizer – MySQL vs. PostgreSQL 31 2018-04-24
Joins and Sub-queries
Available Kind of Joins
Query Optimizer – MySQL vs. PostgreSQL 32 2018-04-24
Available join methods Nested loops join (Hash join available in MariaDB) Bushy plans are considered only when no other possibility exists Full outer joins are not supported Available join methods Nested loops join Hash join Merge join Bushy plans are considered Full outer joins are supported and
- ptimized with hash/merge joins
joins_methods.sql joins_syntax.sql joins_bushy.sql
Sub-queries in WHERE Clause
Query Optimizer – MySQL vs. PostgreSQL 33 2018-04-24
Simple sub-queries that are not correctly optimized were observed For optimal performance a rewrite might be necessary Problematic cases observed Correlated NOT IN Correlated (NOT) EXISTS Simple sub-queries that are not correctly optimized were observed For optimal performance a rewrite might be necessary Problematic case observed Correlated (NOT) IN Uncorrelated NOT IN
joins_subqueries.sql
Query Optimizer – MySQL vs. PostgreSQL 34 2018-04-24
Conclusion
Summary
Query Optimizer – MySQL vs. PostgreSQL 35 2018-04-24
Good configuration capabilities Hints available Fairly good object statistics Metadata only partially used Good indexing capabilities, average partition capabilities Limited join capabilities Good configuration capabilities Hints missing Very good object statistics Metadata only partially used Good indexing capabilities, average partition capabilities Good join capabilities
Query Optimizer – MySQL vs. PostgreSQL 36 2018-04-24
Core Messages
The query optimizer of PostgreSQL is more advanced than the one of MySQL In general, the query optimizer of MySQL can only do a good job with transactional loads; the one of PostgreSQL is also suitable for analytical loads
Rate My Session
Query Optimizer – MySQL vs. PostgreSQL 37 2018-04-24
Query Optimizer – MySQL vs. PostgreSQL 38 2018-04-24
References
References (1)
Query Optimizer – MySQL vs. PostgreSQL 39 2018-04-24
MySQL 8.0 Reference Manual The Unofficial MySQL 8.0 Optimizer Guide MySQL Internals MySQL Server Blog PostgreSQL 10.1 Documentation Planner source code “readme” PostgreSQL Wiki
References (2)
Query Optimizer – MySQL vs. PostgreSQL 40 2018-04-24
The verification scripts I wrote are available on GitHub How Well a Query Optimizer Handles Subqueries?
Q&A
2018-04-24 Query Optimizer – MySQL vs. PostgreSQL 41