Advanced Database Management Systems
Query Processing: Query Evaluation Alvaro A A Fernandes
School of Computer Science, University of Manchester
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 1 / 111
Advanced Database Management Systems Query Processing: Query - - PowerPoint PPT Presentation
Advanced Database Management Systems Query Processing: Query Evaluation Alvaro A A Fernandes School of Computer Science, University of Manchester AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 1 / 111 Outline Algorithmic
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 1 / 111
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 2 / 111
Algorithmic Strategies
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 3 / 111
Query Evaluation Strategies
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 4 / 111
Query Evaluation Strategies
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 5 / 111
Query Evaluation Strategies
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 6 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 7 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 8 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 9 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 10 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 11 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 12 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 13 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 14 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 15 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 16 / 111
Evaluating Relational-Algebraic Operators
◮ reading pages rather than tuples (i.e., we have r ∈ RPage and
◮ making use of buffers more wisely, e.g., if we can hold the smaller
◮ if we cannot hold R in its entirety, we can still use blocks into which we
◮ if there is an index on the join attributes for either R or S, we can
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 17 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 18 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 19 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 20 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 21 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 22 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 23 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 24 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 25 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 26 / 111
Evaluating Relational-Algebraic Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 27 / 111
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 28 / 111
Metadata in the System Catalogue
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 29 / 111
Metadata in the System Catalogue
◮ The table name, the file name in which the table is stored and the
◮ The name and type of each attribute ◮ The name of every index on the table ◮ The integrity constraints on the table
◮ The name and organization (e.g., B+ tree) of the index ◮ The search-key attributes
◮ The name of the view ◮ The definition of (i.e., the query used to compute) the view AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 30 / 111
Metadata in the System Catalogue
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 31 / 111
Metadata in the System Catalogue
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 32 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 33 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 34 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 35 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 36 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 37 / 111
Cost Estimation
◮ If B(R) < M then I/O cost = B(R) + B(S). ◮ In general, the smaller of the operand relations is used in the inner loop
◮ If only one buffer is available to each relation, then I/O cost =
◮ The outer relation, S, is read only once, but the inner relation, R, is
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 38 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 39 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 40 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 41 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 42 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 43 / 111
Cost Estimation
◮ S = σθ1∧θ2(R): Given the splitting laws, this can be treated as a
◮ The effect is to obtain an overall selectivity factor by multiplying the
◮ S = σθ1∨θ2(R): One possibility is to assume that no tuple satisfies
◮ When multiplied with T(R), this could yield a cardinality estimate
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 44 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 45 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 46 / 111
Cost Estimation
◮ If Y is an attribute appearing in several relations, then
◮ As a consequence, if V (R, Y ) ≤ V (S, Y ) then every
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 47 / 111
Cost Estimation
1 V (S,Y ) of joining with a given
T(S) V (S,Y )
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 48 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 49 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 50 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 51 / 111
Cost Estimation
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 52 / 111
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 53 / 111
Cost-Based Plan Selection
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 54 / 111
Cost-Based Plan Selection
◮ M is normally a configuration parameter. ◮ B(R) is easily computed from the information on where/how a relation
◮ T(R) is either stored explicitly or can be estimated (as exemplified
◮ V (R, A) (like T(R)) can be computed in a single scan through a
◮ To avoid scanning the complete relation, V (R, A) may be estimated by
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 55 / 111
Generating and Ranking Logical QEPs
◮ Structuring the process by which transformations are applied so that it
◮ For example, that a logical QEP with likely smaller intermediate results
◮ Appealing to and applying transformations that are consistent with the
◮ For example, that, by pushing selections and projections to lie as close
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 56 / 111
Generating and Ranking Logical QEPs
Table T(R) V(R,A) Usable 5 V(Usable,pltype) = 3 V(Usable,flid) = 4 Certified 4 V(Certified,planetype) = 2 Flights 4 V(Flights,fltno) = 4 V(Flights,to) = 4
(Usable ⊲ ⊳pltype=planetype Certified) ⊲ ⊳flid=fltno (σto=JFK (Flights))
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 57 / 111
Generating and Ranking Logical QEPs
◮ The logical optimizer generates alternative tree shapes by applying
◮ The physical optimizer then assigns physical operators for their logical
◮ The logical optimizer works in a purely heuristic manner (e.g., in
◮ The physical optimizer then considers alternative join orderings and
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 58 / 111
Selecting Join Orders
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 59 / 111
Selecting Join Orders
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 60 / 111
Selecting Join Orders
Join R ⊲ ⊳ S R ⊲ ⊳ T R ⊲ ⊳ U S ⊲ ⊳ T S ⊲ ⊳ U T ⊲ ⊳ U Size 5,000 20,000 10,000 2,000 40,000 1,000
Join (T ⊲ ⊳ U) ⊲ ⊳ R (T ⊲ ⊳ U) ⊲ ⊳ S Size 10,000 2,000
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 61 / 111
Choosing Physical Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 62 / 111
Choosing Physical Operators
◮ the anticipated sizes of operand collections ◮ the amount of available memory ◮ the availability of indexes ◮ whether or not the data is sorted
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 63 / 111
Choosing Physical Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 64 / 111
Choosing Physical Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 65 / 111
Choosing Physical Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 66 / 111
Choosing Physical Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 67 / 111
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 68 / 111
Historical Background
◮ Too much hope was placed on solving the problem with specialist
◮ Key software mechanisms were not yet widely available (e.g.,
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 69 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 70 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 71 / 111
Historical Background
◮ Queries are a composition of a small collection of semantically
◮ The relational algebra is closed: each operator consumes one or more
◮ Many operations are non-blocking (i.e., can produce a result based only
◮ Two forms of parallelism are made possible: pipelined and
◮ These are also referred, in the database literature, as inter-operator
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 72 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 73 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 74 / 111
Historical Background
◮ provide more total power at a lower price, ◮ have a modular architecture that allows for incremental growth, and ◮ employ commodity components. AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 75 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 76 / 111
Historical Background
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 77 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 78 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 79 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 80 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 81 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 82 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 83 / 111
Parallelism Goals and Metrics
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 84 / 111
Parallel Architectures
◮ shared-memory designs ◮ shared-disk designs ◮ shared-nothing designs AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 85 / 111
Parallel Architectures
◮ The bandwidth of the interconnect
◮ Severe shared-resource interference
◮ Cache hit rates must be high, which is
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 86 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 87 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 88 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 89 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 90 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 91 / 111
Parallel Architectures
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 92 / 111
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 93 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 94 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 95 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 96 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 97 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 98 / 111
Data Partitioning Approaches
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 99 / 111
Parallelizing Relational Operators
◮ a shared-nothing architecture, and ◮ operating system support for at least reliable datagrams and
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 100 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 101 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 102 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 103 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 104 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 105 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 106 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 107 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 108 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 109 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 110 / 111
Parallelizing Relational Operators
AAAF (School of CS, Manchester) Advanced DBMSs 2012-2013 111 / 111