chapter 8
play

Chapter 8 Evaluation of Relational Operators Implementing the - PowerPoint PPT Presentation

Evaluation of Relational Operators Torsten Grust Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational Query Engines Operator Selection Architecture and Implementation of Database Systems Selection (


  1. Evaluation of Relational Operators Torsten Grust Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational Query Engines Operator Selection Architecture and Implementation of Database Systems Selection ( σ ) Summer 2016 Selectivity Conjunctive Predicates Disjunctive Predicates Projection ( π ) Join ( � ) Nested Loops Join Block Nested Loops Join Index Nested Loops Join Sort-Merge Join Hash Join Operator Pipelining Volcano Iterator Model Torsten Grust Wilhelm-Schickard-Institut für Informatik Universität Tübingen 1

  2. Evaluation of Relational Query Engines Relational Operators • In many ways, a DBMS’s query engine compares to virtual Torsten Grust machines ( e.g. , the Java VM): Relational Query Engine Virtual Machine (VM) Operators of the relational Primitive VM instructions Relational Query algebra Engines Operator Selection Operates over streams of Acts on object representa- Selection ( σ ) rows tions Selectivity Operator network Sequential program (with Conjunctive Predicates Disjunctive Predicates (tree/DAG) branches, loops) Projection ( π ) Several equivalent variants Compact instruction set Join ( � ) of an operator Nested Loops Join Block Nested Loops Join Index Nested Loops Join Sort-Merge Join Hash Join ✛ Equivalent operator variants Operator Pipelining Volcano Iterator Model Instead of a single � operator, a typical DBMS query engine features equivalent variants � ′ , � ′′ , . . . . What would equivalent mean in the context of the relational model? 2

  3. Evaluation of Operator Variants Relational Operators Torsten Grust • Specific operator variants may be tailored to exploit physical properties of its input or the current system state: 1 The presence or absence of indexes on the input file(s), Relational Query 2 the sortedness of the input file(s), Engines Operator Selection 3 the size of the input file(s), Selection ( σ ) 4 the available space in the buffer pool , Selectivity Conjunctive Predicates 5 the buffer replacement policy , Disjunctive Predicates 6 . . . Projection ( π ) Join ( � ) Nested Loops Join Physical operators Block Nested Loops Join Index Nested Loops Join The variants ( � ′ , � ′′ ) are thus referred to physical operators . Sort-Merge Join Hash Join They implement the logical operators of the relational algebra. Operator Pipelining Volcano Iterator Model • The query optimizer is in charge to perform optimal (or, reasonable) operator selection (much like the instruction selection phase in a programming language compiler). 3

  4. � � � � � � � � � � � Evaluation of Operator Selection Relational Operators Torsten Grust Initial, logical operator network (“plan”) R � π � � Relational Query Engines � � sort S Operator Selection � Selection ( σ ) � σ Selectivity T Conjunctive Predicates Disjunctive Predicates Projection ( π ) Physical plan with (un)sortedness annotations ( u / s ) Join ( � ) Nested Loops Join Block Nested Loops Join Index Nested Loops Join R s � � ′ Sort-Merge Join u � π ′ Hash Join u u � Operator Pipelining u � sort ′ s � ′ S u Volcano Iterator Model � ′ u s � σ ′′ s T 4

  5. � � � � � � � � � � Evaluation of Plan Rewriting Relational Operators Torsten Grust Physical plan with (un)sortedness annotations ( u / s ) R s � � ′ u � π ′ u u � u � sort ′ s � ′ S u Relational Query � ′ u Engines s � σ ′′ Operator Selection T s Selection ( σ ) Selectivity Conjunctive Predicates • Rewrite the plan to exploit that the ⊕ ′′ variant of operator Disjunctive Predicates Projection ( π ) ⊕ can benefit from/preserve sortedness of its input(s): Join ( � ) Nested Loops Join Block Nested Loops Join Rewritten physical plan (preserve equivalence!) Index Nested Loops Join Sort-Merge Join Hash Join Operator Pipelining R s s � π ′′ Volcano Iterator Model � � ′′ s � s u � sort ′ � ′′ S s � ′′ s � σ ′′ s T s 5

  6. Evaluation of Selection ( σ )—No Index, Unsorted Data Relational Operators Torsten Grust • Selection ( σ p ) reads an input file R in of records and writes those records satisfying predicate p into the output file: Relational Query Engines Selection Operator Selection Selection ( σ ) 1 Function: σ ( p , R in , R out ) Selectivity Conjunctive Predicates 2 out ← createFile ( R out ) ; Disjunctive Predicates 3 in ← openScan ( R in ) ; Projection ( π ) Join ( � ) 4 while ( r ← nextRecord ( in )) � = � EOF � do Nested Loops Join if p ( r ) then Block Nested Loops Join 5 Index Nested Loops Join appendRecord ( out , r ) ; 6 Sort-Merge Join Hash Join 7 closeFile ( out ) ; Operator Pipelining Volcano Iterator Model 6

  7. Evaluation of Selection ( σ )—No Index, Unsorted Data Relational Operators Torsten Grust Remarks: Relational Query • Reading the special “record” � EOF � from a file via Engines Operator Selection nextRecord() indicates that all its record have been Selection ( σ ) retrieved (scanned) already. Selectivity Conjunctive Predicates Disjunctive Predicates • This simple procedure does not require r in to come with Projection ( π ) Join ( � ) any special physical properties (the procedure is exclusively Nested Loops Join Block Nested Loops Join defined in terms of heap files). Index Nested Loops Join Sort-Merge Join Hash Join • In particular, predicate p may be arbitrary . Operator Pipelining Volcano Iterator Model 7

  8. Evaluation of Selection ( σ )—No Index, Unsorted Data Relational Operators Torsten Grust • We can summarize the characteristics of this implementation of the selection operator as follows: Selection ( σ )—no index, unsorted data Relational Query Engines σ p ( R ) Operator Selection Selection ( σ ) input access 1 Selectivity file scan ( openScan ) of R Conjunctive Predicates prerequisites none ( p arbitrary, R may be a heap file) Disjunctive Predicates Projection ( π ) I/O cost N R + sel ( p ) · N R Join ( � ) ���� � �� � input cost output cost Nested Loops Join Block Nested Loops Join Index Nested Loops Join Sort-Merge Join Hash Join • N R denotes the number of pages in file R , | R | denotes Operator Pipelining Volcano Iterator Model the number of records � � (if p R records fit on one page, we have N R = | R | / p R ) 1 Also known as access path in the literature and text books. 8

  9. Evaluation of Aside: Selectivity Relational Operators Torsten Grust • sel ( p ) , the selectivity of predicate p , is the fraction of records satisfying predicate p : sel ( p ) = | σ p ( R ) | 0 1 � � | R | Relational Query Engines Operator Selection Selection ( σ ) ✛ Selectivity examples Selectivity Conjunctive Predicates Disjunctive Predicates What can you say about the following selectivities? Projection ( π ) 1 sel ( true ) Join ( � ) Nested Loops Join 2 sel ( false ) Block Nested Loops Join Index Nested Loops Join 3 sel ( A = 0 ) Sort-Merge Join Hash Join Operator Pipelining Volcano Iterator Model Estimated selectivities IBM DB2 reports (estimated) selecitvities in the operators details of, e.g. , its IXSCAN operator. 9

  10. Evaluation of Selection ( σ )—Matching Predicates with an Index Relational Operators • A selection on input file R can be sped up considerably if an Torsten Grust index has been defined and that index matches predicate p . • The matching process depends on p itself as well as on the index type. If there is no immediate match but p is compound , a sub-expression of p may still find a partial Relational Query match . Residual predicate evaluation work may then remain. Engines Operator Selection Selection ( σ ) ✛ When does a predicate match a sort key? Selectivity Conjunctive Predicates Assume R is tree-indexed on attribute A in ascending order. Disjunctive Predicates Projection ( π ) Which of the selections below can benefit from the index on R ? Join ( � ) 1 σ A = 42 ( R ) Nested Loops Join Block Nested Loops Join 2 σ A < 42 ( R ) Index Nested Loops Join Sort-Merge Join 3 σ A > 42 AND A < 100 ( R ) Hash Join Operator Pipelining 4 σ A > 42 OR A > 100 ( R ) Volcano Iterator Model 5 σ A > 42 AND A < 32 ( R ) 6 σ A > 42 AND B = 10 ( R ) 7 σ A > 42 OR B = 10 ( R ) 10

  11. Selection ( σ )—B + -tree Index Evaluation of Relational Operators • A B + -tree index on R whose key matches the selection Torsten Grust predicate p is clearly the superior method to evaluate σ p ( R ) : • Descend the B + -tree to retrieve the first index entry to satisfy p . If the index is clustered , access that Relational Query record on its page in R and continue to scan inside R . Engines Operator Selection • If the index is unclustered and sel ( p ) indicates a large Selection ( σ ) Selectivity number of qualifying records, it pays off to Conjunctive Predicates Disjunctive Predicates 1 read the matching index entries k ∗ = � k , rid � in the Projection ( π ) sequence set, Join ( � ) Nested Loops Join 2 sort those entries on their rid field, Block Nested Loops Join 3 and then access the pages of R in sorted rid order. Index Nested Loops Join Sort-Merge Join Note that lack of clustering is a minor issue if sel ( p ) is Hash Join close to 0. Operator Pipelining Volcano Iterator Model Accessing unclustered B + -trees IBM DB2 uses physical operator quadruple IXSCAN / SORT / RIDSCN / FETCH to implement the above strategy. 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend