Translating SQL to RA Database Systems: The Complete Book Ch 16,16.1 1
The Evaluation Pipeline How does this work? .sql (now) π 𝜏 ⋈ Parsed Query Employee Department Results What does this look like? How does this work? (last class) (later today?) Data 2
A Basic SQL Query (optional) keyword indicating that the answer should not contain duplicates SELECT [DISTINCT] target-list A list of attributes of relations in relation-list FROM relation-list A list of relation names (possibly with a range-variable after each name) WHERE condition Comparisons (‘=’, ‘<>’, ‘<‘, ‘>’, ‘<=’, ‘>=’) and other boolean predicates, combined using AND, OR, and NOT (a boolean formula) 3
SQL SQL is a language for querying relations • SELECT to access (query) data • Different features for different access patterns. • INSERT INTO , DELETE FROM to modify data • CREATE TABLE , DROP TABLE , • ALTER TABLE to modify relations 4
Relational Algebra Trees SELECT O.FirstName FROM Officers O, Ships S WHERE O.Ship = S.ID AND S.Name = ‘Enterprise’ π FirstName ( Officers ⋈ Ship=ID ( σ Name=‘Enterprise’ Ships )) 5
Relational Algebra Trees π FirstName ( Officers ⋈ Ship=ID ( σ Name=‘Enterprise’ Ships )) π FirstName Officers ⋈ Ship=ID σ Name=‘Enterprise’ Ships 6
Relational Algebra Trees Output π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Inputs Ships π FirstName ( Officers ⋈ Ship=ID ( σ Name=‘Enterprise’ Ships )) 7
Syntax Trees in Java Statement Select CreateTable [Body] SelectBody Union PlainSelect [SelectList,FromList,Where] Member FromItem instance of Class Table SubSelect [Body] Interface 8
InstanceOf Statement statement = parser.Statement(); if (statement instanceof Select) { Algebra raTree = parseTree((Select)statement); evaluate(raTree); } else if (statement instanceof CreateTable) { loadTableSchema((CreateTable)statement); } 9
Syntax Trees in Java What would a class hierarchy look like for Relational Algebra? 10
Syntax Trees in Java Operator ⨉ 𝜏 ⊔ π [c] [c] [l,r] [l,r] 11
SQL to RA U lim nextselect SELECT [DISTINCT] distinct target FROM source order by WHERE cond1 target ( π ) GROUP BY … HAVING cond2 cond2 ( σ ) ORDER BY order agg LIMIT lim UNION nextselect cond1 ( σ ) source ( ⨉ , ⋈ ) 12
FROM Clause FROM R, S, T, … ⨉ … ⨉ T ⨉ R S What happens if I have a FROM-nested query? 13
FROM Clause FROM R, (SELECT …) S, T, … ⨉ … ⨉ T ⨉ R SELECT … Selects are just relations! 14
FROM Clause FROM R JOIN S ON cond 15
FROM Clause FROM R JOIN S ON cond σ cond ⋈ cond or R S ⨉ R S 16
FROM Clause FROM R NATURAL JOIN S 17
FROM Clause FROM R NATURAL JOIN S σ cond ⋈ cond or R S ⨉ R S cond = schema(R) ⋂ schema(S) You need to be able to compute the schema of a RA operator 18
SQL to RA U lim nextselect SELECT [DISTINCT] distinct target FROM source order by WHERE cond1 target ( π ) GROUP BY … HAVING cond2 cond2 ( σ ) ORDER BY order agg LIMIT lim UNION nextselect cond1 ( σ ) source ( ⨉ , ⋈ ) 19
SELECT (target) Clause SELECT * no π (or target = schema(input)) π SELECT A, B, … targets targets = A, B, … input SELECT R.*, S.* targets = schema(input) from R, S Schemas need both Table Alias & Attribute Name (see Column class) 20
SQL to RA U lim nextselect SELECT [DISTINCT] distinct target FROM source order by WHERE cond1 target ( π ) GROUP BY … HAVING cond2 cond2 ( σ ) ORDER BY order agg LIMIT lim UNION nextselect cond1 ( σ ) source ( ⨉ , ⋈ ) 21
Let’s Try It… CREATE TABLE R(a int, b int) CREATE TABLE S(b int, c int, d int) CREATE TABLE T(c int, e int, f string) SELECT R.*, T.f FROM R, T, ( SELECT b, AVG(d) AS DAvg, SUM(c) AS CSum FROM S WHERE d > 10 GROUP BY b HAVING DAvg < 20 ) SAgg WHERE R.b = SAgg.b AND T.c = SAgg.CSum 22
23
… but that’s stupid! That query will be sloooooooooow. 24
Translation is hard. Don’t make your life harder. Translate Dumb, Fix it in the Optimizer 25
Group Work Write pseudocode translating from a non-aggregate SELECT … FROM … WHERE … to a relational algebra expression 26
Evaluating RA Database Systems: The Complete Book Ch 15,15.1-15.3 27
Project Outline Parser & SQL Query Relational Algebra Translator .sql JSqlParser Optimizer ??? Statistics Here Lie Dragons? Query Evaluation Execution Plan Result Engine 28
The Evaluation Pipeline .sql π 𝜏 ⋈ Parsed Query Employee Department Results How does this work? (now) Data 29
Evaluation Strategies • Staged Evaluation : Start at leaves, Evaluate each operator as one step. • Pull Model : Tuple-at-a-time Iterator for each operator (also called Volcano Operators ) reads from source iterator(s). • Push Model : Thread-per operator reads from input buffer(s) and writes to output buffer. 30
Staged Evaluation Compute the Output π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 31
Staged Evaluation Compute the Output Dependency : Compute π π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 32
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 33
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName Dependency : Compute σ ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 34
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName Dependency : Compute σ Dependency : Load Ships ⋈ Ship=ID ID, Name [1701, Enterprise ] [DS9, Deep Space 9] σ Name=‘Enterprise’ Officers [74656, Voyager ] [75633, Defiant ] Ships 35
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName Dependency : Compute σ ⋈ Ship=ID ID, Name ID, Name [1701, Enterprise ] [1701, Enterprise ] [DS9, Deep Space 9] σ Name=‘Enterprise’ Officers [74656, Voyager ] [75633, Defiant ] Ships 36
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName ⋈ Ship=ID ID, Name [1701, Enterprise ] σ Name=‘Enterprise’ Officers Ships 37
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName Dependency : Load Officers ⋈ Ship=ID ID, Name [1701, Enterprise ] σ Name=‘Enterprise’ Officers FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701 ] [Jean Luc, Picard, 4.0, 1701D] [Benjamin, Sisko, 3.0, DS9 ] Ships [Kathryn, Janeway, 4.0, 74656] [Nerys, Kira, 2.5, 75633] [Spock, NULL, 2.5, 1701 ] [William, Riker, 2.5, 1701D] [Nerys, Kira, 2.5, DS9 ] [Chakotay, NULL, 3.0, 74656] 38
Staged Evaluation Compute the Output Dependency : Compute π Dependency : Compute ⋈ π FirstName ⋈ ⋈ Ship=ID ID, Name [1701, Enterprise ] FirstName, LastName, Rank, Ship, ID, Name σ Name=‘Enterprise’ Officers FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701, 1701, Enterprise] [Spock, NULL, 2.5, 1701, 1701, Enterprise] [James, Kirk, 4.0, 1701 ] [Jean Luc, Picard, 4.0, 1701D] [Benjamin, Sisko, 3.0, DS9 ] Ships [Kathryn, Janeway, 4.0, 74656] [Nerys, Kira, 2.5, 75633] [Spock, NULL, 2.5, 1701 ] [William, Riker, 2.5, 1701D] [Nerys, Kira, 2.5, DS9 ] [Chakotay, NULL, 3.0, 74656] 39
Staged Evaluation Compute the Output Dependency : Compute π π FirstName ⋈ Ship=ID FirstName, LastName, Rank, Ship, ID, Name FirstName σ Name=‘Enterprise’ Officers [James, Kirk, 4.0, 1701, 1701, Enterprise] [James ] [Spock, NULL, 2.5, 1701, 1701, Enterprise] [Spock ] Ships 40
Staged Evaluation Compute the Output π FirstName FirstName ⋈ Ship=ID [James ] [Spock ] σ Name=‘Enterprise’ Officers Ships 41
Staged Evaluation Can we do better? 42
Staged Evaluation • Expensive : Lots of Bulk Copies • Cache Locality : Repeated Scans over Full Tables • Memory Use : Working Set is a Full Table (or more) How do we do better? 43
The Memory Hierarchy and You • We want to keep data as close to the CPU as possible • Faster memory == Smaller memory • Solution 1: Minimize the Working Set Size! • (the memory used at any one time) • Solution 2: Aggressively Batch & Reuse Data 44
Volcano Evaluation Compute one tuple π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 45
Volcano Evaluation Compute one tuple Dependency : Next tuple from π π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 46
Volcano Evaluation Compute one tuple Dependency : Next tuple from π Dependency : Next tuple from ⋈ π FirstName ⋈ Ship=ID σ Name=‘Enterprise’ Officers Ships 47
Recommend
More recommend