Translating SQL to RA Database Systems: The Complete Book Ch - - PowerPoint PPT Presentation

translating sql to ra
SMART_READER_LITE
LIVE PREVIEW

Translating SQL to RA Database Systems: The Complete Book Ch - - PowerPoint PPT Presentation

Translating SQL to RA Database Systems: The Complete Book Ch 16,16.1 1 The Evaluation Pipeline How does this work? .sql (now) Parsed Query Employee Department Results What does this look like? How does this work? (last


slide-1
SLIDE 1

Translating SQL to RA

Database Systems: The Complete Book Ch 16,16.1 1
slide-2
SLIDE 2 π 𝜏 ⋈ Employee Department

The Evaluation Pipeline

Parsed Query Data Results .sql How does this work? (now) What does this look like? (last class) How does this work? (later today?) 2
slide-3
SLIDE 3

A Basic SQL Query

SELECT [DISTINCT] target-list FROM relation-list WHERE condition

A list of relation names (possibly with a range-variable after each name) A list of attributes of relations in relation-list Comparisons (‘=’, ‘<>’, ‘<‘, ‘>’, ‘<=’, ‘>=’) and other boolean predicates, combined using AND, OR, and NOT (a boolean formula) (optional) keyword indicating that the answer should not contain duplicates 3
slide-4
SLIDE 4

SQL

4
  • SQL is a language for querying relations
  • SELECT to access (query) data
  • Different features for different access patterns.
  • INSERT INTO, DELETE FROM to modify data
  • CREATE TABLE, DROP TABLE,
ALTER TABLE to modify relations
slide-5
SLIDE 5

Relational Algebra Trees

SELECT O.FirstName FROM Officers O, Ships S WHERE O.Ship = S.ID AND S.Name = ‘Enterprise’

5 πFirstName(Officers⋈Ship=ID(σName=‘Enterprise’Ships))
slide-6
SLIDE 6

Relational Algebra Trees

6 πFirstName(Officers⋈Ship=ID(σName=‘Enterprise’Ships)) Officers⋈Ship=ID πFirstName σName=‘Enterprise’Ships
slide-7
SLIDE 7

Relational Algebra Trees

7

Inputs Output

πFirstName(Officers⋈Ship=ID(σName=‘Enterprise’Ships)) Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships
slide-8
SLIDE 8

Syntax Trees in Java

8 Statement Select CreateTable [Body] SelectBody Union PlainSelect [SelectList,FromList,Where] FromItem Member instance of Table SubSelect [Body] Class Interface
slide-9
SLIDE 9

InstanceOf

9 Statement statement = parser.Statement(); if(statement instanceof Select) { Algebra raTree = parseTree((Select)statement); evaluate(raTree); } else if(statement instanceof CreateTable) { loadTableSchema((CreateTable)statement); }
slide-10
SLIDE 10

Syntax Trees in Java

10 What would a class hierarchy look like for Relational Algebra?
slide-11
SLIDE 11

Syntax Trees in Java

11 Operator

π ⨉ 𝜏 ⊔

[c] [c] [l,r] [l,r]
slide-12
SLIDE 12

SQL to RA

12 SELECT [DISTINCT] target FROM source WHERE cond1 GROUP BY … HAVING cond2 ORDER BY order LIMIT lim UNION nextselect

source (⨉,⋈) cond1 (σ) agg cond2 (σ) target (π)

  • rder by

lim nextselect U distinct

slide-13
SLIDE 13

FROM Clause

13 FROM R, S, T, …

⨉ R S ⨉ T ⨉ …

What happens if I have a FROM-nested query?
slide-14
SLIDE 14

FROM Clause

14 FROM R, (SELECT …) S, T, …

⨉ R SELECT … ⨉ T ⨉ …

Selects are just relations!
slide-15
SLIDE 15

FROM Clause

15 FROM R JOIN S ON cond
slide-16
SLIDE 16

FROM Clause

16 FROM R JOIN S ON cond

⨉ σ R S

cond

⋈ R S

cond
  • r
slide-17
SLIDE 17

FROM Clause

17 FROM R NATURAL JOIN S
slide-18
SLIDE 18

FROM Clause

18 FROM R NATURAL JOIN S cond = schema(R) ⋂ schema(S)

⨉ σ R S

cond

⋈ R S

cond
  • r
You need to be able to compute the schema of a RA operator
slide-19
SLIDE 19

SQL to RA

19 SELECT [DISTINCT] target FROM source WHERE cond1 GROUP BY … HAVING cond2 ORDER BY order LIMIT lim UNION nextselect

source (⨉,⋈) agg target (π)

  • rder by

lim nextselect U distinct cond1 (σ) cond2 (σ)

slide-20
SLIDE 20

SELECT (target) Clause

20 SELECT A, B, …

π input

targets SELECT * SELECT R.*, S.*

no π (or target = schema(input))

targets = schema(input) from R, S Schemas need both Table Alias & Attribute Name (see Column class) targets = A, B, …
slide-21
SLIDE 21

SQL to RA

21 SELECT [DISTINCT] target FROM source WHERE cond1 GROUP BY … HAVING cond2 ORDER BY order LIMIT lim UNION nextselect

source (⨉,⋈) agg target (π)

  • rder by

lim nextselect U distinct cond1 (σ) cond2 (σ)

slide-22
SLIDE 22

Let’s Try It…

22 CREATE TABLE R(a int, b int) CREATE TABLE S(b int, c int, d int) CREATE TABLE T(c int, e int, f string) SELECT R.*, T.f FROM R, T, ( SELECT b, AVG(d) AS DAvg, SUM(c) AS CSum FROM S WHERE d > 10 GROUP BY b HAVING DAvg < 20 ) SAgg WHERE R.b = SAgg.b AND T.c = SAgg.CSum
slide-23
SLIDE 23 23
slide-24
SLIDE 24 24 … but that’s stupid! That query will be sloooooooooow.
slide-25
SLIDE 25 25 Translation is hard. Don’t make your life harder. Translate Dumb, Fix it in the Optimizer
slide-26
SLIDE 26

Group Work

26 Write pseudocode translating from a non-aggregate SELECT … FROM … WHERE … to a relational algebra expression
slide-27
SLIDE 27

Evaluating RA

Database Systems: The Complete Book Ch 15,15.1-15.3 27
slide-28
SLIDE 28

Project Outline

SQL Query Parser & Translator Relational Algebra Optimizer Execution Plan Evaluation Engine Query Result Statistics

???

Here Lie Dragons?

JSqlParser .sql 28
slide-29
SLIDE 29 π 𝜏 ⋈ Employee Department

The Evaluation Pipeline

Parsed Query Data Results .sql How does this work? (now) 29
slide-30
SLIDE 30

Evaluation Strategies

  • Staged Evaluation: Start at leaves, Evaluate each
  • perator as one step.
  • Pull Model: Tuple-at-a-time Iterator for each
  • perator (also called Volcano Operators) reads
from source iterator(s).
  • Push Model: Thread-per operator reads from input
buffer(s) and writes to output buffer. 30
slide-31
SLIDE 31

Staged Evaluation

31 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Compute the Output
slide-32
SLIDE 32

Staged Evaluation

32 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute π Compute the Output
slide-33
SLIDE 33

Staged Evaluation

33 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute ⋈ Dependency: Compute π Compute the Output
slide-34
SLIDE 34

Staged Evaluation

34 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute σ Dependency: Compute ⋈ Dependency: Compute π Compute the Output
slide-35
SLIDE 35

Staged Evaluation

35 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute σ Dependency: Load Ships Dependency: Compute ⋈ Dependency: Compute π Compute the Output ID, Name [1701, Enterprise ] [DS9, Deep Space 9] [74656, Voyager ] [75633, Defiant ]
slide-36
SLIDE 36

Staged Evaluation

36 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute σ Dependency: Compute ⋈ Dependency: Compute π Compute the Output ID, Name [1701, Enterprise ] ID, Name [1701, Enterprise ] [DS9, Deep Space 9] [74656, Voyager ] [75633, Defiant ]
slide-37
SLIDE 37

Staged Evaluation

37 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute ⋈ Dependency: Compute π Compute the Output ID, Name [1701, Enterprise ]
slide-38
SLIDE 38

Staged Evaluation

38 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Load Officers Dependency: Compute ⋈ Dependency: Compute π Compute the Output ID, Name [1701, Enterprise ] FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701 ] [Jean Luc, Picard, 4.0, 1701D] [Benjamin, Sisko, 3.0, DS9 ] [Kathryn, Janeway, 4.0, 74656] [Nerys, Kira, 2.5, 75633] [Spock, NULL, 2.5, 1701 ] [William, Riker, 2.5, 1701D] [Nerys, Kira, 2.5, DS9 ] [Chakotay, NULL, 3.0, 74656]
slide-39
SLIDE 39

Staged Evaluation

39 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute ⋈ Dependency: Compute π Compute the Output ID, Name [1701, Enterprise ] FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701 ] [Jean Luc, Picard, 4.0, 1701D] [Benjamin, Sisko, 3.0, DS9 ] [Kathryn, Janeway, 4.0, 74656] [Nerys, Kira, 2.5, 75633] [Spock, NULL, 2.5, 1701 ] [William, Riker, 2.5, 1701D] [Nerys, Kira, 2.5, DS9 ] [Chakotay, NULL, 3.0, 74656] FirstName, LastName, Rank, Ship, ID, Name [James, Kirk, 4.0, 1701, 1701, Enterprise] [Spock, NULL, 2.5, 1701, 1701, Enterprise]

slide-40
SLIDE 40

Staged Evaluation

40 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Compute π Compute the Output FirstName, LastName, Rank, Ship, ID, Name [James, Kirk, 4.0, 1701, 1701, Enterprise] [Spock, NULL, 2.5, 1701, 1701, Enterprise] FirstName [James ] [Spock ]
slide-41
SLIDE 41

Staged Evaluation

41 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Compute the Output

FirstName [James ] [Spock ]

slide-42
SLIDE 42

Staged Evaluation

42 Can we do better?
slide-43
SLIDE 43

Staged Evaluation

  • Expensive: Lots of Bulk Copies
  • Cache Locality: Repeated Scans over Full Tables
  • Memory Use: Working Set is a Full Table (or more)
43 How do we do better?
slide-44
SLIDE 44

The Memory Hierarchy and You

  • We want to keep data as close to the CPU as possible
  • Faster memory == Smaller memory
  • Solution 1: Minimize the Working Set Size!
  • (the memory used at any one time)
  • Solution 2: Aggressively Batch & Reuse Data
44
slide-45
SLIDE 45

Volcano Evaluation

45 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Compute one tuple
slide-46
SLIDE 46

Volcano Evaluation

46 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from π Compute one tuple
slide-47
SLIDE 47

Volcano Evaluation

47 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple
slide-48
SLIDE 48

Volcano Evaluation

48 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from σ Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple
slide-49
SLIDE 49

Volcano Evaluation

49 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from σ Dependency: Tuple from Ships Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple ID, Name [1701, Enterprise ]
slide-50
SLIDE 50

Volcano Evaluation

50 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from σ Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple ID, Name [1701, Enterprise ]
slide-51
SLIDE 51

Volcano Evaluation

51 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple ID, Name [1701, Enterprise ]
slide-52
SLIDE 52

Volcano Evaluation

52 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Tuple f. Officers Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple ID, Name [1701, Enterprise ] FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701 ]
slide-53
SLIDE 53

Volcano Evaluation

53 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from ⋈ Dependency: Next tuple from π Compute one tuple ID, Name [1701, Enterprise ] FirstName, LastName, Rank, Ship [James, Kirk, 4.0, 1701 ] FirstName, LastName, Rank, Ship, ID, Name [James, Kirk, 4.0, 1701, 1701, Enterprise]

slide-54
SLIDE 54

Volcano Evaluation

54 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Dependency: Next tuple from π Compute one tuple FirstName, LastName, Rank, Ship, ID, Name [James, Kirk, 4.0, 1701, 1701, Enterprise] FirstName [James ]
slide-55
SLIDE 55

Volcano Evaluation

55 Officers ⋈Ship=ID πFirstName σName=‘Enterprise’ Ships Compute one tuple

FirstName [James ]

slide-56
SLIDE 56

Iterators

56 void open() { // call open() on child iterators // prepare the iterator } Tuple getNext() { // read, process, and return a tuple } void close() { // clean-up the iterator // call close() on child iterators }
slide-57
SLIDE 57

GetNext()

57

Relation

What is the Working Set Size?

Read One Line from File Split Line into Fields Parse Field Types Return Tuple

slide-58
SLIDE 58

GetNext()

58

Projection (π) Read One Tuple Compute Projected Attributes Return Tuple

What is the Working Set Size?
slide-59
SLIDE 59

GetNext()

59

Selection (σ) Read One Tuple Test Condition Return Tuple Reject Tuple

What is the Working Set Size?
slide-60
SLIDE 60

GetNext()

60

Union (U) Read One Tuple from S Return Tuple Read One Tuple from R R Empty?

What is the Working Set Size?
slide-61
SLIDE 61

GetNext()

61

Nested Loop Join/Cross (⨉) Read (and save) One Tuple from R Return Tuple Construct Joint Tuple: < S > o < R> Read One Tuple from S S Empty? Reset S (Close(),Open())

but… What is the Working Set Size?

Is there a saved tuple? Y N

slide-62
SLIDE 62

Implementing: Joins

Solution 1 (Nested-Loop) For Each (a in A) { For Each (b in B) { emit (a, b); }} A B

62
slide-63
SLIDE 63

Implementing: Joins

Solution 2 (Block-Nested-Loop)

63
slide-64
SLIDE 64

Implementing: Joins

Solution 2 (Block-Nested-Loop) 1) Partition into Blocks 2) NLJ on each pair of blocks

64