DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

database system implementation
SMART_READER_LITE
LIVE PREVIEW

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #13: QUERY COMPILATION 2 TODAYS AGENDA Background Code Generation / Transpilation JIT Compilation (LLVM) Real-world Implementations 3 ANATOMY OF A


slide-1
SLIDE 1

DATABASE SYSTEM IMPLEMENTATION

GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #13: QUERY COMPILATION

slide-2
SLIDE 2

TODAY’S AGENDA

Background Code Generation / Transpilation JIT Compilation (LLVM) Real-world Implementations

2

slide-3
SLIDE 3

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control Query Parser Query Optimizer Query Executor Lock Manager (Concurrency Control) Access Methods (or Indexes) Buffer Pool Manager Log Manager Memory Manager + Disk Manager Networking Manager

3

Query Transactional Storage Manager Query Processor Shared Utilities Process Manager

Source: Anatomy of a Database System

slide-4
SLIDE 4

HEKATON REMARK

After switching to an in-memory DBMS, the only way to increase throughput is to reduce the number of instructions executed.

→ To go 10x faster, the DBMS must execute 90% fewer instructions… → To go 100x faster, the DBMS must execute 99% fewer instructions…

4

COMPILATION IN THE MICROSOFT SQL SERVER HEKATON ENGINE IEEE Data Engineering Bulletin 2011

slide-5
SLIDE 5

OBSERVATION

One way to achieve such a reduction in instructions is through code specialization. This means generating code that is specific to a particular query in the DBMS.

→ Encode everything known about the data (e.g., type) and query (e.g., where clause)

Most code is written to make it easy for humans to understand rather than performance…

→ Interpretation vs. Compilation

5

slide-6
SLIDE 6

EXAMPLE DATABASE

6

CREATE TABLE A ( id INT PRIMARY KEY, val INT ); CREATE TABLE B ( id INT PRIMARY KEY, val INT ); CREATE TABLE C ( a_id INT REFERENCES A(id), b_id INT REFERENCES B(id), PRIMARY KEY (a_id, b_id) );

slide-7
SLIDE 7

QUERY PROCESSING

7

SELECT A.id, B.val FROM A, B WHERE A.id = B.id AND B.val > 100

A B

A.id=B.id val>100 A.id, B.val

s

p

slide-8
SLIDE 8

QUERY PROCESSING

8

Tuple-at-a-time

→ Each operator calls next on their child to get the next tuple to process.

Vector-at-a-time

→ Each operator calls next on their child to get the next chunk of data to process.

Operator-at-a-time

→ Each operator materializes their entire output for their parent operator.

SELECT A.id, B.val FROM A, B WHERE A.id = B.id AND B.val > 100

A B

A.id=B.id val>100 A.id, B.val

s

p

slide-9
SLIDE 9

QUERY PROCESSING

9

Tuple-at-a-time

→ Each operator calls next on their child to get the next tuple to process.

Vector-at-a-time

→ Each operator calls next on their child to get the next chunk of data to process.

Operator-at-a-time

→ Each operator materializes their entire output for their parent operator.

SELECT A.id, B.val FROM A, B WHERE A.id = B.id AND B.val > 100

A B

A.id=B.id val>100 A.id, B.val

s

p

slide-10
SLIDE 10

QUERY INTERPRETATION

10

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

⨝A.id=C.a_id

σ A.val=123 A

⨝B.id=C.b_id

ΓB.id, COUNT(*) σB.val=?+1 B C

slide-11
SLIDE 11

QUERY INTERPRETATION

11

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

⨝A.id=C.a_id

σ A.val=123 A

⨝B.id=C.b_id

ΓB.id, COUNT(*) σB.val=?+1 B C ⨝

for t1 in left.next(): buildHashTable(t1) for t2 in right.next(): if probe(t2): emit(t1⨝t2) for t in child.next(): if evalPred(t): emit(t)

σ ⨝

for t1 in left.next(): buildHashTable(t1) for t2 in right.next(): if probe(t2): emit(t1⨝t2) for t in A: emit(t)

A

for t in B: emit(t)

B

for t in C: emit(t)

C

for t in child.next(): if evalPred(t): emit(t)

σ Γ

for t in child.next(): buildAggregateTable(t) for t in aggregateTable: emit(t)

slide-12
SLIDE 12

PREDICATE INTERPRETATION

12

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

slide-13
SLIDE 13

PREDICATE INTERPRETATION

13

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

slide-14
SLIDE 14

Execution Context

PREDICATE INTERPRETATION

14

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-15
SLIDE 15

Execution Context

PREDICATE INTERPRETATION

15

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-16
SLIDE 16

Execution Context

PREDICATE INTERPRETATION

16

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-17
SLIDE 17

1000

Execution Context

PREDICATE INTERPRETATION

17

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-18
SLIDE 18

1000

Execution Context

PREDICATE INTERPRETATION

18

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-19
SLIDE 19

1000

Execution Context

PREDICATE INTERPRETATION

19

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-20
SLIDE 20

1000 999

Execution Context

PREDICATE INTERPRETATION

20

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-21
SLIDE 21

1000 999

Execution Context

PREDICATE INTERPRETATION

21

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-22
SLIDE 22

1000 999 1

Execution Context

PREDICATE INTERPRETATION

22

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-23
SLIDE 23

1000 999 1 true 1000

Execution Context

PREDICATE INTERPRETATION

23

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Current Tuple (123, 1000) Query Parameters (int:999) Table Schema B→(int:id, int:val)

TupleAttribute(B.val) Constant(1) = + Parameter(0)

slide-24
SLIDE 24

CODE SPECIALIZATION

Any CPU intensive entity of database can be natively compiled if they have a similar execution pattern on different inputs.

→ Access Methods → Stored Procedures → Operator Execution → Predicate Evaluation → Logging Operations

24

slide-25
SLIDE 25

BENEFITS

Attribute types are known a priori.

→ Data access function calls can be converted to inline pointer casting.

Predicates are known a priori.

→ They can be evaluated using primitive data comparisons.

No function calls in loops

→ Allows the compiler to efficiently distribute data to registers and increase cache reuse.

25

slide-26
SLIDE 26

ARCHITECTURE OVERVIEW

26

SQL Query

Parser

Abstract Syntax Tree Physical Plan Cost Estimates System Catalog

Binder Optimizer

Annotated AST Native Code

Compiler

slide-27
SLIDE 27

CODE GENERATION

Approach #1: Transpilation

→ Write code that converts a relational query plan into C/C++ and then run it through a conventional compiler to generate native code.

Approach #2: JIT Compilation

→ Generate an intermediate representation (IR) of the query that can be quickly compiled into native code .

27

slide-28
SLIDE 28

HIQUE – CODE GENERATION

For a given query plan, create a C/C++ program that implements that query’s execution.

→ Bake in all the predicates and type conversions.

Use an off-shelf compiler to convert the code into a shared object, link it to the DBMS process, and then invoke the exec function.

28

GENERATING CODE FOR HOLISTIC QUERY EVALUATION ICDE 2010

slide-29
SLIDE 29

OPERATOR TEMPLATES

29

SELECT * FROM A WHERE A.val = ? + 1

slide-30
SLIDE 30

Interpreted Plan

OPERATOR TEMPLATES

30

for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

slide-31
SLIDE 31

Interpreted Plan

OPERATOR TEMPLATES

31

for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple.

slide-32
SLIDE 32

Interpreted Plan

OPERATOR TEMPLATES

32

for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple. 1. Traverse predicate tree and pull values up. 2. If tuple value, calculate the offset of the target attribute. 3. Perform casting as needed for comparison operators. 4. Return true / false.

slide-33
SLIDE 33

Templated Plan Interpreted Plan

OPERATOR TEMPLATES

33

tuple_size = ### predicate_offset = ### parameter_value = ### for t in range(table.num_tuples): tuple = table.data + t ∗ tuple_size val = (tuple+predicate_offset) if (val == parameter_value + 1): emit(tuple) for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple. 1. Traverse predicate tree and pull values up. 2. If tuple value, calculate the offset of the target attribute. 3. Perform casting as needed for comparison operators. 4. Return true / false.

slide-34
SLIDE 34

Templated Plan Interpreted Plan

OPERATOR TEMPLATES

34

tuple_size = ### predicate_offset = ### parameter_value = ### for t in range(table.num_tuples): tuple = table.data + t ∗ tuple_size val = (tuple+predicate_offset) if (val == parameter_value + 1): emit(tuple) for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple. 1. Traverse predicate tree and pull values up. 2. If tuple value, calculate the offset of the target attribute. 3. Perform casting as needed for comparison operators. 4. Return true / false.

slide-35
SLIDE 35

Templated Plan Interpreted Plan

OPERATOR TEMPLATES

35

tuple_size = ### predicate_offset = ### parameter_value = ### for t in range(table.num_tuples): tuple = table.data + t ∗ tuple_size val = (tuple+predicate_offset) if (val == parameter_value + 1): emit(tuple) for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple. 1. Traverse predicate tree and pull values up. 2. If tuple value, calculate the offset of the target attribute. 3. Perform casting as needed for comparison operators. 4. Return true / false.

slide-36
SLIDE 36

Templated Plan Interpreted Plan

OPERATOR TEMPLATES

36

tuple_size = ### predicate_offset = ### parameter_value = ### for t in range(table.num_tuples): tuple = table.data + t ∗ tuple_size val = (tuple+predicate_offset) if (val == parameter_value + 1): emit(tuple) for t in range(table.num_tuples): tuple = get_tuple(table, t) if eval(predicate, tuple, params): emit(tuple)

1. Get schema in catalog for table. 2. Calculate offset based on tuple size. 3. Return pointer to tuple. 1. Traverse predicate tree and pull values up. 2. If tuple value, calculate the offset of the target attribute. 3. Perform casting as needed for comparison operators. 4. Return true / false.

slide-37
SLIDE 37

DBMS INTEGRATION

The generated query code can invoke any other function in the DBMS. This allows it to use all the same components as interpreted queries.

→ Concurrency Control → Logging / Checkpoints → Indexes

37

slide-38
SLIDE 38

EVALUATION

Generic Iterators

→ Canonical model with generic predicate evaluation.

Optimized Iterators

→ Type-specific iterators with inline predicates.

Generic Hardcoded

→ Handwritten code with generic iterators/predicates.

Optimized Hardcoded

→ Direct tuple access with pointer arithmetic.

HIQUE

→ Query-specific specialized code.

38

slide-39
SLIDE 39

QUERY COMPILATION EVALUATION

39

50 100 150 200 250

Generic Iterators Optimized Iterators Generic Hardcoded Optimized Hardcoded HIQUE

Execution Time (ms)

L2-cache Miss Memory Stall Instruction Exec.

Intel Core 2 Duo 6300 @ 1.86GHz Join Query: 10k⨝ 10k→10m

Source: Konstantinos Krikellas

slide-40
SLIDE 40

QUERY COMPILATION EVALUATION

40

50 100 150 200 250

Generic Iterators Optimized Iterators Generic Hardcoded Optimized Hardcoded HIQUE

Execution Time (ms)

L2-cache Miss Memory Stall Instruction Exec.

Intel Core 2 Duo 6300 @ 1.86GHz Join Query: 10k⨝ 10k→10m

Source: Konstantinos Krikellas

slide-41
SLIDE 41

QUERY COMPILATION COST

41

121 160 213 274 403 619

200 400 600 800

Q1 Q2 Q3

Compilation Time (ms)

Compile (-O0) Compile (-O2)

Intel Core 2 Duo 6300 @ 1.86GHz TPC-H Queries

Source: Konstantinos Krikellas

slide-42
SLIDE 42

OBSERVATION

Relational operators are a useful way to reason about a query but are not the most efficient way to execute it. It takes a (relatively) long time to compile a C/C++ source file into executable code. HIQUE does not allow for full pipelining…

42

slide-43
SLIDE 43

PIPELINED OPERATORS

43

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

⨝A.id=C.a_id

σ A.val=123 A

⨝B.id=C.b_id

Γ

B.id,COUNT(*)

σB.val=?+1 B C

slide-44
SLIDE 44

PIPELINED OPERATORS

44

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

⨝A.id=C.a_id

σ A.val=123 A

⨝B.id=C.b_id

Γ

B.id,COUNT(*)

σB.val=?+1 B C

Pipeline Boundaries #1 #4 #2 #3

slide-45
SLIDE 45

HYPER – JIT QUERY COMPILATION

Compile queries in-memory into native code using the LLVM toolkit. Organizes query processing in a way to keep a tuple in CPU registers for as long as possible.

→ Push-based vs. Pull-based → Data Centric vs. Operator Centric

45

EFFICIENTLY COMPILING EFFICIENT QUERY PLANS FOR MODERN HARDWARE VLDB 2011

slide-46
SLIDE 46

LLVM

Collection of modular and reusable compiler and toolchain technologies. Core component is a low-level programming language (IR) that is similar to assembly. Not all of the DBMS components need to be written in LLVM IR.

→ LLVM code can make calls to C++ code.

46

slide-47
SLIDE 47

LLVM

47

C Frontend Common Optimizer X86 Backend Fortran Frontend Ada Frontend PowerPC Backend ARM Backend

Intermediate Representation (IR)

Source: The Architecture of Open Source Applications

slide-48
SLIDE 48

PUSH-BASED EXECUTION

48

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Generated Query Plan

for t in A: if t.val == 123: Materialize t in HashTable ⨝(A.id=C.a_id) for t in B: if t.val == <param> + 1: Aggregate t in HashTable Γ(B.id) for t in Γ(B.id): Materialize t in HashTable ⨝(B.id=C.b_id) for t3 in C: for t2 in ⨝(B.id=C.b_id): for t1 in ⨝(A.id=C.a_id): emit(t1⨝t2⨝t3)

slide-49
SLIDE 49

PUSH-BASED EXECUTION

49

SELECT * FROM A, C, (SELECT B.id, COUNT(*) FROM B WHERE B.val = ? + 1 GROUP BY B.id) AS B WHERE A.val = 123 AND A.id = C.a_id AND B.id = C.b_id

Generated Query Plan

for t in A: if t.val == 123: Materialize t in HashTable ⨝(A.id=C.a_id) for t in B: if t.val == <param> + 1: Aggregate t in HashTable Γ(B.id) for t in Γ(B.id): Materialize t in HashTable ⨝(B.id=C.b_id) for t3 in C: for t2 in ⨝(B.id=C.b_id): for t1 in ⨝(A.id=C.a_id): emit(t1⨝t2⨝t3)

#1 #4 #2 #3

slide-50
SLIDE 50

QUERY COMPILATION EVALUATION

50

1 10 100 1000 10000 100000

Q1 Q2 Q3 Q4 Q5

Execution Time (ms)

HyPer (LLVM) HyPer (C++) VectorWise MonetDB ???

Dual Socket Intel Xeon X5770 @ 2.93GHz TPC-H Queries

Source: Thomas Neumann

slide-51
SLIDE 51

QUERY COMPILATION COST

51

274 403 619 13 37 15

200 400 600 800

Query #1 Query #2 Query #3

Compilation Time (ms)

HIQUE HyPer

HIQUE (-O2) vs. HyPer TPC-H Queries

Source: Konstantinos Krikellas

slide-52
SLIDE 52

QUERY COMPILATION COST

LLVM's compilation time grows super-linearly relative to the query size.

→ # of joins → # of predicates → # of aggregations

Not a big issue with OLTP applications. Major problem with OLAP workloads.

52

slide-53
SLIDE 53

HYPER – ADAPTIVE EXECUTION

First generate the LLVM IR for the query. Then execute that IR in an interpreter. Compile the query in the background. When the compiled query is ready, seamlessly replace the interpretive execution.

53

ADAPTIVE EXECUTION OF COMPILED QUERIES ICDE 2018

slide-54
SLIDE 54

HYPER – ADAPTIVE EXECUTION

54

slide-55
SLIDE 55

HYPER – ADAPTIVE EXECUTION

55

Optimizer

(0.2 ms) SQL Query

slide-56
SLIDE 56

HYPER – ADAPTIVE EXECUTION

56

Optimizer

(0.2 ms) SQL Query

Code Generator

(0.7 ms) Query Plan

slide-57
SLIDE 57

HYPER – ADAPTIVE EXECUTION

57

Optimizer

(0.2 ms) SQL Query

Code Generator

(0.7 ms) Query Plan

slide-58
SLIDE 58

HYPER – ADAPTIVE EXECUTION

58

Optimizer

(0.2 ms) SQL Query

Code Generator

(0.7 ms) Query Plan

Byte Code Compiler

(0.4 ms) LLVM IR

slide-59
SLIDE 59

HYPER – ADAPTIVE EXECUTION

59

Optimizer

(0.2 ms)

Byte Code

SQL Query

Code Generator

(0.7 ms) Query Plan

Byte Code Compiler

(0.4 ms) LLVM IR

slide-60
SLIDE 60

HYPER – ADAPTIVE EXECUTION

60

Optimizer

(0.2 ms)

Byte Code

SQL Query

Code Generator

(0.7 ms) Query Plan

Byte Code Compiler

(0.4 ms)

Unoptimized LLVM Compiler

(6 ms) LLVM IR LLVM IR

x86 Code

slide-61
SLIDE 61

HYPER – ADAPTIVE EXECUTION

61

Optimizer

(0.2 ms)

Byte Code

SQL Query

Code Generator

(0.7 ms) Query Plan

LLVM Passes

(25 ms)

Byte Code Compiler

(0.4 ms)

Unoptimized LLVM Compiler

(6 ms) LLVM IR LLVM IR LLVM IR

x86 Code

slide-62
SLIDE 62

HYPER – ADAPTIVE EXECUTION

62

Optimizer

(0.2 ms)

Byte Code

SQL Query

Code Generator

(0.7 ms) Query Plan

LLVM Passes

(25 ms)

Byte Code Compiler

(0.4 ms)

Unoptimized LLVM Compiler

(6 ms)

Optimized LLVM Compiler

(17 ms) LLVM IR LLVM IR LLVM IR LLVM IR

x86 Code x86 Code

slide-63
SLIDE 63

REAL-WORLD IMPLEMENTATIONS

IBM System R Oracle Microsoft Hekaton Cloudera Impala Actian Vector

63

MemSQL VitesseDB Apache Spark Peloton

slide-64
SLIDE 64

IBM SYSTEM R

A primitive form of code generation and query compilation was used by IBM in 1970s.

→ Compiled SQL statements into assembly code by selecting code templates for each operator.

Technique was abandoned when IBM built DB2:

→ High cost of external function calls → Poor portability → Software engineer complications

64

A HISTORY AND EVALUATION OF SYSTEM R COMMUNICATIONS OF THE ACM 1981

slide-65
SLIDE 65

ORACLE

Convert PL/SQL stored procedures into Pro*C code and then compiled into native C/C++ code. They also put Oracle-specific operations directly in the SPARC chips as co-processors.

→ Memory Scans → Bit-pattern Dictionary Compression → Vectorized instructions designed for DBMSs → Security/encryption

65

slide-66
SLIDE 66

MICROSOFT HEKATON

Can compile both procedures and SQL.

→ Non-Hekaton queries can access Hekaton tables through compiled inter-operators.

Generates C code from an imperative syntax tree, compiles it into DLL, and links at runtime. Employs safety measures to prevent somebody from injecting malicious code in a query.

66

COMPILATION IN THE MICROSOFT SQL SERVER HEKATON ENGINE IEEE DATA ENGINEERING BULLETIN 2011

slide-67
SLIDE 67

CLOUDERA IMPALA

LLVM JIT compilation for predicate evaluation and record parsing.

→ Not sure if they are also doing operator compilation.

Optimized record parsing is important for Impala because they need to handle multiple data formats stored on HDFS.

67

IMPALA: A MODERN, OPEN-SOURCE SQL ENGINE FOR HADOOP CIDR 2015

slide-68
SLIDE 68

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc. Converts all queries into a parameterized form and caches the compiled query plan.

68

slide-69
SLIDE 69

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc. Converts all queries into a parameterized form and caches the compiled query plan.

69

SELECT * FROM A WHERE A.id = 123

slide-70
SLIDE 70

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc. Converts all queries into a parameterized form and caches the compiled query plan.

70

SELECT * FROM A WHERE A.id = ? SELECT * FROM A WHERE A.id = 123

slide-71
SLIDE 71

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc. Converts all queries into a parameterized form and caches the compiled query plan.

71

SELECT * FROM A WHERE A.id = ? SELECT * FROM A WHERE A.id = 123

slide-72
SLIDE 72

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc. Converts all queries into a parameterized form and caches the compiled query plan.

72

SELECT * FROM A WHERE A.id = ? SELECT * FROM A WHERE A.id = 123 SELECT * FROM A WHERE A.id = 456

slide-73
SLIDE 73

MEMSQL (2016–PRESENT)

A query plan is converted into an imperative plan expressed in a high-level imperative DSL.

→ MemSQL Programming Language (MPL) → Think of this as a C++ dialect.

The DSL then gets converted into a second language of opcodes.

→ MemSQL Bit Code (MBC) → Think of this as JVM byte code.

Finally the DBMS compiles the opcodes into LLVM IR and then to native code.

73

Source: Drew Paroski

slide-74
SLIDE 74

APACHE SPARK

Introduced in the new Tungsten engine in 2015. The system converts a query's WHERE clause expression trees into ASTs. It then compiles these ASTs to generate JVM bytecode, which is then executed natively.

74

SPARK SQL: RELATIONAL DATA PROCESSING IN SPARK SIGMOD 2015

slide-75
SLIDE 75

PELOTON

Full compilation of the entire query plan. Relax the pipeline breakers of HyPer to create mini-batches for operators that can be vectorized. Use software pre-fetching to hide memory stalls.

75

RELAXED OPERATOR FUSION FOR IN-MEMORY DATABASES: MAKING COMPILATION, VECTORIZATION, AND PREFETCHING WORK TOGETHER AT LAST VLDB 2017

slide-76
SLIDE 76

PELOTON

76

88147 26350 87473 9960 21500

901 1396 2641 383 540 892 846 1763 191 220

1 10 100 1000 10000 100000

Q1 Q3 Q13 Q14 Q19

Execution Time (ms)

Interpreted LLVM LLVM + ROF

Dual Socket Intel Xeon E5-2630v4 @ 2.20GHz TPC-H 10 GB Database

Source: Prashanth Menon

slide-77
SLIDE 77

PARTING THOUGHTS

Query compilation makes a difference but is non- trivial to implement. The 2016 version of MemSQL is the best query compilation implementation out there. Hekaton is very good too. Any new DBMS that wants to compete has to implement query compilation.

77

slide-78
SLIDE 78

NEXT CLASS

Query Optimization

78