15-721 DATABASE SYSTEMS Lecture #10 Query Execution & - - PowerPoint PPT Presentation

15 721
SMART_READER_LITE
LIVE PREVIEW

15-721 DATABASE SYSTEMS Lecture #10 Query Execution & - - PowerPoint PPT Presentation

15-721 DATABASE SYSTEMS Lecture #10 Query Execution & Scheduling Andy Pavlo / / Carnegie Mellon University / / Spring 2016 2 TODAYS AGENDA Process Models Query Parallelization Data Placement Scheduling 10 Crack Commandments


slide-1
SLIDE 1

Andy Pavlo / / Carnegie Mellon University / / Spring 2016

Lecture #10 – Query Execution & Scheduling

DATABASE SYSTEMS

15-721

slide-2
SLIDE 2

CMU 15-721 (Spring 2016)

TODAY’S AGENDA

Process Models Query Parallelization Data Placement Scheduling 10 Crack Commandments

2

slide-3
SLIDE 3

CMU 15-721 (Spring 2016)

MULTI-USER DATABASE APP STACK

3

Server Client End Users

slide-4
SLIDE 4

CMU 15-721 (Spring 2016)

MULTI-USER DATABASE APP STACK

3

Server Client End Users

SQL PL/SQL

slide-5
SLIDE 5

CMU 15-721 (Spring 2016)

MULTI-USER DATABASE APP STACK

3

Server Client End Users

REST SOAP SQL PL/SQL

slide-6
SLIDE 6

CMU 15-721 (Spring 2016)

MULTI-USER DATABASE APP STACK

3

Server Client End Users

REST SOAP SQL PL/SQL BACK-END APPLICATION FRONT-END APPLICATION DBMS

slide-7
SLIDE 7

CMU 15-721 (Spring 2016)

QUERY EXECUTION

A query plan is comprised of operators. An operator instance is an invocation of an

  • perator on some segment of data.

A task is the execution of a sequence of one or more operator instances.

4

slide-8
SLIDE 8

CMU 15-721 (Spring 2016)

PROCESS MODEL

A DBMS’s process model defines how the system is architected to support concurrent requests from a multi-user application. A worker is the DBMS component that is responsible for executing tasks on behalf of the client and returning the results.

5

ARCHITECTURE OF A DATABASE SYSTEM Foundations and Trends in Databases 2007

slide-9
SLIDE 9

CMU 15-721 (Spring 2016)

PROCESS MODELS

Approach #1: Process per DBMS Worker Approach #2: Process Pool Approach #3: Thread per DBMS Worker

6

slide-10
SLIDE 10

CMU 15-721 (Spring 2016)

PROCESS PER WORKER

Each worker is a separate OS process.

→ Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle

7

Dispatcher Worker

slide-11
SLIDE 11

CMU 15-721 (Spring 2016)

PROCESS PER WORKER

Each worker is a separate OS process.

→ Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle

7

Dispatcher Worker

slide-12
SLIDE 12

CMU 15-721 (Spring 2016)

PROCESS PER WORKER

Each worker is a separate OS process.

→ Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle

7

Dispatcher Worker

slide-13
SLIDE 13

CMU 15-721 (Spring 2016)

PROCESS PER WORKER

Each worker is a separate OS process.

→ Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle

7

Dispatcher Worker

slide-14
SLIDE 14

CMU 15-721 (Spring 2016)

PROCESS PER WORKER

Each worker is a separate OS process.

→ Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle

7

Dispatcher Worker

slide-15
SLIDE 15

CMU 15-721 (Spring 2016)

PROCESS POOL

A worker uses any process that is free in a pool

→ Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2

8

Worker Pool Dispatcher

slide-16
SLIDE 16

CMU 15-721 (Spring 2016)

PROCESS POOL

A worker uses any process that is free in a pool

→ Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2

8

Worker Pool Dispatcher

slide-17
SLIDE 17

CMU 15-721 (Spring 2016)

THREAD PER WORKER

Single process with multiple worker threads.

→ DBMS has to manage its own scheduling. → May or may not use a dispatcher thread. → Examples: IBM DB2, MSSQL, MySQL, Oracle (Newer)

9

Worker Threads

slide-18
SLIDE 18

CMU 15-721 (Spring 2016)

PROCESS MODELS

Using a multi-threaded architecture has several advantages:

→ Less overhead per context switch. → Don’t have to manage shared memory.

The thread per worker model does not mean that you have intra-query parallelism. I am not aware of any new DBMS built in the last 7-8 years that doesn’t use threads.

10

slide-19
SLIDE 19

CMU 15-721 (Spring 2016)

SCHEDULING

For each query plan, the DBMS has to decide where, when, and how to execute it.

→ How many tasks should it use? → How many CPU cores should it use? → What CPU core should the tasks execute on? → Where should a task store its output?

The DBMS always knows more than the OS.

11

slide-20
SLIDE 20

CMU 15-721 (Spring 2016)

INTER-QUERY PARALLELISM

Improve overall performance by allowing multiple queries to execute simultaneously.

→ Provide the illusion of isolation through concurrency control scheme.

The difficulty of implementing a concurrency control scheme is not significantly affected by the DBMS’s process model.

12

slide-21
SLIDE 21

CMU 15-721 (Spring 2016)

INTRA-QUERY PARALLELISM

Improve the performance of a single query by executing its operators in parallel. Approach #1: Intra-Operator (Horizontal)

→ Operators are decomposed into independent instances that perform the same function on different subsets of data.

Approach #2: Inter-Operator (Vertical)

→ Operations are overlapped in order to pipeline data from one stage to the next without materialization.

13

slide-22
SLIDE 22

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

slide-23
SLIDE 23

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

slide-24
SLIDE 24

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3 A B

σ

π

σ

slide-25
SLIDE 25

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

1 2 3

A B

σ

π

σ

slide-26
SLIDE 26

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

1 2 3

A B

σ

π

σ

slide-27
SLIDE 27

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

1 2 3

A B

σ

π

σ

σ σ σ

slide-28
SLIDE 28

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

1 2 3

A B

σ

π

σ

σ σ σ

slide-29
SLIDE 29

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

1 2 3

Exchange

A B

σ

π

σ

σ σ σ

slide-30
SLIDE 30

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

1 2 3

Exchange

A B

σ

π

σ

σ σ σ

slide-31
SLIDE 31

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

B1 B2

1 2 3 4 5

Exchange

A B

σ

π

σ

σ σ σ

slide-32
SLIDE 32

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

B1 B2

Build HT Build HT

1 2 3 4 5

Exchange Exchange

σ σ

A B

σ

π

σ

σ σ σ

slide-33
SLIDE 33

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

B1 B2

Build HT Build HT

1 2 3 4 5

Exchange Exchange

σ σ

A B

σ

π

σ

σ σ σ

slide-34
SLIDE 34

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

B1 B2

Build HT Build HT

1 2 3 4 5

Exchange Exchange

σ σ

A B

σ

π

σ

σ σ σ

slide-35
SLIDE 35

CMU 15-721 (Spring 2016)

INTRA-OPERATOR PARALLELISM

14

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A2 A1 A3

Build HT Build HT Build HT

B1 B2

Build HT Build HT

1 2 3 4 5 1 2 3 4

Probe HT Probe HT Probe HT Probe HT

Exchange Exchange Exchange

σ σ

A B

σ

π

σ

σ σ σ

slide-36
SLIDE 36

CMU 15-721 (Spring 2016)

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

INTER-OPERATOR PARALLELISM

15

A B

σ

π

σ

slide-37
SLIDE 37

CMU 15-721 (Spring 2016)

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

INTER-OPERATOR PARALLELISM

15

A B

σ

π

σ

slide-38
SLIDE 38

CMU 15-721 (Spring 2016)

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

INTER-OPERATOR PARALLELISM

15

1 ⨝

for r1 ∊ outer: for r2 ∊ inner: emit(r1⨝r2)

A B

σ

π

σ

slide-39
SLIDE 39

CMU 15-721 (Spring 2016)

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

INTER-OPERATOR PARALLELISM

15

1 ⨝

for r1 ∊ outer: for r2 ∊ inner: emit(r1⨝r2)

2 π

for r ∊ incoming: emit(πr)

A B

σ

π

σ

slide-40
SLIDE 40

CMU 15-721 (Spring 2016)

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

INTER-OPERATOR PARALLELISM

15

1 ⨝

for r1 ∊ outer: for r2 ∊ inner: emit(r1⨝r2)

2 π

for r ∊ incoming: emit(πr)

A B

σ

π

σ

slide-41
SLIDE 41

CMU 15-721 (Spring 2016)

OBSERVATION

Coming up with the right number of workers to use for a query plan depends on the number

  • f CPU cores, the size of the data, and

functionality of the operators.

16

slide-42
SLIDE 42

CMU 15-721 (Spring 2016)

WORKER ALLOCATION

Approach #1: One Worker per Core

→ Each core is assigned one thread that is pinned to that core in the OS. → See sched_setaffinity

Approach #2: Multiple Workers per Core

→ Use a pool of workers per core (or per socket). → Allows CPU cores to be fully utilized in case one worker at a core blocks.

17

slide-43
SLIDE 43

CMU 15-721 (Spring 2016)

TASK ASSIGNMENT

Approach #1: Push

→ A centralized dispatcher assigns tasks to workers and monitors their progress. → When the worker notifies the dispatcher that it is finished, it is given a new task.

Approach #1: Pull

→ Workers pull the next task from a queue, process it, and then return to get the next task.

18

slide-44
SLIDE 44

CMU 15-721 (Spring 2016)

OBSERVATION

Regardless of what worker allocation or task assignment policy the DBMS uses, it’s important that workers operate on local data. The DBMS’s scheduler has to be aware of it’s underlying hardware’s memory layout.

→ Uniform vs. Non-Uniform Memory Access

19

slide-45
SLIDE 45

CMU 15-721 (Spring 2016)

UNIFORM MEMORY ACCESS

20

Bus

Cache Cache Cache Cache

slide-46
SLIDE 46

CMU 15-721 (Spring 2016)

NON-UNIFORM MEMORY ACCESS

21

Cache Cache Cache Cache

slide-47
SLIDE 47

CMU 15-721 (Spring 2016)

DATA PLACEMENT

The DBMS can partition memory for a database and assign each partition to a CPU. By controlling and tracking the location of partitions, it can schedule operators to execute

  • n workers at the closest CPU core.

See Linux’s move_pages

22

slide-48
SLIDE 48

CMU 15-721 (Spring 2016)

MEMORY ALLOCATION

What happens when the DBMS calls malloc?

→ Assume that the allocator doesn’t already have an chunk of memory that it can give out.

23

slide-49
SLIDE 49

CMU 15-721 (Spring 2016)

MEMORY ALLOCATION

What happens when the DBMS calls malloc?

→ Assume that the allocator doesn’t already have an chunk of memory that it can give out.

Actually, almost nothing:

23

slide-50
SLIDE 50

CMU 15-721 (Spring 2016)

MEMORY ALLOCATION

What happens when the DBMS calls malloc?

→ Assume that the allocator doesn’t already have an chunk of memory that it can give out.

Actually, almost nothing:

→ The allocator will extend the process’ data segment. → But this new virtual memory is not immediately backed by physical memory. → The OS only allocates physical memory when there is a page fault.

23

slide-51
SLIDE 51

CMU 15-721 (Spring 2016)

MEMORY ALLOCATION LOCATION

Now after a page fault, where does the OS allocate physical memory in a NUMA system? Approach #1: Interleaving

→ Distribute allocated memory uniformly across CPUs.

Approach #2: First-Touch

→ At the CPU of the thread that accessed the memory location that caused the page fault.

24

slide-52
SLIDE 52

CMU 15-721 (Spring 2016)

DATA PLACEMENT

25

Source: Danica Porobic

4000 8000 12000 Spread Group Mix OS

Throughput (txn/sec)

Workload: TPC-C Payment using 4 Workers Processor: NUMA with 4 sockets (6 cores each)

? ? ? ?

slide-53
SLIDE 53

CMU 15-721 (Spring 2016)

DATA PLACEMENT

25

Source: Danica Porobic

4000 8000 12000 Spread Group Mix OS

Throughput (txn/sec)

Workload: TPC-C Payment using 4 Workers Processor: NUMA with 4 sockets (6 cores each)

? ? ? ?

slide-54
SLIDE 54

CMU 15-721 (Spring 2016)

PARTITIONING VS. PLACEMENT

A partitioning scheme is used to split the database based on some policy.

→ Round-robin → Attribute Ranges → Hashing → Partial/Full Replication

A placement scheme then tells the DBMS where to put those partitions.

→ Round-robin → Interleave across cores

26

slide-55
SLIDE 55

CMU 15-721 (Spring 2016)

OBSERVATION

We have the following so far:

→ Process Model → Worker Allocation Model → Task Assignment Model → Data Placement Policy → Solid appreciation for the CMU-DB fam.

But how do we decide how to create a set of tasks from a logical query plan?

→ This is relatively easy for OLTP queries. → Much harder for OLAP queries…

27

slide-56
SLIDE 56

CMU 15-721 (Spring 2016)

STATIC SCHEDULING

The DBMS decides how many threads to use to execute the query when it generates the plan. It does not change while the query executes.

→ The easiest approach is to just use the same # of tasks as the # of cores.

28

slide-57
SLIDE 57

CMU 15-721 (Spring 2016)

MORSEL-DRIVEN SCHEDULING

Dynamic scheduling of tasks that operate over horizontal partitions called “morsels” that are distributed across cores.

→ One worker per core → Pull-based task assignment → Round-robin data placement

Supports parallel, NUMA-aware operator implementations.

29

MORSEL-DRIVEN PARALLELISM: A NUMA- AWARE QUERY EVALUATION FRAMEWORK FOR THE MANY-CORE AGE SIGMOD 2014

slide-58
SLIDE 58

CMU 15-721 (Spring 2016)

HYPER: ARCHITECTURE

No separate dispatcher thread. The threads perform cooperative scheduling for each query plan.

→ Each worker has a queue of tasks that will execute on morsels that are local to it. → It pulls the next task from a global work queue.

30

slide-59
SLIDE 59

CMU 15-721 (Spring 2016)

Data Table

HYPER: DATA PARTITIONING

31

id a1 a2 a3

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

slide-60
SLIDE 60

CMU 15-721 (Spring 2016)

Data Table

HYPER: DATA PARTITIONING

31

id a1 a2 a3

A2 A1 A3

Morsels

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

slide-61
SLIDE 61

CMU 15-721 (Spring 2016)

Data Table

HYPER: DATA PARTITIONING

31

1 2 3

id a1 a2 a3

A2 A1 A3

Morsels

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

slide-62
SLIDE 62

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-63
SLIDE 63

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-64
SLIDE 64

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-65
SLIDE 65

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-66
SLIDE 66

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-67
SLIDE 67

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-68
SLIDE 68

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-69
SLIDE 69

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-70
SLIDE 70

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-71
SLIDE 71

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-72
SLIDE 72

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-73
SLIDE 73

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-74
SLIDE 74

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-75
SLIDE 75

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-76
SLIDE 76

CMU 15-721 (Spring 2016)

Task Queues

HYPER: EXECUTION EXAMPLE

32

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100

A B

σ

π

σ

1

Morsels Local Data

2

Morsels Local Data

3

Morsels Local Data

slide-77
SLIDE 77

CMU 15-721 (Spring 2016)

MORSEL-DRIVEN SCHEDULING

Because there is only one worker per core, they have to use work stealing because

  • therwise threads could sit idle waiting for

stragglers. Uses a lock-free hash table to maintain the global work queues.

→ We will discuss hash tables next class…

33

slide-78
SLIDE 78

CMU 15-721 (Spring 2016)

HANA NUMA-AWARE SCHEDULER

Pull-based scheduling with multiple worker threads that are organized into groups (pools).

→ Each CPU can have multiple groups. → Each group has a soft and hard priority queue.

Uses a separate “watchdog” thread to check whether groups are saturated and can reassign tasks dynamically.

34

SCALING UP CONCURRENT MAIN-MEMORY COLUMN-STORE SCANS: TOWARDS ADAPTIVE NUMA-AWARE DATA AND TASK PLACEMENT VLDB 2015

slide-79
SLIDE 79

CMU 15-721 (Spring 2016)

HANA THREAD GROUPS

Each thread group has a soft and hard priority task queues.

→ Threads are allowed to steal tasks from other groups’ soft queues.

Four different pools of thread per group:

→ Working: Actively executing a task. → Inactive: Blocked inside of the kernel due to a latch. → Free: Sleeps for a little, wake up to see whether there is a new task to execute. → Parked: Like free but doesn’t wake up on its own.

35

slide-80
SLIDE 80

CMU 15-721 (Spring 2016)

HANA NUMA-AWARE SCHEDULER

Can dynamically adjust thread pinning based

  • n whether a task is CPU or memory bound.

Found that work stealing was not as beneficial for systems with a larger number of sockets. Using thread groups allows cores to execute

  • ther tasks instead of just only queries.

36

slide-81
SLIDE 81

CMU 15-721 (Spring 2016)

PARTING THOUGHTS

A DBMS is a beautiful, strong-willed independent piece of software. But it has to make sure that it uses its underlying hardware correctly.

→ Data location is an important aspect of this. → Tracking memory location in a single-node DBMS is the same as tracking shards in a distributed DBMS

Don’t let the OS ruin your life.

37

slide-82
SLIDE 82

CMU 15-721 (Spring 2016)

10 CRACK COMMANDMENTS

In 1997, Christopher Wallace wrote a prophetic list of rules to follow if you are hustling product out on the streets. Almost 20 years later, these rules are still apt for both trapping and databases.

38

Wallace

10 CRACK COMMANDMENTS BAD BOY RECORDS 1997

slide-83
SLIDE 83

CMU 15-721 (Spring 2016)

10 CRACK COMMANDMENTS

39

6. Never lend anybody credit. 7. Never mix your family with your business affairs. 8. Never keep a large amount of product on yourself. 9. Never talk to the police.

  • 10. Never take a consignment if you

do not have the clientele. 1. Never let people know how much money you have. 2. Never let people know your next move. 3. Never trust anybody. 4. Never sample your own supply. 5. Never sling where you live.

slide-84
SLIDE 84

CMU 15-721 (Spring 2016)

NEXT CLASS

Parallel Hash Joins! Project #2 Checkpoint: Monday Feb 22

40