15-721 ADVANCED DATABASE SYSTEMS Lecture #18 Parallel Join - PowerPoint PPT Presentation

15-721 ADVANCED DATABASE SYSTEMS Lecture #18 – Parallel Join Algorithms (Hashing) Andy Pavlo / / Carnegie Mellon University / / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017

2 TODAY’S AGENDA Background Parallel Hash Join Hash Functions Hash Table Implementations Evaluation CMU 15-721 (Spring 2017)

3 PARALLEL JOIN ALGORITHMS Perform a join between two relations on multiple threads simultaneously to speed up operation. Two main approaches: → Hash Join → Sort-Merge Join We won’t discuss nested-loop joins… CMU 15-721 (Spring 2017)

4 OBSERVATION Many OLTP DBMSs don’t implement hash join. But a index nested-loop join with a small number of target tuples is more or less equivalent to a hash join. CMU 15-721 (Spring 2017)

5 HASHING VS. SORTING 1970s – Sorting 1980s – Hashing 1990s – Equivalent 2000s – Hashing 2010s – ??? CMU 15-721 (Spring 2017)

6 PARALLEL JOIN ALGORITHMS SO SORT VS. S. HASH SH REV EVISI SITED ED: FAST ST JOIN I JO N IMPLEMENT NTATION O N ON N MO MODERN MUL MULTI-CO CORE CPU CPUS VLDB 2009 → Hashing is faster than Sort-Merge. → Sort-Merge will be faster with wider SIMD. MASSI SSIVEL ELY PARA RALLEL S SORT ORT- MAI AIN-MEMORY ORY HASH JOI OINS ON ON MERG RGE JOI OINS I IN MAIN MEMORY ORY MULTI-CO MUL CORE CPU CPUS: : TUNING T TO MULTI-CORE D MUL E DATABASE SY SE SYST STEM EMS THE U UNDERL RLYI YING H HARD RDWARE RE VLDB 2012 ICDE 2013 → Sort-Merge is already faster, → New optimizations and results even without SIMD. for Radix Hash Join. Source: Cagri Balkesen CMU 15-721 (Spring 2017)

7 JOIN ALGORITHM DESIGN GOALS Goal #1: Minimize Synchronization → Avoid taking latches during execution. Goal #2: Minimize CPU Cache Misses → Ensure that data is always local to worker thread. CMU 15-721 (Spring 2017)

8 IMPROVING CACHE BEHAVIOR Factors that affect cache misses in a DBMS: → Cache + TLB capacity. → Locality (temporal and spatial). Non-Random Access (Scan): → Clustering to a cache line. → Execute more operations per cache line. Random Access (Lookups): → Partition data to fit in cache + TLB. Source: Johannes Gehrke CMU 15-721 (Spring 2017)

9 PARALLEL HASH JOINS Hash join is the most important operator in a DBMS for OLAP workloads. It’s important that we speed it up by taking advantage of multiple cores. → We want to keep all of the cores busy, without becoming memory bound DESIGN AND EVALUATION OF MAIN MEMORY HASH JOIN ALGORITHMS FOR MULTI-CORE CPUS SIGMOD 2011 CMU 15-721 (Spring 2017)

10 CLOUDERA IMPALA % of Total CPU Time Spent in Query Operators Workload: TPC-H Benchmark HASH JOIN 25.0% SEQ SCAN 49.6% UNION 3.1% AGGREGATE 19.9% OTHER 2.4% CMU 15-721 (Spring 2017)

11 HASH JOIN (R ⨝ S) Phase #1: Partition ( optional ) → Divide the tuples of R and S into sets using a hash on the join key. Phase #2: Build → Scan relation R and create a hash table on join key. Phase #3: Probe → For each tuple in S , look up its join key in hash table for R . If a match is found, output combined tuple. CMU 15-721 (Spring 2017)

12 PARTITION PHASE Split the input relations into partitioned buffers by hashing the tuples’ join key(s). → The hash function used for this phase should be different than the one used in the build phase. → Ideally the cost of partitioning is less than the cost of cache misses during build phase. Contents of buffers depends on storage model: → NSM : Either the entire tuple or a subset of attributes. → DSM : Only the columns needed for the join + offset. CMU 15-721 (Spring 2017)

13 PARTITION PHASE Approach #1: Non-Blocking Partitioning → Only scan the input relation once. → Produce output incrementally. Approach #2: Blocking Partitioning (Radix) → Scan the input relation multiple times. → Only materialize results all at once. CMU 15-721 (Spring 2017)

14 NON-BLOCKING PARTITIONING Scan the input relation only once and generate the output on-the-fly. Approach #1: Shared Partitions → Single global set of partitions that all threads update. → Have to use a latch to synchronize threads. Approach #2: Private Partitions → Each thread has its own set of partitions. → Have to consolidate them after all threads finish. CMU 15-721 (Spring 2017)

15 SHARED PARTITIONS Data Table A B C CMU 15-721 (Spring 2017)

15 SHARED PARTITIONS Data Table hash P (key) A B C # p # p # p CMU 15-721 (Spring 2017)

15 SHARED PARTITIONS Data Table Partitions hash P (key) A B C P 1 # p P 2 # p ⋮ # p P n CMU 15-721 (Spring 2017)

16 PRIVATE PARTITIONS Data Table Partitions hash P (key) A B C # p # p # p CMU 15-721 (Spring 2017)

16 PRIVATE PARTITIONS Data Table Partitions Combined hash P (key) A B C P 1 # p P 2 # p ⋮ # p P n CMU 15-721 (Spring 2017)

17 RADIX PARTITIONING Scan the input relation multiple times to generate the partitions. Multi-step pass over the relation: → Step #1: Scan R and compute a histogram of the # of tuples per hash key for the radix at some offset. → Step #2: Use this histogram to determine output offsets by computing the prefix sum . → Step #3: Scan R again and partition them according to the hash key. CMU 15-721 (Spring 2017)

18 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 CMU 15-721 (Spring 2017)

18 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 Radix 9 2 3 8 1 4 CMU 15-721 (Spring 2017)

18 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 Radix 8 1 2 0 4 6 CMU 15-721 (Spring 2017)

19 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , …, x n ) is a second sequence of numbers ( y 0 , y1, …, y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 CMU 15-721 (Spring 2017)

19 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , …, x n ) is a second sequence of numbers ( y 0 , y1, …, y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 Prefix Sum 1 CMU 15-721 (Spring 2017)

19 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , …, x n ) is a second sequence of numbers ( y 0 , y1, …, y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 + Prefix Sum 1 3 CMU 15-721 (Spring 2017)

19 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , …, x n ) is a second sequence of numbers ( y 0 , y1, …, y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 + + + + + Prefix Sum 1 3 6 10 15 21 CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 hash P (key) # 1 9 p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 hash P (key) # 1 9 p Partition 0: 2 # 0 7 Partition 1: 2 p # 0 3 p # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Step #2: Compute output offsets Partition 0 , CPU 0 # 0 7 p # 1 8 p 0 Partition 0, CPU 1 hash P (key) # 1 9 p Partition 0: 2 Partition 1 , CPU 0 # 0 7 Partition 1: 2 p # 0 3 p Partition 1, CPU 1 # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Step #3: Read input and partition Partition 0 , CPU 0 # 0 7 0 7 p # 1 8 p 0 Partition 0, CPU 1 hash P (key) # 1 9 0 3 p Partition 0: 2 Partition 1 , CPU 0 # 0 7 Partition 1: 2 p # 0 3 p Partition 1, CPU 1 # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Step #3: Read input and partition Partition 0 , CPU 0 # 0 7 0 7 p # 1 8 0 7 p 0 Partition 0, CPU 1 hash P (key) # 1 9 0 3 p Partition 0: 2 Partition 1 , CPU 0 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p Partition 1, CPU 1 # 1 1 1 1 p 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

20 RADIX PARTITIONS Partition 0 # 0 7 0 7 p # 1 8 0 7 p 0 hash P (key) # 1 9 0 3 p Partition 0: 2 Partition 1 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas CMU 15-721 (Spring 2017)

15-721 ADVANCED DATABASE SYSTEMS Lecture #18 Parallel Join - PowerPoint PPT Presentation

15-721 ADVANCED DATABASE SYSTEMS Lecture #18 Parallel Join Algorithms (Hashing) Andy Pavlo / / Carnegie Mellon University / / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017 2 TODAYS AGENDA Background Parallel

ADVANCED DATABASE SYSTEMS Server-side Logic Execution @ Andy_Pavlo // 15- 721 // Spring 2019

Homework Assignment: 5 11-721: Grammars and Lexicons 11-721: Grammars and Lexicons Fall 2007

ADVANCED DATABASE SYSTEMS Networking @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721

ADVANCED DATABASE SYSTEMS History of Databases @ Andy_Pavlo // 15- 721 // Spring 2020 2

ADVANCED DATABASE SYSTEMS Vectorized Execution @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

ADVANCED DATABASE SYSTEMS OLTP Indexes (Trie Data Structures) @ Andy_Pavlo // 15- 721 //

ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 //

ADVANCED DATABASE SYSTEMS Database Compression @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

15-721 DATABASE SYSTEMS [Source] Lecture #08 Indexing (OLAP) Andy Pavlo / / Carnegie

ADVANCED DATABASE SYSTEMS Vectorization vs. Compilation @ Andy_Pavlo // 15- 721 // Spring

ADVANCED DATABASE SYSTEMS Storage Models & Data Layout @ Andy_Pavlo // 15- 721 // Spring

15-721 DATABASE SYSTEMS Lecture #09 Storage Models & Data Layout Andy Pavlo / /

15-721 DATABASE SYSTEMS [Source] Lecture #03 Concurrency Control Part I Andy Pavlo / /

Lect ure # 03 ADVANCED DATABASE SYSTEMS Query Compilation @ Andy_Pavlo // 15- 721 // Spring

15-721 DATABASE SYSTEMS [Source] Lecture #04 Concurrency Control Part II Andy Pavlo / /

ADVANCED DATABASE SYSTEMS Recovery Protocols @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

Smoke-free Foster Care: Tales from the Field June 24, 2014 Tobacco Control Legal Consortium

Topic #3 CS162 Topic #3 1 Topic #3 Abstract Data Types Introduction to...Object Models

Simple Arrays Short introduction to processing data Recording and storing sound Analog

Verification of blockchains and smart contracts Formal Methods Update, 2018 BITS Pilani Goa

Responding to the Housing Challenges Posed by the Pandemic Presenters Call llie S Selt

Rediscovering Structural Change: Africa has the largest differences in productivity Differences

Private Sector Resources for Family Planning Pamela Riley Strengthening Health Outcomes through

Coarse-Grained Parallelism Variable Privatization, Loop Alignment, Loop Fusion, Loop

15-721 ADVANCED DATABASE SYSTEMS Lecture #18 Parallel Join - PowerPoint PPT Presentation

15-721 ADVANCED DATABASE SYSTEMS Lecture #18 Parallel Join Algorithms (Hashing) Andy Pavlo / / Carnegie Mellon University / / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017 2 TODAYS AGENDA Background Parallel

ADVANCED DATABASE SYSTEMS Server-side Logic Execution @ Andy_Pavlo // 15- 721 // Spring 2019

Homework Assignment: 5 11-721: Grammars and Lexicons 11-721: Grammars and Lexicons Fall 2007

ADVANCED DATABASE SYSTEMS Networking @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721

ADVANCED DATABASE SYSTEMS History of Databases @ Andy_Pavlo // 15- 721 // Spring 2020 2

ADVANCED DATABASE SYSTEMS Vectorized Execution @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

ADVANCED DATABASE SYSTEMS OLTP Indexes (Trie Data Structures) @ Andy_Pavlo // 15- 721 //

ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 //

ADVANCED DATABASE SYSTEMS Database Compression @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

15-721 DATABASE SYSTEMS [Source] Lecture #08 Indexing (OLAP) Andy Pavlo / / Carnegie

ADVANCED DATABASE SYSTEMS Vectorization vs. Compilation @ Andy_Pavlo // 15- 721 // Spring

ADVANCED DATABASE SYSTEMS Storage Models &amp; Data Layout @ Andy_Pavlo // 15- 721 // Spring

15-721 DATABASE SYSTEMS Lecture #09 Storage Models &amp; Data Layout Andy Pavlo / /

15-721 DATABASE SYSTEMS [Source] Lecture #03 Concurrency Control Part I Andy Pavlo / /

Lect ure # 03 ADVANCED DATABASE SYSTEMS Query Compilation @ Andy_Pavlo // 15- 721 // Spring

15-721 DATABASE SYSTEMS [Source] Lecture #04 Concurrency Control Part II Andy Pavlo / /

ADVANCED DATABASE SYSTEMS Recovery Protocols @ Andy_Pavlo // 15- 721 // Spring 2019 CMU

Smoke-free Foster Care: Tales from the Field June 24, 2014 Tobacco Control Legal Consortium

Topic #3 CS162 Topic #3 1 Topic #3 Abstract Data Types Introduction to...Object Models

Simple Arrays Short introduction to processing data Recording and storing sound Analog

Verification of blockchains and smart contracts Formal Methods Update, 2018 BITS Pilani Goa

Responding to the Housing Challenges Posed by the Pandemic Presenters Call llie S Selt

Rediscovering Structural Change: Africa has the largest differences in productivity Differences

Private Sector Resources for Family Planning Pamela Riley Strengthening Health Outcomes through

Coarse-Grained Parallelism Variable Privatization, Loop Alignment, Loop Fusion, Loop

ADVANCED DATABASE SYSTEMS Storage Models & Data Layout @ Andy_Pavlo // 15- 721 // Spring

15-721 DATABASE SYSTEMS Lecture #09 Storage Models & Data Layout Andy Pavlo / /