database system implementation
play

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - PowerPoint PPT Presentation

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #20: PARALLEL JOIN ALGORITHMS (HASHING) 2 ANATOMY OF A DATABASE SYSTEM Process Manager Connection Manager + Admission Control Query Parser Query Processor


  1. 38 RADIX PARTITIONING Scan the input relation multiple times to generate the partitions. Multi-step pass over the relation: → Step #1: Scan R and compute a histogram of the # of tuples per hash key for the radix at some offset. → Step #2: Use this histogram to determine output offsets by computing the prefix sum . → Step #3: Scan R again and partition them according to the hash key.

  2. 39 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64

  3. 40 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64

  4. 41 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 Radix 9 2 3 8 1 4

  5. 42 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 Radix

  6. 43 RADIX The radix is the value of an integer at a particular position (using its base). Input 89 12 23 08 41 64 Radix 8 1 2 0 4 6

  7. 44 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , … , x n ) is a second sequence of numbers ( y 0 , y1, … , y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6

  8. 45 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , … , x n ) is a second sequence of numbers ( y 0 , y1, … , y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 Prefix Sum 1

  9. 46 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , … , x n ) is a second sequence of numbers ( y 0 , y1, … , y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 + Prefix Sum 1

  10. 47 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , … , x n ) is a second sequence of numbers ( y 0 , y1, … , y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 + Prefix Sum 1 3

  11. 48 PREFIX SUM The prefix sum of a sequence of numbers ( x 0 , x 1 , … , x n ) is a second sequence of numbers ( y 0 , y1, … , y n ) that is a running total of the input sequence. Input 1 2 3 4 5 6 + + + + + Prefix Sum 1 3 6 10 15 21

  12. 49 RADIX PARTITIONS # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  13. 50 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  14. 51 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  15. 52 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  16. 53 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  17. 54 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p # 0 7 p # 0 3 p # 1 1 p 1 # 1 5 p # 1 0 p Source: Spyros Blanas

  18. 55 RADIX PARTITIONS Step #1: Inspect input, create histograms # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p Partition 0: 2 # 0 7 Partition 1: 2 p # 0 3 p # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas

  19. 56 RADIX PARTITIONS Step #2: Compute output offsets # 0 7 p # 1 8 p 0 # 1 9 hash P (key) p Partition 0: 2 # 0 7 Partition 1: 2 p # 0 3 p # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas

  20. 57 RADIX PARTITIONS Step #2: Compute output offsets Partition 0 , CPU 0 # 0 7 p # 1 8 p 0 Partition 0, CPU 1 # 1 9 hash P (key) p Partition 0: 2 Partition 1 , CPU 0 # 0 7 Partition 1: 2 p # 0 3 p Partition 1, CPU 1 # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas

  21. 58 RADIX PARTITIONS Step #3: Read input and partition Partition 0 , CPU 0 # 0 7 p # 1 8 p 0 Partition 0, CPU 1 # 1 9 hash P (key) p Partition 0: 2 Partition 1 , CPU 0 # 0 7 Partition 1: 2 p # 0 3 p Partition 1, CPU 1 # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas

  22. 59 RADIX PARTITIONS Step #3: Read input and partition Partition 0 , CPU 0 # 0 7 0 7 p # 1 8 p 0 Partition 0, CPU 1 # 1 9 0 3 hash P (key) p Partition 0: 2 Partition 1 , CPU 0 # 0 7 Partition 1: 2 p # 0 3 p Partition 1, CPU 1 # 1 1 p 1 # 1 5 p Partition 0: 1 # 1 0 Partition 1: 3 p Source: Spyros Blanas

  23. 60 RADIX PARTITIONS Step #3: Read input and partition Partition 0 , CPU 0 # 0 7 0 7 p # 1 8 0 7 p 0 Partition 0, CPU 1 # 1 9 0 3 hash P (key) p Partition 0: 2 Partition 1 , CPU 0 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p Partition 1, CPU 1 # 1 1 1 1 p 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  24. 61 RADIX PARTITIONS Partition 0 # 0 7 0 7 p # 1 8 0 7 p 0 # 1 9 0 3 hash P (key) p Partition 0: 2 Partition 1 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  25. 62 RADIX PARTITIONS Recursively repeat until target number of partitions have been created Partition 0 # 0 7 0 7 p # 1 8 0 7 p 0 # 1 9 0 3 hash P (key) p Partition 0: 2 Partition 1 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  26. 63 RADIX PARTITIONS Recursively repeat until target number of partitions have been created # 0 7 0 7 p # 1 8 0 7 p 0 0 # 1 9 0 3 hash P (key) p Partition 0: 2 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  27. 64 RADIX PARTITIONS Recursively repeat until target number of partitions have been created # 0 7 0 7 p # 1 8 0 7 p 0 0 # 1 9 0 3 hash P (key) p Partition 0: 2 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  28. 65 RADIX PARTITIONS Recursively repeat until target number of partitions have been created # 0 7 0 7 p # 1 8 0 7 p 0 0 # 1 9 0 3 hash P (key) p Partition 0: 2 # 0 7 1 8 Partition 1: 2 p # 0 3 1 9 p # 1 1 1 1 p 1 1 # 1 5 1 5 p Partition 0: 1 # 1 0 1 0 Partition 1: 3 p Source: Spyros Blanas

  29. 66 BUILD PHASE The threads are then to scan either the tuples (or partitions) of R . For each tuple, hash the join key attribute for that tuple and add it to the appropriate bucket in the hash table. → The buckets should only be a few cache lines in size.

  30. 67 HASH TABLE Design Decision #1: Hash Function → How to map a large key space into a smaller domain. → Trade-off between being fast vs. collision rate. Design Decision #2: Hashing Scheme → How to handle key collisions after hashing. → Trade-off between allocating a large hash table vs. additional instructions to find/insert keys.

  31. 68 HASH FUNCTIONS We don’t want to use a cryptographic hash function for our join algorithm. We want something that is fast and will have a low collision rate.

  32. 69 HASH FUNCTIONS MurmurHash (2008) → Designed to a fast, general purpose hash function. Google CityHash (2011) → Based on ideas from MurmurHash2 → Designed to be faster for short keys (<64 bytes). Google FarmHash (2014) → Newer version of CityHash with better collision rates. CLHash (2016) → Fast hashing function based on carry-less multiplication.

  33. 70 HASH FUNCTION BENCHMARKS Intel Core i7-8700K @ 3.70GHz std::hash MurmurHash3 CityHash FarmHash CLHash 18000 Throughput (MB/sec) 12000 6000 0 1 51 101 151 201 251 Key Size (bytes) Source: Fredrik Widlund

  34. 71 HASH FUNCTION BENCHMARKS Intel Core i7-8700K @ 3.70GHz std::hash MurmurHash3 CityHash FarmHash CLHash 18000 64 32 192 Throughput (MB/sec) 128 12000 6000 0 1 51 101 151 201 251 Key Size (bytes) Source: Fredrik Widlund

  35. 72 HASH FUNCTION BENCHMARKS Intel Core i7-8700K @ 3.70GHz std::hash MurmurHash3 CityHash FarmHash CLHash 192 36000 128 Throughput (MB/sec) 24000 64 32 12000 0 1 51 101 151 201 251 Key Size (bytes) Source: Fredrik Widlund

  36. 73 HASHING SCHEMES Approach #1: Chained Hashing Approach #2: Linear Hashing Approach #3: Robin Hood Hashing Approach #4: Cuckoo Hashing

  37. 74 CHAINED HASHING Maintain a linked list of “buckets” for each slot in the hash table. Resolve collisions by placing all elements with the same hash key into the same bucket. → To determine whether an element is present, hash to its bucket and scan for it. → Insertions and deletions are generalizations of lookups.

  38. 75 CHAINED HASHING hash(key) Ø ⋮ ⋮

  39. 76 LINEAR HASHING Single giant table of slots. Resolve collisions by linearly searching for the next free slot in the table. → To determine whether an element is present, hash to a location in the table and scan for it. → Have to store the key in the table to know when to stop scanning. → Insertions are generalizations of lookups.

  40. 77 LINEAR HASHING hash(key) A B C D E F

  41. 78 LINEAR HASHING hash(key) A B | A hash(A) C D E F

  42. 79 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C D E F

  43. 80 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C D E F

  44. 81 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C | C hash(C) D E F

  45. 82 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C | C hash(C) D | D E hash(D) F

  46. 83 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C | C hash(C) D | D E hash(D) F

  47. 84 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C | C hash(C) D | D E hash(D) F | E hash(E)

  48. 85 LINEAR HASHING hash(key) | B hash(B) A B | A hash(A) C | C hash(C) D | D E hash(D) F | E hash(E) | F hash(F)

  49. 86 OBSERVATION To reduce the # of wasteful comparisons during the join, it is important to avoid collisions of hashed keys. This requires a chained hash table with ~2x the number of slots as the # of elements in R .

  50. 87 ROBIN HOOD HASHING Variant of linear hashing that steals slots from "rich" keys and give them to "poor" keys. → Each key tracks the number of positions they are from where its optimal position in the table. → On insert, a key takes the slot of another key if the first key is farther away from its optimal position than the second key. RO ROBIN N HOOD HASHING NG Foundations of Computer Science 1985

  51. 88 ROBIN HOOD HASHING hash(key) A B C D E F

  52. 89 ROBIN HOOD HASHING hash(key) A B | A [0] hash(A) C D E F

  53. 90 ROBIN HOOD HASHING hash(key) A B | A [0] # of "Jumps" From First Position hash(A) C D E F

  54. 91 ROBIN HOOD HASHING hash(key) A B | A [0] hash(A) C D E F

  55. 92 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) C D E F

  56. 93 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) A[0] == C[0] C D E F

  57. 94 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) A[0] == C[0] C | C [1] hash(C) D E F

  58. 95 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) C | C [1] C[1] > D[0] hash(C) D E F

  59. 96 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) C | C [1] C[1] > D[0] hash(C) D | D [1] E hash(D) F

  60. 97 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) C | C [1] hash(C) D | D [1] E hash(D) F

  61. 98 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) A[0] == E[0] C | C [1] hash(C) D | D [1] E hash(D) F

  62. 99 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) A[0] == E[0] C | C [1] C[1] == E[1] hash(C) D | D [1] E hash(D) F

  63. 100 ROBIN HOOD HASHING hash(key) | B [0] hash(B) A B | A [0] hash(A) A[0] == E[0] C | C [1] C[1] == E[1] hash(C) D | D [1] D[1] < E[2] E hash(D) F

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend