hardware sensitive scan operator variants for compiled
play

Hardware-Sensitive Scan Operator Variants for Compiled Selection - PowerPoint PPT Presentation

Hardware-Sensitive Scan Operator Variants for Compiled Selection Pipelines Databases D B and Software S E Engineering David Broneske , Andreas Meister, Gunter Saake University of Magdeburg 1 Introduction Query Compilation sum(A*B)


  1. Hardware-Sensitive Scan Operator Variants for Compiled Selection Pipelines Databases D B and Software S E Engineering David Broneske , Andreas Meister, Gunter Saake University of Magdeburg 1

  2. Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2

  3. Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2

  4. Introduction Query Compilation ɣ sum(A*B) ⋈ lo_orderdate = d_datekey Bandwidth-bound -> compute-bound Possibility for code optimizations 훔 d_year=1993 훔 lo_discount …, lo_quantity Dates Lineorder D B S E 2

  5. Motivating Examples D B S E 3

  6. Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for D B S E 3

  7. Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated 1 for ( int i = 0; i < input_size; ++i){ 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } D B S E 3

  8. Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated 1 for ( int i = 0; i < input_size; ++i){ 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } SIMD [ZR02] 1 for ( int i = 0; i < simd_size; ++i){ 2 mask= SIMD_COMP(simd_col[i],pred); 3 if (mask){ 4 for ( int j=0;j < SIMD_LENGTH;++j){ 5 if ((mask >> j) & 1) 6 agg+=agg_col[i]; 7 } 8 } 9 } D B S E 3

  9. Motivating Examples Branching 1 for ( int i = 0; i < input_size; ++i){ 2 if (col[i] < pred) 3 agg+=agg_col[i]; 4 } List. 1: Branching scan for Predicated a) Single Predicate response time in ms 1 for ( int i = 0; i < input_size; ++i){ 300 2 agg+=agg_col[i] ∗ (col[i] < pred); 3 } 200 SIMD [ZR02] 100 1 for ( int i = 0; i < simd_size; ++i){ 2 mask= SIMD_COMP(simd_col[i],pred); 3 if (mask){ 0 4 for ( int j=0;j < SIMD_LENGTH;++j){ 0 0 . 2 0 . 4 0 . 6 0 . 8 1 5 if ((mask >> j) & 1) 6 agg+=agg_col[i]; Selectivity 7 } 8 } 9 } Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 3

  10. Motivating Examples a) Single Predicate response time in ms 300 200 100 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4

  11. Motivating Examples b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4

  12. Motivating Examples 8 Aggregates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4

  13. Motivating Examples 1 Aggregate 8 Aggregates 3 Filter Predicates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4

  14. Motivating Examples 1 Aggregate 8 Aggregates 3 Filter Predicates 1 Filter Predicate b) Query Q1 c) Query Q6 a) Single Predicate response time in ms 400 300 1 , 000 300 When to use which scan variant? 200 200 500 100 100 0 0 0 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 Selectivity Selectivity Selectivity Branching Scan SIMD Scan Predicated Scan Predicated SIMD Scan D B S E 4

  15. Evaluation Setup Evaluation Criteria Number of predicates Number of aggregates inside loop Workload & Machine TPC-H LineItem table SF 10 Intel Xeon E5- 2630 v3 with SSE4.2 Variants: Branching vs. Predication Scalar vs. SIMD D B S E 5

  16. Number of Predicates Branching Scan 400 Time in ms 200 s 10 e t 0 a c 0 5 i d 0 . 5 e r 1 P Selectivity P1 f o # D B S E 6

  17. Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o # # # Predicated Scan SIMD Predicated Scan 400 400 Time in ms Time in ms 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6

  18. Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o Results: # # # Predicated Scan SIMD Predicated Scan For one predicate SIMD does not pay out 400 400 Time in ms Time in ms 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6

  19. Number of Predicates Branching Scan Branching Scan SIMD Scan 400 400 400 400 Time in ms Time in ms Time in ms 200 200 200 200 s s s 10 10 e e 10 e t t t 0 0 0 a a a c c c 0 0 5 5 0 5 i i i d d d 0 . 5 0 . 5 0 . 5 e e e r r r 1 1 1 P P P Selectivity P1 Selectivity P1 Selectivity P1 f f f o o o Results: # # # Predicated Scan SIMD Predicated Scan For one predicate SIMD does not pay out 400 400 The more predicates, Time in ms Time in ms the better SIMD 200 200 s s 10 e 10 e t t 0 0 a a c c 0 5 0 5 i i d d 0 . 5 0 . 5 e e r r 1 1 P P Selectivity P1 Selectivity P1 f f o o # # D B S E 6

  20. Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o # # Predicated Scan SIMD Predicated Scan 750 750 Time in ms Time in ms 500 500 250 250 s s e e 10 10 t t a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7

  21. Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o Results: # # More aggregates, less Predicated Scan SIMD Predicated Scan impact of branch misprediction 750 750 Time in ms Time in ms 500 500 250 250 s s e e 10 10 t t a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7

  22. Work Inside the Loop Branching Scan SIMD Scan 600 750 750 Time in ms Time in ms 500 500 400 250 250 s s e e 10 10 t t 200 a a 0 0 g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 g 1 0 g Selectivity P1 Selectivity P1 A A f f o o Results: # # More aggregates, less Predicated Scan SIMD Predicated Scan impact of branch misprediction 750 750 Time in ms Time in ms The more aggregates, 500 500 the better branching 250 250 s s e e 10 10 t t a a 0 0 scans for low selectivity g g 5 5 0 e 0 e r r 0 . 5 g 0 . 5 g 1 0 1 0 g g Selectivity P1 A Selectivity P1 A f f o o # # D B S E 7

  23. Decision Trees Number of Predicates selectivity < 0.05 >= 0.05 #predicates #predicates < 4 >= 4 < 2 >=2 Branching SIMD Predicated SIMD Scan Branching Scan Predicated Number of Aggregates selectivity < 0.1 >= 0.1 SIMD #aggregates Predicated < 6 >= 6 SIMD selectivity Branching < 0.05 >= 0.05 SIMD SIMD Branching Predicated D B S E 8

  24. Conclusion Increasing number of aggregates slows down predicated variants SIMD outperforms scalar variants for several predicates Pipeline code for filter-&-aggregate pipelines 1 Decision trees as a result of our evaluation in the paper Future Work Hash table put / probe (joins, groupings) Automatic calibration for query compilation D B 1 http:/ /git.iti.cs.ovgu.de/dbronesk/BTW-Pipeline-Variants S E 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend