toward gpu accelerated data stream processing
play

Toward GPU Accelerated Data Stream Processing Marcus Pinnecke, - PowerPoint PPT Presentation

Toward GPU Accelerated Data Stream Processing Marcus Pinnecke, David Broneske and Gunter Saake University of Magdeburg, Germany May 27, 2015 Background and Motivation Fundamentals, Windowing, GPU Acceleration in DBMS/SPS Data Stream


  1. Toward GPU Accelerated Data Stream Processing Marcus Pinnecke, David Broneske and Gunter Saake University of Magdeburg, Germany May 27, 2015

  2. Background and Motivation Fundamentals, Windowing, GPU Acceleration in DBMS/SPS

  3. Data Stream Processing Application requirements Examples ■ System Monitoring and Fraud Prevention — Log files about load, network activity, storage ■ Social Media — Identify topics of interest online, such as top-k hash tags on Twitter ■ … Requirements ■ Real-time response Data Stream Processing ■ Continuous processing and analysis ■ High-volume data, potentially infinite ■ High-velocity data (many changes) Toward GPU Accelerated Data Stream Processing 1

  4. Data Stream Processing Processing Model and Windowing Infinite streams of data, but… ■ Limited main memory and ■ Only sequential access Solutions ■ Reduction of data amount (e.g., sampling) or ■ Buffering ( windowing ) Toward GPU Accelerated Data Stream Processing 2

  5. Data Stream Processing Processing Model and Windowing stream of windows stream of events Windows infinite finite finite finite Time-Based Count-Based • More common for real applications • Variable number of events per window • Problematic due to limited GPU memory Toward GPU Accelerated Data Stream Processing 3

  6. Data Stream Processing Bottleneck — Example Join Algorithm ■ Number of join candidates depends on number of events inside window ⨝ Toward GPU Accelerated Data Stream Processing 4

  7. Data Stream Processing Bottleneck — Example Join Algorithm ■ Number of join candidates depends on number of events inside window ■ Many events in the same instant for time-based windows ■ Decrease of throughput ⨝ Toward GPU Accelerated Data Stream Processing 4

  8. Data Stream Processing Bottleneck — Back Pressure Data flow systems (e.g., stream processing) suffer of back pressure Back pressure ■ Upwards-propagated decrease of throughput ■ To the level of the slowest component Results is need for load shedding. Toward GPU Accelerated Data Stream Processing 5

  9. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  10. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  11. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  12. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  13. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  14. Data Stream Processing Bottleneck throughput slowest σ … component ⨝ σ … Toward GPU Accelerated Data Stream Processing 6

  15. Data Stream Processing Bottleneck — Solutions ■ Parallelization of operators C A C B C ■ Distributed computation Site 2 Site 1 A B C more computation resources Toward GPU Accelerated Data Stream Processing 7

  16. GPU? CPU Site 2 Site 1 A B C In DBMS? Toward GPU Accelerated Data Stream Processing 7

  17. Database Management Systems GPUs in DMBS ■ … Efficient co-processor ■ … Might outperform CPUs for certain operations ■ … Computations are highly parallel (SIMD) ■ … Huge corpus on research results Some conclusions ■ Data transfer costs to and from graphic card are critical ■ Operation should match GPU architecture (e.g., branch free) ■ Operation must be expensive enough to amortize transfer costs ■ Column-oriented architectures save transfer costs Toward GPU Accelerated Data Stream Processing 8

  18. GPU Acceleration for Data Stream Processing Challenges Limited memory on graphic cards VS (time-based) windows can be huge event representation (tuple) does not match the GPU architecture Toward GPU Accelerated Data Stream Processing 9

  19. GPU-ready Stream Processing Our 1 st contribution: Handle graphic card memory limitation for very large windows via bucketing

  20. GPU-ready Stream Processing Bucketing We suggest Portioning streams of variable-length window of tuples into a stream of “ Buckets ” Bucket: fixed-size window portions with column-oriented event representation Toward GPU Accelerated Data Stream Processing 10

  21. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 Bucketing Operator Let’s say bucket size 5 Bucket-at-a-Time 11

  22. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 Bucketing 5 4 3 2 1 3 2 1 7 4 6 5 3 4 3 2 1 5 Operator 7 4 6 5 3 4 3 2 1 5 5 4 3 2 1 3 2 1 Let’s say bucket size 5 Bucket-at-a-Time 11

  23. GPU-ready Stream Processing Bucketing (2) 3 events, column-oriented Bucket-at-a-Time Let’s say bucket size 3 3 2 1 3 2 1 Bucketing 5 4 3 5 2 4 1 7 6 8 6 5 4 4 3 3 5 2 1 Operator 7 6 8 6 5 4 4 3 3 5 2 1 5 4 3 5 2 4 1 Let’s say bucket size 5 Bucket-at-a-Time 11

  24. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 5 4 3 2 1 5 4 3 2 1 Bucketing 5 4 3 8 7 7 6 6 8 6 4 4 5 3 5 2 3 3 4 2 5 1 1 Operator 5 4 3 8 7 7 6 6 8 6 4 4 5 3 5 2 3 3 4 2 5 1 1 5 4 3 2 1 5 4 3 2 1 5 events, column-oriented Let’s say bucket size 5 Bucket-at-a-Time 11

  25. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 5 4 3 2 1 5 4 3 5 4 3 2 1 5 4 3 Bucketing 7 6 5 4 3 8 7 7 6 6 8 6 4 2 3 4 5 5 1 3 Operator 8 7 7 6 6 8 6 4 2 3 4 5 5 1 3 7 6 5 4 3 5 4 3 2 1 5 4 3 2 1 Let’s say bucket size 5 Bucket-at-a-Time 11

  26. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 5 4 3 2 1 5 4 3 5 4 3 2 1 5 4 3 Bucketing 8 7 6 8 8 7 7 6 6 8 6 2 3 4 4 5 1 5 3 Operator 8 7 7 6 6 8 6 2 3 4 4 5 1 5 3 8 7 6 8 7 6 5 4 3 5 4 3 2 1 7 6 5 4 3 5 4 3 2 1 Let’s say bucket size 5 Bucket-at-a-Time 11

  27. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 5 4 3 2 1 8 7 6 5 4 3 5 4 3 2 1 8 7 6 5 4 3 Bucketing Bucketing 8 7 6 8 7 4 6 6 3 7 2 5 3 4 5 7 4 1 8 6 6 5 8 3 8 7 6 Operator Operator 8 7 4 6 6 3 7 2 5 3 4 5 7 4 8 1 6 6 5 8 3 8 7 6 8 7 6 8 7 6 5 4 3 5 4 3 2 1 8 7 6 5 4 3 5 4 3 2 1 Let’s say bucket size 5 Bucket-at-a-Time 11

  28. GPU-ready Stream Processing Bucketing (2) Bucket-at-a-Time Let’s say bucket size 3 5 4 8 7 6 8 7 6 5 4 3 5 4 8 7 6 8 7 6 5 4 3 Bucketing 7 4 8 3 7 2 8 5 3 4 5 7 6 1 6 6 8 Operator 7 4 8 3 7 2 8 5 3 4 5 7 6 1 6 6 8 8 7 6 8 7 6 5 4 3 5 4 3 2 8 7 6 8 7 6 5 4 3 5 4 3 2 Let’s say bucket size 5 Bucket-at-a-Time 11

  29. GPU-ready Stream Processing Benefits through Bucketing ■ Each operator requests its own bucket size k ■ The bucket size is independent of the actual window length We suggest a technique called bucketing , that portions each stream of vary- ■ Memory allocation on graphic card has an upper bound for input length window of tuples (events) into a stream of fixed-size window ■ Bucketing flips event representation portions with column-orientated event representation (Buckets) ■ Processing entire columns ■ Window length > bucket size, the window is split into portions ■ Single bucketing-operator can be subscribed by many operators Toward GPU Accelerated Data Stream Processing 12

  30. GPU-ready Stream Processing Buckets versus Windows Windowing Bucketing We suggest a technique called bucketing , that portions each stream of vary- ■ Bounding infinite stream ■ Portioning windows Purpose length window of tuples (events) into a stream of fixed-size window ■ Stream of events ■ Stream of windows Consumes portions with column-orientated event representation (Buckets) Produces ■ Stream of windows ■ Stream of buckets ■ Might be huge ■ Has upper bound #Events Events Represention ■ Tuples ■ Column-wise Toward GPU Accelerated Data Stream Processing 13

  31. GPU-ready Stream Processing Achieve bucketing Slice subscriber 1 Slice subscriber 2 Slice subscriber 3 n Stream Schema Length Actual View Ring Buffer 1 Ring Buffer 2 … Ring Buffer n Toward GPU Accelerated Data Stream Processing 14

  32. GPU-ready Stream Processing Achieve bucketing Slice subscriber 1 Slice subscriber 2 Slice subscriber 3 n Stream Schema Length Actual View Ring Buffer 1 ( a b c ) 1 1 1 Ring Buffer 2 … Ring Buffer n Toward GPU Accelerated Data Stream Processing 14

  33. GPU-ready Stream Processing Achieve bucketing Slice subscriber 1 Slice subscriber 2 Slice subscriber 3 n Stream Schema Length Actual View a Ring Buffer 1 1 b Ring Buffer 2 1 … c Ring Buffer n 1 Toward GPU Accelerated Data Stream Processing 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend