in support of workload aware streaming state management
play

IN SUPPORT OF WORKLOAD-AWARE STREAMING STATE MANAGEMENT Vasiliki - PowerPoint PPT Presentation

IN SUPPORT OF WORKLOAD-AWARE STREAMING STATE MANAGEMENT Vasiliki Kalavri John Liagouris vkalavri@bu.edu liagos@bu.edu HotStorage 2020 14 July 2020 STREAMING DATAFLOWS Nexmark Q4: Rolling average of winning bids auctions source


  1. IN SUPPORT OF WORKLOAD-AWARE STREAMING STATE MANAGEMENT Vasiliki Kalavri John Liagouris vkalavri@bu.edu liagos@bu.edu HotStorage 2020 14 July 2020

  2. STREAMING DATAFLOWS Nexmark Q4: “Rolling average of winning bids” auctions source Worker 1 rolling join average sink bids Worker 2 source Logical Dataflow Physical Dataflow Nexmark Streaming Benchmark Suite: https://beam.apache.org/documentation/sdks/java/testing/nexmark/ 2

  3. LARGER-THAN-MEMORY STATE MANAGEMENT Worker 1 put/get <k,v> <k,v> put/get Worker 2 Large operator state is backed by key-value stores 3

  4. LARGER-THAN-MEMORY STATE MANAGEMENT Worker 1 put/get <k,v> <k,v> put/get Worker 2 Large operator state is LSM-based write-optimized backed by key-value stores store with efficient range scans 4

  5. STATE REQUIREMENTS VARY ACROSS OPERATORS Nexmark Q4: “Rolling average of winning bids” auctions source rolling join Join: Write-heavy and can potentially average accumulate large state Average: Read-Modify-Write a single value sink bids source Dataflow operators may have different state access patterns and memory requirements 5

  6. CURRENT PRACTICE: MONOLITHIC STATE MANAGEMENT Worker 1 One key-value store (RocksDB) per stateful operator instance <k,v> <k,v> All key-value stores in the <k,v> <k,v> dataflow are globally-configured Worker 2 6

  7. FLAWS OF MONOLITHIC STATE MANAGEMENT Worker 1 - Oblivious store configuration <k,v> <k,v> - Unnecessary data marshaling - Unnecessary key-value store features <k,v> <k,v> Worker 2 7

  8. UNNECESSARY KEY-VALUE STORE FEATURES - State partitioning All these operations are handled by modern stream processors outside the state store - State scoping - Concurrent access to state Stream processors guarantee single-thread access to state - State checkpointing 8

  9. WORKLOAD-AWARE STREAMING STATE MANAGEMENT Worker 1 Multiple state stores of different store:<u64,auction> types and configurations store:u64 store:<u64,bid> according to the requirements of the stateful operators rmw_u64 put/get Streaming operators are instantiated once and are long-running: their access patterns and state sizes are largely known in advance Worker 2 9

  10. A FLEXIBLE TESTBED FOR STREAMING STATE MANAGEMENT RocksDB : LSM-based - Implemented in Rust with efficient range scans - Based on Timely Dataflow stream processor - Supports two key-value stores - RocksDB - FASTER FASTER : Hybrid log with efficient lookups and in-place updates - Supports different window evaluation strategies Testbed: https://github.com/jliagouris/wassm Timely Dataflow: https://github.com/TimelyDataflow/timely-dataflow 10 FASTER: https://github.com/microsoft/FASTER

  11. 11 EXPERIMENTAL RESULTS

  12. EVALUATION GOALS 1. Study the effect of the backend’s data layout on the evaluation of streaming windows 2. Study the effect of workload-aware configuration on queries with multiple stateful operators 12

  13. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION COUNT-30s-1s �� � ������ - Query 1: Count the number of records in a ��������������� �� �� 30s window that slides every 1s ������������� �� �� - Input rate: 10K records/s ���� �� �� - Single thread execution �� �� �� �� - Report end-to-end latency (ms) per record �� �� �� � �� � �� � �� � ������������ 13

  14. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION COUNT-30s-1s �� � ������ ��������������� �� �� p90 ������������� p99 �� �� Complementary CDF : Each ���� point (x,y) indicates that y% of p99.9 �� �� the latency measurements are … �� �� at least x ms �� �� �� �� �� � �� � �� � �� � ������������ Lower is better 14

  15. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION COUNT-30s-1s �� � ������ ��������������� �� �� ������������� RocksDB PUT/GET: On record , �� �� retrieve window contents, apply ���� �� �� new record, and put the updated �� �� contents back to the store �� �� �� �� �� � �� � �� � �� � ������������ Lower is better 15

  16. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION COUNT-30s-1s �� � ������ ��������������� �� �� ������������� RocksDB MERGE: On record , put �� �� record to the store using MERGE. ���� �� �� The record is applied to the window �� �� contents lazily on trigger �� �� �� �� �� � �� � �� � �� � ������������ Lower is better 16

  17. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION COUNT-30s-1s �� � ������ ��������������� �� �� ������������� �� �� FASTER performs better 100X in p99 ���� �� �� due to in-place updates �� �� �� �� �� �� �� � �� � �� � �� � ������������ Lower is better 17

  18. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION RANK-30s-30s �� � - Query 2: Rank records in a 30s tumbling ������ ��������������� window �� �� ������������� �� �� - Input rate: 1K records/s ���� �� �� - Single thread execution �� �� �� �� - Report end-to-end latency (ms) per record �� �� �� � �� � �� � �� � �� � �� � ������������ Lower is better 18

  19. EFFECT OF DATA LAYOUT ON WINDOW EVALUATION RANK-30s-30s �� � ������ ��������������� �� �� ������������� �� �� RocksDB MERGE performs 100X 1000X ���� �� �� best due to lazy evaluation �� �� �� �� �� �� �� � �� � �� � �� � �� � �� � ������������ Lower is better 19

  20. THERE IS NO CLEAR WINNER COUNT-30s-1s RANK-30s-30s �� � �� � ������ ������ ��������������� ��������������� �� �� �� �� ������������� ������������� �� �� �� �� 100X in p99 100X 1000X ���� ���� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� �� � �� � �� � �� � �� � �� � �� � �� � �� � �� � ������������ ������������ 20

  21. MONOLITHIC VS WORKLOAD-WARE STATE MANAGEMENT - Experiments with six Nexmark * queries - Different stateful operators ( joins , window aggregations , custom aggregations ) - Simple workload-aware configuration of data types and available memory size * Nexmark Streaming Benchmark Suite: https://beam.apache.org/documentation/sdks/java/testing/nexmark/ 21

  22. MONOLITHIC VS WORKLOAD-AWARE STATE MANAGEMENT - State store used: FASTER custom join and Q4 rolling aggregate �� � - Input rate: 10K records/s ���������� �������������� �� �� - SIngle thread execution �� �� - Monolithic memory configuration: 8GB �� �� �� �� - Workload-aware memory configuration: 6GB �� �� (bids), 1.5GB (auctions), 512MB (average) �� � �� � �� � ������������ - Report end-to-end latency (ms) per record 22

  23. MONOLITHIC VS WORKLOAD-AWARE STATE MANAGEMENT - State store used: FASTER custom join and Q4 rolling aggregate �� � - Input rate: 10K records/s ���������� �������������� �� �� - SIngle thread execution �� �� - Monolithic memory configuration: 8GB �� �� �� �� - Workload-aware memory configuration: 6GB �� �� (bids), 1.5GB (auctions), 512MB (average) �� � �� � �� � ������������ - Report end-to-end latency (ms) per record 23

  24. MONOLITHIC VS WORKLOAD-AWARE STATE MANAGEMENT - State store used: FASTER custom join and Q4 rolling aggregate �� � - Input rate: 10K records/s ���������� �������������� �� �� - SIngle thread execution �� �� - Monolithic memory configuration: 8GB �� �� �� �� - Workload-aware memory configuration: 6GB �� �� (bids), 1.5GB (auctions), 512MB (average) �� � �� � �� � ������������ - Report end-to-end latency (ms) per record 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend