: Streaming Meets Transaction Processing
By Meehan et al.
CS590-BDS Thamir Qadah
Some slides contains material from the original authors’ slides. Project Website: http://sstore.cs.brown.edu/
: Streaming Meets Transaction Processing By Meehan et al. - - PowerPoint PPT Presentation
: Streaming Meets Transaction Processing By Meehan et al. CS590-BDS Thamir Qadah Some slides contains material from the original authors slides. Project Website: http://sstore.cs.brown.edu/ Introduction What is S-Store? A data
Some slides contains material from the original authors’ slides. Project Website: http://sstore.cs.brown.edu/
○ A data processing system that combines stream processing and transaction processing. ○ Extends H-Store to support streaming semantics
○ Traditional stream processing system: No or limited support for transactional guarantees ○ Traditional OLTP systems: No support for data-driven processing
Data Ingestion for the Connected World John Meehan, Cansu Aslantas, Jiang Du, Nesime Tatbul, Stan Zdonik CIDR 2017, Jan 2017
provides the most benefit the client.
Update Order Buying Power Customer Orders OLTP Transactions
FIX Message
Trading Venue Selection Exchange A Exchange B Exchange A Exchange B Check and Debit Order Amount
Update Order Buying Power Customer Orders OLTP Transactions
FIX Message
Trading Venue Selection Exchange A Exchange B Exchange A Exchange B Check and Debit Order Amount
Update Order Buying Power Customer Orders
FIX Message
Trading Venue Selection Exchange A Exchange B Exchange A Exchange B Check and Debit Order Amount Isolation Needed OLTP Transactions
Update Order Buying Power Customer Orders
FIX Message
Trading Venue Selection Exchange A Exchange B Exchange A Exchange B Check and Debit Order Amount OLTP Transactions Ordering Needed
Isolation Needed
Update Order Buying Power Customer Orders
FIX Message
Trading Venue Selection Exchange A Exchange B Exchange A Exchange B Check and Debit Order Amount OLTP Transactions
○ ACID guarantees for OLTP and Streaming ○ Ordered Execution guarantees ■ Executions follow the dataflow graph for streaming transactions ○ Exactly once processing guarantees for streams ■ No loss or duplication
○ Public tables ○ Windows ○ Streams
○ OLTP transactions: can only access public tables ○ Streaming transactions: can access all kinds of state
simultaneity and ordering
tuples.
○ External to a streaming transaction
○ Internal to a streaming transaction ○ Have a slide parameter => (sliding window) ○ If slide == window size => (tumbling window)
representing streaming transactions and edges represent the flow of data among nodes.
T1(s1,w1) T2(s1)
s1 … s1.b2, s1.b1 s2 … s2.b2, s2.b1 s3 ...
Definition
Border Transaction Interior Transaction
T1(s1,w1) T2(s1)
s1 … s1.b2, s1.b1 s2 … s2.b2, s2.b1 s3 ...
Definition Execution
T1,1(s1.b1,w1) T1,2(s1.b2,w1) T2,1(s2.b1) T2,2(s2.b2) Transaction Execution
T1(s1,w1) T2(s1)
s1 … s1.b2, s1.b1 s2 … s2.b2, s2.b1 s3 ...
Definition Execution
T1,1(s1.b1,w1) T1,2(s1.b2,w1) T2,1(s2.b1) T2,2(s2.b2)
State
Stream s1 Window w1 Stream s2 Table for s3
○ DAG order constraint ○ Stream order constraint
in the schedule.
○ Uses command-log for committed transactions ○ Replay commands to restore states ○ Limitation: cannot guarantee same results if non-determinism exist in transaction logic
○ Perform command logging for border transactions only. ○ Assumes the ability to replay input data streams.
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2 1 ... ... 2 ... ... Batch s1.b1 is ready
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2 1 ... ... 2 ... ...
T1,1(s1.b1)
T1,1 is scheduled
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2 1 ... ... 2 ... ...
T1,1(s1.b1)
s1,b2 is ready, T1,2 is scheduled, T1,1 produces output 1 ... ... 2 ... ... 3 ... ... 4 ... ...
T1,2(s1.b1)
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2
T1,1(s1.b1)
s1,b2 is ready, T1,2 is scheduled, T1,1 commits 1 ... ... 2 ... ... 3 ... ... 4 ... ...
T1,2(s1.b1)
3 ... ... 4 ... ...
TS A1 A2 Stream 1
T1(s1)
TS A3 A4 Stream 2
T1,1(s1.b1)
T1,2 commits 1 ... ... 2 ... ...
T1,2(s1.b1)
3 ... ... 4 ... ...
Logging becomes a bottleneck
Strong recovery requires communication with recovery manager for each transaction redone from the log
○ OLTP+OLAP+Transactional Streaming
graphs?