Near Real-time Data Warehousing with Multi-stage Trickle & Flip
Jānis Zuters, University of Latvia, BIR 2011, October 8, 2011
This work has been supported by ESF project No. 2009/ 0216/ 1DP/ 1.1.1.2.0/ 09/ API A/ VI AA/ 044
Near Real-time Data Warehousing with Multi-stage Trickle & Flip - - PowerPoint PPT Presentation
This work has been supported by ESF project No. 2009/ 0216/ 1DP/ 1.1.1.2.0/ 09/ API A/ VI AA/ 044 Near Real-time Data Warehousing with Multi-stage Trickle & Flip J nis Zuters , University of Latvia, BIR 2011, October 8, 2011 Data
This work has been supported by ESF project No. 2009/ 0216/ 1DP/ 1.1.1.2.0/ 09/ API A/ VI AA/ 044
Data Source Data Warehouse ETL OLAP CDC Data Loading Near Real-time
Data Source Data Warehouse ETL Microbatch ETL Data loading conflicts OLAP Frequent data loading (e.g., hourly)
Data Source ETL Data Warehouse OLAP Staging Tables Copy of Staging Tables flip copy 1h
Data Source ETL Staging Tables Copy of Staging Tables Data Warehouse 1h flip
Data Source ETL Data Warehouse OLAP Staging Tables Copy of Staging Tables flip copy 1h Real-time Partition Static Data
copies of real-time data!
Copy of Staging Tables flip Real-time Partition
copy OLAP flip Real-time Partition
Data Source OLAP Data Warehouse Staging Tables 0
ETL
Real-time Partition 1 add? Staging Tables 1 add 5 min. Real-time Partition 2 add? Staging Tables 2 add 1 hour Static Data
ALGORITHM multiple_trickle_and_flip_refresh (R1, M, H`, H) H – real-time partition for the current hour M, H` – staging partitions R1 –refreshment rate (e.g., 5 minutes) M is being continuously fed from the source BEGIN Do Every R1 % e.g., every 5 minutes Add M to H` Empty M If H is available % not locked by querying Add H` to H Empty H`
Data Source
ETL
Staging Tables 0 Real-time Partition 1 Real-time Partition 2 Staging Tables 1 Staging Tables 2 Static Data OLAP
Data Source Data Warehouse OLAP Staging Tables 0
ETL
Real-time Partition 1 add? Staging Tables 1 add 5 min. Real-time Partition 2 add? Staging Tables 2 add 1 hour Static Data Data Source Data Source Data Warehouse OLAP OLAP Staging Tables 0
ETL ETL
Real-time Partition 1 add? Real-time Partition 1 add? Staging Tables 1 add 5 min. Staging Tables 1 add 5 min. Real-time Partition 2 add? Real-time Partition 2 add? Staging Tables 2 add 1 hour Staging Tables 2 add 1 hour Static Data
Jānis Zuters, University of Latvia