READY: Completeness is in the Eye of the Beholder
- B. Chandramouli, J. Gehrke, J. Goldstein, M. Hofmann,
- D. Kossmann, J. Levandoski, R. Marroquin, W. Xin
ETH Zurich, Facebook, Microsoft
READY: Completeness is in the Eye of the Beholder B. Chandramouli, - - PowerPoint PPT Presentation
READY: Completeness is in the Eye of the Beholder B. Chandramouli, J. Gehrke, J. Goldstein, M. Hofmann, D. Kossmann , J. Levandoski, R. Marroquin, W. Xin ETH Zurich, Facebook, Microsoft Observations Observation 1: We produce data in silos
ETH Zurich, Facebook, Microsoft
DB DB Files Data Lake ERP CRM
PowerBI
DB DB Files Data Lake ERP CRM
PowerBI
SQL, concurrency, integrity SQL snapshots, integrity
DB DB
Files Files
DB
ERP CRM PowerBI
DB DB
Files Files
DB
ERP CRM PowerBI
SQL, concurrency, integrity
constraints of another app
DB DB
Files Files
DB
ERP CRM PowerBI
○ queries on USA orders ○ report only when all USA
○ queries on Toys orders ○ report only when all Toys
○ queries on USA orders ○ report only when all USA
○ queries on Toys orders ○ report only when all Toys
Database states that meet USA Analyst‘s constraint: 3, 4, 8.
Database States that meet Toys Analyst’s constraints: 1, 6
Producer V1 V2 V3 V4 V5
all versions of data lake
Producer V1 V2 V3 V4 V5
all versions of data lake
Consumer1
all versions visible to C1 (versions meet C1’s constraints)
Producer V1 V2 V3 V4 V5
all versions of data lake
Consumer1
all versions visible to C1 (versions meet C1’s constraints)
query (last)
Producer V1 V2 V3 V4 V5
all versions of data lake
Consumer1
all versions visible to C1 (versions meet C1’s constraints)
query (next)
Producer V1 V2 V3 V4 V5
all versions of data lake
Consumer1
all versions visible to C1 (versions meet C1’s constraints)
Consumer2
all versions visible to C2 (versions meet C2’s constraints)
○ Each transaction / query runs in a sandbox ○ Sandboxes Define: ■ which snapshots of data lake are visible ■ which business objects are visible
21
TPC-H Update Functions No constraints TPC-H Queries Synthetic Integrity Constraints
DB Producer
DB Consumer1 Consumer2
Sandbox1 AND Sandbox2
DB Producer
DB Consumer1
Sandbox1
DB Consumer1
Sandbox1
SF1 SF10