Autonomous ETL With Materialized Views
Abhishek Somani, Adesh Rao
May 2018
Autonomous ETL With Materialized Views Abhishek Somani, Adesh Rao - - PowerPoint PPT Presentation
Autonomous ETL With Materialized Views Abhishek Somani, Adesh Rao May 2018 Agenda 1. Standard techniques for structuring data for SQL-on-Hadoop (Hive, Presto, Spark etc) 2. Difficulties in structuring data 3. A case for Materialized Views
Abhishek Somani, Adesh Rao
May 2018
2
3
4
5
ORC Parquet
6
7
Speedup of Unsorted vs Sorted ORC data on TPCDS scale 1000
8
9
10
11
Example:
CREATE MATERIALIZED VIEW mv AS SELECT seller_id, seller_name, num_item*cost AS value FROM sales;
Effect: Query rewrite
SELECT seller_id, num_item*cost AS value FROM sales; ~ SELECT seller_id, value FROM mv;
12
13
Query1: SELECT * from T1 where customer_id = 26988 and month
= “January”;
Rewritten: SELECT * from MV1 where customer_id = 26988 and
month = “January”;
Query2: SELECT * from T1 where seller_id = 121 and month =
“January”;
Rewritten: SELECT * from MV2 where seller_id = 121 and month =
“January”;
Original Table T1:
Materialized View MV1:
Materialized View MV2:
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Abhishek Somani, Adesh Rao
May 2018