Motivation Why are Views Useful? Give an example query: Workloads - - PowerPoint PPT Presentation

motivation why are views useful
SMART_READER_LITE
LIVE PREVIEW

Motivation Why are Views Useful? Give an example query: Workloads - - PowerPoint PPT Presentation

Motivation Why are Views Useful? Give an example query: Workloads often have repeating patterns: SELECT l.partkey FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(2015-03-31) ORDER BY l.shipdate DESC


slide-1
SLIDE 1
slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

SELECT l.partkey FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) ORDER BY l.shipdate DESC LIMIT 10; SELECT l.partkey, COUNT(*) FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) GROUP BY l.partkey; SELECT l.suppkey, COUNT(*) FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) GROUP BY l.suppkey;

Workloads often have repeating patterns:

Give an example query:

CREATE VIEW salesSinceLastMonth AS SELECT l.* FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) SELECT partkey FROM salesSinceLastMonth ORDER BY shipdate DESC LIMIT 10; SELECT suppkey, COUNT(*) FROM salesSinceLastMonth GROUP BY suppkey; SELECT partkey, COUNT(*) FROM salesSinceLastMonth GROUP BY partkey;

View Definition

Motivation — Why are Views Useful?

SELECT partkey FROM salesSinceLastMonth ORDER BY shipdate DESC LIMIT 10; SELECT partkey FROM ( SELECT l.* FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) ) AS salesSinceLastMonth

Views act as normal relations

Definition — What is a View / How are they used?

slide-17
SLIDE 17

AND o.orderdate > DATE(’2015-03-31’) ) AS salesSinceLastMonth ORDER BY shipdate DESC LIMIT 10;

Analogous to a function Complex query patterns can be given an shorthand Can freely change view logic “in the background” (Change ‘last month’)

Views contain and abstract concepts But not quite normal relations…

Easy… rows in salesSinceLastMonth go 1-1 with LINEITEM. Can find the row of line item that matches a given row of salesSinceLastMonth and update it.

UPDATE salesSinceLastMonth SET statusCode = ‘q’; WHERE orderkey = 22;

Harder… What happens if order #22 doesn’t exist? How does the insertion interact with sequences (e.g., Lineitem.lineno)

INSERT INTO salesSinceLastMonth (orderkey, partkey, suppkey, …) VALUES (22, 99, 42, …); CREATE TRIGGER salesSinceLastMonthInsert INSTEAD OF INSERT ON salesSinceLastMonth REFERENCING NEW ROW AS newRow FOR EACH ROW IF NOT EXISTS ( SELECT * FROM ORDERS WHERE ORDERS.orderkey = newRow.orderKey) ) THEN INSERT INTO ORDERS(orderkey) VALUES (orderkey) END IF; INSERT INTO LINEITEM VALUES newRow; END FOR;

InsteadOf triggers update rows

View Updates

Views exist because they’re queried frequently…

Precompute (materialize) the view’s contents (like an index)

Why not use them to make computations faster. Challenges:

View Materialization

slide-18
SLIDE 18

What happens when the data behind the view changes? What happens when the view definition changes? What happens when we write a query without realizing we have a view? Q(D) is the result of your query on the database

Let’s say you have a database D and a query Q

Q(D+ΔD) is the new result

Let’s say you make a change ΔD (e.g., Insert Tuple)

Analogy to Sum {34,29,10,15} + {12} (== 88+12)

If we have Q(D), can we get Q(D+ΔD) faster?

Projection Selection Union Cross-Product Aggregation

Specific query examples

Insert Delete Update

Interactions with...

Updates to Materialized Views

CREATE MATERIALIZED VIEW salesSinceLastMonth AS SELECT l.* FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) SELECT l.partkey FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE(’2015-03-31’) ORDER BY l.shipdate DESC LIMIT 10;

Can we use materialized views without knowing about them?

View: SELECT Lv FROM Rv WHERE Cv Query: SELECT Lq FROM Rq WHERE Cq

Simplify the query model:

View Selection

slide-19
SLIDE 19

Rv ⊆ Rq (All relations in the view are in the query join) Cq = Cv ⋀ C’ (The view condition is weaker than the query condition) Lq ∩ attrs(Rv) ⊆ Lv (The view doesn’t project away attributes needed for the output) attrs(C’) ∩ attrs(Rv) ⊆ Lv (The view doesn’t project away attributes needed for the condition)

When can we rewrite this query?

SELECT Lq FROM (Rq-Rv), view WHERE C’

The whole thing rewrites to:

Views for Transactions

slide-20
SLIDE 20

Incremental View Maintenance

Not covered by Database Systems: TCB

1

slide-21
SLIDE 21

Materialized Views

2

Q( )

When the base data changes, the view needs to be updated

slide-22
SLIDE 22

Materialized Views

3

Q( )

When the base data changes, the view needs to be updated

slide-23
SLIDE 23

View Maintenance

4

VIEW ← Q(D)

slide-24
SLIDE 24

View Maintenance

5

WHEN D ← D+ΔD DO:

Re-evaluating the query from scratch is expensive!

VIEW ← Q(D+ΔD)

slide-25
SLIDE 25

View Maintenance

6

VIEW ← VIEW+ΔQ(D,ΔD) WHEN D ← D+ΔD DO:

(ideally) Smaller & Faster Query (ideally) Fast “merge” operation.

slide-26
SLIDE 26

Intuition

7

D = {1, 2, 3, 4} ΔD = {5} Q(D) = SUM(D) Q(D+ΔD) ~ O(|D|+|ΔD|) VIEW + SUM(ΔD) ~ O(|ΔD|)

slide-27
SLIDE 27

Intuition

8

R = {1, 2, 3}, S ={5,6} ΔR = {4} Q(R,S) = COUNT(R x S) Q(R+ΔR,S) ~ O( (|R|+|ΔR|) * |S| ) VIEW + COUNT(|ΔR|*|S|) ~ O(|ΔR|*|S|)

slide-28
SLIDE 28

Intuition

9

+ ~ U * ~ x

Are these kinds of patterns common?

slide-29
SLIDE 29

Rings/Semirings

10

This kind of pattern occurs frequently. Semiring : < S, +, x, S0, S1 > Any set of ‘things’ S such that… Si + Sj = Sk Si x Sj = Sk Si x (Sj + Sk) = (Si x Sj) + (Sj x Sk) Si + S0 = Si Si x S1 = Si Closed Distributive

Additive & Multiplicative “zeroes”

Si x S0 = S0

slide-30
SLIDE 30

Rings/Semirings

11

Ring : < S, +, x, S0, S1, - > Any semiring where every element has an additive inverse… Si + (-Si) = S0

slide-31
SLIDE 31

12

THE TANGENT ENDS NOW

slide-32
SLIDE 32

Incremental View Maintenance

13

VIEW ← VIEW+ΔQ(D,ΔD) WHEN D ← D+ΔD DO:

What does ΔR represent? How to interpret R + ΔR? Basic Challenges of IVM How to compute ΔQ?

slide-33
SLIDE 33

What is ΔR?

14

What does it need to represent? Insertions Deletions Updates

(Delete Old Record & Insert Updated Record)

slide-34
SLIDE 34

What is ΔR?

15

A Set/Bag of Insertions A Set/Bag of Deletions

slide-35
SLIDE 35

What is +?

16

A Set/Bag of Insertions A Set/Bag of Deletions A Set/Bag

R ΔR + +

R ⋃ ΔRinserted

  • ΔRdeleted

But this breaks closure of ‘+’!

slide-36
SLIDE 36

Incremental View Maintenance

18

VIEW ← VIEW+ΔQ(D,ΔD)

Construct ΔQ(R,ΔR,S,ΔS,…) Given Q(R,S,…)

slide-37
SLIDE 37

Delta Queries

19

R σ R ΔR σ Original R Inserted Tuples of R

Does this work for deleted tuples?

slide-38
SLIDE 38

Delta Queries

20

R π R ΔR π

Does this work (completely) under set semantics?

slide-39
SLIDE 39

Delta Queries

21

R1 R1 ΔR1 R2 U R2 ΔR2

slide-40
SLIDE 40

Delta Queries

22

R R ΔR S x S

slide-41
SLIDE 41

Delta Queries

23

R : { 1, 2, 3 } S : { 5, 6} R x S = { <1,5>, <1, 6>, <2,5>, <2,6>, <3,5>, <3,6> } ΔRinserted = { 4 } ΔRdeleted = { 3,2 } (R+ΔR) x S = { <1,5>, <1, 6>, <4,5>, <4,6> } Δinserted(R x S) = ΔRinserted x S Δdeleted(R x S) = ΔRdeleted x S

What if R and S both change?

slide-42
SLIDE 42

Delta Queries

24

Computing a Delta Query

slide-43
SLIDE 43

Delta Queries

25

The original query The delta query

slide-44
SLIDE 44

26

How about an example…

slide-45
SLIDE 45

Delta Queries

27

CUSTOMER ORDERS LINEITEM

Let’s say you have an insertion into LINEITEM

slide-46
SLIDE 46

Delta Queries

28

CUSTOMER ORDERS LINEITEM

slide-47
SLIDE 47

Delta Queries

29

CUSTOMER ORDERS LINEITEM

= ø

slide-48
SLIDE 48

Delta Queries

30

CUSTOMER ORDERS LINEITEM

slide-49
SLIDE 49

Delta Queries

31

CUSTOMER ORDERS LINEITEM

slide-50
SLIDE 50

Delta Queries

32

SELECT * FROM CUSTOMER C, ORDERS O, DELTA_LINEITEM DL WHERE C.custkey = O.custkey AND DL.orderkey = O.orderkey AND C.mktsegment = … AND O.orderdate = … AND DL.shipdate = …

slide-51
SLIDE 51

Multisets

33

{ 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 5 } (not compact) { 1 → x3, 2 → x5, 3 → x2, 4 → x6, 5 → x1 } Multiset representation: Tuple → # of occurrences multiplicity

slide-52
SLIDE 52

Multiset Deltas

34

Insertions = Positive Multiplicity Deletions = Negative Multiplicity + = Bag/Multiset Union

slide-53
SLIDE 53

Multiset Deltas

35

What does Union do? { A→1, B→3 } ⋃ { B→2, C→4 } = { A→1, B→5, C→4 } { A→1 } ⋃ { A→-1 } = { A→0 } = { }

slide-54
SLIDE 54

Multiset Deltas

36

What does Union do? { A→1, B→3 } ⋃ { B→2, C→4 } = { A→1, B→5, C→4 } { A→1 } ⋃ { A→-1 } = { A→0 } = { } What does Cross Product do? { A→1, B→3 } x { C→4 } = { <A,C>→?, <B,C>→? }

slide-55
SLIDE 55

Multiset Deltas

37

What does Union do? { A→1, B→3 } ⋃ { B→2, C→4 } = { A→1, B→5, C→4 } { A→1 } ⋃ { A→-1 } = { A→0 } = { } What does Cross Product do? { A→1, B→3 } x { C→4 } = { <A,C>→4, <B,C>→? }

slide-56
SLIDE 56

Multiset Deltas

38

What does Union do? { A→1, B→3 } ⋃ { B→2, C→4 } = { A→1, B→5, C→4 } { A→1 } ⋃ { A→-1 } = { A→0 } = { } What does Cross Product do? { A→1, B→3 } x { C→4 } = { <A,C>→4, <B,C>→12 }

slide-57
SLIDE 57

Multiset Deltas

39

What does projection do? πAttr1{ <A,X>→1, <A,Y>→2, <B,Z>→5 } = { <A>→1, <A>→2, <B>→5 } = { <A>→3, <B>→5 }

This effect seems… familiar

slide-58
SLIDE 58

40

If you find this subject interesting… let’s chat. http://www.dbtoaster.org