Towards a Formal Model for View Maintenance in Data Warehouses D. - - PowerPoint PPT Presentation

towards a formal model for view maintenance in data
SMART_READER_LITE
LIVE PREVIEW

Towards a Formal Model for View Maintenance in Data Warehouses D. - - PowerPoint PPT Presentation

Towards a Formal Model for View Maintenance in Data Warehouses D. Agrawal , A. El Abbadi , A. Most efaoui , M. Raynal and M. Roy Univ. Santa Barbara, California IRISA, Rennes, France Towards a Formal


slide-1
SLIDE 1

Towards a Formal Model for View Maintenance in Data Warehouses

  • D. Agrawal
  • , A. El Abbadi
  • , A. Most´

efaoui

, M. Raynal

and M. Roy

✁ ✂
  • Univ. Santa Barbara, California

IRISA, Rennes, France

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.1/22

slide-2
SLIDE 2

Summary

The Data Warehouse Problem Definitions Existing protocols A Formal Definition of the Problem Formal Definition of Data Objects Abstract Definition of View Management The Protocol A Virtual Topology A Pipelining Technique

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.2/22

slide-3
SLIDE 3

The Data Warehouse Problem

A set of databases

✂✁ ✄ ✂☎ ✄ ✆ ✆ ✆ ✄ ✞✝

How to efficiently query a database aggregate? Query

x2 x3 x4 x5 x1

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.3/22

slide-4
SLIDE 4

The Data Warehouse Problem

A set of databases

✂✁ ✄ ✂☎ ✄ ✆ ✆ ✆ ✄ ✞✝

How to efficiently query a database aggregate? By adding a Data Warehouse Query

x2 x3 x4 x5 x1 Data Warehouse

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.3/22

slide-5
SLIDE 5

Data Warehouse: Definition

The Data Warehouse maintains a DB summary a Select-Project-Join (SPJ) expression:

✄ ✆ ✆ ✆ ✄ ✝ ✁ ✂ ✄ ✆☎ ✝
✞✟ ✆ ✆ ✆ ✞✟ ✝ ✁ ✁

Data Warehouse (DWH) problem

calculus of a “Simple” distributed function with changing Data Sources.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.4/22

slide-6
SLIDE 6

Extremal Solutions

The DWH maintains the total aggregation of all Data Sources. costly in space unnecessary network usage

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.5/22

slide-7
SLIDE 7

Extremal Solutions

The DWH maintains the total aggregation of all Data Sources. costly in space unnecessary network usage The DWH stores no datum, and forwards queries to Data Sources high latency unnecessary network usage

  • DWH

proxy

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.5/22

slide-8
SLIDE 8

Proposed Solutions

The DWH maintains the SPJ expression Periodically, it calculates the Major Problem: asynchrony of updates on Data Sources Error Terms

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.6/22

slide-9
SLIDE 9

Major Difficulties

Asynchrony and distribution of the model:

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22

slide-10
SLIDE 10

Major Difficulties

Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22

slide-11
SLIDE 11

Major Difficulties

Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh. Complexity of proposed protocols:

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22

slide-12
SLIDE 12

Major Difficulties

Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh. Complexity of proposed protocols: unproved algorithms need for a formal definition of the problem.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22

slide-13
SLIDE 13

Formal Definitions (data)

Data Objects denoted

a data manager is associated with each

  • can be updated and read using the

query/update primitives Timeline: the successive values of

  • are

denoted

✄ ☎
✄ ✆ ✝

.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.8/22

slide-14
SLIDE 14

Formal Definitions (operations)

Data Operations add/remove tuples, denoted associative commutative. a join operation, denoted associative, commutative, distributive over .

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.9/22

slide-15
SLIDE 15

Formal Definitions (dwh)

the Data Warehouse calculates such that

✆ ✆ ✆

consistency is mandatory at any time. up-to-dateness is eventual for performance reasons

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.10/22

slide-16
SLIDE 16

Abstract Def. of View Management

Validity any query on the dwh returns an

✄✁ ☎ ✁ ✆ ✆ ✆
✄ ✂ ☎ ✝

.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22

slide-17
SLIDE 17

Abstract Def. of View Management

Validity any query on the dwh returns an

✄✁ ☎ ✁ ✆ ✆ ✆
✄ ✂ ☎ ✝

. Order Consistency

✄ ✁
✁ ✆ ✆ ✆
✄ ✁ ✂ ☎ ✝

,

✄ ☎
✁ ✆ ✆ ✆
✄ ☎ ✂ ☎ ✝

if

was issued before

, then

✁ ✂ ✄ ✄ ☎
  • .

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22

slide-18
SLIDE 18

Abstract Def. of View Management

Validity any query on the dwh returns an

✄✁ ☎ ✁ ✆ ✆ ✆
✄ ✂ ☎ ✝

. Order Consistency

✄ ✁
✁ ✆ ✆ ✆
✄ ✁ ✂ ☎ ✝

,

✄ ☎
✁ ✆ ✆ ✆
✄ ☎ ✂ ☎ ✝

if

was issued before

, then

✁ ✂ ✄ ✄ ☎
  • .

Up-to-Dateness for any

✄ ☎
  • , an infinite

sequence of queries will return at least an

✆ ✆ ✄
✆ ✆ ✆ ✁

with

✄ ✁ ✄

.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22

slide-19
SLIDE 19

The Protocol: a single update

Suppose that

✂ ✂✁

. if

is updated to

✂ ✁

, then the corresponding is:

✂ ✂ ✁

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-20
SLIDE 20

The Protocol: a single update

Suppose that

✂ ✂✁

.

x1 x2 x4 x3 DWH

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-21
SLIDE 21

The Protocol: a single update

Suppose that

✂ ✂✁

.

x1 x2 x4 x3 DWH ð1

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-22
SLIDE 22

The Protocol: a single update

Suppose that

✂ ✂✁

.

x1 x2 x4 x3 DWH ð1*x2

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-23
SLIDE 23

The Protocol: a single update

Suppose that

✂ ✂✁

.

x1 x2 x4 x3 DWH ð1*x2*x3

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-24
SLIDE 24

The Protocol: a single update

Suppose that

✂ ✂✁

.

x1 x2 x4 x3 DWH ÐF = ð1*x2*x3*x4

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22

slide-25
SLIDE 25

The Protocol: Concurrent Updates

Now, suppose that both

and

are updated.

✁ ✂
✂ ✁ ✁
✂ ☎ ✁
✁ ✂
✂ ✄ ☎✝✆ ✞✠✟ ✡ ✞✠✟ ☛ ✞✠✟ ☞ ✌✎✍ ✄ ✟ ✆ ✞ ☎ ✡ ✞✠✟ ☛ ✞✠✟ ☞ ✌ ✍ ✄ ☎✝✆ ✞ ☎ ✡ ✞✠✟ ☛ ✞✠✟ ☞ ✌

complexity increases with concurrency two solutions:

  • 1. compute error terms
  • 2. order the updates

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.13/22

slide-26
SLIDE 26

The Protocol: a Virtual Topology

the star topology (center: dwh, edges: nodes) is seen as a ring a token perpetually moves on the ring it generates a natural order on updates

x1 x2 x4 x3 DWH

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.14/22

slide-27
SLIDE 27

The Protocol: Pipelining Updates

The token generates a global time (

  • f steps)
  • n

:

  • current
✁ ✁
  • lastcommited

. when an update made a total rotation, it can be integrated to the data warehouse. the token can contain up to

updates in commitment phase.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.15/22

slide-28
SLIDE 28

The Protocol: Code (1)

when the token arrives to

  • with sequence

number

:

  • 1. let
✁✄✂ ☎ ✆✞✝ ✟✡✠ ☛ ☞ ✌ ✍

;

  • 2. if (
✁ ✂ ✎ ☎ ✏

) then

✑ ☛ ✒ ✑ ☛ ✓ ✔

;

send incr

✕ ✁ ✂✗✖ ✑ ☛ ✘

to DWH endif; 3.

✆ ✝ ✟✡✠ ☛ ☞ ✌ ✍ ✒ ✁✄✙

; 4.

✚ ✛ ✎ ☎ ✌

do

✆ ✝ ✟✡✠ ☛ ☞ ✛ ✍ ✒ ✕ ✆ ✝ ✟✡✠ ☛ ☞ ✛ ✍✢✜ ✕✤✣ ✙ ✥ ✁ ✙ ✘ ✘

enddo;

5.

✁ ✙ ✒ ✏

;

  • 6. send token sn
✕ ✆ ✝ ✟✡✠ ☛ ✖ ✑ ☛ ✘

to next data

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.16/22

slide-29
SLIDE 29

The Protocol: Code (2)

when update

is received by

  • :

1.

✣ ✙ ✒ ✣ ✙

; 2.

✁ ✙ ✒ ✁ ✙

when incr

is received by DWH:

  • 1. wait (
☛ ✠ ✣ ✆

_

✑ ☛ ☎ ✑ ☛ ✘

; 2.

✄ ✒ ✄

; 3.

☛ ✠ ✣ ✆

_

✑ ☛ ✒ ☛ ✠ ✣ ✆

_

✑ ☛ ✓ ✔

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.17/22

slide-30
SLIDE 30

The Protocol: Sketch for the Proof

Validity, Up-to-dateness and Order Consistency use a total order: the number of steps performed by the token induction on the content of the token

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.18/22

slide-31
SLIDE 31

a Real Life Protocol

How to make a quiescent protocol? when there is no update, then the token is destroyed. when an update occurs, the data source sends a request to the data warehouse if the token was destroyed, it is recreated

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.19/22

slide-32
SLIDE 32

a Real Life Protocol (2)

How to remove the ring assumption? in a star network, each message comes from/to the dwh the dwh incorporates updates and destroys/recreates the token when necessary

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.20/22

slide-33
SLIDE 33

Extension: Multi Term

Meta-datawarehouse: aggregation of multiple data warehouses a data object may appear in several views computed in the data warehouses

Meta−DWH x1x3x4+x2x3x4x5 x1 x2 x3 x4 x5 DWH1 DWH2

synchronization problems, possible deadlocks.

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.21/22

slide-34
SLIDE 34

Conclusion

a formal definition of a database problem an abstract protocol provable can be adapted to fit to real-life systems efficient

Towards a Formal Modelfor View Maintenance in Data Warehouses – p.22/22