View Invalidation for Dynamic Content Caching in Multitiered - - PowerPoint PPT Presentation

view invalidation for dynamic content caching in
SMART_READER_LITE
LIVE PREVIEW

View Invalidation for Dynamic Content Caching in Multitiered - - PowerPoint PPT Presentation

View Invalidation for Dynamic Content Caching in Multitiered Architectures K. Seluk Candan Divyakant Agrawal Wen-Syan Li Oliver Po Wang-Pin Hsiung NEC USA, C&C Research Labs. CA USA Multi-tiered architectures. Clients do not


slide-1
SLIDE 1

View Invalidation for Dynamic Content Caching in Multitiered Architectures

  • K. Selçuk Candan

Divyakant Agrawal Wen-Syan Li Oliver Po Wang-Pin Hsiung NEC USA, C&C Research Labs. CA USA

slide-2
SLIDE 2

12/3/2002 Presented by K. Selçuk Candan

Multi-tiered architectures….

  • Clients do not access the

database directly.

E F

  • Instead, they use applications

which invoke DBMSs

  • r they access result caches

proxy cache (A)

front-end cache (B)

edge cache (C)

user side cache (D)

middle-tier caches (E)

slide-3
SLIDE 3

12/3/2002 Presented by K. Selçuk Candan

Problem…..

Users

slide-4
SLIDE 4

12/3/2002 Presented by K. Selçuk Candan

Result caches and consistency

Various

view materialization and update management techniques

have been proposed to deal with updates to the underlying data.

These techniques guarantee that cached results are always

consistent

with the underlying data.

slide-5
SLIDE 5

12/3/2002 Presented by K. Selçuk Candan

Strong consistency requirements..

Data Warehouse Data Data

slide-6
SLIDE 6

12/3/2002 Presented by K. Selçuk Candan

Strong consistency requirements..

Data Warehouse Data Data

slide-7
SLIDE 7

12/3/2002 Presented by K. Selçuk Candan

Strong consistency requirements..

Data Warehouse Data Data Queries

slide-8
SLIDE 8

12/3/2002 Presented by K. Selçuk Candan

Result Caches and consistency

Various

view materialization and update management techniques

have been proposed to deal with updates to the underlying data.

These techniques guarantee that cached results are always

consistent

with the underlying data.

Other applications do not require caches reflect the database

exactly all the time.

slide-9
SLIDE 9

12/3/2002 Presented by K. Selçuk Candan

Relaxed consistency requirements..

Data Warehouse Data Data Queries Middletier Cache Data Data Queries Misses

slide-10
SLIDE 10

12/3/2002 Presented by K. Selçuk Candan

Invalidation vs. view maintenance

Result caches need

all out-dated results be invalidated

in a timely fashion.

slide-11
SLIDE 11

12/3/2002 Presented by K. Selçuk Candan

Example

Page: http://www.autobuy.com/modelinfo?car=Toyota

select maker, model, price from Car where maker = "Toyota"; is cached.

slide-12
SLIDE 12

12/3/2002 Presented by K. Selçuk Candan

Example (cont.)

If a new tuple

(Toyota; Avalon; 25000)

is inserted into Car, then we can either

recompute the new results of this query (preferably incrementally) and

rerun the application to regenerate the page.

purge the corresponding page from the cache.

the request can still served from the database!

  • r
slide-13
SLIDE 13

12/3/2002 Presented by K. Selçuk Candan

Overinvalidation as a tool

Overinvalidation can be used if accurate invalidation is

too expensive or

not feasible in a given time frame

Underinvalidation is not acceptable!

Invalidation is inherently cheaper than view maintenance:

  • we do not need to compute all consequences of updates
  • to reduce the invalidation delay, we can overinvalidate
slide-14
SLIDE 14

12/3/2002 Presented by K. Selçuk Candan

Query and update streams…

up1 up2 up3 q1 q2 q3 q4 q5 inv1 inv3 inv2

slide-15
SLIDE 15

12/3/2002 Presented by K. Selçuk Candan

Example

  • Query,

select * from Car, Mileage where Car.maker = "Toyota" and Car.model = Mileage.model;

  • New tuples:

(“Mitsubishi", “Galant", 23000),

(“Toyota", “Avalon", 25000),

  • For the second tuple, we need to check whether

Car.model = Mileage.model

can be satisfied using the data in the database.

(No additional information required) (Additional information required) (Polling query)

slide-16
SLIDE 16

12/3/2002 Presented by K. Selçuk Candan

Polling queries (cont.)

Polling query that has to be answered:

select * from Mileage where "Avalon" = Mileage.model;

If the result to polling query is non-empty, then

the newly inserted tuple affected the query Keypoint: We only need to check for existence, we do not need to evaluate the polling query completely

slide-17
SLIDE 17

12/3/2002 Presented by K. Selçuk Candan

?: the effect of updates on join views

slide-18
SLIDE 18

12/3/2002 Presented by K. Selçuk Candan

?: the effect of updates on join views

  • no distinction between deleted or inserted tuples
  • no need to evaluate entire ?
slide-19
SLIDE 19

12/3/2002 Presented by K. Selçuk Candan

Challenges in calculating ?

available from the update logs

slide-20
SLIDE 20

12/3/2002 Presented by K. Selçuk Candan

Challenges in calculating ?

available from the update logs not available !!!

  • synchronous: a single copy is maintained

the copy is locked during invalidation

  • snapshot-based: a copy of the database is maintained
  • asynchronous: a single copy is maintained

no locking is used

slide-21
SLIDE 21

12/3/2002 Presented by K. Selçuk Candan

Snapshot-based approach (new and old versions are available)

slide-22
SLIDE 22

12/3/2002 Presented by K. Selçuk Candan

Results

Snapshot-based approach

no over- or under-invalidation

replication overhead

slide-23
SLIDE 23

12/3/2002 Presented by K. Selçuk Candan

Synchronous approach (only new available)

  • old version of the database is not available!!!

OVERINVALIDATION

slide-24
SLIDE 24

12/3/2002 Presented by K. Selçuk Candan

Results

Snapshot-based approach

no over- or under-invalidation

replication overhead

Synchronous approach

when there are more than two relations, unrecoverable over- invalidation is possible

locking overhead

slide-25
SLIDE 25

12/3/2002 Presented by K. Selçuk Candan

Asynchronous approach (neither old nor new is available)

slide-26
SLIDE 26

12/3/2002 Presented by K. Selçuk Candan

Results

Snapshot-based approach

no over- or under-invalidation

replication overhead

Synchronous approach

when there are more than two relations, unrecoverable over- invalidation is possible

locking overhead

Asynchronous approach

when there are more than two relations, unrecoverable under- invalidation is possible

no overhead

slide-27
SLIDE 27

12/3/2002 Presented by K. Selçuk Candan

Efficiency: consolidated invalidation

TIME

slide-28
SLIDE 28

12/3/2002 Presented by K. Selçuk Candan

Consolidated invalidation

slide-29
SLIDE 29

12/3/2002 Presented by K. Selçuk Candan

Consolidated invalidation

slide-30
SLIDE 30

12/3/2002 Presented by K. Selçuk Candan

Consolidated invalidation

slide-31
SLIDE 31

12/3/2002 Presented by K. Selçuk Candan

Consolidated invalidation

slide-32
SLIDE 32

12/3/2002 Presented by K. Selçuk Candan

Consolidation versus individual invalidation

Individual invalidation:

is the average top-1 retrieval cost

is the number of queries

Consolidated invalidation:

is the total size of ?

slide-33
SLIDE 33

12/3/2002 Presented by K. Selçuk Candan

Polling query overhead

slide-34
SLIDE 34

12/3/2002 Presented by K. Selçuk Candan

Polling query overhead

slide-35
SLIDE 35

12/3/2002 Presented by K. Selçuk Candan

Overinvalidation vs. table sizes

slide-36
SLIDE 36

12/3/2002 Presented by K. Selçuk Candan

Overinvalidation vs. update rate

slide-37
SLIDE 37

12/3/2002 Presented by K. Selçuk Candan

Conclusions

Fast invalidation is key for caching in multi-tiered architectures Hard consistency is not required by many applications

Overinvalidation is acceptable

Underinvalidation is not!

View invalidation is inherently cheaper than view maintenance View invalidation is feasible!