Using Latency-Recency Profiles for Data Delivery on the Web Laura - - PowerPoint PPT Presentation

using latency recency profiles for data delivery on the
SMART_READER_LITE
LIVE PREVIEW

Using Latency-Recency Profiles for Data Delivery on the Web Laura - - PowerPoint PPT Presentation

Using Latency-Recency Profiles for Data Delivery on the Web Laura Bright Louiqa Raschid University of Maryland Introduction Caching improves data delivery on the web Cached data may become stale Keeping data fresh adds overhead


slide-1
SLIDE 1

Using Latency-Recency Profiles for Data Delivery on the Web

Laura Bright Louiqa Raschid University of Maryland

slide-2
SLIDE 2

August 22, 2002 VLDB 2002 2

Introduction

  • Caching improves data delivery on the

web

  • Cached data may become stale
  • Keeping data fresh adds overhead

– Latency – Bandwidth – Server Load

  • Existing techniques do not consider

client latency and recency preferences

slide-3
SLIDE 3

August 22, 2002 VLDB 2002 3

Outline

  • Web Technologies
  • Existing Solutions
  • Latency-Recency Profiles
  • Experiments
  • Conclusions
slide-4
SLIDE 4

August 22, 2002 VLDB 2002 4

Proxy Caches

  • Resides between clients and web
  • Objects have Time-to-Live (TTL); expired
  • bjects validated at server
  • Validation adds overhead
  • No server cooperation required

Internet clients servers proxy cache

slide-5
SLIDE 5

August 22, 2002 VLDB 2002 5

Application Servers

  • Offload functionality of database-backed web

servers

  • May perform caching to improve performance
  • Servers may propagate updates to cache

Internet clients database servers application server

slide-6
SLIDE 6

August 22, 2002 VLDB 2002 6

Web Portals

  • Provide information gathered from

multiple data sources

  • Problem: updates to objects at sources

– Update propagation consumes bandwidth – Objects at portal may be stale Internet clients Portal Remote servers

slide-7
SLIDE 7

August 22, 2002 VLDB 2002 7

Consistency Approaches

  • Time-to-Live (TTL)
  • Always-Use-Cache (AUC)
  • Server-Side Invalidation (SSI)
slide-8
SLIDE 8

August 22, 2002 VLDB 2002 8

Time-to-Live (TTL)

  • Estimated lifetime of cached object
  • When TTL expires, cache must validate
  • bject at server
  • No server cooperation
  • Proxy caches

client cache server

slide-9
SLIDE 9

August 22, 2002 VLDB 2002 9

cache server

Always Use Cache (AUC)

  • Objects served from the cache
  • Background prefetching keeps cached
  • bjects up to date
  • No server cooperation
  • Portals, web crawlers

client

slide-10
SLIDE 10

August 22, 2002 VLDB 2002 10

Server Side Invalidation (SSI)

  • Servers send updates to cache
  • Guarantees freshness
  • Increases workload at server
  • Application servers

client cache server

slide-11
SLIDE 11

August 22, 2002 VLDB 2002 11

Summary

  • Existing techniques do not consider client

preferences

  • May add unnecessary overhead (latency,

bandwidth, or server load)

– TTL, SSI

  • May not meet client recency preferences

– AUC

  • Our goal: consider client preferences and

reduce overhead

slide-12
SLIDE 12

August 22, 2002 VLDB 2002 12

Outline

  • Web Technologies
  • Existing Solutions
  • Latency-Recency Profiles
  • Experiments
  • Conclusions
slide-13
SLIDE 13

August 22, 2002 VLDB 2002 13

Latency-Recency Profiles

  • Used in download decision at cache
  • Profile - set of application specific

parameters that reflect client latency and recency preferences

  • Examples

– Stock trader tolerates latency of 5 seconds for most recent stock quotes – Casual web browser wants low latency; tolerates data with recency two updates

slide-14
SLIDE 14

August 22, 2002 VLDB 2002 14

Profile Parameters

  • Set by clients
  • Target Latency ( ) -acceptable latency
  • f request
  • Target Age( ) - acceptable recency of

data

  • Examples

– stock trader: =0 updates, =5 seconds – casual browser: =2 updates, = 2 seconds

L

T

A

T

L

T

L

T

A

T

A

T

slide-15
SLIDE 15

August 22, 2002 VLDB 2002 15

Profile-Based Downloading

  • Parameters appended to requests
  • Scoring function determines when to

validate object and when to use cached copy

  • Scales to multiple clients
  • Minimal overhead at cache
slide-16
SLIDE 16

August 22, 2002 VLDB 2002 16

Scoring Function Properties

  • Tunability

– Clients control latency-recency tradeoff

  • Guarantees

– upper bounds with respect latency or recency

  • Ease of implementation
slide-17
SLIDE 17

August 22, 2002 VLDB 2002 17

Example Scoring Function

  • T = target value ( or )
  • x = actual or estimated value (Age or

Latency)

  • K =constant that tunes the rate the score

decreases

{

  • therwise

T x if ≤

= K) x, (T, Score

K) T

  • K/(x

1 +

L

T

A

T

slide-18
SLIDE 18

August 22, 2002 VLDB 2002 18

Combined Weighted Score

  • Used by cache
  • Age = estimated age of object
  • Latency = estimated latency
  • w = relative importance of meeting

target latency

  • (1 - w) = importance of meeting recency

=

  • re

CombinedSc ) K Age, Score(T * w)

  • (1

A A,

) K Latency, Score(T * w

L L,

+

slide-19
SLIDE 19

August 22, 2002 VLDB 2002 19

Profile-Based Downloading

  • DownloadScore- expected score of

downloading a fresh object

1.0 * w ) K Age, , Score(T * w)

  • (1

CacheScore

A A

+ =

) K Latency, , Score(T * w 1.0 * w)

  • (1
  • re

DownloadSc

L L

+ =

  • CacheScore- expected score of using the

cached object

  • If DownloadScore > CacheScore, download

fresh object, otherwise use cache

slide-20
SLIDE 20

August 22, 2002 VLDB 2002 20

Tuning Profiles

W= 0.5, No firm upper bound K values control slope

slide-21
SLIDE 21

August 22, 2002 VLDB 2002 21

Upper Bounds

W > 0.5- Firm Latency Upper Bound Download if Latency <3 W = 0.6, =2 =0.5

L

K

A

K

Download if Latency <2 W = 0.6, =2 =2

L

K

A

K

slide-22
SLIDE 22

August 22, 2002 VLDB 2002 22

Outline

  • Web Technologies
  • Existing Solutions
  • Latency-Recency Profiles
  • Experiments
  • Conclusions
slide-23
SLIDE 23

August 22, 2002 VLDB 2002 23

Baseline Algorithms

  • Time-to-Live (TTL)

– Estimated lifetime of object – Can be estimated by a server or as function

  • f time object last modified

– Provides most recent data – When object’s TTL expires, new object must be downloaded

slide-24
SLIDE 24

August 22, 2002 VLDB 2002 24

Baseline Algorithms

  • AlwaysUseCache (AUC)

– minimizes latency – If object is in cache, always serve without validation – Prefetch cached objects in round robin manner to improve recency – Prefetch rates of 60 objects/minute and 300 objects/minute

slide-25
SLIDE 25

August 22, 2002 VLDB 2002 25

Baseline Algorithms

  • Server-Side Invalidation (SSI)
  • SSI-Msg

– Server sends invalidation messages only – Cache must request updated object

  • SSI-Obj

– Server sends updated objects to cache – Reduces latency but consumes bandwidth

slide-26
SLIDE 26

August 22, 2002 VLDB 2002 26

Trace Data

  • Proxy cache trace data obtained from

NLANR in January 2002

  • 3.7 million requests over 5 days
  • 1,365,545 distinct objects, avg size 2.1 KB
  • Performed preprocessing
  • Age estimated using last-modified time
  • Latency is average over previous requests
  • Profiles: = 1 second, = 1 update
  • Cache size range: 1% of world size- infinite

A

T

L

T

slide-27
SLIDE 27

August 22, 2002 VLDB 2002 27

Synthetic Data

  • World of 100,000 objects
  • Zipf-like popularity distribution
  • Update intervals uniformly distributed

from 10 min-2 hours

  • Workload of 8 requests/sec
  • Object sizes 2-12 KB
  • Infinite cache
slide-28
SLIDE 28

August 22, 2002 VLDB 2002 28

Metrics

  • Validations - messages between cache

and servers

– Useful validations - object was modified – Useless validations - object not modified

  • Downloads - objects downloaded from

servers

  • Stale hits - objects served from cache

that were modified at server

slide-29
SLIDE 29

August 22, 2002 VLDB 2002 29

Comparison- Trace Data

TTL AUC- 60 AUC- 300 Profile

Val msgs 252367 378312 1891560

92943

Useful vals

24898 933 2810 22896

Useless vals

122074 279349 327776 67601

  • Avg. Est.

Age

18.4 11.1 0.87

Stale hits

4282 31285 22897 7704

slide-30
SLIDE 30

August 22, 2002 VLDB 2002 30

Comparison- Trace Data

5000 10000 15000 20000 25000 Useful Validations TTL AUC-60 AUC-300 Profile

50000 100000 150000 200000 250000 300000 350000 Useless Vals. TTL AUC-60 AUC-300 Profile

5000 10000 15000 20000 25000 30000 35000 Stale Hits TTL AUC-60 AUC-300 Profile

slide-31
SLIDE 31

August 22, 2002 VLDB 2002 31

Comparison- Synthetic Data

20000 40000 60000 80000 100000 120000 140000 160000 180000 Validations Downloads Stale Hits SSI-Msg TTL AUC-300 Profile

slide-32
SLIDE 32

August 22, 2002 VLDB 2002 32

Effect of Cache Size

– X-axis- cache size, Y-axis- average latency – Profile lies between extremes of TTL and AUC – Profile exploits increased cache size better than TTL

slide-33
SLIDE 33

August 22, 2002 VLDB 2002 33

Effect of Cache Size

– X-axis- cache size, Y-axis- number of stale objects – AUC must prefetch many objects when cache is large – Profile can scale to large cache size

slide-34
SLIDE 34

August 22, 2002 VLDB 2002 34

Effect of Surges

  • Surge- client request workload exceeds

capacity of server or network

  • Two groups of clients:

– MostRecent: = 0 = 1 sec – LowLatency: = 1 = 0 sec

  • 30 second surge period
  • Capacity Ratio = available resources/

resources required

L

T

A

T

L

T

A

T

slide-35
SLIDE 35

August 22, 2002 VLDB 2002 35

Effect of Surges

slide-36
SLIDE 36

August 22, 2002 VLDB 2002 36

Related Work

  • Refreshing cached data

– Cho and Garcia-Molina 2000

  • WebViews

– Labrinidis and Roussopoulos, 2000, 2001

  • Caching Dynamic Content

– Candan et al. 2001 – Luo and Naughton 2001

  • Caching Approximate Values

– Olston and Widom 2001, 2002

slide-37
SLIDE 37

August 22, 2002 VLDB 2002 37

Conclusions and Future Work

  • Latency-Recency Profiles can reduce
  • verhead while meeting client

preferences

  • Future work:

– Implementation – Mobile Environments – Effects of server cooperation