Evaluating a New Approach to Strong Web Cache Consistency with - - PDF document

evaluating a new approach to strong web cache consistency
SMART_READER_LITE
LIVE PREVIEW

Evaluating a New Approach to Strong Web Cache Consistency with - - PDF document

Evaluating a New Approach to Strong Web Cache Consistency with Snapshots of Collected Content Mikhail Mikhailov and Craig E. Wills Computer Science Department Worcester Polytechnic Institute {mikhail,cew}@cs.wpi.edu Presented by Craig E.


slide-1
SLIDE 1

Evaluating a New Approach to Strong Web Cache Consistency with Snapshots of Collected Content∗

Mikhail Mikhailov and Craig E. Wills Computer Science Department Worcester Polytechnic Institute {mikhail,cew}@cs.wpi.edu Presented by Craig E. Wills at the 12th International World Wide Web Conference Budapest, Hungary May 23, 2003

∗Partially supported by the National Science Foundation Grant

CCR-9988250. 1

slide-2
SLIDE 2

Roadmap

  • Current Practice and Previous Work
  • Approach to Object Management
  • Evaluation Methodology
  • Results
  • Conclusions and Future Work

2

slide-3
SLIDE 3

Current Practice—Cache Control

Servers can tell caches no-cache or Expires and can provide Last-Modified time Given only Last-Modified, caches use %-age of

  • bject age as a reasonable freshness estimate:
  • this approach is heuristic in nature

(caches may use different %-ages)

  • this approach results in many unnecessary

validations (15–37% of all requests)

  • clients still receive stale objects

3

slide-4
SLIDE 4

Previous Work

  • Volumes

(Krishnamurthy and Wills, WWW98; Cohen et al., SIGCOMM98; Krishnamurthy and Rexford, Addison Wesley01)

  • Piggyback (In)Validation

(Krishnamurthy and Wills, WWW98, USITS97)

  • Server Invalidation, Leases, Volume Leases

(Liu and Cao, ICDCS97, Yin et al., ICDCS98, USITS99, WWW01, TOIT02)

  • Data Update Propagation

(Challenger and Iyengar, USITS97, SC98, INFOCOM99, INFOCOM00)

4

slide-5
SLIDE 5

Our Goals

  • manage objects deterministically rather than

heuristically

  • guarantee strong cache consistency
  • scale system to large number of clients

(no per-client state)

  • eliminate unnecessary validations and byte

transfers

5

slide-6
SLIDE 6

News Portal Example

EO4 top.photo.jpg adbanner.gif EO5 EO1 main.css EO2 main.js logo1.gif CO EO3 index.html

  • CO - HTML page (container)
  • EO1 - CSS object
  • EO2 - JavaScript code
  • EO3 - site logo image
  • EO4 - top story photo
  • EO5 - ad banner image

6

slide-7
SLIDE 7

Foundation for Our Approach

  • objects are not stand alone, they have

relationships with other objects

  • objects change at different rates, i. e. they

have different change characteristics

  • a Web page is composed of many

heterogeneous objects

  • caches can contact servers on each access

to fetch frequently changing objects

  • servers can inform caches of updates to
  • ther objects on the same page
  • could expose internal structure of

composite objects to clients, so clients too can operate on individual components

7

slide-8
SLIDE 8

Object Change Characteristics

  • n each

access

Static Changes predictably?

(Can be managed deterministically?) Cacheable Uncacheable

Legend: Deterministic (ND) Non Relatively Static (RSt) Relatively Dynamic (RDyn)

frequently rarely never yes no

Born−on−Access (BoA) Changes how often? Periodic

8

slide-9
SLIDE 9

Object Relationships

index.html logo1.gif main.js top.photo.jpeg adbanner.gif main.css

BoA RSt RSt RSt St St

9

slide-10
SLIDE 10

Combining Object Change Characteristics and Relationships

  • Use the retrieval of a Born-on-Access
  • bject to manage Non-Deterministic
  • bjects
  • If need be, force validation of one

Non-Deterministic object to manage other Non-Deterministic objects

10

slide-11
SLIDE 11

MONARCH

Management of Objects in a Network using Assembly, Relationships, and Change cHaracteristics

  • servers classify objects based on object change

characteristics

  • servers group related objects into volumes
  • servers determine which combination

describes each volume

  • servers designate one object in each

volume to be the manager

  • servers assign all objects Content Control

Commands (CCCs)

  • servers and caches use CCCs to manage
  • bjects deterministically

11

slide-12
SLIDE 12

Evaluation Methodology

  • selected recognizable Web sites

amazon.com, boston.com, cisco.com, cnn.com, espn.com,

  • ra.com,

photo.net, slashdot.org, usenix.org, wpi.edu, yahoo.com

  • collected snapshots of content (home page,

static, and transient links) from each site every 15 minutes 9am–9pm for 14 days (June/July 2002); fetched each object for at least 1 hour

  • simulated 9 scenarios for each site:

content: home, home+static, home+static+transient requests: every 15min, 9am, 9am+noon+4pm+8pm

12

slide-13
SLIDE 13

Policies Studied

  • MONARCH (M)
  • No Cache (NC) and Optimal (Opt)
  • Never Validate (NV) and

Always Validate (AV)

  • Heuristic (H5, H10) and Current Practice

(CP)

  • Object and Volume Leases (OVL)

13

slide-14
SLIDE 14

Effectiveness of the CP Policy

Requests and KB served by Server Site Opt CP NC cisco 1.9 2.9 3.5 19.6 19.2 55.4 cnn* 6.3 56.1 16.4 77.6 31.4 190.8 espn* 4.3 75.3 19.4 85.4 38.7 159.5

(* - stale content served in at least one scenario)

  • transfers 50–60% fewer bytes than NC
  • transfers more bytes than Opt
  • issues more requests than Opt
  • serves stale content

14

slide-15
SLIDE 15

Comparison of Policies

10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 % Bytes Fetched % Requests to Server CNN, 31.4 Requests, 190.8 KB %-ages are relative to NC Opt M CP OVL H5 *H10 AV *NV

  • all policies, even AV, offer substantial

(at least 50–60%) byte savings

  • H5 and H10 outperform CP but serve stale

content in at least one scenario

  • M, OVL, and Opt have similar performance

15

slide-16
SLIDE 16

Server Overhead

Site MONARCH OVL Volumes Revisions Object Leases total avg avg max cisco 4 121 67 70 cnn 84 198 580 941 espn 63 167 525 806

  • number of object leases maintained by OVL

depends on the number of clients and request arrivals

  • overhead of MONARCH is independent of

arrival rate or number of clients

  • invalidation traffic is negligible

16

slide-17
SLIDE 17

Response Time

Based on data obtained by Krishnamurthy et.al., IMW02

5 10 15 20 boston/cnn/espn cisco/photonet/slashdot Response Time for Modem Client (sec.) Web Sites M M CP CP AV AV NC NC

  • Opt, M, and OVL map to the same bucket
  • H5 and H10 map to Opt or CP
  • can improve upon CP for larger, more

dynamic, pages

17

slide-18
SLIDE 18

Conclusions

  • MONARCH manages objects

deterministically and provides strong cache consistency

  • Server state maintained by MONARCH is

independent of request rate or number of clients

  • MONARCH outperforms heuristic policies

in terms of requests and bytes

  • Used snapshots of content actively collected

from real Web sites to evaluate cache consistency policies

18

slide-19
SLIDE 19

Future Work

  • Content assembly: selective,

personalization, URL-rewriting

  • Dynamic change characteristics
  • Coupling MONARCH with existing

templating mechanisms

  • Applying our ideas to non-HTML content:

WML, MPEG-4, on-line games

  • Further refine and expand the content

collection methodology

19