evaluating a new approach to strong web cache consistency
play

Evaluating a New Approach to Strong Web Cache Consistency with - PDF document

Evaluating a New Approach to Strong Web Cache Consistency with Snapshots of Collected Content Mikhail Mikhailov and Craig E. Wills Computer Science Department Worcester Polytechnic Institute {mikhail,cew}@cs.wpi.edu Presented by Craig E.


  1. Evaluating a New Approach to Strong Web Cache Consistency with Snapshots of Collected Content ∗ Mikhail Mikhailov and Craig E. Wills Computer Science Department Worcester Polytechnic Institute {mikhail,cew}@cs.wpi.edu Presented by Craig E. Wills at the 12 th International World Wide Web Conference Budapest, Hungary May 23, 2003 ∗ Partially supported by the National Science Foundation Grant CCR-9988250. 1

  2. Roadmap • Current Practice and Previous Work • Approach to Object Management • Evaluation Methodology • Results • Conclusions and Future Work 2

  3. Current Practice—Cache Control Servers can tell caches no-cache or Expires and can provide Last-Modified time Given only Last-Modified , caches use %-age of object age as a reasonable freshness estimate: • this approach is heuristic in nature (caches may use different %-ages) • this approach results in many unnecessary validations (15–37% of all requests) • clients still receive stale objects 3

  4. Previous Work • Volumes ( Krishnamurthy and Wills, WWW98; Cohen et al., SIGCOMM98; Krishnamurthy and Rexford, Addison Wesley01 ) • Piggyback (In)Validation ( Krishnamurthy and Wills, WWW98, USITS97 ) • Server Invalidation, Leases, Volume Leases ( Liu and Cao, ICDCS97, Yin et al., ICDCS98, USITS99, WWW01, TOIT02 ) • Data Update Propagation ( Challenger and Iyengar, USITS97, SC98, INFOCOM99, INFOCOM00 ) 4

  5. Our Goals • manage objects deterministically rather than heuristically • guarantee strong cache consistency • scale system to large number of clients (no per-client state) • eliminate unnecessary validations and byte transfers 5

  6. News Portal Example index.html CO EO3 adbanner.gif EO5 logo1.gif EO4 top.photo.jpg main.css EO1 main.js EO2 • CO - HTML page (container) • EO1 - CSS object • EO2 - JavaScript code • EO3 - site logo image • EO4 - top story photo • EO5 - ad banner image 6

  7. Foundation for Our Approach • objects are not stand alone, they have relationships with other objects • objects change at different rates, i. e. they have different change characteristics • a Web page is composed of many heterogeneous objects • caches can contact servers on each access to fetch frequently changing objects • servers can inform caches of updates to other objects on the same page • could expose internal structure of composite objects to clients, so clients too can operate on individual components 7

  8. Object Change Characteristics on each Legend: access Born−on−Access Cacheable (BoA) Deterministic (ND) Uncacheable Changes how often? frequently Relatively Periodic Dynamic (RDyn) Non rarely Relatively Static (RSt) never Static yes no Changes predictably? (Can be managed deterministically?) 8

  9. Object Relationships index.html main.css adbanner.gif top.photo.jpeg main.js logo1.gif BoA St St RSt RSt RSt 9

  10. Combining Object Change Characteristics and Relationships • Use the retrieval of a Born-on-Access object to manage Non-Deterministic objects • If need be, force validation of one Non-Deterministic object to manage other Non-Deterministic objects 10

  11. MONARCH Management of Objects in a Network using Assembly, Relationships, and Change cHaracteristics • servers classify objects based on object change characteristics • servers group related objects into volumes • servers determine which combination describes each volume • servers designate one object in each volume to be the manager • servers assign all objects Content Control Commands (CCCs) • servers and caches use CCCs to manage objects deterministically 11

  12. Evaluation Methodology • selected recognizable Web sites amazon.com, boston.com, cisco.com, cnn.com, espn.com, ora.com, photo.net, slashdot.org, usenix.org, wpi.edu, yahoo.com • collected snapshots of content ( home page , static , and transient links) from each site every 15 minutes 9am–9pm for 14 days (June/July 2002); fetched each object for at least 1 hour • simulated 9 scenarios for each site: content: home, home+static, home+static+transient requests: every 15min, 9am, 9am+noon+4pm+8pm 12

  13. Policies Studied • MONARCH (M) • No Cache (NC) and Optimal (Opt) • Never Validate (NV) and Always Validate (AV) • Heuristic (H5, H10) and Current Practice (CP) • Object and Volume Leases (OVL) 13

  14. Effectiveness of the CP Policy Requests and KB served by Server Site Opt CP NC cisco 1.9 2.9 3.5 19.6 19.2 55.4 cnn* 6.3 56.1 16.4 77.6 31.4 190.8 espn* 4.3 75.3 19.4 85.4 38.7 159.5 (* - stale content served in at least one scenario) • transfers 50–60% fewer bytes than NC • transfers more bytes than Opt • issues more requests than Opt • serves stale content 14

  15. Comparison of Policies 100 CNN, 31.4 Requests, 190.8 KB Opt 90 M %-ages are relative to NC CP 80 OVL % Bytes Fetched 70 H5 *H10 60 AV 50 *NV 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 % Requests to Server • all policies, even AV, offer substantial (at least 50–60%) byte savings • H5 and H10 outperform CP but serve stale content in at least one scenario • M, OVL, and Opt have similar performance 15

  16. Server Overhead Site MONARCH OVL Volumes Revisions Object Leases total avg avg max cisco 4 121 67 70 cnn 84 198 580 941 espn 63 167 525 806 • number of object leases maintained by OVL depends on the number of clients and request arrivals • overhead of MONARCH is independent of arrival rate or number of clients • invalidation traffic is negligible 16

  17. Response Time Based on data obtained by Krishnamurthy et.al., IMW02 20 Response Time for Modem Client (sec.) 15 10 5 M CP AV NC M CP AV NC 0 boston/cnn/espn cisco/photonet/slashdot Web Sites • Opt, M, and OVL map to the same bucket • H5 and H10 map to Opt or CP • can improve upon CP for larger, more dynamic, pages 17

  18. Conclusions • MONARCH manages objects deterministically and provides strong cache consistency • Server state maintained by MONARCH is independent of request rate or number of clients • MONARCH outperforms heuristic policies in terms of requests and bytes • Used snapshots of content actively collected from real Web sites to evaluate cache consistency policies 18

  19. Future Work • Content assembly: selective, personalization, URL-rewriting • Dynamic change characteristics • Coupling MONARCH with existing templating mechanisms • Applying our ideas to non-HTML content: WML, MPEG-4, on-line games • Further refine and expand the content collection methodology 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend