article

Engineering web cache consistency

Online:01 August 2002Publication History

Abstract

Server-driven consistency protocols can reduce read latency and improve data freshness for a given network and server overhead, compared to the traditional consistency protocols that rely on client polling. Server-driven consistency protocols appear particularly attractive for large-scale dynamic Web workloads because dynamically generated data can change rapidly and unpredictably. However, there have been few reports on engineering server-driven consistency for such workloads. This article reports our experience in engineering server-driven consistency for a sporting and event Web site hosted by IBM, one of the most popular sites on the Internet for the duration of the event. We also examine an e-commerce site for a national retail store. Our study focuses on scalability and cachability of dynamic content. To assess scalability, we measure both the amount of state that a server needs to maintain to ensure consistency and the bursts of load in sending out invalidation messages when a popular object is modified. We find that server-driven protocols can cap the size of the server's state to a given amount without significant performance costs, and can smooth the bursts of load with minimal impact on the consistency guarantees. To improve performance, we systematically investigate several design issues for which prior research has suggested widely different solutions, including whether servers should send invalidations to idle clients. Finally, we quantify the performance impact of caching dynamic data with server-driven consistency protocols and the benefits of server-driven consistency protocols for large-scale dynamic Web services. We find that (i) caching dynamically generated data can increase cache hit rates by up to 10%, compared to the systems that do not cache dynamically generated data; and (ii) server-driven consistency protocols can increase cache hit rates by a factor of 1.5-3 for large-scale dynamic Web services, compared to client polling protocols. We have implemented a prototype of a server-driven consistency protocol based on our findings by augmenting the popular Squid cache.

References

  1. Baker, M. 1994. Fast crash recovery in distributed file systems. PhD thesis, University of California at Berkeley. Google ScholarGoogle Scholar
  2. Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. 1998. On the implications of Zipf's law for web caching. Technical Report 1371, University of Wisconsin (Apr. 1998).Google ScholarGoogle Scholar
  3. Brewington, B. and Cybenko, G. 2000. How dynamic is the web? In World Wide Web (May 2000). Google ScholarGoogle Scholar
  4. http://cacheoff.ircache.net/N01/.Google ScholarGoogle Scholar
  5. Cate, V. 1992. Alex---a global filesystem. In Proceedings of the 1992 USENIX File System Workshop (May 1992).Google ScholarGoogle Scholar
  6. Challenger, J., Dantzig, P., and Iyengar, A. 1998. A scalable and highly available system for serving dynamic data at frequently accessed web sites. In Proceedings of ACM/IEEE SC98 (Nov. 1998). Google ScholarGoogle Scholar
  7. Challenger, J., Iyengar, A., and Dantzig, P. 1999. A scalable system for consistently caching dynamic web data. In Proceedings of IEEE INFOCOM'99 (Mar. 1999).Google ScholarGoogle Scholar
  8. Challenger, J., Iyengar, A., Witting, K., Ferstat, C., and Reed, P. 2000. A publishing system for efficiently creating dynamic web content. In Proceedings of IEEE INFOCOM 2000 (Mar. 2000).Google ScholarGoogle Scholar
  9. Chandra, B., Dahlin, M., Gao, L., and Nayate, A. 2001. End-to-end wan service availability. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS01). Google ScholarGoogle Scholar
  10. Cho, J. and Garcia-Molina, H. 2000. Synchronizing a database to improve freshnes. In VLDB, 2000. Google ScholarGoogle Scholar
  11. Cohen, E. and Kaplan, H. 2001. Refreshment policies for web content caches. In INFOCOM 2001. Google ScholarGoogle Scholar
  12. Cohen, E., Krishnamurthy, B., and Rexford, J. 1998. Improving end-to-end performance of the web using server volumes and proxy filters. In SIGCOMM '98. Google ScholarGoogle Scholar
  13. http://httpd.apache.org/docs/logs.html.Google ScholarGoogle Scholar
  14. Duvvuri, V., Shenoy, P., and Tewari, R. 1999. Adaptive leases: A strong consistency mechanism for the World Wide Web. In INFOCOM 2000. Google ScholarGoogle Scholar
  15. Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and Berners-Lee, T. 1999. Hypertext Transfer Protocol---HTTP/1.1. Request for Comments 2616, Network Working Group (June 1999). Google ScholarGoogle Scholar
  16. Gray, C. and Cheriton, D. 1989. Leases: An efficient fault-tolerant mechanism for distributed file cache consistency. In Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, 202--210. Google ScholarGoogle Scholar
  17. Gwertzman, J. and Seltzer, M. 1996. World-Wide Web cache consistency. In Proceedings of the 1996 USENIX Technical Conference (Jan. 1996). Google ScholarGoogle Scholar
  18. Howard, J., Kazar, M., Menees, S., Nichols, D., Satyanarayanan, M., Sidebotham, R., and West, M. 1988. Scale and performance in a distributed file system. ACM Trans. Comput. Syst. 6, 1 (Feb. 1988), 51--81. Google ScholarGoogle Scholar
  19. Iyengar, A. and Challenger, J. 1997. Improving Web server performance by caching dynamic data. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (Dec. 1997). Google ScholarGoogle Scholar
  20. Iyengar, A., Squillante, M., and Zhang, L. 1999. Analysis and characterization of large-scale web server access patterns and performance. In World Wide Web (June 1999). Google ScholarGoogle Scholar
  21. Krishnamurthy, B. and Wills, C. 1998. Piggyback Server Invalidation for proxy cache coherency. In Proceedings of the Seventh International World Wide Web Conference 185--193. Google ScholarGoogle Scholar
  22. Li, D. and Cheriton, D. R. 1999. Scalable Web caching of frequently updated objects using reliable multicast. In Proceeding of the 2nd USENIX Symposium on Internet Technologies and Systems (USITS'99) (Oct. 1999). Google ScholarGoogle Scholar
  23. Liu, C. and Cao, P. 1997. Maintaining strong cache consistency in the World-Wide Web. In Proceedings of the Seventeenth International Conference on Distributed Computing Systems (May 1997). Google ScholarGoogle Scholar
  24. Mogul, J. 1996. http://www.roads.lut.ac.uk/lists/http-caching/1996/01/0002.html.Google ScholarGoogle Scholar
  25. Mogul, J. 1994. Recovery in Spritely NFS. Comput. Syst. 7, 2 (Spring 1994), 201--262. Google ScholarGoogle Scholar
  26. Mogul, J. 1996. A design for caching in HTTP 1.1 preliminary draft. Tech. Rep., Internet Engineering Task Force (IETF), Jan. 1996. Work in progress.Google ScholarGoogle Scholar
  27. Nelson, M., Welch, B., and Ousterhout, J. 1988. Caching in the sprite network file system. ACM Trans. Comput. Syst. 6, 1 (Feb. 1988). Google ScholarGoogle Scholar
  28. Padmanabhan, V. and Qiu, L. 2000. The content and access dynamics of a busy web site: Findings and implications. In SIGCOMM '2000. Google ScholarGoogle Scholar
  29. Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., and Lyon, B. 1985. Design and implementation of the Sun network filesystem. In Proceedings of the Summer 1985 USENIX Conference (June 1985), 119--130.Google ScholarGoogle Scholar
  30. Tewari, R., Dahlin, M., Vin, H., and Kay, J. 1999. Design considerations for distributed caching on the Internet. In Proceedings of the Nineteenth International Conference on Distributed Computing Systems (May 1999). Google ScholarGoogle Scholar
  31. Wolman, A., Voelker, G., Sharma, N., Cardwell, N., Brown, M., Landray, T., Pinnel, D., Karlin, A., and Levy, H. 1999a. Organization-based analysis of Web-object sharing and caching. In Proceedings of the Second USENIX Symposium on Internet Technologies and Systems (Oct. 1999). Google ScholarGoogle Scholar
  32. Wolman, A., Voelker, G., Sharma, N., Cardwell, N., Karlin, A., and Levy, H. 1999b. On the scale and performance of cooperative Web proxy caching. In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (Dec. 1999). Google ScholarGoogle Scholar
  33. Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1998. Using leases to support server-driven consistency in large-scale systems. In Proceedings of the Eighteenth International Conference on Distributed Computing Systems (May 1998). Google ScholarGoogle Scholar
  34. Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1999a. Hierarchical cache consistency in a WAN. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems (USITS'99). Google ScholarGoogle Scholar
  35. Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1999b. Volume leases to support consistency in large-scale systems. IEEE Trans. Knowl. Data Eng. (Feb. 1999). Google ScholarGoogle Scholar
  36. Yu, H., Breslau, L., and Schenker, S. 1999. A scalable Web cache consistency architecture. In SIGCOMM '99 (Sept. 1999). Google ScholarGoogle Scholar

Index Terms

  1. Engineering web cache consistency

                                      Comments

                                      Login options

                                      Check if you have access through your login credentials or your institution to get full access on this article.

                                      Sign in

                                      Full Access

                                      PDF Format

                                      View or Download as a PDF file.

                                      PDF

                                      eReader

                                      View online with eReader.

                                      eReader
                                      About Cookies On This Site

                                      We use cookies to ensure that we give you the best experience on our website.

                                      Learn more

                                      Got it!