Abstract
Server-driven consistency protocols can reduce read latency and improve data freshness for a given network and server overhead, compared to the traditional consistency protocols that rely on client polling. Server-driven consistency protocols appear particularly attractive for large-scale dynamic Web workloads because dynamically generated data can change rapidly and unpredictably. However, there have been few reports on engineering server-driven consistency for such workloads. This article reports our experience in engineering server-driven consistency for a sporting and event Web site hosted by IBM, one of the most popular sites on the Internet for the duration of the event. We also examine an e-commerce site for a national retail store. Our study focuses on scalability and cachability of dynamic content. To assess scalability, we measure both the amount of state that a server needs to maintain to ensure consistency and the bursts of load in sending out invalidation messages when a popular object is modified. We find that server-driven protocols can cap the size of the server's state to a given amount without significant performance costs, and can smooth the bursts of load with minimal impact on the consistency guarantees. To improve performance, we systematically investigate several design issues for which prior research has suggested widely different solutions, including whether servers should send invalidations to idle clients. Finally, we quantify the performance impact of caching dynamic data with server-driven consistency protocols and the benefits of server-driven consistency protocols for large-scale dynamic Web services. We find that (i) caching dynamically generated data can increase cache hit rates by up to 10%, compared to the systems that do not cache dynamically generated data; and (ii) server-driven consistency protocols can increase cache hit rates by a factor of 1.5-3 for large-scale dynamic Web services, compared to client polling protocols. We have implemented a prototype of a server-driven consistency protocol based on our findings by augmenting the popular Squid cache.
References
- Baker, M. 1994. Fast crash recovery in distributed file systems. PhD thesis, University of California at Berkeley. Google Scholar
- Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. 1998. On the implications of Zipf's law for web caching. Technical Report 1371, University of Wisconsin (Apr. 1998).Google Scholar
- Brewington, B. and Cybenko, G. 2000. How dynamic is the web? In World Wide Web (May 2000). Google Scholar
- http://cacheoff.ircache.net/N01/.Google Scholar
- Cate, V. 1992. Alex---a global filesystem. In Proceedings of the 1992 USENIX File System Workshop (May 1992).Google Scholar
- Challenger, J., Dantzig, P., and Iyengar, A. 1998. A scalable and highly available system for serving dynamic data at frequently accessed web sites. In Proceedings of ACM/IEEE SC98 (Nov. 1998). Google Scholar
- Challenger, J., Iyengar, A., and Dantzig, P. 1999. A scalable system for consistently caching dynamic web data. In Proceedings of IEEE INFOCOM'99 (Mar. 1999).Google Scholar
- Challenger, J., Iyengar, A., Witting, K., Ferstat, C., and Reed, P. 2000. A publishing system for efficiently creating dynamic web content. In Proceedings of IEEE INFOCOM 2000 (Mar. 2000).Google Scholar
- Chandra, B., Dahlin, M., Gao, L., and Nayate, A. 2001. End-to-end wan service availability. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (USITS01). Google Scholar
- Cho, J. and Garcia-Molina, H. 2000. Synchronizing a database to improve freshnes. In VLDB, 2000. Google Scholar
- Cohen, E. and Kaplan, H. 2001. Refreshment policies for web content caches. In INFOCOM 2001. Google Scholar
- Cohen, E., Krishnamurthy, B., and Rexford, J. 1998. Improving end-to-end performance of the web using server volumes and proxy filters. In SIGCOMM '98. Google Scholar
- http://httpd.apache.org/docs/logs.html.Google Scholar
- Duvvuri, V., Shenoy, P., and Tewari, R. 1999. Adaptive leases: A strong consistency mechanism for the World Wide Web. In INFOCOM 2000. Google Scholar
- Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and Berners-Lee, T. 1999. Hypertext Transfer Protocol---HTTP/1.1. Request for Comments 2616, Network Working Group (June 1999). Google Scholar
- Gray, C. and Cheriton, D. 1989. Leases: An efficient fault-tolerant mechanism for distributed file cache consistency. In Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, 202--210. Google Scholar
- Gwertzman, J. and Seltzer, M. 1996. World-Wide Web cache consistency. In Proceedings of the 1996 USENIX Technical Conference (Jan. 1996). Google Scholar
- Howard, J., Kazar, M., Menees, S., Nichols, D., Satyanarayanan, M., Sidebotham, R., and West, M. 1988. Scale and performance in a distributed file system. ACM Trans. Comput. Syst. 6, 1 (Feb. 1988), 51--81. Google Scholar
- Iyengar, A. and Challenger, J. 1997. Improving Web server performance by caching dynamic data. In Proceedings of the USENIX Symposium on Internet Technologies and Systems (Dec. 1997). Google Scholar
- Iyengar, A., Squillante, M., and Zhang, L. 1999. Analysis and characterization of large-scale web server access patterns and performance. In World Wide Web (June 1999). Google Scholar
- Krishnamurthy, B. and Wills, C. 1998. Piggyback Server Invalidation for proxy cache coherency. In Proceedings of the Seventh International World Wide Web Conference 185--193. Google Scholar
- Li, D. and Cheriton, D. R. 1999. Scalable Web caching of frequently updated objects using reliable multicast. In Proceeding of the 2nd USENIX Symposium on Internet Technologies and Systems (USITS'99) (Oct. 1999). Google Scholar
- Liu, C. and Cao, P. 1997. Maintaining strong cache consistency in the World-Wide Web. In Proceedings of the Seventeenth International Conference on Distributed Computing Systems (May 1997). Google Scholar
- Mogul, J. 1996. http://www.roads.lut.ac.uk/lists/http-caching/1996/01/0002.html.Google Scholar
- Mogul, J. 1994. Recovery in Spritely NFS. Comput. Syst. 7, 2 (Spring 1994), 201--262. Google Scholar
- Mogul, J. 1996. A design for caching in HTTP 1.1 preliminary draft. Tech. Rep., Internet Engineering Task Force (IETF), Jan. 1996. Work in progress.Google Scholar
- Nelson, M., Welch, B., and Ousterhout, J. 1988. Caching in the sprite network file system. ACM Trans. Comput. Syst. 6, 1 (Feb. 1988). Google Scholar
- Padmanabhan, V. and Qiu, L. 2000. The content and access dynamics of a busy web site: Findings and implications. In SIGCOMM '2000. Google Scholar
- Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., and Lyon, B. 1985. Design and implementation of the Sun network filesystem. In Proceedings of the Summer 1985 USENIX Conference (June 1985), 119--130.Google Scholar
- Tewari, R., Dahlin, M., Vin, H., and Kay, J. 1999. Design considerations for distributed caching on the Internet. In Proceedings of the Nineteenth International Conference on Distributed Computing Systems (May 1999). Google Scholar
- Wolman, A., Voelker, G., Sharma, N., Cardwell, N., Brown, M., Landray, T., Pinnel, D., Karlin, A., and Levy, H. 1999a. Organization-based analysis of Web-object sharing and caching. In Proceedings of the Second USENIX Symposium on Internet Technologies and Systems (Oct. 1999). Google Scholar
- Wolman, A., Voelker, G., Sharma, N., Cardwell, N., Karlin, A., and Levy, H. 1999b. On the scale and performance of cooperative Web proxy caching. In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles (Dec. 1999). Google Scholar
- Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1998. Using leases to support server-driven consistency in large-scale systems. In Proceedings of the Eighteenth International Conference on Distributed Computing Systems (May 1998). Google Scholar
- Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1999a. Hierarchical cache consistency in a WAN. In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems (USITS'99). Google Scholar
- Yin, J., Alvisi, L., Dahlin, M., and Lin, C. 1999b. Volume leases to support consistency in large-scale systems. IEEE Trans. Knowl. Data Eng. (Feb. 1999). Google Scholar
- Yu, H., Breslau, L., and Schenker, S. 1999. A scalable Web cache consistency architecture. In SIGCOMM '99 (Sept. 1999). Google Scholar
Index Terms
Engineering web cache consistency





Comments