skip to main content
10.1145/1736020.1736034acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Request behavior variations

Published:13 March 2010Publication History

ABSTRACT

A large number of user requests execute (often concurrently) within a server system. A single request may exhibit fluctuating hardware characteristics (such as instruction completion rate and on-chip resource usage) over the course of its execution, due to inherent variations in application execution semantics as well as dynamic resource competition on resource-sharing processors like multicores. Understanding such behavior variations can assist fine-grained request modeling and adaptive resource management.

This paper presents operating system management to track request behavior variations online. In addition to metric sample collection during periodic interrupts, we exploit the frequent system calls in server applications to perform low-cost in-kernel sampling. We utilize identified behavior variations to support or enhance request modeling in request classification, anomaly analysis, and online request signature construction. A foundation of our request modeling is the ability to quantify the difference between two requests' time series behaviors. We evaluate several differencing measures and enhance the classic dynamic time warping technique with additional penalties for asynchronous warp steps. Finally, motivated by fluctuating request resource usage and the resulting contention, we implement contention-easing CPU scheduling on multicore platforms and demonstrate its effectiveness in improving the worst-case request performance.

Experiments in this paper are based on five server applications -- Apache web server, TPCC, TPCH, RUBiS online auction benchmark, and a user-content-driven online teaching application called WeBWorK.

References

  1. Intel 64 and IA-32 architectures software developer's manual volume 3B: System programming guide, part 2, table B-7. http://download.intel.com/design/processor/manuals/253669.pdf.Google ScholarGoogle Scholar
  2. Moodle course management system. http://moodle.org/.Google ScholarGoogle Scholar
  3. RUBiS: Rice University Bidding System. http://rubis.objectweb.org.Google ScholarGoogle Scholar
  4. SPECweb99 benchmark. http://www.specbench.org/osg/web99.Google ScholarGoogle Scholar
  5. TPC-C benchmark. http://www.tpc.org/tpcc.Google ScholarGoogle Scholar
  6. TPC-H benchmark. http://www.tpc.org/tpch.Google ScholarGoogle Scholar
  7. WeBWorK: Online homework for math and science. http://webwork.maa.org/moodle/.Google ScholarGoogle Scholar
  8. J.M. Anderson, L.M. Berc, J. Dean, S. Ghemawat, M.R. Henzinger, S.A. Leung, R.L. Sites, M.T. Vandevoorde, C.A. Waldspurger, and W.E. Weihl. Continuous profiling: Where have all the cycles gone? ACM Trans. on Computer Systems, 15(4):357--390, November 1997. Google ScholarGoogle Scholar
  9. M. Aron and P. Druschel. Soft timers: Efficient microsecond software timer support for network processing. ACM Trans. on Computer Systems, 18(3):197--228, August 2000. Google ScholarGoogle Scholar
  10. P. Barham, A. Donnelly, R. Isaacs, and R. Mortier. Using Magpie for request extraction and workload modeling. In 6th USENIX Symp. on Operating Systems Design and Implementation, pages 259--272, San Francisco, CA, December 2004. Google ScholarGoogle Scholar
  11. L.A. Barroso, K. Gharachorloo, and E. Bugnion. Memory system characterization of commercial workloads. In 25th Int'l Symp. on Computer Architecture, pages 3--14, Barcelona, Spain, July 1998. Google ScholarGoogle Scholar
  12. A.P. Batson and A.W. Madison. Measurements of major locality phases in symbolic reference strings. In ACM SIGMETRICS, pages 75--84, Cambridge, MA, March 1976. Google ScholarGoogle Scholar
  13. A.S. Dhodapkar and J.E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In 29th Int'l Symp. on Computer Architecture, pages 233--244, Anchorage, AL, May 2002. Google ScholarGoogle Scholar
  14. A. Fedorova, C. Small, D. Nussbaum, and M. Seltzer. Chip multithreading systems need a new operating system scheduler. In SIGOPS European Workshop, Leuven, Belgium, September 2004. Google ScholarGoogle Scholar
  15. A. Fedorova, M. Seltzer, and M.D. Smith. Improving performance isolation on chip multiprocessors via an operating system scheduler. In 16th Int'l Conf. on Parallel Architecture and Compilation Techniques, pages 25--38, Brasov, Romania, September 2007. Google ScholarGoogle Scholar
  16. M. Hauswirth, A. Diwan, P.F. Sweeney, and M.C. Mozer. Automating vertical profiling. In 20th ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 281--296, San Diego, CA, October 2005. Google ScholarGoogle Scholar
  17. C. Isci and M. Martonosi. Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques. In 12th Int'l Symp. on High-Performance Computer Architecture, pages 121--132, Austin, TX, February 2006.Google ScholarGoogle Scholar
  18. L. Kaufman and P. J. Rousseeuw. Finding groups in data: An introduction to cluster analysis. Wiley, New York, 1990.Google ScholarGoogle Scholar
  19. K. Keeton, D.A. Patterson, Y.Q. He, R.C. Raphael, and W.E. Baker. Performance characterization of a Quad Pentium Pro SMP using OLTP workloads. In 25th Int'l Symp. on Computer Architecture, pages 15--26, Barcelona, Spain, July 1998. Google ScholarGoogle Scholar
  20. J.R. Larus and M. Parkes. Using cohort scheduling to enhance server performance. In USENIX Annual Technical Conf., Monterey, CA, June 2002. Google ScholarGoogle Scholar
  21. V.I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10, 1966.Google ScholarGoogle Scholar
  22. J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In 14th Int'l Symp. on High-Performance Computer Architecture, Salt Lake City, UT, February 2008.Google ScholarGoogle Scholar
  23. C. Myers, L.R. Rabiner, and A.E. Rosenberg. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, 28(6):623--635, December 1980.Google ScholarGoogle Scholar
  24. S. Parekh, S. Eggers, and H. Levy. Thread-sensitive scheduling for SMT processors. Technical report, Department of Computer Science and Engineering, University of Washington, May 2000.Google ScholarGoogle Scholar
  25. H. Sakoe and S. Chiba. Dynamic programming optimization for spoken word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, 26(1):43--49, February 1978.Google ScholarGoogle Scholar
  26. K. Shen, H. Tang, T. Yang, and L. Chu. Integrated resource management for cluster-based internet services. In 5th USENIX Symp. on Operating Systems Design and Implementation, pages 225--238, Boston, MA, December 2002. Google ScholarGoogle Scholar
  27. K. Shen, M. Zhong, S. Dwarkadas, C. Li, C. Stewart, and X. Zhang. Hardware counter driven on-the-fly request signatures. In 13th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 189--200, Seattle, WA, March 2008. Google ScholarGoogle Scholar
  28. K. Shen, C. Stewart, C. Li, and X. Li. Reference-driven performance anomaly identification. In ACM SIGMETRICS, pages 85--96, Seattle, WA, June 2009. Google ScholarGoogle Scholar
  29. X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In 11th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 165--176, Boston, MA, October 2004. Google ScholarGoogle Scholar
  30. T. Sherwood, S. Sair, and B. Calder. Phase tracking and prediction. In 30th Int'l Symp. on Computer Architecture, pages 336--349, San Diego, CA, June 2003. Google ScholarGoogle Scholar
  31. A. Snavely and D. Tullsen. Symbiotic job scheduling for a simultaneous multithreading processor. In 9th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 234--244, Cambridge, MA, November 2000. Google ScholarGoogle Scholar
  32. C. Stewart and K. Shen. Performance modeling and system management for multi-component online services. In Second USENIX Symp. on Networked Systems Design and Implementation, pages 71--84, Boston, MA, May 2005. Google ScholarGoogle Scholar
  33. C. Stewart, M. Leventi, and K. Shen. Empirical examination of a collaborative web application. In IEEE Int'l Symp. on Workload Characterization, Seattle, WA, September 2008.Google ScholarGoogle Scholar
  34. D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared L2 caches on multicore systems in software. In Workshop on the Interaction between Operating Systems and Computer Architecture, San Diego, CA, June 2007.Google ScholarGoogle Scholar
  35. B. Urgaonkar, P. Shenoy, and T. Roscoe. Resource overbooking and application profiling in shared hosting platforms. In 5th USENIX Symp. on Operating Systems Design and Implementation, pages 239--254, Boston, MA, December 2002. Google ScholarGoogle Scholar
  36. R. von Behren, J. Condit, F. Zhou, G.C. Necula, and E. Brewer. Capriccio: Scalable threads for internet services. In 19th ACM Symp. on Operating Systems Principles, pages 268--281, Bolton Landing, NY, October 2003. Google ScholarGoogle Scholar
  37. M. Welsh, D. Culler, and E. Brewer. SEDA: An architecture for wellconditioned, scalable internet services. In 18th ACM Symp. on Operating Systems Principles, pages 230--243, Banff, Canada, October 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. X. Zhang, S. Dwarkadas, G. Folkmanis, and K. Shen. Processor hardware counter statistics as a first-class system resource. In 11th Workshop on Hot Topics in Operating Systems, San Diego, CA, May 2007. Google ScholarGoogle Scholar
  39. X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multi-core cache management. In 4th European Systems Conf., pages 89--102, Nuremberg, Germany, April 2009. Google ScholarGoogle Scholar
  40. X. Zhang, S. Dwarkadas, and K. Shen. Hardware execution throttling for multi-core resource management. In USENIX Annual Technical Conf., San Deigo, CA, June 2009. Google ScholarGoogle Scholar

Index Terms

  1. Request behavior variations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!