ABSTRACT
A large number of user requests execute (often concurrently) within a server system. A single request may exhibit fluctuating hardware characteristics (such as instruction completion rate and on-chip resource usage) over the course of its execution, due to inherent variations in application execution semantics as well as dynamic resource competition on resource-sharing processors like multicores. Understanding such behavior variations can assist fine-grained request modeling and adaptive resource management.
This paper presents operating system management to track request behavior variations online. In addition to metric sample collection during periodic interrupts, we exploit the frequent system calls in server applications to perform low-cost in-kernel sampling. We utilize identified behavior variations to support or enhance request modeling in request classification, anomaly analysis, and online request signature construction. A foundation of our request modeling is the ability to quantify the difference between two requests' time series behaviors. We evaluate several differencing measures and enhance the classic dynamic time warping technique with additional penalties for asynchronous warp steps. Finally, motivated by fluctuating request resource usage and the resulting contention, we implement contention-easing CPU scheduling on multicore platforms and demonstrate its effectiveness in improving the worst-case request performance.
Experiments in this paper are based on five server applications -- Apache web server, TPCC, TPCH, RUBiS online auction benchmark, and a user-content-driven online teaching application called WeBWorK.
- Intel 64 and IA-32 architectures software developer's manual volume 3B: System programming guide, part 2, table B-7. http://download.intel.com/design/processor/manuals/253669.pdf.Google Scholar
- Moodle course management system. http://moodle.org/.Google Scholar
- RUBiS: Rice University Bidding System. http://rubis.objectweb.org.Google Scholar
- SPECweb99 benchmark. http://www.specbench.org/osg/web99.Google Scholar
- TPC-C benchmark. http://www.tpc.org/tpcc.Google Scholar
- TPC-H benchmark. http://www.tpc.org/tpch.Google Scholar
- WeBWorK: Online homework for math and science. http://webwork.maa.org/moodle/.Google Scholar
- J.M. Anderson, L.M. Berc, J. Dean, S. Ghemawat, M.R. Henzinger, S.A. Leung, R.L. Sites, M.T. Vandevoorde, C.A. Waldspurger, and W.E. Weihl. Continuous profiling: Where have all the cycles gone? ACM Trans. on Computer Systems, 15(4):357--390, November 1997. Google Scholar
- M. Aron and P. Druschel. Soft timers: Efficient microsecond software timer support for network processing. ACM Trans. on Computer Systems, 18(3):197--228, August 2000. Google Scholar
- P. Barham, A. Donnelly, R. Isaacs, and R. Mortier. Using Magpie for request extraction and workload modeling. In 6th USENIX Symp. on Operating Systems Design and Implementation, pages 259--272, San Francisco, CA, December 2004. Google Scholar
- L.A. Barroso, K. Gharachorloo, and E. Bugnion. Memory system characterization of commercial workloads. In 25th Int'l Symp. on Computer Architecture, pages 3--14, Barcelona, Spain, July 1998. Google Scholar
- A.P. Batson and A.W. Madison. Measurements of major locality phases in symbolic reference strings. In ACM SIGMETRICS, pages 75--84, Cambridge, MA, March 1976. Google Scholar
- A.S. Dhodapkar and J.E. Smith. Managing multi-configuration hardware via dynamic working set analysis. In 29th Int'l Symp. on Computer Architecture, pages 233--244, Anchorage, AL, May 2002. Google Scholar
- A. Fedorova, C. Small, D. Nussbaum, and M. Seltzer. Chip multithreading systems need a new operating system scheduler. In SIGOPS European Workshop, Leuven, Belgium, September 2004. Google Scholar
- A. Fedorova, M. Seltzer, and M.D. Smith. Improving performance isolation on chip multiprocessors via an operating system scheduler. In 16th Int'l Conf. on Parallel Architecture and Compilation Techniques, pages 25--38, Brasov, Romania, September 2007. Google Scholar
- M. Hauswirth, A. Diwan, P.F. Sweeney, and M.C. Mozer. Automating vertical profiling. In 20th ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 281--296, San Diego, CA, October 2005. Google Scholar
- C. Isci and M. Martonosi. Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques. In 12th Int'l Symp. on High-Performance Computer Architecture, pages 121--132, Austin, TX, February 2006.Google Scholar
- L. Kaufman and P. J. Rousseeuw. Finding groups in data: An introduction to cluster analysis. Wiley, New York, 1990.Google Scholar
- K. Keeton, D.A. Patterson, Y.Q. He, R.C. Raphael, and W.E. Baker. Performance characterization of a Quad Pentium Pro SMP using OLTP workloads. In 25th Int'l Symp. on Computer Architecture, pages 15--26, Barcelona, Spain, July 1998. Google Scholar
- J.R. Larus and M. Parkes. Using cohort scheduling to enhance server performance. In USENIX Annual Technical Conf., Monterey, CA, June 2002. Google Scholar
- V.I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady, 10, 1966.Google Scholar
- J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In 14th Int'l Symp. on High-Performance Computer Architecture, Salt Lake City, UT, February 2008.Google Scholar
- C. Myers, L.R. Rabiner, and A.E. Rosenberg. Performance tradeoffs in dynamic time warping algorithms for isolated word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, 28(6):623--635, December 1980.Google Scholar
- S. Parekh, S. Eggers, and H. Levy. Thread-sensitive scheduling for SMT processors. Technical report, Department of Computer Science and Engineering, University of Washington, May 2000.Google Scholar
- H. Sakoe and S. Chiba. Dynamic programming optimization for spoken word recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing, 26(1):43--49, February 1978.Google Scholar
- K. Shen, H. Tang, T. Yang, and L. Chu. Integrated resource management for cluster-based internet services. In 5th USENIX Symp. on Operating Systems Design and Implementation, pages 225--238, Boston, MA, December 2002. Google Scholar
- K. Shen, M. Zhong, S. Dwarkadas, C. Li, C. Stewart, and X. Zhang. Hardware counter driven on-the-fly request signatures. In 13th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 189--200, Seattle, WA, March 2008. Google Scholar
- K. Shen, C. Stewart, C. Li, and X. Li. Reference-driven performance anomaly identification. In ACM SIGMETRICS, pages 85--96, Seattle, WA, June 2009. Google Scholar
- X. Shen, Y. Zhong, and C. Ding. Locality phase prediction. In 11th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 165--176, Boston, MA, October 2004. Google Scholar
- T. Sherwood, S. Sair, and B. Calder. Phase tracking and prediction. In 30th Int'l Symp. on Computer Architecture, pages 336--349, San Diego, CA, June 2003. Google Scholar
- A. Snavely and D. Tullsen. Symbiotic job scheduling for a simultaneous multithreading processor. In 9th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems, pages 234--244, Cambridge, MA, November 2000. Google Scholar
- C. Stewart and K. Shen. Performance modeling and system management for multi-component online services. In Second USENIX Symp. on Networked Systems Design and Implementation, pages 71--84, Boston, MA, May 2005. Google Scholar
- C. Stewart, M. Leventi, and K. Shen. Empirical examination of a collaborative web application. In IEEE Int'l Symp. on Workload Characterization, Seattle, WA, September 2008.Google Scholar
- D. Tam, R. Azimi, L. Soares, and M. Stumm. Managing shared L2 caches on multicore systems in software. In Workshop on the Interaction between Operating Systems and Computer Architecture, San Diego, CA, June 2007.Google Scholar
- B. Urgaonkar, P. Shenoy, and T. Roscoe. Resource overbooking and application profiling in shared hosting platforms. In 5th USENIX Symp. on Operating Systems Design and Implementation, pages 239--254, Boston, MA, December 2002. Google Scholar
- R. von Behren, J. Condit, F. Zhou, G.C. Necula, and E. Brewer. Capriccio: Scalable threads for internet services. In 19th ACM Symp. on Operating Systems Principles, pages 268--281, Bolton Landing, NY, October 2003. Google Scholar
- M. Welsh, D. Culler, and E. Brewer. SEDA: An architecture for wellconditioned, scalable internet services. In 18th ACM Symp. on Operating Systems Principles, pages 230--243, Banff, Canada, October 2001. Google Scholar
Digital Library
- X. Zhang, S. Dwarkadas, G. Folkmanis, and K. Shen. Processor hardware counter statistics as a first-class system resource. In 11th Workshop on Hot Topics in Operating Systems, San Diego, CA, May 2007. Google Scholar
- X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multi-core cache management. In 4th European Systems Conf., pages 89--102, Nuremberg, Germany, April 2009. Google Scholar
- X. Zhang, S. Dwarkadas, and K. Shen. Hardware execution throttling for multi-core resource management. In USENIX Annual Technical Conf., San Deigo, CA, June 2009. Google Scholar
Index Terms
Request behavior variations
Recommendations
Request behavior variations
ASPLOS '10A large number of user requests execute (often concurrently) within a server system. A single request may exhibit fluctuating hardware characteristics (such as instruction completion rate and on-chip resource usage) over the course of its execution, due ...
Request behavior variations
ASPLOS '10A large number of user requests execute (often concurrently) within a server system. A single request may exhibit fluctuating hardware characteristics (such as instruction completion rate and on-chip resource usage) over the course of its execution, due ...
Hardware counter driven on-the-fly request signatures
ASPLOS '08Today's processors provide a rich source of statistical informationon application execution through hardware counters. In this paper, we explore the utilization of these statistics as request signaturesin server applications for identifying requests and ...








Comments