skip to main content
article

Online performance auditing: using hot optimizations without getting burned

Published:11 June 2006Publication History
Skip Abstract Section

Abstract

As hardware complexity increases and virtualization is added at more layers of the execution stack, predicting the performance impact of optimizations becomes increasingly difficult. Production compilers and virtual machines invest substantial development effort in performance tuning to achieve good performance for a range of benchmarks. Although optimizations typically perform well on average, they often have unpredictable impact on running time, sometimes degrading performance significantly. Today's VMs perform sophisticated feedback-directed optimizations, but these techniques do not address performance degradations, and they actually make the situation worse by making the system more unpredictable.This paper presents an online framework for evaluating the effectiveness of optimizations, enabling an online system to automatically identify and correct performance anomalies that occur at runtime. This work opens the door for a fundamental shift in the way optimizations are developed and tuned for online systems, and may allow the body of work in offline empirical optimization search to be applied automatically at runtime. We present our implementation and evaluation of this system in a product Java VM.

References

  1. A.-R. Adl-Tabatabai, J. Bharadwaj, D.-Y. Chen, A. Ghuloum, V. Menon, B. Murphy, M. Serrano, and T. Shpeisman. The Star-JIT compiler: A dynamic compiler for managed runtime environments. Intel Technology Journal, 7(1):19--31, Feb. 2003.Google ScholarGoogle Scholar
  2. A.-R. Adl-Tabatabai, R. L. Hudson, M. J. Serrano, and S. Subramoney. Prefetch injection based on hardware monitoring and object metadata. ACM SIGPLAN Notices, 39(6):267--276, June 2004. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. O'Boyle, J. Thomson, M. Toussaint, and C. Williams. Using machine learning to focus iterative optimization. In The International Symposium on Code Generation and Optimization, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. W. Appel. Simple generational garbage collection and fast allocation. Software-Practice and Experience, 19(2):171--183, Feb. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arnold, S. Fink, D. Grove, M. Hind, and P. F. Sweeney. Adaptive optimization in the Jalapeño JVM. ACM SIGPLAN Notices, 35(10):47--65, Oct. 2000. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney. A survey of adaptive optimization in virtual machines. Proceedings of the IEEE, 93(2), 2005. Special issue on Program Generation, Optimization, and Adaptation.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. Arnold and B. G. Ryder. A framework for reducing the cost of instrumented code. ACMSIGPLAN Notices, 36(5):168--179, May 2001. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. F. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin locks: Featherweight synchronization for Java. ACM SIGPLAN Notices, 33(5):258--268, May 1998. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Bilmes, K. Asanovic, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In 1997 International Conference on Supercomputing, pages 340--347, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Cavazos and J. E. B. Moss. Inducing heuristics to decide whether to schedule. ACM SIGPLAN Notices, 39(6):183--194, June 2004. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Chen, J. Chame, and M. Hall. Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In The International Symposium on Code Generation and Optimization, Mar. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Chen, J. Lu, W.-C. Hsu, and P.-C. Yew. Continuous adaptive objectcode re-optimization framework. In Ninth Asia-Pacific Computer Systems Architecture, Sept. 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. B. Childers, J. Davidson, and M. L. Soffa. Continuous compilation: A new approach to aggressive and adaptive code transformation. In International Symposium on Parallel and Distributed Processing Symposium, Apr. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. M. Chilimbi and M. Hirzel. Dynamic hot data stream prefetching for general-purpose programs. ACM SIGPLAN Notices, 37(5):199--209, May 2002. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. http://www-plan.cs.colorado.edu/henkel/projects/colorado bench.Google ScholarGoogle Scholar
  16. K. Cooper, P. Schielke, and D. Subramanian. Optimizing for reduced code space using genetic algorithms. In ACM Conference on Languages, Compilers, and Tools for Embedded Systems, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. D. Cooper, D. Subramanian, and L. Torczon. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. http://pag.csail.mit.edu/daikon.Google ScholarGoogle Scholar
  19. J. Dean and C. Chambers. Towards better inlining decisions using inlining trials. In LISP and Functional Programming, pages 273--282, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. P. Deutsch and A. M. Schiffman. Efficient implementation of the Smalltalk-80 system. In 11th Symposium on Principles of Programming Languages (POPL), pages 297--302, Jan. 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. C. Diniz and M. C. Rinard. Dynamic feedback: An effective technique for adaptive computing. ACM SIGPLAN Notices, 32(5):71--84, May 1997. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Frigo. A fast Fourier transform compiler. ACM SIGPLAN Notices, 34(5):169--180, May 1999. In Conference on Programming Language Design and Implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In 1998 IEEE International Conference on Acoustics Speech and Signal Processing, volume 3, pages 1381--1384. IEEE, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  24. G. Fursin, A. Cohen, M. O'Boyle, and O. Temam. A practical method for quickly evaluating program optimizations. In Proceedings of the 1st International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2005), number 3793 in LNCS, pages 29--46. Springer Verlag, November 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. N. Grcevski, A. Kilstra, K. Stoodley, M. Stoodley, and V. Sundaresan. Java just-in-time compiler and virtual machine improvements for server and middleware applications. In 3rd Virtual Machine Research and Technology Symposium (VM), May 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. J. Hansen. Adaptive Systems for the Dynamic Run-time Optimization of Programs. PhD thesis, Carnegie-Mellon University, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Hirzel, A. Diwan, and M. Hind. Pointer analysis in the pressence of dynamic class loading. In 18th European Conference on Object-Oriented Programming (ECOOP), volume 3086 of LNCS, pages 96--122, June 2004.Google ScholarGoogle Scholar
  28. U. Hölzle, C. Chambers, and D. Ungar. Optimizing dynamicallytyped object-oriented languages with polymorphic inline caches. In 5th European Conference on Object-Oriented Programming (ECOOP), volume 512 of LNCS, pages 21--38, July 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. U. Hölzle and D. Ungar. Reconciling responsiveness with performance in pure object-oriented languages. Transactions on Programming Languages and Systems (TOPLAS), 18(4):355--400, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Huang, S. M. Blackburn, K. S. McKinley, J. E. B. Moss, Z. Wang, and P. Cheng. The garbage collection advantage: Improving program locality. ACM SIGPLAN Notices, 39(10):69--80, Oct. 2004. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. E.-J. Im and K. Yelick. Optimizing sparse matrix-vector multiplication for register reuse in SPARSITY. In International Conference on Computational Science, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. E.-J. Im, K. Yelick, and R. Vuduc. SPARSITY: Optimization framework for sparse matrix kernels. International Journal of High Performance Computing Applications, 18(1), Jan. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Ishizaki, M. Takeuchi, K. Kawachiya, T. Suganuma, O. Gohda, T. Inagaki, A. Koseki, K. Ogata, M. Kawahito, T. Yasue, T. Ogasawara, T. Onodera, H. Komatsu, and T. Nakatani. Effectiveness of crossplatform optimizations for a Java just-in-time compiler. ACMSIGPLAN Notices, 38(11):187--204, Nov. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Kistler and M. Franz. Automated data-member layout of heap objects to improve memory-hierarchy performance. Transactions on Programming Languages and Systems (TOPLAS), 22(3):49--505, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. T. P. Kistler and M. Franz. Continuous program optimization: Design and evaluation. IEEE Transactions on Computers, 50(6):549--566, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. P. Kulkarni, D. Whalley, G. Tyson, and J. Davidson. Exhaustive optimization phase order space exploration. In The International Symposium on Code Generation and Optimization, Mar. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. P. A. Kulkarni, S. R. Hines, D. B. Whalley, J. D. Hiser, J. W. Davidson, and D. L. Jones. Fast and efficient searches for effective optimization phase sequences. ACM Transactions on Architecture and Code Optimization, 2(2):165--198, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. X. Li, M. J. Garzarán, and D. Padua. Optimizing sorting with genetic algorithms. In The International Symposium on Code Generation and Optimization, pages 99--110, Mar. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. Maier, P. Ramarao, M. Stoodley, and V. Sundaresan. Experiences with multithreading and dynamic class loading in a Java just-in-time compiler. In The International Symposium on Code Generation and Optimization, Mar. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. P. Nagpurkar, M. Hind, C. Krintz, P. F. Sweeney, and V. Rajan. Online phase detection algorithms. In The International Symposium on Code Generation and Optimization, Mar. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. N. Nethercote, D. Burger, and K. S. McKinley. Self-evaluating compilation applied to loop unrolling. Technical Report TR-06-12, The University of Texas at Austin, Feb. 2006.Google ScholarGoogle Scholar
  42. M. Paleczny, C. Vick, and C. Click. The Java Hotspot server compiler. In Java Virtual Machine Research and Technology Symposium (JVM), pages 1--12, Apr. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. W. Singer, J. Xiong, F. Franchetti, A. Gačić, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, 93(2), 2005. Special issue on Program Generation, Optimization, and Adaptation.Google ScholarGoogle ScholarCross RefCross Ref
  44. R. M. Rabbah and K. V. Palem. Data remapping for design space optimization of embedded memory systems. ACM Transactions on Embedded Computing Systems, 2(2):1--32, May 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. Saavedra and D. Park. Improving the effectiveness of software prefetching with adaptive execution. In Conference on Parallel Architectures and Compilation Techniques, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. http://www.sable.mcgill.ca/software/#soot.Google ScholarGoogle Scholar
  47. Standard Performance Evaluation Corporation. SPECjbb2000 Java Business Benchmark. http://www.spec.org/jbb2000.Google ScholarGoogle Scholar
  48. Standard Performance Evaluation Corporation. SPECjvm98 Benchmarks. http://www.spec.org/jvm98.Google ScholarGoogle Scholar
  49. M. Stephenson and S. Amarasinghe. Predicting unroll factors using supervised classification. In The International Symposium on Code Generation and Optimization, pages 123--134, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. A dynamic optimization framework for a Java just-in-time compiler. ACM SIGPLAN Notices, 36(11):180--195, Nov. 2001. In Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. The DaCapo Project. DaCapo Benchmark Suite, version beta051009. http://www-ali.cs.umass.edu/DaCapo/gcbm.html.Google ScholarGoogle Scholar
  52. N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato, and L. Rauchwerger. A framework for adaptive algorithm selection in STAPL. In Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 277--288, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. S. Triantafyllis, M. Vachharajani, and D. August. Compiler optimization-space exploration. Journal of Instruction-Level Parallelism, 7:1--25, Jan. 2005.Google ScholarGoogle Scholar
  54. M. J. Voss and R. Eigemann. High-level adaptive program optimization with ADAPT. ACM SIGPLAN Notices, 36(7):93--102, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. http://xml.apache.org/xerces2-j/index.html.Google ScholarGoogle Scholar
  57. K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P.Wu. A comparison of empirical and model-driven optimization. In ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online performance auditing: using hot optimizations without getting burned

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 41, Issue 6
        Proceedings of the 2006 PLDI Conference
        June 2006
        426 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1133255
        Issue’s Table of Contents
        • cover image ACM Conferences
          PLDI '06: Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation
          June 2006
          438 pages
          ISBN:1595933204
          DOI:10.1145/1133981

        Copyright © 2006 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 June 2006

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!