skip to main content
research-article

Improving the performance of trace-based systems by false loop filtering

Published:05 March 2011Publication History
Skip Abstract Section

Abstract

Trace-based compilation is a promising technique for language compilers and binary translators. It offers the potential to expand the compilation scopes that have traditionally been limited by method boundaries.

Detecting repeating cyclic execution paths and capturing the detected repetitions into traces is a key requirement for trace selection algorithms to achieve good optimization and performance with small amounts of code. One important class of repetition detection is cyclic-path-based repetition detection, where a cyclic execution path (a path that starts and ends at the same instruction address) is detected as a repeating cyclic execution path.

However, we found many cyclic paths that are not repeating cyclic execution paths, which we call false loops. A common class of false loops occurs when a method is invoked from multiple call-sites. A cycle is formed between two invocations of the method from different call-sites, but which does not represent loops or recursion. False loops can result in shorter traces and smaller compilation scopes, and degrade the performance.

We propose false loop filtering, an approach to reject false loops in the repetition detection step of trace selection, and a technique called false loop filtering by call-stack-comparison, which rejects a cyclic path as a false loop if the call stacks at the beginning and the end of the cycle are different.

We applied false loop filtering to our trace-based Java™ JIT compiler that is based on IBM's J9 JVM. We found that false loop filtering achieved an average improvement of 16% and 10% for the DaCapo benchmark when applied to two baseline trace selection algorithms, respectively, with up to 37% improvement for individual benchmarks. In the end, with false loop filtering, our trace-based JIT achieves a performance comparable to that of the method-based J9 JVM/JIT using the corresponding optimization level.

References

  1. V. Bala, E. Duesterwald, and S. Banerjia. Transparent dynamic optimization: The design and implementation of Dynamo. Technical Report HPL-1999-78, HP Laboratories, 1999.Google ScholarGoogle Scholar
  2. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, PLDI '00, pages 1--12, New York, NY, USA, June 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. Ball and J. R. Larus. Efficient path profiling. In Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, MICRO 29, pages 46--57, Washington, DC, USA, 1996. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bebenita, F. Brandner, M. Fahndrich, F. Logozzo, W. Schulte, N. Tillmann, and H. Venter. SPUR: a trace-based JIT compiler for CIL. In Proceedings of the ACM International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA '10, pages 708--725, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. M. Blackburn, R. Garner, C. Hoffmann, A. M. Khang, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA '06, pages 169--190, New York, NY, USA, Oct. 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. F. Bolz, A. Cuni, M. Fijalkowski, and A. Rigo. Tracing the meta-level: PyPy's tracing JIT compiler. In Proceedings of the 4th Workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems, ICOOOLPS '09, pages 18--25, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Bruening and S. Amarasinghe. Maintaining consistency and bounding capacity of software code caches. In Proceedings of the International Symposium on Code Generation and Optimization, CGO '05, pages 74--85, Washington, DC, USA, Mar. 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In Proceedings of the International Symposium on Code Generation and Optimization, CGO '03, pages 265--275, Washington, DC, USA, Mar. 2003. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. DaCapo. The DaCapo benchmark suite. http://dacapobench.org/.Google ScholarGoogle Scholar
  10. E. Duesterwald and V. Bala. Software profiling for hot path prediction: less is more. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS-IX, pages 202--211, New York, NY, USA, Oct. 2000. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Gal and M. Franz. Incremental dynamic code generation with trace trees. Technical report, University of California Irvine, November 2006.Google ScholarGoogle Scholar
  12. A. Gal, C. W. Probst, and M. Franz. HotpathVM: an effective JIT compiler for resource-constrained devices. In Proceedings of the 2nd International Conference on Virtual Execution Environments, VEE '06, pages 144--153, New York, NY, USA, June 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Gal, B. Eich, M. Shaver, D. Anderson, D. Mandelin, M. R. Haghighat, B. Kaplan, G. Hoare, B. Zbarsky, J. Orendorff, J. Ruderman, E. W. Smith, R. Reitmaier, M. Bebenita, M. Chang, and M. Franz. Trace-based just-in-time type specialization for dynamic languages. In Proceedings of the 2009 ACM SIGPLAN conference on Programming Language Design and Implementation, PLDI '09, pages 465--478, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Grcevski, A. Kielstra, K. Stoodley, M. Stoodley, and V. Sundaresan. Java just-in-time compiler and virtual machine improvements for server and middleware applications. In Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3, pages 12--12, Berkeley, CA, USA, June 2004. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Hiniker, K. Hazelwood, and M. D. Smith. Improving region selection in dynamic optimization systems. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 38, pages 141--154, Washington, DC, USA, Dec. 2005. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Inoue, H. Hayashizaki, P. Wu, and T. Nakatani. A trace-based Java JIT compiler retrofitted from a method-based compiler. In Proceedings of the International Symposium on Code Generation and Optimization (to be published), CGO '11, Apr. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. LuaJIT. LuaJIT design notes in lua-l mailing list. http://lua-users.org/lists/lua-l/2008-02/msg00051.html, http://lua-users.org/lists/lua-l/2009--11/msg00089.html, http://lua-users.org/lists/lua-l/2008-06/msg00228.html.Google ScholarGoogle Scholar
  18. D. Merrill and K. Hazelwood. Trace fragment selection within method-based JVMs. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '08, pages 41--50, New York, NY, USA, June 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Pettis and R. C. Hansen. Profile guided code positioning. In Proceedings of the ACM SIGPLAN 1990 conference on Programming Language Design and Implementation, PLDI '90, pages 16--27, New York, NY, USA, June 1990. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Zaleski, A. D. Brown, and K. Stoodley. YETI: a graduallY extensible trace interpreter. In Proceedings of the 3rd International Conference on Virtual Execution Environments, VEE '07, pages 83--93, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving the performance of trace-based systems by false loop filtering

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 46, Issue 3
      ASPLOS '11
      March 2011
      407 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1961296
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
        March 2011
        432 pages
        ISBN:9781450302661
        DOI:10.1145/1950365

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 March 2011

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!