skip to main content
article

A new idiom recognition framework for exploiting hardware-assist instructions

Published:20 October 2006Publication History
Skip Abstract Section

Abstract

Modern processors support hardware-assist instructions (such as TRT and TROT instructions on IBM zSeries) to accelerate certain functions such as delimiter search and character conversion. Such special instructions have often been used in high performance libraries, but they have not been exploited well in optimizing compilers except for some limited cases. We propose a new idiom recognition technique derived from a topological embedding algorithm [4] to detect idiom patterns in the input program more aggressively than in previous approaches. Our approach can detect a pattern even if the code segment does not exactly match the idiom. For example, we can detect a code segment that includes additional code within the idiom pattern. We implemented our new idiom recognition approach based on the Java Just-In-Time (JIT) compiler that is part of the J9 Java Virtual Machine, and we supported several important idioms for special hardware-assist instructions on the IBM zSeries and on some models of the IBM pSeries. To demonstrate the effectiveness of our technique, we performed two experiments. The first one is to see how many more patterns we can detect compared to the previous approach. The second one is to see how much performance improvement we can achieve over the previous approach. For the first experiment, we used the Java Compatibility Kit (JCK) API tests. For the second one we used IBM XML parser, SPECjvm98, and SPCjbb2000. In summary, relative to a baseline implementation using exact pattern matching, our algorithm converted 75% more loops in JCK tests. We also observed significant performance improvement of the XML parser by 64%, of SPECjvm98 by 1%, and of SPECjbb2000 by 2% on average on a z990. Finally, we observed the JIT compilation time increases by only 0.32% to 0.44%.

References

  1. W. Blume and R. Eigenmann. An Overview of Symbolic Analysis Techniques Needed for the Effective Parallelization of the Perfect Benchmarks. Proceedings of the 1994 International Conference on Parallel Processing, pp. 233--238, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Clark, J. Blome, M. Chu, S. Mahlke, S. Biles, and K. Flautner. An architecture framework for transparent instruction set customization in embedded processors. In Proc. of the 32nd Annual International Symposium on Computer Architecture, pp. 272--283, June 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In ACM Symposium on Principles of Programming Languages. (POPL '77), pp. 238--252, 1977.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J.J. Fu, Directed Graph Pattern Matching and Topological Embedding, Journal of Algorithms, 22(2):372--391, February 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Grcevski, A. Kielstra, K. Stoodley, M.G. Stoodley, V. Sundaresan: Java Just-in-Time Compiler and Virtual Machine Improvements for Server and Middleware Applications. Virtual Machine Research and Tech Symposium 2004: pp. 151--162]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. IBM Corp., IBM Mainframe, http://www-03.ibm.com/servers/eserver/zseries/]]Google ScholarGoogle Scholar
  7. IBM Corp., IBM PowerPC Architecture, http://www-03.ibm.com/chips/power/powerpc/]]Google ScholarGoogle Scholar
  8. IBM Corp., IBM System p5 servers, http://www-03.ibm.com/systems/p/]]Google ScholarGoogle Scholar
  9. IBM Corp., PowerPC Microprocessor Family: Vector/SIMD Multimedia Extension Technology Programming Environments Manual, http://www-3.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D]]Google ScholarGoogle Scholar
  10. IBM Corp. z/Architecture Principles of Operation, SA22-7832-04, September 2005]]Google ScholarGoogle Scholar
  11. T. Inagaki, T. Onodera, H. Komatsu, and T. Nakatani, "Stride Prefetching by Dynamically Inspecting Objects," ACM Programming Language Design and Implementation (PLDI 2003), pp. 269--277, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Java Compatibility Kit, https://jck.dev.java.net/]]Google ScholarGoogle Scholar
  13. M. Kawahito, H. Komatsu, and T. Nakatani. Partial redundancy elimination for access expressions by speculative code motion, Software: Practice and Experience, Vol. 34, No. 11, pp. 1065--1090, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Knoop, O. Rüthing, and B. Steffen. Partial dead code elimination. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation, pp. 147--158, Orlando, Florida, June 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Knoop, O. Rüthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, Vol. 17, No. 5, pp. 777--802, 1995.]]Google ScholarGoogle Scholar
  16. S. Larsen and S. Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. In Proceedings of the Conference on Programming Language Design and Implementation, pp.145--156, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Leuschel. A framework for the integration of partial evaluation and abstract interpretation of logic programs. ACM Transactions on Programming Languages and Systems. Vol. 26, No. 3, pp. 413--463, 2004]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Lindholm, F. Yellin, The Java Virtual Machine Specification, Addison-Wesley Publishing Co., Reading, MA (1996).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Metzger. Automated Recognition of Parallel Algorithms in Scientific Applications. In IJCAI-95 Workshop Program Working Notes, 1995.]]Google ScholarGoogle Scholar
  20. Motorola Corp., AltiVec Technology Programming Interface Manual, 1999, http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf]]Google ScholarGoogle Scholar
  21. S.S. Muchnick. Advanced compiler design and implementation, Morgan Kaufmann Publishers, Inc., 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Pinter, R. Pinter, Program Optimization and Parallelization Using Idioms, ACM Transactions on Programming Languages and Systems, vol. 14, 1994, pp.305--327]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Pottenger, R. Eigenmann, Idiom recognition in the Polaris parallelizing compiler, Proceedings of the 9th international conference on Supercomputing, pp. 444--448, 1995]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Sato, Array Form Representation of Idiom Recognition System for Numerical Programs, Proceedings of the 2001 conference on APL, pp. 87--98, 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Shin, M.W. Hall, J. Chame. Superword-level parallelism in the presence of control flow, Symposium on Code Generation and Optimization, pp. 165--175, 2005]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T.J. Slegel, E. Pfeffer, and J.A. Magee. The IBM eServer z990 microprocessor, IBM Journal of Research and Development, Vol. 48, No. 3/4, 2004]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Suganuma, T. Ogasawara, M. Takeuchi, T. Yasue, M. Kawahito, K. Ishizaki, H. Komatsu, T. Nakatani. Overview of the IBM Java Just-in-Time Compiler, IBM Systems Journal, Vol. 39, No. 1, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. "A Dynamic Optimization Framework for a Java Just-In-Time Compiler", In Proceedings of the International Conference on Object-Oriented Programming, Systems, Languages, and Applications, October 2001]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Suganuma, T. Ogasawara, K. Kawachiya, M. Takeuchi, K. Ishizaki, A. Koseki, T. Inagaki, T. Yasue, M. Kawahito, T. Onodera, H. Komatsu, and T. Nakatani. Evolution of a Java just-in-time compiler for IA-32 platforms, IBM Journal of Research and Development, vol. 48, Num. 5/6, 2004]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G.R. Uh and D.B. Whalley. Effectively exploiting indirect jumps, Software - Practice and Experience, vol. 29, Num. 12, pp. 1061--1101, 1999]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A new idiom recognition framework for exploiting hardware-assist instructions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 40, Issue 5
      Proceedings of the 2006 ASPLOS Conference
      December 2006
      425 pages
      ISSN:0163-5980
      DOI:10.1145/1168917
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
        October 2006
        440 pages
        ISBN:1595934510
        DOI:10.1145/1168857

      Copyright © 2006 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 October 2006

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!