Abstract
Modern processors support hardware-assist instructions (such as TRT and TROT instructions on IBM zSeries) to accelerate certain functions such as delimiter search and character conversion. Such special instructions have often been used in high performance libraries, but they have not been exploited well in optimizing compilers except for some limited cases. We propose a new idiom recognition technique derived from a topological embedding algorithm [4] to detect idiom patterns in the input program more aggressively than in previous approaches. Our approach can detect a pattern even if the code segment does not exactly match the idiom. For example, we can detect a code segment that includes additional code within the idiom pattern. We implemented our new idiom recognition approach based on the Java Just-In-Time (JIT) compiler that is part of the J9 Java Virtual Machine, and we supported several important idioms for special hardware-assist instructions on the IBM zSeries and on some models of the IBM pSeries. To demonstrate the effectiveness of our technique, we performed two experiments. The first one is to see how many more patterns we can detect compared to the previous approach. The second one is to see how much performance improvement we can achieve over the previous approach. For the first experiment, we used the Java Compatibility Kit (JCK) API tests. For the second one we used IBM XML parser, SPECjvm98, and SPCjbb2000. In summary, relative to a baseline implementation using exact pattern matching, our algorithm converted 75% more loops in JCK tests. We also observed significant performance improvement of the XML parser by 64%, of SPECjvm98 by 1%, and of SPECjbb2000 by 2% on average on a z990. Finally, we observed the JIT compilation time increases by only 0.32% to 0.44%.
- W. Blume and R. Eigenmann. An Overview of Symbolic Analysis Techniques Needed for the Effective Parallelization of the Perfect Benchmarks. Proceedings of the 1994 International Conference on Parallel Processing, pp. 233--238, 1994.]] Google Scholar
Digital Library
- N. Clark, J. Blome, M. Chu, S. Mahlke, S. Biles, and K. Flautner. An architecture framework for transparent instruction set customization in embedded processors. In Proc. of the 32nd Annual International Symposium on Computer Architecture, pp. 272--283, June 2005.]] Google Scholar
Digital Library
- P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In ACM Symposium on Principles of Programming Languages. (POPL '77), pp. 238--252, 1977.]] Google Scholar
Digital Library
- J.J. Fu, Directed Graph Pattern Matching and Topological Embedding, Journal of Algorithms, 22(2):372--391, February 1997.]] Google Scholar
Digital Library
- N. Grcevski, A. Kielstra, K. Stoodley, M.G. Stoodley, V. Sundaresan: Java Just-in-Time Compiler and Virtual Machine Improvements for Server and Middleware Applications. Virtual Machine Research and Tech Symposium 2004: pp. 151--162]] Google Scholar
Digital Library
- IBM Corp., IBM Mainframe, http://www-03.ibm.com/servers/eserver/zseries/]]Google Scholar
- IBM Corp., IBM PowerPC Architecture, http://www-03.ibm.com/chips/power/powerpc/]]Google Scholar
- IBM Corp., IBM System p5 servers, http://www-03.ibm.com/systems/p/]]Google Scholar
- IBM Corp., PowerPC Microprocessor Family: Vector/SIMD Multimedia Extension Technology Programming Environments Manual, http://www-3.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D]]Google Scholar
- IBM Corp. z/Architecture Principles of Operation, SA22-7832-04, September 2005]]Google Scholar
- T. Inagaki, T. Onodera, H. Komatsu, and T. Nakatani, "Stride Prefetching by Dynamically Inspecting Objects," ACM Programming Language Design and Implementation (PLDI 2003), pp. 269--277, 2003.]] Google Scholar
Digital Library
- Java Compatibility Kit, https://jck.dev.java.net/]]Google Scholar
- M. Kawahito, H. Komatsu, and T. Nakatani. Partial redundancy elimination for access expressions by speculative code motion, Software: Practice and Experience, Vol. 34, No. 11, pp. 1065--1090, 2004.]] Google Scholar
Digital Library
- J. Knoop, O. Rüthing, and B. Steffen. Partial dead code elimination. In Proceedings of the ACM SIGPLAN '94 Conference on Programming Language Design and Implementation, pp. 147--158, Orlando, Florida, June 1994.]] Google Scholar
Digital Library
- J. Knoop, O. Rüthing, and B. Steffen. Optimal code motion: Theory and practice. ACM Transactions on Programming Languages and Systems, Vol. 17, No. 5, pp. 777--802, 1995.]]Google Scholar
- S. Larsen and S. Amarasinghe. Exploiting superword level parallelism with multimedia instruction sets. In Proceedings of the Conference on Programming Language Design and Implementation, pp.145--156, 2000.]] Google Scholar
Digital Library
- M. Leuschel. A framework for the integration of partial evaluation and abstract interpretation of logic programs. ACM Transactions on Programming Languages and Systems. Vol. 26, No. 3, pp. 413--463, 2004]] Google Scholar
Digital Library
- T. Lindholm, F. Yellin, The Java Virtual Machine Specification, Addison-Wesley Publishing Co., Reading, MA (1996).]] Google Scholar
Digital Library
- R. Metzger. Automated Recognition of Parallel Algorithms in Scientific Applications. In IJCAI-95 Workshop Program Working Notes, 1995.]]Google Scholar
- Motorola Corp., AltiVec Technology Programming Interface Manual, 1999, http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf]]Google Scholar
- S.S. Muchnick. Advanced compiler design and implementation, Morgan Kaufmann Publishers, Inc., 1997.]] Google Scholar
Digital Library
- S. Pinter, R. Pinter, Program Optimization and Parallelization Using Idioms, ACM Transactions on Programming Languages and Systems, vol. 14, 1994, pp.305--327]] Google Scholar
Digital Library
- B. Pottenger, R. Eigenmann, Idiom recognition in the Polaris parallelizing compiler, Proceedings of the 9th international conference on Supercomputing, pp. 444--448, 1995]] Google Scholar
Digital Library
- H. Sato, Array Form Representation of Idiom Recognition System for Numerical Programs, Proceedings of the 2001 conference on APL, pp. 87--98, 2001]] Google Scholar
Digital Library
- J. Shin, M.W. Hall, J. Chame. Superword-level parallelism in the presence of control flow, Symposium on Code Generation and Optimization, pp. 165--175, 2005]] Google Scholar
Digital Library
- T.J. Slegel, E. Pfeffer, and J.A. Magee. The IBM eServer z990 microprocessor, IBM Journal of Research and Development, Vol. 48, No. 3/4, 2004]] Google Scholar
Digital Library
- T. Suganuma, T. Ogasawara, M. Takeuchi, T. Yasue, M. Kawahito, K. Ishizaki, H. Komatsu, T. Nakatani. Overview of the IBM Java Just-in-Time Compiler, IBM Systems Journal, Vol. 39, No. 1, 2000.]] Google Scholar
Digital Library
- T. Suganuma, T. Yasue, M. Kawahito, H. Komatsu, and T. Nakatani. "A Dynamic Optimization Framework for a Java Just-In-Time Compiler", In Proceedings of the International Conference on Object-Oriented Programming, Systems, Languages, and Applications, October 2001]] Google Scholar
Digital Library
- T. Suganuma, T. Ogasawara, K. Kawachiya, M. Takeuchi, K. Ishizaki, A. Koseki, T. Inagaki, T. Yasue, M. Kawahito, T. Onodera, H. Komatsu, and T. Nakatani. Evolution of a Java just-in-time compiler for IA-32 platforms, IBM Journal of Research and Development, vol. 48, Num. 5/6, 2004]] Google Scholar
Digital Library
- G.R. Uh and D.B. Whalley. Effectively exploiting indirect jumps, Software - Practice and Experience, vol. 29, Num. 12, pp. 1061--1101, 1999]] Google Scholar
Digital Library
Index Terms
A new idiom recognition framework for exploiting hardware-assist instructions
Recommendations
A new idiom recognition framework for exploiting hardware-assist instructions
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systemsModern processors support hardware-assist instructions (such as TRT and TROT instructions on IBM zSeries) to accelerate certain functions such as delimiter search and character conversion. Such special instructions have often been used in high ...
A new idiom recognition framework for exploiting hardware-assist instructions
Proceedings of the 2006 ASPLOS ConferenceModern processors support hardware-assist instructions (such as TRT and TROT instructions on IBM zSeries) to accelerate certain functions such as delimiter search and character conversion. Such special instructions have often been used in high ...
A new idiom recognition framework for exploiting hardware-assist instructions
Proceedings of the 2006 ASPLOS ConferenceModern processors support hardware-assist instructions (such as TRT and TROT instructions on IBM zSeries) to accelerate certain functions such as delimiter search and character conversion. Such special instructions have often been used in high ...






Comments