skip to main content
research-article

Improving the performance of object-oriented languages with dynamic predication of indirect jumps

Published:01 March 2008Publication History
Skip Abstract Section

Abstract

Indirect jump instructions are used to implement increasingly-common programming constructs such as virtual function calls, switch-case statements, jump tables, and interface calls. The performance impact of indirect jumps is likely to increase because indirect jumps with multiple targets are difficult to predict even with specialized hardware.

This paper proposes a new way of handling hard-to-predict indirect jumps: dynamically predicating them. The compiler (static or dynamic) identifies indirect jumps that are suitable for predication along with their control-flow merge (CFM) points. The hardware predicates theinstructions between different targets of the jump and its CFM point if the jump turns out to be hard-to-predict at run time. If the jump would actually have been mispredicted, its dynamic predication eliminates a pipeline flush, thereby improving performance.

Our evaluations show that Dynamic Indirect jump Predication (DIP) improves the performance of a set of object-oriented applications including the Java DaCapo benchmark suite by 37.8% compared to a commonly-used branch target buffer based predictor, while also reducing energy consumption by 24.8%. We compare DIP to three previously proposed indirect jump predictors and find that it provides the best performance and energy-efficiency.

Skip Supplemental Material Section

Supplemental Material

Video

References

  1. J.R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In POPL-10, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Alpern, A. Cocchi, S. Fink, D. Grove, and D. Lieber. Efficient implementation of Java interfaces: Invokeinterface considered harmless. In OOPSLA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Bhansali, W.-K. Chen, S.D. Jong, A. Edwards, and M. Drinic. Framework for instruction-level tracing and analysis of programs. In VEE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi, and S.K. Reinhardt. The M5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S.M. Blackburn et al. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA'06, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA-27, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Calder, D. Grunwald, and B. Zorn. Quantifying behavioral differences between C and C++ programs. Journal of Programming Languages, 2(4):323--351, 1995.Google ScholarGoogle Scholar
  8. L. Cardelli and P. Wegner. On understanding types, data abstraction, and polymorphism. ACM Computing Surveys, 17(4):471--523, Dec. 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Chang, E. Hao, and Y.N. Patt. Target prediction for indirect jumps. In ISCA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Cytron et al. Efficiently computing static single assignment form and the control dependence graph. ACM TOPLAS, 13(4):451--490, Oct. 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L.P. Deutsch and A.M. Schiffman. Efficient implementation of the Smalltalk-80 system. In POPL, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Driesen and U. Holzle. Accurate indirect branch prediction. In ISCA-25, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Driesen and U. Holzle. Multi-stage cascaded prediction. In Euro-Par, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M.A. Ertl and D. Gregg. Optimizing indirect branch prediction accuracy in virtual machine interpreters. In PLDI, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Farrens, T. Heil, J.E. Smith, and G. Tyson. Restricted dual path execution. Technical Report CSE-97-18, University of California at Davis, Nov. 1997.Google ScholarGoogle Scholar
  16. S. Gochman, R. Ronen, I. Anati, A. Berkovits, T. Kurts, A. Naveh, A. Saeed, Z. Sperber, and R.C. Valentine. The Intel PentiumM processor: Microarchitecture and performance. Intel Technology Journal, 7(2), May 2003.Google ScholarGoogle Scholar
  17. T. Heil and J.E. Smith. Selective dual path execution. Technical report, University of Wisconsin-Madison, Nov. 1996.Google ScholarGoogle Scholar
  18. G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, Feb. 2001.Google ScholarGoogle Scholar
  19. U. Holzle, C. Chambers, and D. Ungar. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In ECOOP, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. U. Holzle and D. Ungar. Optimizing dynamically-dispatched calls with run-time type feedback. In PLDI, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Intel Corp. ICC 9.1 for Linux. http://www.intel.com/cd/software/products/asmo-na/eng/compilers/284264.htm.Google ScholarGoogle Scholar
  22. Intel Corp. Intel Core2 Duo Desktop Processor E6600. http://processorfinder.intel.com/details.aspx?sSpec=SL9ZL.Google ScholarGoogle Scholar
  23. Intel Corp. Intel VTune Performance Analyzers. http://www.intel.com/vtune/.Google ScholarGoogle Scholar
  24. K. Ishizaki, M. Kawahito, T. Yasue, H. Komatsu, and T. Nakatani. A study of devirtualization techniques for a java just-in-time compiler. In OOPSLA-15, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. E. Jacobsen, E. Rotenberg, and J.E. Smith. Assigning confidence to conditional branch predictions. In MICRO-29, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Jimenez and C. Lin. Dynamic branch prediction with perceptrons. In HPCA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J.A. Joao, O. Mutlu, H. Kim, and Y.N. Patt. Dynamic predication of indirect jumps. IEEE Computer Architecture Letters, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Kalamatianos and D.R. Kaeli. Predicting indirect branches via data compression. In MICRO-31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R.E. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 19(2):24--36, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. H. Kim, J.A. Joao, O. Mutlu, C.J. Lee, Y.N. Patt, and R.S. Cohn. VPC Prediction: Reducing the cost of indirect branches via hardware-based dynamic devirtualization. In ISCA-34, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Diverge-merge processor (DMP): Dynamic predicated execution of complex control-flow graphs based on frequently executed paths. In MICRO-39, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Diverge-merge processor: Generalized and energy-efficient dynamic predication. IEEE Micro, 27(1):94--104, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Kim, J.A. Joao, O. Mutlu, and Y.N. Patt. Profile-assisted compiler support for dynamic predication in diverge-merge processors. In CGO-5, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Klauser, T. Austin, D. Grunwald, and B. Calder. Dynamic hammock predication for non-predicated instruction set architectures. In PACT, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Klauser, A. Paithankar, and D. Grunwald. Selective eager execution on the polypath architecture. In ISCA-25, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B.R. Rau, D.W.L. Yen, W. Yen, and R.A. Towle. The Cydra 5 departmental supercomputer. IEEE Computer, 22:12--35, Jan. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E.M. Riseman and C.C. Foster. The inhibition of potential parallelism by conditional jumps. IEEE Transactions on Computers, C-21(12):1405--1411, 1972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Roth, A. Moshovos, and G.S. Sohi. Improving virtual function call target prediction via dependence-based pre-computation. In ICS-13, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Seznec and P. Michaud. A case for (partially) TAgged GEometric history length branch prediction. JILP, Feb. 2006.Google ScholarGoogle Scholar
  40. E.H. Sussenguth. Instruction Control Sequence. U.S. Patent 3559183, Jan. 26, 1971.Google ScholarGoogle Scholar
  41. D. Tarditi, July 2007. Personal communication.Google ScholarGoogle Scholar
  42. J. Tendler, S. Dodson, S. Fields, H. Le, and B. Sinharoy. POWER4 system microarchitecture. IBM Technical White Paper, Oct. 2001.Google ScholarGoogle Scholar
  43. P.H. Wang, H. Wang, R.M. Kling, K. Ramakrishnan, and J.P. Shen. Register renaming and scheduling for dynamic execution of predicated code. In HPCA-7, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. M. Wolczko. Benchmarking Java with the Richards benchmark. http://research.sun.com/people/mario/java_benchmarking/richards/richards.html.Google ScholarGoogle Scholar

Index Terms

  1. Improving the performance of object-oriented languages with dynamic predication of indirect jumps

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!