skip to main content
research-article

Hierarchical Execution to Speed Up Pipeline Interlock in Mainframe Computers

Published:01 May 1996Publication History
Skip Abstract Section

Abstract

This paper introduces a methodology, called hierarchical execution, which reduces stalls caused by pipeline interlocks such as data and control dependencies. Since a lot of software has been accumulated in mainframe computer systems as object code, it is important to improve performance without having to recompile the code for optimization. Our methodology consists of a simple pre-ALU that generates results, with shorter latency than the main ALU, asynchronously, which reduces the overhead especially for address generation interlocks and branch instructions. This method was implemented in Hitachi's mainframe processors, M-680 and M-880. In M-680, the pre-ALU, together with the instruction decoder, processes instructions in superpipelined fashion, which further improves performance. The aggregate effect of hierarchical execution on CPU time, for evaluated benchmarks, is 10% on average, with only a 1.6% increase in hardware. Therefore, we can roughly say that the hierarchical execution method improved cost performance by 8%.

References

  1. IBM, Enterprise Systems Architecture/390 Principles of Operation, second edition, 1993Google ScholarGoogle Scholar
  2. J. Novitsky, M. Azimi and R. Ghaznavi, "Optimizing Systems Performance Based on Pentium<sup>TM</sup> Processors," Proc. Compcon Spring '93, pp. 63-72, Feb. 1993.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. Circello and F. Goodrich, "The Motorola 68060 Microprocessor," Proc. Compcon Spring '93, pp. 73-78, Feb. 1993.Google ScholarGoogle ScholarCross RefCross Ref
  4. C.R. Moore, "The PowerPC<sup>TM</sup> 601 Microprocessor," Proc. Compcon Spring '93, pp. 109-116, Feb. 1993.Google ScholarGoogle Scholar
  5. D. Dobberpuhl, R.T. Witek, R. Allmon, R. Anglin, D. Bertucci, S. Britton, L. Chao, R.A. Conrad, D.E. Dever, B. Gieseke, S.M.N. Hassoun, G.W. Hoeppner, K. Kuchler, M. Ladd, B.M. Leary, L. Madden, E.J. McLellan, D.R. Meyer, J. Montanaro, D.A. Priore, V. Rajagopalan, S. Samudrala and S. Santhanam, "A 200 MHz 64 Bit Dual Issue CMOS Microprocessor," IEEE J. Solid-State Circuits, vol. 27, no. 11, pp. 1,555-1,567, Nov. 1992.Google ScholarGoogle ScholarCross RefCross Ref
  6. E. Delano, W. Walker, J. Yetter and M. Forsyth, "A High Speed Superscalar PA-RISC Processor," Proc. Compcon Spring '92, Feb. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. "The SPARC Architecture Manual Version 8," SPARC International, Prentice Hall, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W.J. Nohilly and V.T. Lund, "IBM ES/9000<sup>TM</sup> System Architecture and Hardware," Proc. ICCD '91, pp. 540-543, Oct. 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Shintani, K. Inoue, E. Kamada, T. Shonai, K. Wada, S. Abe and K. Wakai, "Logic Design for a High Performance Mainframe Computer, The HITAC M-880 Processor," Proc. ICCD '91, pp. 14-20, Oct. 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Bashteen, I. Lui and J. Mullan, "A Superpipeline Approach to the MIPS Architecture," Proc. Compcon Spring '91, pp. 8-12, Feb. 1991.Google ScholarGoogle ScholarCross RefCross Ref
  11. G.F. Grohoski, "Machine Organization of the IBM RISC System/6000 Processor," IBM J. Research and Development, vol. 34, no. 1, pp. 37-58, Jan. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J.E. Smith, "Dynamic Instruction Scheduling and the Astronautics ZS-1," Computer, vol. 22, no. 7, pp. 21-35, July 1989 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R.M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM J. Research and Development, pp. 25-33, Jan. 1967.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Shintani, K. Inoue, E. Kamada and T. Shonai, "A Performance and Cost Analysis of Applying Superscalar Method to Mainframe Computers," IEEE Trans. Computers, vol. 44, no. 7, July 1995 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hierarchical Execution to Speed Up Pipeline Interlock in Mainframe Computers

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Article Metrics

                  • Downloads (Last 12 months)0
                  • Downloads (Last 6 weeks)0

                  Other Metrics