Abstract
Combining is a local compiler optimization technique that can enhance the performance of global compaction techniques for VLIW machines. Given two adjacent operations of a certain class that are flow (read-after-write) dependent and that cannot be placed in the same micro-instruction, the combining technique can transform the operations so that the modified operations have no dependence. The transformed operations can be executed in the same micro-instruction, thus allowing the total execution time of the program to be reduced. In this paper, combining a pair of flow-dependent operations into a wide instruction word is suggested as an important compilation technique for VLIW architectures. Combining is particularly effective with software pipelining and loop unrolling since combinable operations can come together with a higher probability when these compilation techniques are used. We implemented combining in our parallelizing compiler for the wide instruction word architecture, which is now being built at the IBM T. J. Watson Research Center. It is shown that ten percent speedup is obtained on the Stanford integer benchmarks and other sequential-matured C programs, in comparison to compaction techniques that do not use combining. For a class of inner loops, combining can remove the inter-iteration dependencies completely and can improve performance in the same ratio as the loop is unrolled.
- 1 Rho, A., Sethi It., and Ullman, J. {1986}. Compilers Principles, Techniques, and Tools, Addison- Wesley. Google Scholar
Digital Library
- 2 Ebcioglu, K. {1987}. A Compilation Technique'for Software Pipelining of Loops with Conditional Jumps, Proceedings of the ZOfh Annual Workshop on Microprogramming, ACM Press, pp. 69-79. Google Scholar
Digital Library
- 3 Ebcioglu, K. {1988}. Some Design Ideas for a VLIW Architecture for Sequential Natured Software, Parallel Processing (Proceedings of IFIP WC 10.3 Working Conference on Parallel Processing), M. Cosnard et al. (eds.), pp. 1-21, North Holland.Google Scholar
- 4 Ebcioglu, K. and Nakatani, T. {3989}. A New Compilation Technique for Parallelizing Loops with Unpredictable Branches on a VLIW Architecture, to appear in proceedings of ihe Second Workshop on Programmhg Languages and Compilers for Parallel Compdng, University of Illinois at Urbana-Champaign. Google Scholar
Digital Library
- 5 Kuck, D.J. {1978}. The Siraclrlre oj Computers and Compulations, John Wiley and Sons. Google Scholar
Digital Library
- 6 Nicolau, A. {1985}. Percolation Scheduling: A Parallel Compilation Technique, TR 85-678, Department of Comput,er Science, Cornell IJniversity. Google Scholar
Digital Library
- 7 Warren, S.H., Auslandcr, M.A., Chaitin, G.J., Chibib, A.C., Hopkins, M.E., and MacKay, A.L. {1986}. Final Code G eneration in the PL.8 Compiler, Report No. RC 11974, IBM T.J. Watson Research Center.Google Scholar
Index Terms
“Combining” as a compilation technique for VLIW architectures
Recommendations
“Combining” as a compilation technique for VLIW architectures
MICRO 22: Proceedings of the 22nd annual workshop on Microprogramming and microarchitectureCombining is a local compiler optimization technique that can enhance the performance of global compaction techniques for VLIW machines. Given two adjacent operations of a certain class that are flow (read-after-write) dependent and that cannot be ...
Loop fusion for clustered VLIW architectures
Embedded systems require maximum performance from a processor within significant constraints in power consumption and chip cost. Using software pipelining, high-performance digital signal processors can often exploit considerable instruction-level ...






Comments