
- 1 A. Charlesworth, "An approach to scientific array processing: The architectural design of the AP- 120B/FPS-164 family," in IEEE Computer, September 1981.Google Scholar
- 2 R. Touzeau, "A Fortran compiler for the FPS-164 Scientific Computer," in Proceedings of the SIGPLAN '8,1 Symposium on Compiler Construction, June 1984. Google Scholar
Digital Library
- 3 M. S. Lam, "Software pipelining: An effective scheduling technique for VLIW machines," in Proceedings ojf the A CM SIGPLAN 1988 Conference on Programming Language Design and Implementation, pp. 318-328, June 1988. Google Scholar
Digital Library
- 4 R. L. Lee, A. Kwok, and F. Briggs, "The floating point performance of a superscalar SPARC processor," in Proceedings o/the $th International Conference on Architecture Support/or Programming Languages and Operating Systems, pp. 28-37, April 1989. Google Scholar
Digital Library
- 5 A. Aiken and A. Nicolau, "Optimal loop parallelization," in Proceedings o/ the A CM SIGPLAN 1988 Conference on Programming Language Design and Implementation, pp. 308-317, June 1988. Google Scholar
Digital Library
- 6 K. Ebcioglu and T. Nakatani, "A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture," in S~cond Workshop on Languages and Compiler's for Parallel Computing, August 1989. Google Scholar
Digital Library
- 7 B. Su and J. Wang, "GURPR*: A new global software pipelining algorithm," in Proceedings of the ~~th Annual Workshop on Microprogramming and Microarchitecture, pp. 212-216, November 1991. Google Scholar
Digital Library
- 8 J. H. Patel and E. S. Davidson, "Improving the throughput of a pipeline by insertion of delays," in Proceedings of the $rd International Symposium on Computer Architecture, pp. 159-164, 1976. Google Scholar
Digital Library
- 9 B. R. Rau and C. D. Glaeser, "Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing," in Proceedings of the ~Oth Annual Workshop on Microprogramming and Microarchitecture, pp. 183-198, October 1981. Google Scholar
Digital Library
- 10 M. Lam, A Systolic Array Optimizing Compiler. PhD thesis, Carnegie Mellon University, Pittsburg, PA, 1987. Google Scholar
Digital Library
- 11 C. Eisenbeis, "Optimization of horizontal microcode generation for loop structures," in international Conference on Supercomputing, pp. 453-465, July 1988. Google Scholar
Digital Library
- 12 R. B. Jones and V. H. Allan, "Software pipelining: An evaluation of Enhanced Pipelining,' in Proceedings o/the ~dth International Workshop on Microprogramming and Microarchitecture, pp. 82-92, November 1991. Google Scholar
Digital Library
- 13 J. A. Fisher, "Trace scheduling: A technique for global microcode compaction," IEEE Transactions on Computers, vol. c-30, pp. 478-490, July 1981.Google Scholar
Digital Library
- 14 B. R. Rau, D. W. L. Yen, W. Yen, and R. A. Towle, "The Cydra 5 departmental supercomputer,' IEEE Computer, pp. 12-35, January 1989. Google Scholar
Digital Library
- 15 J. C. Dehnert, P. Y. Hsu, and J. P. Bratt, "Overlapped loop support in the Cydra 5," in Proceedings o/ the Third International Conference on Architectural Support/or Programming Languages and Operating Systems, pp. 26-38, April 1989. Google Scholar
Digital Library
- 16 R. Towle, Control and Data Dependence/or Program Transformations. PhD thesis, Department of Computer Science, University of Illinois, Urbana, IL, 1976. Google Scholar
Digital Library
- 17 J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, "Conversion of control dependence to data dependence,'' in Proceedings o/ the l Oth A CM Symposium on Principles o/ Programming Languages, pp. 177-189, January 1983. Google Scholar
Digital Library
- 18 F. Gasperoni, "Compilation techniques for VLIW architectures," Tech. Rep. 66741, IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY 10598, August 1989.Google Scholar
- 19 N. J. Warter and W. W. Hwu, "Enhanced modulo scheduling," Tech. Rep. CRHC-92-11, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL, November 1992.Google Scholar
- 20 J. C. H. Park and M. Schlansker, "On Predicated Execution," Tech. Rep. HPL-91-58, Hewlett Packard Software Systems Laboratory, May 1991.Google Scholar
- 21 A. Aho, R. Sethi, and J. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA: Addison- Wesley, 1986. Google Scholar
Digital Library
- 22 D. C. Lin, "Compiler support for predicated execution in superscalar processors," Master's thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana, IL, 1992.Google Scholar
- 23 P. Tirumalai, M. Lee, and M. Schlansker, "Parallelization of loops with exits on pipelined architectures," in Supercomputing, November 1990. Google Scholar
Digital Library
- 24 J. W. Bockhaus, "An implementation of GURPR*: A software pipelining algorithm," Master's thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana, IL, 1992.Google Scholar
- 25 Intel, i860 6y-Bit Microprocessor. Santa Clara, CA, 1989.Google Scholar
- 26 N. J. Wafter, D. M. Lavery, and W. W. Hwu, "The benefit of Predicated Execution for software pipelining,' in Proceedings o.f the ~3rd Hawaii International Conference on System Sciences, to appear January 1993.Google Scholar
- 27 B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker, "Register allocation for software pipelined loops," in Proceedings o/the A CM SIGPLAN 9~ Conference on Programming Language Design and Implementation, pp. 283-299, June 1992. Google Scholar
Digital Library
- 28 W. Y. Chen, S. A. Mahlke, N. J. Wafter, and W. W. Hwu, "Using profile information to assist advanced compiler optimization and scheduling," in Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing, August 1992. Google Scholar
Digital Library
Index Terms
Enhanced modulo scheduling for loops with conditional branches
Recommendations
Modulo scheduling of loops in control-intensive non-numeric programs
MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on MicroarchitectureMuch of the previous work on modulo scheduling has targeted numeric programs, in which, often, the majority of the loops are well-behaved loop-counter-based loops without early exits. In control-intensive non-numeric programs, the loops frequently have ...
Software pipelining loops with conditional branches
MICRO 29: Proceedings of the 29th annual ACM/IEEE international symposium on MicroarchitectureSoftware pipelining is an aggressive scheduling technique that generates efficient code for loops and is particularly effective for VLIW architectures. Few software pipelining algorithms, however, are able to efficiently schedule loops that contain ...
Schedule-independent storage mapping for loops
This paper studies the relationship between storage requirements and performance. Storage-related dependences inhibit optimizations for locality and parallelism. Techniques such as renaming and array expansion can eliminate all storage-related ...






Comments