Abstract
Since its introduction by Joseph A. Fisher in 1979, trace scheduling has influenced much of the work on compile-time ILP (Instruction Level Parallelism) transformations. Initially developed for use in microcode compaction, it quickly became the main technique for machine-level compile-time parallelism exploitation. Although it has been used since the 1980s in many state-of-the-art compilers (e.g., Intel, Fujitsu, HP), a rigorous theory of trace scheduling is still lacking in the existing literature. This is reflected in the ad hoc way compensation code is inserted after a trace compaction, in the total absence of any attempts to measure the size of that compensation code, and so on.
The aim of this article is to create a mathematical theory of the foundation of trace scheduling. We give a clear algorithm showing how to insert compensation code after a trace is replaced with its schedule, and then prove that the resulting program is indeed equivalent to the original program. We derive an upper bound on the size of that compensation code, and show that this bound can be actually attained. We also give a very simple proof that the trace scheduling algorithm always terminates.
- Banerjee, U. 1997. Dependence Analysis. Kluwer Academic Publishers, Norwell, MA. Google Scholar
Digital Library
- Ellis, J. R. 1985. Bulldog: A compiler for VLIW architecture. Ph.D. thesis, Tech. rep. YALEU/DCS/RR364, Department of Computer Science, Yale University. Google Scholar
Digital Library
- Fisher, J. A. 1979. The optimization of horizontal microcode within and beyond basic blocks: An Application of processor scheduling with resources. Tech. rep. COO-3077-161, Courant Mathematics and Computing Laboratory, New York University.Google Scholar
- Fisher, J. A. 1981. Trace scheduling: A technique for global microcode compaction. IEEE Trans. Comput. C-30, 7, 478--490. Google Scholar
Digital Library
- Hwu, W. W., Mahlke, S. A., Chen, W. Y., Chang, P. P., Warter, N. J., Bringmann, R. A., Ouellette, R. G., Hank, R. E., Kiyohara, T., Haab, G. E., Holm, J. G., and Lavery, D. M. 1993. The superblock: An effective technique for VLIW and superscalar compilation. J. Supercomput. 7, 1--2, 229--248. Google Scholar
Digital Library
- Landscov, D., Davidson, S., Shriver, B., and Mallett. P. W. 1980. Local microcode compaction techniques. ACM Comput. Surv. 12, 3, 261--294. Google Scholar
Digital Library
- Lowney, P. G., Freudenberger, S. M., Karzes, T. J., Lichtenstein, W. D., Nix, R. P., O'Donnell, J. S., and Ruttenberg, J. C. 1993. The multiflow trace scheduling compiler. J. Supercomput. 7, 1--2, 51--142. Google Scholar
Digital Library
- Nicolau, A. 1985. Parallelism, memory anti-aliasing and correctness for trace-scheduling compilers. Tech. rep. YALE/DCS/RR-374, Department of Computer Science, Yale University.Google Scholar
Index Terms
Mathematical foundation of trace scheduling
Recommendations
Warp-aware trace scheduling for GPUs
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilationGPU performance depends not only on thread/warp level parallelism (TLP) but also on instruction-level parallelism (ILP). It is not enough to schedule instructions within basic blocks, it is also necessary to exploit opportunities for ILP optimization ...
Optimal trace scheduling using enumeration
This article presents the first optimal algorithm for trace scheduling. The trace is a global scheduling region used by compilers to exploit instruction-level parallelism across basic block boundaries. Several heuristic techniques have been proposed for ...
Avoidance and suppression of compensation code in a trace scheduling compiler
Trace scheduling is an optimization technique that selects a sequence of basic blocks as a trace and schedules the operations from the trace together. If an operation is moved across basic block boundaries, one or more compensation copies may be ...






Comments