Abstract
Increasing the performance of application-specific processors by exploiting application-resident parallelism is often prohibited by costs; especially in the case of low-volume productions. The flexibility of horizontal-microcoded machines allows these costs to be reduced, but the flexibility often reduces efficiency. VLIW is a new and promising concept for the design of low-cost, high-performance parallel computer systems. We suggest that the VLIW concept can also be used as a basis for cost-effective design of application-specific processors which must exploit application-resident parallelism.
The SCARCE (SCalable ARChitecture Experiment) framework, an approach for cost-effective design of application-specific processors, provides features which allow the design of retargetable VLIW architectures. However, a retargetable VLIW architecture is only effective if there is a retargetable VLIW compiler. Since a VLIW compiler is an essential part of the VLIW architecture, tradeoffs must be made between the variety of VLIW architectures and the compiler complexity. We suggest that limiting the flexibility of the retargetable VLIW architecture does not necessary reduce the application space.
This paper discusses the issues related to the design of a retargetable VLIW processor architecture and compiler within the SCARCE framework.
- 1 Robert Cohn, Thomas Gross, Monica Lam, and P.S. Tseng. Architecture and Compiler Tradeoffs for a Long Instruction Word Microprocessor. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 2-14, ACM, April 1989. Google Scholar
Digital Library
- 2 Robert P. Colwell, Robert P. Nix, John J. O'Donnell, David B. Papworth, and Paul K. Rodman. A VLIW Architecture for a Trace Scheduling Compiler. In Proceedings of the Second International Conference on Architectural Support for Programming Languages and Operating Systems, pages pages 180-192, ACM, October 1987. SIGPLAN Notices Vol. 22, No. 10. Google Scholar
Digital Library
- 3 John R. Ellis. Bulldog: A Compiler for VLIW Architectures. ACM Doctoral Dissertation Awards, MIT Press, Cambridge, Massachusetts, 1986.Google Scholar
Digital Library
- 4 Joseph A. Fisher. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Transactions on Computers, C-30(7):478-490, July 1981.Google Scholar
Digital Library
- 5 Joseph A. Fisher, John R. Ellis, John C. Ruttenberg, and Alexandru Nicolau. Parallel Processing: A Smart Compiler and a Dumb Machine. In Proceedings of the ACM SIGPLAN '84 Symposium on Compiler Construction, pages 3'7-47, June 1984. SIGPLAN Notices Vol. 19, No. 6. Google Scholar
Digital Library
- 6 Norman P. Jouppi. Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines. In Proceedings of the Third International Conference on Architectural Support for Prqgramming Languages and Operating Systems, pages 272-282, ACM, April 1989. Google Scholar
Digital Library
- 7 David J. Kuck, Yoichi Muraoka, and Shyh-Ching Chen. On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup. IEEE Transactions on Computers, C-21(12):1293-1319, December 1972.Google Scholar
- 8 Monica Lam. Software Pipelining: An Effective Scheduling Technique for VLIW Machines. In Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, pages 318-328, June 1988. Google Scholar
Digital Library
- 9 Joseph L. Linn. SRDAG Compaction - A Generalization of Trace Scheduling to Increase the Use of Global Context Information. In Proceedings of the 16 Annual Microprogramming Workshop, pages 13-22, Downingtown, PA, December 1983.Google Scholar
- 10 Hans Mulder, Robert J. Portier, Apoorv Srivastava, and Ronald in 't Velt. An Architectural Framework for Application-Specific and Scalable Architectures. In Proceedings of the 16th Annual International Symposium on Computer Archticture, Jerusalem, Israel, May 1989. Google Scholar
Digital Library
- 11 Hans Mulder, Robert J. Portier, Apoorv Srivastava, and Ronald in 't Velt. Efficient Macro-Code Emulation in Hardwired Pipelined Processors. In Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture, pages 83-90, San Diego, December 1988. Google Scholar
Digital Library
- 12 Hans Mulder and Paul Stravers. A Flexible VLSI Core for an Adaptable Architecture. Submitted for review.Google Scholar
- 13 Deborah W. Runner and Erwin H. Warshawsky. Synthesizing ADA's Ideal Machine Mate. VLSI Systems Design, 30-39, October 1988.Google Scholar
- 14 Michael D. Smith, Mike Johnson, and Mark A. Horowitz. Limits on Multiple Instruction Issue. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 290-302, ACM, April 1989. Google Scholar
Digital Library
- 15 Shinji Tomita, Kiyoshi Shibayama, Toshiyuki Nakata, Shinji Yuasa, and Hiroshi Hagiwara. A Computer with Low-Level Parallelism QA-2 - Its Applications to 3-D Graphics and Prolog/Lisp Machines -. In The 13th Annual International Symposium on Computer Architecture, pages 280-289, IEEE Computer Society and Association for Computing Machinery, June 1986. Google Scholar
Digital Library
- 16 A. Wolfe and J.P. Shen. Flexible Processors: A Promising Application-specific Processor Design Approach. In Proceedings of the 2lst annual workshop on microprogramming and microarchitecture, pages 30- 39, November 1988. Google Scholar
Digital Library
Index Terms
Cost-effective design of application specific VLIW processors using the SCARCE framework
Recommendations
Cost-effective design of application specific VLIW processors using the SCARCE framework
MICRO 22: Proceedings of the 22nd annual workshop on Microprogramming and microarchitectureIncreasing the performance of application-specific processors by exploiting application-resident parallelism is often prohibited by costs; especially in the case of low-volume productions. The flexibility of horizontal-microcoded machines allows these ...
High-Performance and Low-Cost Dual-Thread VLIW Processor Using Weld Architecture Paradigm
This paper presents a cost-effective and high-performance dual-thread VLIW processor model. The dual-thread VLIW processor model is a low-cost subset of the Weld architecture paradigm. It supports one main thread and one speculative thread running ...
Hybrid multithreading for VLIW processors
CASES '09: Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systemsSeveral multithreading techniques have been proposed to reduce resource underutilization in Very Long Instruction Word (VLIW) processors. Simultaneous MultiThreading (SMT) is a popular technique that improves processor performance by issuing multiple ...






Comments