Abstract
High-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank register architectures are being adopted to eliminate the amount of read/write ports in register files. This presents new challenges for devising compiler optimization schemes for such architectures. In this paper, we address the compiler optimization issues for PAC architecture, which is a 5-way issue DSP processor with distributed register files. We present an integrated flow to address several phases of compiler optimizations in interacting with distributed register files and multi-bank register files in the layer of instruction scheduling, software pipelining, and data flow optimizations. Our experiments on a novel 32-bit embedded VLIW DSP (known as the PAC DSP core) exhibit the state of the art performance for embedded VLIW DSP processors with distributed register files by incorporating our proposed schemes in compilers.
- R. Leupers, Instruction scheduling for clustered VLIW DSPs, In PACT, pages 291--300, Oct. 2000. Google Scholar
Digital Library
- Y. Qian and S. Carr and P. Sweany, Optimizing loop performance for clustered VLIW architectures, In The 2002 International Conference on Parallel Architectures and Compilation Techniques, pages 271--280, Sept. 2002. Google Scholar
Digital Library
- J. Hiser, S. Carr, and P. Sweany, Global Register Partitioning, In Proc. Ninth Intl Conf. Parallel Architectures and Compilation Techniques, pp. 13--23, Oct. 2000. Google Scholar
Digital Library
- Chung-Ju Wu, Sheng-Yuan Chen, and Jenq-Kuen Lee, Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files, In LCPC, 2006. Google Scholar
Digital Library
- A. Terechko, E. L. Thenaff, M. Garg, Eijndhoven, and H. Corporaal. Inter-cluster communication models for clustered VLIW processors. Procs. HPCA, 2003; 354--364. Google Scholar
Digital Library
- T.-J. Lin, P.-C. Hsiao, C.-W. Liu, and C.-W. Jen. Area-efficient register organization for fully-synthesizable VLIW DSP cores. International Journal of Electrical Engineering, vol. 13, May 2006.Google Scholar
- David Chang and Max Baron. Taiwan's Roadmap to Leadership in Design. Microprocessor Report, In-Stat/MDR, Dec. 2004. http://www.mdronline.com/mpr/archive/mpr 2004.htmlGoogle Scholar
- Yung-Chia Lin, Chung-Lin Tang, Chung-Ju Wu, Ming-Yu Hung, Yi-Ping You, Ya-Chiao Moo, Sheng-Yuan Chen and Jenq Kuen Lee, Compiler Supports and Optimizations for PAC VLIW DSP Processors, In LCPC, 2005. Google Scholar
Digital Library
- V. Zivojnovic, J. Martinez, C. Schläger and H. Meyr. DSPstone: A DSP-Oriented Benchmarking Methodology. Proc. of ICSPAT, Dallas, 1994.Google Scholar
Index Terms
Enabling compiler flow for embedded VLIW DSP processors with distributed register files
Recommendations
Enabling compiler flow for embedded VLIW DSP processors with distributed register files
LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsHigh-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank ...
Compiler supports for VLIW DSP processors with SIMD intrinsics
To sustain growing multimedia workload, modern digital signal processing (DSP) processors are commonly equipped with subword instructions to accelerate signal processing. Besides subword, functional units of very long instruction word (VLIW) DSP ...
Copy propagation optimizations for VLIW DSP processors with distributed register files
LCPC'06: Proceedings of the 19th international conference on Languages and compilers for parallel computingHigh-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank ...







Comments