Abstract
Current FPGA soft processor systems use dedicated hardware modules or accelerators to speed up data-parallel applications. This work explores an alternative approach of using a soft vector processor as a general-purpose accelerator. The approach has the benefits of a purely software-oriented development model, a fixed ISA allowing parallel software and hardware development, a single accelerator that can accelerate multiple applications, and scalable performance from the same source code. With no hardware design experience needed, a software programmer can make area-versus-performance trade-offs by scaling the number of functional units and register file bandwidth with a single parameter. A soft vector processor can be further customized by a number of secondary parameters to add or remove features for a specific application to optimize resource utilization. This article introduces VIPERS, a soft vector processor architecture that maps efficiently into an FPGA and provides a scalable amount of performance for a reasonable amount of area. Compared to a Nios II/s processor, instances of VIPERS with 32 processing lanes achieve up to 44× speedup using up to 26× the area.
- Altera Corp. 2008a. Avalon Interface Specifications, 1.0 ed.Google Scholar
- Altera Corp. 2008b. Nios II C2H Compiler User Guide, 8.0 ed.Google Scholar
- Altera Corp. 2008c. Nios II Processor Reference Handbook, 8.0 ed.Google Scholar
- ARC International. 2008. Configurable CPU/DSP processors. http://www.arc.com/configurablecores/index.html.Google Scholar
- Asanovic, K. 1998. Vector microprocessors. Ph.D. thesis, Electrical Engineering and Computer Science Department, University of California, Berkeley. Google Scholar
Digital Library
- Brost, V., Yang, F., and Paindavoine, M. 2007. A modular VLIW processor. In IEEE International Symposium Circuits and Systems. 3968--3971.Google Scholar
- Casper, J., Krashinsky, R., Batten, C., and Asanovic, K. 2005. A parameterizable FPGA prototype of a vector-thread processor. In Workshop on Architecture Research using FPGA Platforms, as part of HPCA-11.Google Scholar
- Chen, S., Venkatesan, R., and Gillard, P. 2008. Implementation of vector floating-point processing unit on FPGAs for high performance computing. In Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering. 881--885.Google Scholar
- Cho, J., Chang, H., and Sung, W. 2006. An FPGA-based SIMD processor with a vector memory unit. In International Symposium on Circuits and Systems. 525--528.Google Scholar
- Daemen, J. and Rijmen, V. 2002. The design of Rijndael: AES---the Advanced Encryption Standard. Springer-Verlag. Google Scholar
Digital Library
- Diefendorff, K., Dubey, P. K., Hochsprung, R., and Scale, H. 2000. AltiVec extension to PowerPC accelerates media processing. IEEE Micro 20, 2, 85--95. Google Scholar
Digital Library
- Dinh, Q., Chen, D., and Wong, D. 2008. Efficient ASIP design for configurable processors with fine-grained resource sharing. In ACM/SIGDA International Symposium on FPGAs. Google Scholar
Digital Library
- Edwards, S. A. 2006. The challenges of synthesizing hardware from C-like languages. IEEE Des. Test Comput. 23, 5, 375--386. Google Scholar
Digital Library
- EEMBC. 2008. The embedded microprocessor benchmark consortium. http://www.eembc.org/.Google Scholar
- Fort, B., Capalija, D., Vranesic, Z. G., and Brown, S. D. 2006. A multithreaded soft processor for SoPC area reduction. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. 131--142. Google Scholar
Digital Library
- Grabbe, C., Bednara, M., von zur Gathen, Shokrollahi, J., and Teich, J. 2003. A high performance VLIW processor for finite field arithmetic. In Proceedings of the International Parallel and Distributed Processing Symposium. 189b. Google Scholar
Digital Library
- Guthaus, M., Ringenberg, J., Ernst, D., Austin, T., Mudge, T., and Brown, R. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google Scholar
Digital Library
- Habata, S., Yokokawa, M., and Kitawaki, S. 2003. The Earth simulator system. NEC Res. Devel. J. 44, 1, 21--16.Google Scholar
- Hasan, M. Z. and Ziavras, S. G. 2005. FPGA-Based vector processing for solving sparse sets of equations. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. 331--332. Google Scholar
Digital Library
- Hwang, K. and Briggs, F. A. 1984. Computer Architecture and Parallel Processing. McGraw-Hill. Google Scholar
Digital Library
- Jacob, A. C., Harris, B., Buhler, J., Chamberlain, R., and Cho, Y. H. 2006. Scalable softcore vector processor for biosequence applications. In IEEE Symposium on Field-Programmable Custom Computing Machines. Google Scholar
Digital Library
- Jones, A. K., Hoare, R., Kusic, D., Fazekas, J., and Foster, J. 2005. An FPGA-based VLIW processor with custom hardware execution. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 107--117. Google Scholar
Digital Library
- Kozyrakis, C. and Patterson, D. 2003a. Overcoming the limitations of conventional vector processors. In Proceedings of the International Symposium on Computer Architecture. 399--409. Google Scholar
Digital Library
- Kozyrakis, C. and Patterson, D. 2003b. Scalable, vector processors for embedded systems. IEEE Micro 23, 6, 36--45. Google Scholar
Digital Library
- Krashinsky, R., Batten, C., Hampton, M., Gerding, S., Pharris, B., Casper, J., and Asanovic, K. 2004. The vector-thread architecture. In Proceedings of the International Symposium on Computer Architecture. 52--63. Google Scholar
Digital Library
- Li, B. and Leong, P. 2008. Serial and parallel FPGA-based variable block size motion estimation processors. J. VLSI Signal Process. 51, 1, 77--98. Google Scholar
Digital Library
- Lu, S.-L. L., Yiannacouras, P., Kassa, R., Konow, M., and Suh, T. 2007. An FPGA-based Pentium in a complete desktop system. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 53--59. Google Scholar
Digital Library
- National Institute of Standards and Technology. 2001. Specification for the Advanced Encryption Standard (AES).Google Scholar
- Ravindran, K., Satish, N., Jin, Y., and Keutzer, K. 2005. An FPGA-based soft multiprocessor system for IPv4 packet forwarding. In Proceedings of the International Conference on Field Programmable Logic and Applications. 487--492.Google Scholar
- Ray, J. and Hoe, J. C. 2003. High-Level modeling and FPGA prototyping of microprocessors. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 100--107. Google Scholar
Digital Library
- Saghir, M. A. R., El-Majzoub, M., and Akl, P. 2006. Datapath and ISA customization for soft VLIW processors. In Proceedings of the IEEE International Conference on Reconfigurable Computing and FPGA’s. 1--10.Google Scholar
- Talla, D. and John, L. 2001. Cost-Effective hardware acceleration of multimedia applications. In Proceedings of the International Conference on Computer Design. 415--424. Google Scholar
Digital Library
- Tensilica, Inc. 2008. Xtensa configurable processors. http://www.tensilica.com/.Google Scholar
- Thakkur, S. and Huff, T. 1999. Internet streaming SIMD extensions. IEEE Comput. 32, 12, 26--34. Google Scholar
Digital Library
- Xilinx, Inc. 2008. MicroBlaze Processor Reference Guide, 8.0 ed.Google Scholar
- Yang, H., Wang, S., Ziavras, S. G., and Hu, J. 2007. Vector processing support for FPGA-oriented high performance applications. In Proceedings of the International Symposium on VLSI. 447--448. Google Scholar
Digital Library
- Yiannacouras, P., Steffan, J. G., and Rose, J. 2008. VESPA: Portable, scalable, and flexible FPGA-based vector processors. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. Google Scholar
Digital Library
- Yu, J. 2008. Vector processing as a soft-core processor accelerator. M.S. thesis, Department of Electrical and Computer Engineering, University of British Columbia.Google Scholar
- Yu, J., Lemieux, G., and Eagleston, C. 2008. Vector processing as a soft-core CPU accelerator. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 222--231. Google Scholar
Digital Library
Index Terms
Vector Processing as a Soft Processor Accelerator
Recommendations
Vector processing as a soft-core CPU accelerator
FPGA '08: Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arraysThe currently accepted method of accelerating applications in FPGA soft processor systems is to design a custom hardware accelerator. This paper suggests the alternative approach of adding a vector processing core to the soft processor as a general-...
The microarchitecture of FPGA-based soft processors
CASES '05: Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systemsAs more embedded systems are built using FPGA platforms, there is an increasing need to support processors in FPGAs. One option is the soft processor, a programmable instruction processor implemented in the reconfigurable logic of the FPGA. Commercial ...
Application-specific customization of soft processor microarchitecture
FPGA '06: Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arraysA key advantage of soft processors (processors built on an FPGA programmable fabric) over hard processors is that they can be customized to suit an application program's specific software. This notion has been exploited in the past principally through ...






Comments