skip to main content
research-article

Vector Processing as a Soft Processor Accelerator

Published:01 June 2009Publication History
Skip Abstract Section

Abstract

Current FPGA soft processor systems use dedicated hardware modules or accelerators to speed up data-parallel applications. This work explores an alternative approach of using a soft vector processor as a general-purpose accelerator. The approach has the benefits of a purely software-oriented development model, a fixed ISA allowing parallel software and hardware development, a single accelerator that can accelerate multiple applications, and scalable performance from the same source code. With no hardware design experience needed, a software programmer can make area-versus-performance trade-offs by scaling the number of functional units and register file bandwidth with a single parameter. A soft vector processor can be further customized by a number of secondary parameters to add or remove features for a specific application to optimize resource utilization. This article introduces VIPERS, a soft vector processor architecture that maps efficiently into an FPGA and provides a scalable amount of performance for a reasonable amount of area. Compared to a Nios II/s processor, instances of VIPERS with 32 processing lanes achieve up to 44× speedup using up to 26× the area.

References

  1. Altera Corp. 2008a. Avalon Interface Specifications, 1.0 ed.Google ScholarGoogle Scholar
  2. Altera Corp. 2008b. Nios II C2H Compiler User Guide, 8.0 ed.Google ScholarGoogle Scholar
  3. Altera Corp. 2008c. Nios II Processor Reference Handbook, 8.0 ed.Google ScholarGoogle Scholar
  4. ARC International. 2008. Configurable CPU/DSP processors. http://www.arc.com/configurablecores/index.html.Google ScholarGoogle Scholar
  5. Asanovic, K. 1998. Vector microprocessors. Ph.D. thesis, Electrical Engineering and Computer Science Department, University of California, Berkeley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Brost, V., Yang, F., and Paindavoine, M. 2007. A modular VLIW processor. In IEEE International Symposium Circuits and Systems. 3968--3971.Google ScholarGoogle Scholar
  7. Casper, J., Krashinsky, R., Batten, C., and Asanovic, K. 2005. A parameterizable FPGA prototype of a vector-thread processor. In Workshop on Architecture Research using FPGA Platforms, as part of HPCA-11.Google ScholarGoogle Scholar
  8. Chen, S., Venkatesan, R., and Gillard, P. 2008. Implementation of vector floating-point processing unit on FPGAs for high performance computing. In Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering. 881--885.Google ScholarGoogle Scholar
  9. Cho, J., Chang, H., and Sung, W. 2006. An FPGA-based SIMD processor with a vector memory unit. In International Symposium on Circuits and Systems. 525--528.Google ScholarGoogle Scholar
  10. Daemen, J. and Rijmen, V. 2002. The design of Rijndael: AES---the Advanced Encryption Standard. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Diefendorff, K., Dubey, P. K., Hochsprung, R., and Scale, H. 2000. AltiVec extension to PowerPC accelerates media processing. IEEE Micro 20, 2, 85--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dinh, Q., Chen, D., and Wong, D. 2008. Efficient ASIP design for configurable processors with fine-grained resource sharing. In ACM/SIGDA International Symposium on FPGAs. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Edwards, S. A. 2006. The challenges of synthesizing hardware from C-like languages. IEEE Des. Test Comput. 23, 5, 375--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. EEMBC. 2008. The embedded microprocessor benchmark consortium. http://www.eembc.org/.Google ScholarGoogle Scholar
  15. Fort, B., Capalija, D., Vranesic, Z. G., and Brown, S. D. 2006. A multithreaded soft processor for SoPC area reduction. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. 131--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Grabbe, C., Bednara, M., von zur Gathen, Shokrollahi, J., and Teich, J. 2003. A high performance VLIW processor for finite field arithmetic. In Proceedings of the International Parallel and Distributed Processing Symposium. 189b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Guthaus, M., Ringenberg, J., Ernst, D., Austin, T., Mudge, T., and Brown, R. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Habata, S., Yokokawa, M., and Kitawaki, S. 2003. The Earth simulator system. NEC Res. Devel. J. 44, 1, 21--16.Google ScholarGoogle Scholar
  19. Hasan, M. Z. and Ziavras, S. G. 2005. FPGA-Based vector processing for solving sparse sets of equations. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines. 331--332. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hwang, K. and Briggs, F. A. 1984. Computer Architecture and Parallel Processing. McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jacob, A. C., Harris, B., Buhler, J., Chamberlain, R., and Cho, Y. H. 2006. Scalable softcore vector processor for biosequence applications. In IEEE Symposium on Field-Programmable Custom Computing Machines. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jones, A. K., Hoare, R., Kusic, D., Fazekas, J., and Foster, J. 2005. An FPGA-based VLIW processor with custom hardware execution. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kozyrakis, C. and Patterson, D. 2003a. Overcoming the limitations of conventional vector processors. In Proceedings of the International Symposium on Computer Architecture. 399--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kozyrakis, C. and Patterson, D. 2003b. Scalable, vector processors for embedded systems. IEEE Micro 23, 6, 36--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Krashinsky, R., Batten, C., Hampton, M., Gerding, S., Pharris, B., Casper, J., and Asanovic, K. 2004. The vector-thread architecture. In Proceedings of the International Symposium on Computer Architecture. 52--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Li, B. and Leong, P. 2008. Serial and parallel FPGA-based variable block size motion estimation processors. J. VLSI Signal Process. 51, 1, 77--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lu, S.-L. L., Yiannacouras, P., Kassa, R., Konow, M., and Suh, T. 2007. An FPGA-based Pentium in a complete desktop system. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 53--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. National Institute of Standards and Technology. 2001. Specification for the Advanced Encryption Standard (AES).Google ScholarGoogle Scholar
  29. Ravindran, K., Satish, N., Jin, Y., and Keutzer, K. 2005. An FPGA-based soft multiprocessor system for IPv4 packet forwarding. In Proceedings of the International Conference on Field Programmable Logic and Applications. 487--492.Google ScholarGoogle Scholar
  30. Ray, J. and Hoe, J. C. 2003. High-Level modeling and FPGA prototyping of microprocessors. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 100--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Saghir, M. A. R., El-Majzoub, M., and Akl, P. 2006. Datapath and ISA customization for soft VLIW processors. In Proceedings of the IEEE International Conference on Reconfigurable Computing and FPGA’s. 1--10.Google ScholarGoogle Scholar
  32. Talla, D. and John, L. 2001. Cost-Effective hardware acceleration of multimedia applications. In Proceedings of the International Conference on Computer Design. 415--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tensilica, Inc. 2008. Xtensa configurable processors. http://www.tensilica.com/.Google ScholarGoogle Scholar
  34. Thakkur, S. and Huff, T. 1999. Internet streaming SIMD extensions. IEEE Comput. 32, 12, 26--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Xilinx, Inc. 2008. MicroBlaze Processor Reference Guide, 8.0 ed.Google ScholarGoogle Scholar
  36. Yang, H., Wang, S., Ziavras, S. G., and Hu, J. 2007. Vector processing support for FPGA-oriented high performance applications. In Proceedings of the International Symposium on VLSI. 447--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yiannacouras, P., Steffan, J. G., and Rose, J. 2008. VESPA: Portable, scalable, and flexible FPGA-based vector processors. In International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Yu, J. 2008. Vector processing as a soft-core processor accelerator. M.S. thesis, Department of Electrical and Computer Engineering, University of British Columbia.Google ScholarGoogle Scholar
  39. Yu, J., Lemieux, G., and Eagleston, C. 2008. Vector processing as a soft-core CPU accelerator. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 222--231. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Vector Processing as a Soft Processor Accelerator

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!