skip to main content
research-article

TBES: Template-Based Exploration and Synthesis of Heterogeneous Multiprocessor Architectures on FPGA

Published:13 January 2016Publication History
Skip Abstract Section

Abstract

This article describes TBES, a software end-to-end environment for synthesizing multitask applications on FPGAs. The implementation follows a template-based approach for creating heterogeneous multiprocessor architectures. Heterogeneity stems from the use of general-purpose processors along with custom accelerators. Experimental results demonstrate substantial speedup for several classes of applications.

Furthermore, this work allows for reducing development costs and saving development time for the software architect, the domain expert, and the optimization expert. This work provides a framework to bring together various existing tools and optimisation algorithms. The advantages are manifold: modularity and flexibility, easy customization for best-fit algorithm selection, durability and evolution over time, and legacy preservation including domain experts' know-how.

In addition to the use of architecture templates for the overall system, a second contribution lies in using high-level synthesis for promoting exploration of hardware IPs. The domain expert, who best knows which tasks are good candidates for hardware implementation, selects parts of the initial application to be potentially synthesized as dedicated accelerators. As a consequence, the HLS general problem turns into a constrained and more tractable issue, and automation capabilities eliminate the need for tedious and error-prone manual processes during domain space exploration.

The automation only takes place once the application has been broken down into concurrent tasks by the designer, who can then drive the synthesis process with a set of parameters provided by TBES to balance tradeoffs between optimization efforts and quality of results.

The approach is demonstrated step by step up to FPGA implementations and executions with an MJPEG benchmark and a complex Viola-Jones face detection application. We show that TBES allows one to achieve results with up to 10 times speedup to reduce development times and to widen design space exploration.

References

  1. U. Alqasemi, H. Li, A. Aguirre, and Q. Zhu. 2012. FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 59, 7 (2012), 1344--1353.Google ScholarGoogle ScholarCross RefCross Ref
  2. Altera. 2015a. Altera and IBM Unveil FPGA-Accelerated POWER Systems with Coherent Shared Memory. Retrieved from http://newsroom.altera.com/press-releases/nr-ibm-capi.htm.Google ScholarGoogle Scholar
  3. Altera. 2015b. Stratix 10 - Overview. Retrieved from https://www.altera.com/products/fpga/stratix-series/stratix-10/overview.html.Google ScholarGoogle Scholar
  4. ATL. 2014. The Atlas Transformation Language (ATL). Retrieved from http://www.eclipse.org/atl/.Google ScholarGoogle Scholar
  5. I. Augé, F. Pétrot, F. Donnet, and P. Gomez. 2005. Platform-based design from parallel C specifications. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 12 (2005), 1811--1826. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Benkrid, D. Crookes, and A. Benkrid. 2002. Towards a general framework for FPGA based image processing using hardware skeletons. Parallel Computing 28, 7--8 (2002), 1141--1154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Cartwright, A. Fahkari, S. Ma, C. Smith, M. Huang, D. Andrews, and J. Agron. 2012. Automating the design of mLUT MPSoPC FPGAs in the cloud. In Proceedings of the 2012 22nd International Conference on Field Programmable Logic and Applications (FPL'12). IEEE, 231--236.Google ScholarGoogle Scholar
  8. Y. Corre, J. P. Diguet, D. Heller, and L. Lagadec. 2012. A framework for high-level synthesis of heterogeneous MP-SoC. In Proceedings of the Great Lakes Symposium on VLSI. ACM, 283--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Coussy, C. Chavet, P. Bomel, D. Heller, E. Senn, and E. Martin. 2008. GAUT: A High-Level Synthesis Tool for DSP applications. In High-Level Synthesis: From Algorithm to Digital Circuit. Springer, 147--169.Google ScholarGoogle Scholar
  10. P. Feiler and D. Gluch. 2012. Model-Based Engineering with AADL: An Introduction to the SAE Architecture Analysis & Design Language. Addison-Wesley Professional. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Fort, A. Canis, J. Choi, N. Calagar, R. Lian, S. Hadjis, Y. T. Chen, M. Hall, B. Syrowik, T. Czajkowski, et al. 2014. Automating the design of processor/accelerator embedded systems with legup high-level synthesis. In Proceedings of the 2014 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC'14). IEEE, 120--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. L. Graham, P. B. Kessler, and M. K. Mckusick. 1982. Gprof: A call graph execution profiler. ACM Sigplan Notices 17, 6 (1982), 120--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Ha, S. Kim, C. Lee, Y. Yi, S. Kwon, and Y. Joo. 2007. PeaCE: A hardware-software codesign environment for multimedia embedded systems. ACM Transactions on Design Automation of Electrical Systems 12, 3 (2007), Article 24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. D. Hill and M. R. Marty. 2008. Amdahl's law in the multicore era. Computer 7 (2008), 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Kahn. 1974. The semantics of a simple language for parallel programming. Information Processing 74 (1974), 471--475.Google ScholarGoogle Scholar
  16. J. Keinert, T. Schlichter, J. Falk, J. Gladigau, C. Haubelt, J. Teich, M. Meredith, and others. 2009. SystemCoDesigner—An automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications. ACM Transactions on Design Automation of Electronic Systems (TODAES) 14, 1 (2009), 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. A. Kinsy and S. Devadas. 2012. Heracles 2.0: A tool for design space exploration of multi/many-core processors. In Proceedings of the Workshop on the Intersections of Computer Architecture and Reconfigurable Logic (CARL'12).Google ScholarGoogle Scholar
  18. H. W. Kuhn. 1955. The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 1--2 (1955), 83--97.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Leeser, S. Miller, and H. Yu. 2004. Smart camera based on reconfigurable hardware enables diverse real-time applications. In Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2004 (FCCM'04). IEEE, 147--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Li, N. Farahini, A. Hemani, K. Rosvall, and I. Sander. 2013. System level synthesis of hardware for DSP applications using pre-characterized function implementations. In Proceedings of the ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. MDE. 2015. Model-Based Engineering Description. Retrieved from http://modelbasedengineering.com.Google ScholarGoogle Scholar
  22. L. Moss, H. Guérard, G. Dare, and G. Bois. 2012. Rapid design exploration on an ESL framework featuring hardware-software codesign for ARM processor-based FPGA's. Space 1 (2012), 18.Google ScholarGoogle Scholar
  23. H. Nikolov, T. Stefanov, and E. Deprettere. 2006. Multi-processor system design with ESPAM. In CODES+ ISSS'06. 211--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Opencores. 2014. Online OpenCores Library. Retrieved from http://opencores.org/.Google ScholarGoogle Scholar
  25. P. Pawelczak, K. Nolan, L. Doyle, S. W. Oh, and D. Cabric. 2011. Cognitive radio: Ten years of experimentation and development. IEEE Communications Magazine 49, 3 (2011), 90--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. D. Pimentel, C. Erbas, and S. Polstra. 2006. A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Transactions on Computers, 55, 2 (2006), 99--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Rashid, F. Ferrandi, and K. Bertels. 2009. Hartes design flow for heterogeneous platforms. In Quality of Electronic Design, 2009 (ISQED'09). IEEE, 330--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Sadri, C. Weis, N. Wehn, and L. Benini. 2013. Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ. In Proceedings of the 10th FPGAworld Conference. ACM, 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Shibata, S. Honda, H. Tomiyama, and H. Takada. 2010. Advanced systembuilder: A tool set for multiprocessor design space exploration. In Proceedings of the 2010 International SoC Design Conference (ISOCC'10).Google ScholarGoogle Scholar
  30. D. Suzuki, N. Natsui, A. Mochizuki, S. Miura, H. Honjo, K. Kinoshita, H. Sato, S. Ikeda, T. Endoh, H. Ohno, and T. Hanyu. 2013. Fabrication of a magnetic tunnel junction-based 240-tile nonvolatile field-programmable gate array chip skipping wasted write operations for greedy power-reduced logic applications. IEICE Electronics Express 10, 23 (2013).Google ScholarGoogle Scholar
  31. M. Thompson, H. Nikolov, T. Stefanov, A. D. Pimentel, C. Erbas, S. Polstra, and E. F. Deprettere. 2007. A framework for rapid system-level exploration, synthesis, and programming of multimedia MP-SoCs. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis. ACM, 9--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Vassiliadis, S. Wong, G. Gaydadjiev, K. Bertels, G. Kuzmanov, and E. M. Panainte. 2004. The MOLEN polymorphic processor. IEEE Transactions on Computers, 53, 11 (2004), 1363--1375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Verdoolaege, H. Nikolov, and T. Stefanov. 2007. PN: A tool for improved derivation of process networks. EURASIP Journal on Embedded Systems 2007, 1 (2007), 19--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001 (CVPR'01). Vol. 1. IEEE, I--511.Google ScholarGoogle Scholar
  35. Xilinx. 2011a. Platform Format Specification Reference Manual - Xilinx (UG 642). Retrieved from http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_2/psf_rm.pdf. (2011).Google ScholarGoogle Scholar
  36. Xilinx. 2011b. Xilinx XUPV5-LX110T FPGA Board Documentation. Retrieved from http://www.xilinx.com/ univ/xupv5-lx110t.htm. (2011).Google ScholarGoogle Scholar
  37. Xilinx. 2012. Xilinx ML605 FPGA Board Documentation. Retrieved from http://www.xilinx.com/products/ boards/ml605/reference_designs.htm.Google ScholarGoogle Scholar
  38. Xtext. 2015. Xtext website. Retrieved from https://eclipse.org/Xtext/index.html.Google ScholarGoogle Scholar
  39. Y. Yankova, G. Kuzmanov, K. Bertels, G. Gaydadjiev, Y. Lu, and S. Vassiliadis. 2007. DWARV: Delftworkbench automated reconfigurable VHDL generator. In International Conference on Field Programmable Logic and Applications, 2007 (FPL'07). IEEE, 697--701.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. TBES: Template-Based Exploration and Synthesis of Heterogeneous Multiprocessor Architectures on FPGA

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!