skip to main content
research-article

Exploiting FPGA-Aware Merging of Custom Instructions for Runtime Reconfiguration

Published:03 September 2014Publication History
Skip Abstract Section

Abstract

Runtime reconfiguration is a promising solution for reducing hardware cost in embedded systems, without compromising on performance. We present a framework that aims to increase the performance benefits of reconfigurable processors that support full or partial runtime reconfiguration. The proposed framework achieves this by: (1) providing a means for choosing suitable custom instruction selection heuristics, (2) leveraging FPGA-aware merging of custom instructions to maximize the reconfigurable logic block utilization in each configuration, and (3) incorporating a hierarchical loop partitioning strategy to reduce runtime reconfiguration overhead. We show that the performance gain can be improved by employing suitable custom instruction selection heuristics that, in turn, depend on the reconfigurable resource constraints and the merging factor (extent to which the selected custom instructions can be merged). The hierarchical loop partitioning strategy leads to an average performance gain of over 31% and 46% for full and partial runtime reconfiguration, respectively. Performance gain can be further increased to over 52% and 70% for full and partial runtime reconfiguration, respectively, by exploiting FPGA-aware merging of custom instructions.

References

  1. K. Atasu, C. Özturan, G. Dündar, O. Mencer, and W. Luk. 2008. CHIPS: Custom hardware instruction processor synthesis. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 27, 3, 528--541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Bauer, M. Shafique, S. Kramer, and J. Henkel. 2007. RISPP: Rotating instruction set processing platform. In Proceedings of the 44th ACM/IEEE/EDA Design Automation Conference. 791--796. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Bonzini and L. Pozzi. 2008. Recurrence-aware instruction set selection for extensible embedded processors. IEEE Trans. VLSI Syst. 16, 10, 1259--1267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cong, Y. Fan, G. Han, and Z. Zhang. 2004. Application-specific instruction generation for configurable processor architectures. In Proceedings of the 12th ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 183--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Eembc. 2014. The embedded microprocessor benchmark consortium. http://www.eembc.org.Google ScholarGoogle Scholar
  6. Y. Guo, G. J. M. Smit, H. Broersma, and P. M. Heysters. 2003. A graph covering algorithm for a coarse grain reconfigurable system. In Proceedings of the ACM/SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems. 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Halldórsson and J. Radhakrishna. 1994. Greed is good: Approximating independent sets in sparse and bounded-degree graphs. In Proceedings of the Annual ACM Symposium on Theory of Computing. 439--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. P. Huynh, J. E. Sim, and T. Mitra. 2009. An efficient framework for dynamic reconfiguration of instruction-set customization. Des. Autom. Embedd. Syst. 13, 1--2, 91--113.Google ScholarGoogle Scholar
  10. G. Karypis and V. Kumar. 1998a. A software package for partitioning unstructured graphs, partitioning meshes and computing fill-reducing orderings of sparse matrices. http://www.lrr.in.tum.de/∼berariu/teaching/res/pos1011/manualMETIS.pdf.Google ScholarGoogle Scholar
  11. G. Karypis and V. Kumar. 1998b. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48, 96--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Kaul, R. Vemuri, S. Govindarajan, and I. Ouaiss. 1999. An automated temporal partitioning and loop fission approach for FPGA based reconfigurable synthesis of DSP applications. In Proceedings of the Design Automation Conference. 616--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. K. Lam, B. N. Krishnan, and T. Srikanthan. 2006. Efficient management of custom instructions for run-time reconfigurable instruction set processors. In Proceedings of the IEEE International Conference on Field Programmable Technology. 261--264.Google ScholarGoogle Scholar
  14. S. K. Lam, F. Huang, T. Srikanthan, and J. Wu. 2008. Run-time management of custom instructions on a partially reconfigurable architecture. In Proceedings of the IEEE International Conference on Electronic Design. 1--6.Google ScholarGoogle Scholar
  15. S. K. Lam and T. Srikanthan. 2009. Rapid design of area-efficient custom instructions for reconfigurable embedded processing. J. Syst. Archit. 55, 1, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. K. Lam, Y. Deng, J. Hu, X. Zhou, and T. Srikanthan. 2010. Hierarchical loop partitioning for rapid generation of runtime configurations. In Proceedings of the 6th International Symposium on Applied Reconfigurable Computing. 282--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. K. Lam, T. Srikanthan, and C. T. Clarke. 2011. Architecture-aware technique for mapping area-time efficient custom instructions onto FPGAS. IEEE Trans. Comput. 60, 5, 680--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. K. Lam, T. Srikanthan, and C. T. Clarke. 2012. Exploiting FPGA-aware merging of custom instructions for runtime reconfiguration. In Proceedings of the 7th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip. 1--8.Google ScholarGoogle Scholar
  19. T. Li, J. Wu, S. K. Lam, and T. Srikanthan. 2010. Selecting profitable custom instructions for reconfigurable processors. J. Syst. Archit. 56, 8, 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, T. Callahan, E. Darnell, R. Harr, U. Kurkure, and J. Stockwood. 2000. Hardware-software co-design of embedded reconfigurable architectures. In Proceedings of the Design Automation Conference. 507--512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Mattson and M. Christensson. 2004. Evaluation of synthesizable CPU cores. M. S. thesis, Chalmers University of Technology, Gothenburg, Sweden.Google ScholarGoogle Scholar
  22. F. Mehdipour, H. Noori, M. S. Zamani, K. Murakami, M. Sedighi, and K. Inoue. 2006. An integrated temporal partitioning and mapping framework for handling custom instructions on a reconfigurable functional unit. In Proceedings of the Asia-Pacific Computer Systems Architecture Conference. 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Pozzi, K. Atasu, and P. Ienne. 2006. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 25, 7, 1209--1229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Prakash, S. K. Lam, C. T. Clarke, and T. Srikanthan. 2013. FPGA-aware techniques for rapid generation of profitable custom instructions. Microprocess. Microsyst. 37, 3, 259--269Google ScholarGoogle ScholarCross RefCross Ref
  25. Stretch. 2014. S6000 family software configurable processors. http://www.stretchinc.com/products/s6000.php.Google ScholarGoogle Scholar
  26. Trimaran. 2014. An infrastructure for research in instruction-level parallelism. http://www.trimaran.org.Google ScholarGoogle Scholar
  27. A. G. Ye and J. Rose. 2006. Using bus-based connections to improve field-programmable gate-array density for implementing datapath circuits. IEEE Trans. Very Large Scale Integr. Syst. 14, 5, 462--473. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Reconfigurable Technology and Systems
    ACM Transactions on Reconfigurable Technology and Systems  Volume 7, Issue 3
    Special Issue on 11th International Conference on Field-Programmable Technology (FPT'12) and Special Issue on the 7th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC'12)
    August 2014
    199 pages
    ISSN:1936-7406
    EISSN:1936-7414
    DOI:10.1145/2664590
    Issue’s Table of Contents

    Copyright © 2014 ACM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 September 2014
    • Revised: 1 November 2013
    • Accepted: 1 November 2013
    • Received: 1 January 2013
    Published in trets Volume 7, Issue 3

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!