skip to main content
research-article

Reconfiguration and Communication-Aware Task Scheduling for High-Performance Reconfigurable Computing

Authors Info & Claims
Published:01 November 2010Publication History
Skip Abstract Section

Abstract

High-performance reconfigurable computing involves acceleration of significant portions of an application using reconfigurable hardware. When the hardware tasks of an application cannot simultaneously fit in an FPGA, the task graph needs to be partitioned and scheduled into multiple FPGA configurations, in a way that minimizes the total execution time. This article proposes the Reduced Data Movement Scheduling (RDMS) algorithm that aims to improve the overall performance of hardware tasks by taking into account the reconfiguration time, data dependency between tasks, intertask communication as well as task resource utilization. The proposed algorithm uses the dynamic programming method. A mathematical analysis of the algorithm shows that the execution time would at most exceed the optimal solution by a factor of around 1.6, in the worst-case. Simulations on randomly generated task graphs indicate that RDMS algorithm can reduce interconfiguration communication time by 11% and 44% respectively, compared with two other approaches that consider data dependency and hardware resource utilization only. The practicality, as well as efficiency of the proposed algorithm over other approaches, is demonstrated by simulating a task graph from a real-life application - N-body simulation - along with constraints for bandwidth and FPGA parameters from existing high-performance reconfigurable computers. Experiments on SRC-6 are carried out to validate the approach.

References

  1. Bazargan, K., Kastner, R., and Sarrafzadeh, M. 2000. Fast template placement for reconfigurable computing systems. IEEE Des. Test Comput. 17, 1, 68--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Brebner, G. and Diessel, O. 2001. Chip-based reconfigurable task management. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’01). 182--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Caprara, A. and Pferschy, U. 2004. Worst-case analysis of the subset sum algorithm for bin packing. Oper. Res. Lett. 32, 20, 159--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Coffman, Jr., E. G., Garey, M. R., and Johnson, D. S. 1996. Approximation algorithms for bin packing: a survey. In Approximation Algorithms for NP-Hard Problems. D. Hochbaum Ed., PWS Publishing, Boston. 46--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Compton, K., Li, Z., Cooley, J., Knol, S., and Hauck, S. 2002. Configuration relocation and defragmentation for run-time reconfigurable computing. IEEE Trans. VLSI Syst. 10, 3, 209--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Diessel, O., ElGindy, H., Middendorf, M., Schmeck, H., and Schmidt, B. 2000. Dynamic scheduling of tasks on partially reconfigurable FPGAs. IEE Proc. Comput. Digital Techniq. (Special Issue on Reconfigurable Systems) 147, 3, 181--188.Google ScholarGoogle ScholarCross RefCross Ref
  7. Fekete, S. P., Köhler, E., and Teich, J. 2001. Optimal FPGA module placement with temporal precedence constraints. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’01). 658--665. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Govindu, G., Scrofano, R., and Prasanna, V. K. 2005. A library of parameterizable floating-point cores for FPGAs and their application to scientific computing. In Proceedings of the International Conference on Engineering Reconfigurable Systems and Algorithms (ERSA’05). 137--145.Google ScholarGoogle Scholar
  9. Handa, M. and Vemuri, R. 2004. A fast algorithm for finding maximal empty rectangles for dynamic FPGA placement. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE’04). Vol. 1. 744--745. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hemmert, K. S. and Underwood, K. D. 2006. Open source high performance floating-point modules. In Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’06). 349--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Huang, M., Simmler, H., Saha, P., and El-Ghazawi, T. 2008. Hardware task scheduling optimizations for reconfigurable computing. In Proceedings of the 2nd International Workshop on High-Performance Reconfigurable Computing Technology and Applications (HPRCTA’08).Google ScholarGoogle Scholar
  12. Huang, M., Simmler, H., Serres, O., and El-Ghazawi, T. 2009. RDMS: A hardware task scheduling algorithm for reconfigurable computing. In Proceedings of the 16th Reconfigurable Architectures Workshop (RAW’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kellerer, H., Pferschy, U., and Pisinger, D. 2004. Knapsack Problems. Springer, Berlin.Google ScholarGoogle Scholar
  14. Kleinberg, J. and Tardos, É. 2005. Algorithm Design. Pearson/Addison-Wesley, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lienhart, G., Kugel, A., and Männer, R. 2002. Using floating-point arithmetic on FPGAs to accelerate scientific N-body simulations. In Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’02). 182--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lucy, L. B. 1977. A numerical approach to the testing of the fission hypothesis. Astronom. J. 82, 12, 1013--1024.Google ScholarGoogle ScholarCross RefCross Ref
  17. Monaghan, J. J. and Lattanzio, J. C. 1985. A refined particle method for astrophysical problems. Astron. Astrophys. 149, 135--143.Google ScholarGoogle Scholar
  18. Pisinger, D. 1999. Linear time algorithms for knapsack problems with bounded weights. J. Algor. 33, 1, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Saha, P. 2007. Automatic software hardware co-design for reconfigurable computing systems. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’07). 507--508.Google ScholarGoogle ScholarCross RefCross Ref
  20. Thakkar, A. J. and Ejnioui, A. 2006. Design and implementation of double precision floating point division and square root on FPGAs. In Proceedings of the IEEE Aerospace Conference.Google ScholarGoogle Scholar
  21. Walder, H. and Platzner, M. 2002. Non-preemptive multitasking on fpga: Task placement and footprint transform. In Proceedings of the 2nd International Conference on Engineering of Reconfigurable Systems and Architectures (ERSA). 24--30.Google ScholarGoogle Scholar
  22. Walder, H., Steiger, C., and Platzner, M. 2003. Fast online task placement on FPGAs: free space partitioning and 2D-hashing. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS’03). 178--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wiangtong, T., Cheung, P., and Luk, W. 2003. Multitasking in hardware-software codesign for reconfigurable computer. In Proceedings of the International Symposium on Circuits and Systems (ISCAS’03). Vol. 5. 621--624.Google ScholarGoogle Scholar
  24. Zhuo, L. and Prasanna, V. K. 2007. Scalable and modular algorithms for floating-point matrix multiplication on reconfigurable computing systems. IEEE Trans. Para. Distrib. Syst. 18, 4, 433--448. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reconfiguration and Communication-Aware Task Scheduling for High-Performance Reconfigurable Computing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Reconfigurable Technology and Systems
          ACM Transactions on Reconfigurable Technology and Systems  Volume 3, Issue 4
          November 2010
          240 pages
          ISSN:1936-7406
          EISSN:1936-7414
          DOI:10.1145/1862648
          Issue’s Table of Contents

          Copyright © 2010 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 November 2010
          • Accepted: 1 August 2009
          • Revised: 1 July 2009
          • Received: 1 March 2009
          Published in trets Volume 3, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!