skip to main content
research-article

Optimization of Shared High-Performance Reconfigurable Computing Resources

Published:01 July 2012Publication History
Skip Abstract Section

Abstract

In the field of high-performance computing, systems harboring reconfigurable devices, such as field-programmable gate arrays (FPGAs), are gaining more widespread interest. Such systems range from supercomputers with tightly coupled reconfigurable hardware to clusters with reconfigurable devices at each node. The use of these architectures for scientific computing provides an alternative for computationally demanding problems and has advantages in metrics, such as operating cost/performance and power/performance. However, performance optimization of these systems can be challenging even with knowledge of the system’s characteristics. Our analytic performance model includes parameters representing the reconfigurable hardware, application load imbalance across the nodes, background user load, basic message-passing communication, and processor heterogeneity. In this article, we provide an overview of the analytical model and demonstrate its application for optimization and scheduling of high-performance reconfigurable computing (HPRC) resources. We examine cost functions for minimum runtime and other optimization problems commonly found in shared computing resources. Finally, we discuss additional scheduling issues and other potential applications of the model.

References

  1. Alpha Data. 2012. http://www.alpha-data.com.Google ScholarGoogle Scholar
  2. Atallah, M. J., Black, C. L., Marinescu, D. C., Segel, H. J., and Casavant, T. L. 1992. Models and algorithms for coscheduling compute-intensive tasks on a network of workstations. J. Parallel Distrib. Comput. 16, 319--327.Google ScholarGoogle ScholarCross RefCross Ref
  3. Basney, J., Raman, B., and Livny, M. 1999. High throughput Monte Carlo. In Proceedings of the 9th SIAM Conference on Parallel Processing for Scientific Computing.Google ScholarGoogle Scholar
  4. BLAS. 2012. Basic Linear Algebra Subprograms. http://www.netlib.org/blas/.Google ScholarGoogle Scholar
  5. Cantu-Paz, E. 1998. Designing efficient master-slave parallel genetic algorithms. In Proceedings of the 3rd Annual Conference on Genetic Programming.Google ScholarGoogle Scholar
  6. Casavant, T. L. and Kuhl, J. G. 1988. A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Trans. Softw. Eng. 14, 2, 141--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Celoxica. 2012. http://www.celoxica.com.Google ScholarGoogle Scholar
  8. ClearSpeed. 2012. http://www.clearspeed.com.Google ScholarGoogle Scholar
  9. Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 171--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Cook, S. A. 1971. The complexity of theorem-proving procedures. In Proceedings of the ACM Symposium on Theory of Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cray. 2012. http://www.cray.com.Google ScholarGoogle Scholar
  12. DRC. 2012. Reconfigurable Processing Unit, DRC Computer Corporation. http://www.drccomp.com.Google ScholarGoogle Scholar
  13. El-Ghazawi, T., El-Araby, E., Huang, M., Gaj, K., Kindratenko, V., and Buell, D. 2008. The promise of high-performance reconfigurable computing. Comput. 41, 69--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. El-Rewini, H. and Lewis, T. G. 1994. Task Scheduling in Parallel Distributed Systems. Prentice Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Govindan, V. and Franklin, M. A. 1996. Application load imbalance on parallel processors. In Proceedings of the 10th International Parallel Processing Symposium (IPPS’96). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Holland, B., Nagarajan K., and George, A. 2009. RAT: RC amenability test for rapid performance prediction. ACM Trans. Reconfig. Technol. Syst. 1, 1--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hou, E., Ansari, N., and Ren, H. 1994. A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5, 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kant, K. 1992. Introduction to Computer System Performance Evaluation. McGraw-Hill, Inc., New York.Google ScholarGoogle Scholar
  20. Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Sci. 220, 671--680.Google ScholarGoogle ScholarCross RefCross Ref
  21. Koehler, S., Curreri, J., and George, A. D. 2008. Performance analysis challenges and framework for high-performance reconfigurable computing. Parallel Comput. 34, 217--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kwok, Y. 1999. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31, 406--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Leong, P. H. W., Leong, M. P., Cheung, O. Y. H., Tung, T., Kwok, C. M., Wong, M. Y., and Lee, K. H. 2001. Pilchard: A reconfigurable computing platform with memory slot interface. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Maxwell at EPCC. 2012. http://www.epcc.ed.ac.uk/facilities/maxwell.Google ScholarGoogle Scholar
  25. Nallatech. 2012. http://www.nallatech.com.Google ScholarGoogle Scholar
  26. NIST. 2005. Guideline for implementing cryptography in the federal government. Tech. rep. NIST SP800-21. http://csrc.nist.gov/publications.Google ScholarGoogle Scholar
  27. Novo-G. 2012. http://www.chrec.org/facilities.html.Google ScholarGoogle Scholar
  28. Peterson, G. D. 1994. Parallel application performance on shared, heterogeneous workstations. Ph.D. dissertation. Washington University Sever Institute of Technology, St. Louis, MO. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Saha, P. and El-Ghazawi, T. 2007. Software/hardware co-scheduling for reconfigurable computing systems. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. SGI RASC. 2012. http://www.sgi.com.Google ScholarGoogle Scholar
  31. Smith, M. C. 2003. Analytical modeling of high performance reconfigurable computers: Prediction and analysis of system performance. Ph.D. dissertation, University of Tennessee. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Smith, M. C. and Peterson, G. D. 2005. Parallel application performance on shared high performance reconfigurable computing resources. Perform. Eval. 60, 1--4, 107--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. SRC MAPstation. 2012. http://www.srccomp.com.Google ScholarGoogle Scholar
  34. Topcuoglu, J., Hariri, S., and Wu, M. 2002. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13, 260--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. XtremeData XD1000 Development System. 2012. http://www.xtremedata.com.Google ScholarGoogle Scholar

Index Terms

  1. Optimization of Shared High-Performance Reconfigurable Computing Resources

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!