Abstract
In the field of high-performance computing, systems harboring reconfigurable devices, such as field-programmable gate arrays (FPGAs), are gaining more widespread interest. Such systems range from supercomputers with tightly coupled reconfigurable hardware to clusters with reconfigurable devices at each node. The use of these architectures for scientific computing provides an alternative for computationally demanding problems and has advantages in metrics, such as operating cost/performance and power/performance. However, performance optimization of these systems can be challenging even with knowledge of the system’s characteristics. Our analytic performance model includes parameters representing the reconfigurable hardware, application load imbalance across the nodes, background user load, basic message-passing communication, and processor heterogeneity. In this article, we provide an overview of the analytical model and demonstrate its application for optimization and scheduling of high-performance reconfigurable computing (HPRC) resources. We examine cost functions for minimum runtime and other optimization problems commonly found in shared computing resources. Finally, we discuss additional scheduling issues and other potential applications of the model.
- Alpha Data. 2012. http://www.alpha-data.com.Google Scholar
- Atallah, M. J., Black, C. L., Marinescu, D. C., Segel, H. J., and Casavant, T. L. 1992. Models and algorithms for coscheduling compute-intensive tasks on a network of workstations. J. Parallel Distrib. Comput. 16, 319--327.Google Scholar
Cross Ref
- Basney, J., Raman, B., and Livny, M. 1999. High throughput Monte Carlo. In Proceedings of the 9th SIAM Conference on Parallel Processing for Scientific Computing.Google Scholar
- BLAS. 2012. Basic Linear Algebra Subprograms. http://www.netlib.org/blas/.Google Scholar
- Cantu-Paz, E. 1998. Designing efficient master-slave parallel genetic algorithms. In Proceedings of the 3rd Annual Conference on Genetic Programming.Google Scholar
- Casavant, T. L. and Kuhl, J. G. 1988. A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Trans. Softw. Eng. 14, 2, 141--154. Google Scholar
Digital Library
- Celoxica. 2012. http://www.celoxica.com.Google Scholar
- ClearSpeed. 2012. http://www.clearspeed.com.Google Scholar
- Compton, K. and Hauck, S. 2002. Reconfigurable computing: A survey of systems and software. ACM Comput. Surv. 34, 171--210. Google Scholar
Digital Library
- Cook, S. A. 1971. The complexity of theorem-proving procedures. In Proceedings of the ACM Symposium on Theory of Computing. Google Scholar
Digital Library
- Cray. 2012. http://www.cray.com.Google Scholar
- DRC. 2012. Reconfigurable Processing Unit, DRC Computer Corporation. http://www.drccomp.com.Google Scholar
- El-Ghazawi, T., El-Araby, E., Huang, M., Gaj, K., Kindratenko, V., and Buell, D. 2008. The promise of high-performance reconfigurable computing. Comput. 41, 69--76. Google Scholar
Digital Library
- El-Rewini, H. and Lewis, T. G. 1994. Task Scheduling in Parallel Distributed Systems. Prentice Hall, Upper Saddle River, NJ. Google Scholar
Digital Library
- Garey, M. R. and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York. Google Scholar
Digital Library
- Govindan, V. and Franklin, M. A. 1996. Application load imbalance on parallel processors. In Proceedings of the 10th International Parallel Processing Symposium (IPPS’96). Google Scholar
Digital Library
- Holland, B., Nagarajan K., and George, A. 2009. RAT: RC amenability test for rapid performance prediction. ACM Trans. Reconfig. Technol. Syst. 1, 1--31. Google Scholar
Digital Library
- Hou, E., Ansari, N., and Ren, H. 1994. A genetic algorithm for multiprocessor scheduling. IEEE Trans. Parallel Distrib. Syst. 5, 113--120. Google Scholar
Digital Library
- Kant, K. 1992. Introduction to Computer System Performance Evaluation. McGraw-Hill, Inc., New York.Google Scholar
- Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Sci. 220, 671--680.Google Scholar
Cross Ref
- Koehler, S., Curreri, J., and George, A. D. 2008. Performance analysis challenges and framework for high-performance reconfigurable computing. Parallel Comput. 34, 217--230. Google Scholar
Digital Library
- Kwok, Y. 1999. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31, 406--471. Google Scholar
Digital Library
- Leong, P. H. W., Leong, M. P., Cheung, O. Y. H., Tung, T., Kwok, C. M., Wong, M. Y., and Lee, K. H. 2001. Pilchard: A reconfigurable computing platform with memory slot interface. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). Google Scholar
Digital Library
- Maxwell at EPCC. 2012. http://www.epcc.ed.ac.uk/facilities/maxwell.Google Scholar
- Nallatech. 2012. http://www.nallatech.com.Google Scholar
- NIST. 2005. Guideline for implementing cryptography in the federal government. Tech. rep. NIST SP800-21. http://csrc.nist.gov/publications.Google Scholar
- Novo-G. 2012. http://www.chrec.org/facilities.html.Google Scholar
- Peterson, G. D. 1994. Parallel application performance on shared, heterogeneous workstations. Ph.D. dissertation. Washington University Sever Institute of Technology, St. Louis, MO. Google Scholar
Digital Library
- Saha, P. and El-Ghazawi, T. 2007. Software/hardware co-scheduling for reconfigurable computing systems. In Proceedings of the International Symposium on Field-Programmable Custom Computing Machines. Google Scholar
Digital Library
- SGI RASC. 2012. http://www.sgi.com.Google Scholar
- Smith, M. C. 2003. Analytical modeling of high performance reconfigurable computers: Prediction and analysis of system performance. Ph.D. dissertation, University of Tennessee. Google Scholar
Digital Library
- Smith, M. C. and Peterson, G. D. 2005. Parallel application performance on shared high performance reconfigurable computing resources. Perform. Eval. 60, 1--4, 107--125. Google Scholar
Digital Library
- SRC MAPstation. 2012. http://www.srccomp.com.Google Scholar
- Topcuoglu, J., Hariri, S., and Wu, M. 2002. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13, 260--274. Google Scholar
Digital Library
- XtremeData XD1000 Development System. 2012. http://www.xtremedata.com.Google Scholar
Index Terms
Optimization of Shared High-Performance Reconfigurable Computing Resources
Recommendations
System-level power-performance tradeoffs for reconfigurable computing
In this paper, we propose a configuration-aware datapartitioning approach for reconfigurable computing. We show how the reconfiguration overhead impacts the data-partitioning process. Moreover, we explore the system-level power-performance tradeoffs ...
Parallel application performance on shared high performance reconfigurable computing resources
Performance modelling and evaluation of high-performance parallel and distributed systemsThe use of a network of shared, heterogeneous workstations each harboring a reconfigurable computing (RC) system offers high performance users an inexpensive platform for a wide range of computationally demanding problems. However, effectively using the ...






Comments