skip to main content
research-article

Quipu: A Statistical Model for Predicting Hardware Resources

Published:01 May 2013Publication History
Skip Abstract Section

Abstract

There has been a steady increase in the utilization of heterogeneous architectures to tackle the growing need for computing performance and low-power systems. The execution of computation-intensive functions on specialized hardware enables to achieve substantial speedups and power savings. However, with a large legacy code base and software engineering experts, it is not at all obvious how to easily utilize these new architectures. As a result, there is a need for comprehensive tool support to bridge the knowledge gap of many engineers as well as to retarget legacy code. In this article, we present the Quipu modeling approach, which consists of a set of tools and a modeling methodology that can generate hardware estimation models, which provide valuable information for developers. This information helps to focus their efforts, to partition their application, and to select the right heterogeneous components. We present Quipu’s capability to generate domain-specific models, that are up to several times more accurate within their particular domain (error: 4.6%) as compared to domain-agnostic models (error: 23%). Finally, we show how Quipu can generate models for a new toolchain and platform within a few days.

References

  1. ACE B. V. 2003. CoSy compilers, overview of construction and operation.Google ScholarGoogle Scholar
  2. Altera. 2011. SoC FPGA ARM Cortex-A9 MPCore processor advance information brief.Google ScholarGoogle Scholar
  3. Banerjee, P., Shenoy, N., et al. 2000. A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’00). 39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ben-Asher, Y. and Rotem, N. 2008. Synthesis for variable pipelined function units. In Proceedings of the International Symposium on Systems-on-Chip (SOC’08). 1--4.Google ScholarGoogle Scholar
  5. Ben-Asher, Y. and Rotem, N. 2010. Automatic memory partitioning: Increasing memory parallelism via data structure partitioning. In Proceedings of the International Workshop on Hardware/Software Codesign (CODES’10). 155--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bertels, K. Vassiliadis, S., Panainte, E. M., Yankova, Y., Galuzzi, G., Chaves, D., and Kuzmanov, G. 2006. Developing applications for polymorphic processors: The Delft Workbench. Tech. rep. CE-TR-2006-XX.Google ScholarGoogle Scholar
  7. Bertels, K., Simna, V.-M., et al. 2010. HArtes: Hardware-software codesign for heterogeneous multicore platforms. IEEE Micro 30, 5, 88--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bertels, K., Ostadzadeh, S. A., and Meeuws, R. J. 2011. Advanced profiling of applications for heterogeneous multi-core platforms. In Proceedings of the International Conference on Engineering of Reconfigurable Systems & Algorithms (ERSA’’11). 171--183.Google ScholarGoogle Scholar
  9. Bilavarn, S., Gogniat, G., Philippe, J-L., and Bossuet, L. 2006. Design space pruning through early estimations of area/delay tradeoffs for FPGA implementations. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 25, 10, 1950--1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Box, G. E. P. and Cox, D. R. 1964. An Analysis of Transformations. J. R. Stat. Soc. Series B 26, 2, 211--252.Google ScholarGoogle ScholarCross RefCross Ref
  11. Brandolese, C., Fornaciari, W., and Salice, F. 2004. An area estimation methodology for FPGA based designs at SystemC-level. In Proceedings of the IEEE/ACM Design Automation Conference (DAC’04). 129--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Callan, R. 1998. Essence of Neural Networks. Prentice Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Cammarota, R., Kejariwal, A., D’Alberto, P., Panigrahi, S., Veidenbaum, A. V., and Nicolau, A. 2011. Pruning hardware evaluation space via correlation-driven application similarity analysis. In Proceedings of the 8th ACM International Conference on Computing Frontiers (CF’11). 4:1--4:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J., Brown, S., and Czajkowski, T. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’11). 33--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679--698. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Chen, T., Raghavan, R., Dale, J. N., and Iwata, E. 2007. Cell broadband engine architecture and its first implementation: a performance view. IBM J. Res. Dev. 51, 559--572. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chuong, L. M., Lam, S.-K., and Srikanthan, T. 2009. Area-time estimation of controller for porting C-Based functions onto FPGA. In Proceedings of the IEEE/IFIP International Symposium on Rapid System Prototyping (RSP’09). 145--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cilardo, A., Durante, P., Lofiego, C., and Mazzeo, A. 2010. Early prediction of hardware complexity in HLL-to-HDL translation. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’10). 483--488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Degryse, T., Devos, H., and Stroobandt, D. 2008. FPGA resource estimation for loop controllers. In Proceedings of the 6th Workshop on Optimizations for DSP and Embedded Systems (ODES’08). 9--15.Google ScholarGoogle Scholar
  20. Deng, L., Sobti, K., and Chakrabarti, C. 2008. Accurate models for estimating area and power of FPGA implementations. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’08). 1417--1420.Google ScholarGoogle Scholar
  21. Eeckhout, L., Vandierendonck, H., and Bosschere, K. D. 2002. Workload design: Selecting representative program-input pairs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 83--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Elshoff, J. L. 1984. Characteristic program complexity measures. In Proceedings of the International Conference on Software Engineering (ICSE’84). 288--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Enzler, R., Jeger, T., Cottet, D., and Tröster, G. 2000. High-level area and performance estimation of hardware building blocks on FPGAs. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’00). 525--534. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Faraway, J. 2006. Extending the Linear Model with R. CRC Press.Google ScholarGoogle Scholar
  25. Guang, W., Baraldo, M., and Furlanut, M. 1995. Calculating percentage prediction error: A user’s note. Pharm. Res. 32, 4, 241--248.Google ScholarGoogle ScholarCross RefCross Ref
  26. Gupta, S., Dutt, N., Gupta, R., and Nicolau, A. 2003. Spark: A high-level synthesis framework for applying parallelizing compiler transformations. In Proceedings of the 16th International Conference on VLSI Design. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Halstead, M. H. 1977. Elements of Software Science. Computer Science Library.Google ScholarGoogle Scholar
  28. Harrison, W. 1992. An entropy-based measure of software complexity. IEEE Trans. Software Eng. 18, 11, 1025--1029. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Harrison, W. and Magel, K. 1981. A topological analysis of the complexity of computer programs with less than three binary branches. SIGPLAN Not. 16, 4, 51--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Holzer, M. and Rupp, M. 2005. Static estimation of execution times for hardware accelerators in system-on-chips. In Proceedings of the International Symposium on Systems-on-Chip (SOC’05). 62--65.Google ScholarGoogle Scholar
  31. Hut, P., Makino, J., and McMillan, S. 1995. Building a better leapfrog. Astrophys. J., Part 2. Lett. 443, L93--L96.Google ScholarGoogle ScholarCross RefCross Ref
  32. Kohavi, R. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’95). Vol. 2, 1137--1143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kulkarni, D., Najjar, W. A., Rinker, R., and Kurdahi, F. J. 2006. Compile-time area estimation for LUT-based FPGAs. ACM Trans. Des. Autom. Electron. Syst. 11, 1, 104--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lakshminarayana, A., Shukla, S., and Kumar, S. 2011. High level power estimation models for FPGAs. In Proceedings of the IEEE Symposium on VLSI. 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. McCabe, T. J. 1976. A complexity measure. IEEE Trans. Softw. Eng., 308--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Meeuws, R. J. 2007. A quantitative model for hardware/software partitioning. M.S. thesis, Delft University of Technology, Delft, The Netherlands.Google ScholarGoogle Scholar
  37. Meeuws, R. J. 2012. Quantitative hardware prediction modeling for hardware/software co-design. Ph.D. thesis.Google ScholarGoogle Scholar
  38. Meeuws, R. J., Galuzzi, C., and Bertels, K. 2011. High level quantitative hardware prediction modeling using statistical methods. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Models, and Simulations (SAMOS’11). 140--149.Google ScholarGoogle Scholar
  39. Meeuws, R., Sigdel, K., Yankova, Y., and Bertels, K. 2008. High level quantitative interconnect estimation for early design space exploration. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’08). 317--320.Google ScholarGoogle Scholar
  40. Meeuws, R. J., Yankova, Y. D., and Bertels, K. L. M. 2006. Towards a quantitative model for hardware/software partitioning. Tech rep., part of Rcosy DES.6392 project.Google ScholarGoogle Scholar
  41. Meeuws, R. J., Yankova, Y. D., Bertels, K., Gaydadjiev, G. N., and Vassiliadis, S. 2007. A quantitative prediction model for hardware/software partitioning. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’07). 317--320.Google ScholarGoogle Scholar
  42. Monostori, Á., Frühauf, H. H., and Kókai, G. 2005. Quick estimation of resources of FPGAs and ASICs using neural networks. In Proceedings of Lernen, Wissensentdeckung und Adaptivität (LWA’05). 210--215.Google ScholarGoogle Scholar
  43. Morris, K. 2004. Catapult C: Mentor announces architectural synthesis. eejournalnet.Google ScholarGoogle Scholar
  44. Munson, J. C. 2002. Software Engineering Measurement. CRC Press, Inc., Boca Raton, FL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. NanGate Inc. NanGate FreePDK45 Generic Open Cell Lib. v1.3.Google ScholarGoogle Scholar
  46. Nayak, A., Haldar, M., Choudhary, A., and Banerjee, P. 2002. Accurate area and delay estimators for FPGAs. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe (DATE’02). 862. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Nocedal, J. and Wright, S. J. 2000. Numerical Optimization. Springer.Google ScholarGoogle Scholar
  48. nVidia Corp. 2011. Tegra 2 Technical Reference Manual.Google ScholarGoogle Scholar
  49. Oliveira, A. L., Braga, P. L., Lima, R. M. F., and Cornelio, M. L. 2010. GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf. Softw. Tech. 52, 11, 1155--1166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Ostadzadeh, S. A., Meeuws, R. J., Ashraf, I., Galuzzi, C., and Bertels, K. L. M. 2012. The Q2 profiling framework: Driving application mapping for heterogeneous reconfigurable platforms. In Proceedings of the International Workshop on Reconfigurable Computing: Architectures, Tools and Applications (ARC’12). 76--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Oviedo, E. I. 1980. Control flow, data flow, and program complexity. In Proceedings of the Annual International Computer Software and Applications Conference (COMPSAC’80). 146--152.Google ScholarGoogle Scholar
  52. Palermo, G., Silvano, C., and Zaccaria, V. 2009. ReSPIR: A response surface-based pareto iterative refinement for application-specific design space exploration. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 28, 12, 1816--1829. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Rupp, K. and Selberherr, S. 2011. The economic limit to Moore ’s law. IEEE Trans. Semicond. Manuf. 24, 1, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  54. Schumacher, P. and Jha, P. K. 2008. Fast and accurate resource estimation of RTL-based designs targeting FPGAS. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’08). 59--64.Google ScholarGoogle Scholar
  55. Schumacher, P. R., Miller, I. D., Parlour, D. B., Janneck, J. W., and Jha, P. K. 2011. Method of estimating resource requirements for a circuit design. Patent. US 7979835.Google ScholarGoogle Scholar
  56. Sheiner, L. B. and Beal, S. L. 1981. Some suggestions for measuring predictive performance. J. Pharmacokinetics Pharmacodynamics 9, 503--512. 10.1007/BF01060893.Google ScholarGoogle ScholarCross RefCross Ref
  57. So, B., Diniz, P. C., and Hall, M. W. 2003. Using estimates from behavioral synthesis tools in compiler-directed design space exploration. In Proceedings of the IEEE/ACM Design Automation Conference (DAC’03). ACM, 514--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Supplee, L. M. E. A. 1997. MELP: The new federal standard at 2400 bps. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing. 1591--1594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Venables, W. and Ripley, B. 2002. Modern Applied Statistics with S. 4th Ed. Statistics and Computing, Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Villarreal, J., Park, A., Najjar, W., and Halstead, R. 2010. Designing modular hardware accelerators in C with ROCCC 2.0. In Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’10). 127--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Welsh, A., Cunningham, R., Donnelly, C., and Lindenmayer, D. 1996. Modelling the abundance of rare species: Statistical models for counts with extra zeros. Ecol. Modell. 88, 1--3, 297--308.Google ScholarGoogle ScholarCross RefCross Ref
  62. Xilinx. 2011. Zynq-7000 extensible processing platform summary.Google ScholarGoogle Scholar
  63. Yankova, Y., Kuzmanov, G., Bertels, K., Gaydadjiev, G., Lu, Y., and Vassiliadis, S. 2007. DWARV: Delftworkbench automated reconfigurable VHDL generator. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’07). 697--701.Google ScholarGoogle Scholar

Index Terms

  1. Quipu: A Statistical Model for Predicting Hardware Resources

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!