Abstract
There has been a steady increase in the utilization of heterogeneous architectures to tackle the growing need for computing performance and low-power systems. The execution of computation-intensive functions on specialized hardware enables to achieve substantial speedups and power savings. However, with a large legacy code base and software engineering experts, it is not at all obvious how to easily utilize these new architectures. As a result, there is a need for comprehensive tool support to bridge the knowledge gap of many engineers as well as to retarget legacy code. In this article, we present the Quipu modeling approach, which consists of a set of tools and a modeling methodology that can generate hardware estimation models, which provide valuable information for developers. This information helps to focus their efforts, to partition their application, and to select the right heterogeneous components. We present Quipu’s capability to generate domain-specific models, that are up to several times more accurate within their particular domain (error: 4.6%) as compared to domain-agnostic models (error: 23%). Finally, we show how Quipu can generate models for a new toolchain and platform within a few days.
- ACE B. V. 2003. CoSy compilers, overview of construction and operation.Google Scholar
- Altera. 2011. SoC FPGA ARM Cortex-A9 MPCore processor advance information brief.Google Scholar
- Banerjee, P., Shenoy, N., et al. 2000. A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’00). 39. Google Scholar
Digital Library
- Ben-Asher, Y. and Rotem, N. 2008. Synthesis for variable pipelined function units. In Proceedings of the International Symposium on Systems-on-Chip (SOC’08). 1--4.Google Scholar
- Ben-Asher, Y. and Rotem, N. 2010. Automatic memory partitioning: Increasing memory parallelism via data structure partitioning. In Proceedings of the International Workshop on Hardware/Software Codesign (CODES’10). 155--162. Google Scholar
Digital Library
- Bertels, K. Vassiliadis, S., Panainte, E. M., Yankova, Y., Galuzzi, G., Chaves, D., and Kuzmanov, G. 2006. Developing applications for polymorphic processors: The Delft Workbench. Tech. rep. CE-TR-2006-XX.Google Scholar
- Bertels, K., Simna, V.-M., et al. 2010. HArtes: Hardware-software codesign for heterogeneous multicore platforms. IEEE Micro 30, 5, 88--97. Google Scholar
Digital Library
- Bertels, K., Ostadzadeh, S. A., and Meeuws, R. J. 2011. Advanced profiling of applications for heterogeneous multi-core platforms. In Proceedings of the International Conference on Engineering of Reconfigurable Systems & Algorithms (ERSA’’11). 171--183.Google Scholar
- Bilavarn, S., Gogniat, G., Philippe, J-L., and Bossuet, L. 2006. Design space pruning through early estimations of area/delay tradeoffs for FPGA implementations. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 25, 10, 1950--1968. Google Scholar
Digital Library
- Box, G. E. P. and Cox, D. R. 1964. An Analysis of Transformations. J. R. Stat. Soc. Series B 26, 2, 211--252.Google Scholar
Cross Ref
- Brandolese, C., Fornaciari, W., and Salice, F. 2004. An area estimation methodology for FPGA based designs at SystemC-level. In Proceedings of the IEEE/ACM Design Automation Conference (DAC’04). 129--132. Google Scholar
Digital Library
- Callan, R. 1998. Essence of Neural Networks. Prentice Hall, Upper Saddle River, NJ. Google Scholar
Digital Library
- Cammarota, R., Kejariwal, A., D’Alberto, P., Panigrahi, S., Veidenbaum, A. V., and Nicolau, A. 2011. Pruning hardware evaluation space via correlation-driven application similarity analysis. In Proceedings of the 8th ACM International Conference on Computing Frontiers (CF’11). 4:1--4:10. Google Scholar
Digital Library
- Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J., Brown, S., and Czajkowski, T. 2011. LegUp: High-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’11). 33--36. Google Scholar
Digital Library
- Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 679--698. Google Scholar
Digital Library
- Chen, T., Raghavan, R., Dale, J. N., and Iwata, E. 2007. Cell broadband engine architecture and its first implementation: a performance view. IBM J. Res. Dev. 51, 559--572. Google Scholar
Digital Library
- Chuong, L. M., Lam, S.-K., and Srikanthan, T. 2009. Area-time estimation of controller for porting C-Based functions onto FPGA. In Proceedings of the IEEE/IFIP International Symposium on Rapid System Prototyping (RSP’09). 145--151. Google Scholar
Digital Library
- Cilardo, A., Durante, P., Lofiego, C., and Mazzeo, A. 2010. Early prediction of hardware complexity in HLL-to-HDL translation. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’10). 483--488. Google Scholar
Digital Library
- Degryse, T., Devos, H., and Stroobandt, D. 2008. FPGA resource estimation for loop controllers. In Proceedings of the 6th Workshop on Optimizations for DSP and Embedded Systems (ODES’08). 9--15.Google Scholar
- Deng, L., Sobti, K., and Chakrabarti, C. 2008. Accurate models for estimating area and power of FPGA implementations. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’08). 1417--1420.Google Scholar
- Eeckhout, L., Vandierendonck, H., and Bosschere, K. D. 2002. Workload design: Selecting representative program-input pairs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 83--94. Google Scholar
Digital Library
- Elshoff, J. L. 1984. Characteristic program complexity measures. In Proceedings of the International Conference on Software Engineering (ICSE’84). 288--293. Google Scholar
Digital Library
- Enzler, R., Jeger, T., Cottet, D., and Tröster, G. 2000. High-level area and performance estimation of hardware building blocks on FPGAs. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’00). 525--534. Google Scholar
Digital Library
- Faraway, J. 2006. Extending the Linear Model with R. CRC Press.Google Scholar
- Guang, W., Baraldo, M., and Furlanut, M. 1995. Calculating percentage prediction error: A user’s note. Pharm. Res. 32, 4, 241--248.Google Scholar
Cross Ref
- Gupta, S., Dutt, N., Gupta, R., and Nicolau, A. 2003. Spark: A high-level synthesis framework for applying parallelizing compiler transformations. In Proceedings of the 16th International Conference on VLSI Design. Google Scholar
Digital Library
- Halstead, M. H. 1977. Elements of Software Science. Computer Science Library.Google Scholar
- Harrison, W. 1992. An entropy-based measure of software complexity. IEEE Trans. Software Eng. 18, 11, 1025--1029. Google Scholar
Digital Library
- Harrison, W. and Magel, K. 1981. A topological analysis of the complexity of computer programs with less than three binary branches. SIGPLAN Not. 16, 4, 51--63. Google Scholar
Digital Library
- Holzer, M. and Rupp, M. 2005. Static estimation of execution times for hardware accelerators in system-on-chips. In Proceedings of the International Symposium on Systems-on-Chip (SOC’05). 62--65.Google Scholar
- Hut, P., Makino, J., and McMillan, S. 1995. Building a better leapfrog. Astrophys. J., Part 2. Lett. 443, L93--L96.Google Scholar
Cross Ref
- Kohavi, R. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’95). Vol. 2, 1137--1143. Google Scholar
Digital Library
- Kulkarni, D., Najjar, W. A., Rinker, R., and Kurdahi, F. J. 2006. Compile-time area estimation for LUT-based FPGAs. ACM Trans. Des. Autom. Electron. Syst. 11, 1, 104--122. Google Scholar
Digital Library
- Lakshminarayana, A., Shukla, S., and Kumar, S. 2011. High level power estimation models for FPGAs. In Proceedings of the IEEE Symposium on VLSI. 7--12. Google Scholar
Digital Library
- McCabe, T. J. 1976. A complexity measure. IEEE Trans. Softw. Eng., 308--320. Google Scholar
Digital Library
- Meeuws, R. J. 2007. A quantitative model for hardware/software partitioning. M.S. thesis, Delft University of Technology, Delft, The Netherlands.Google Scholar
- Meeuws, R. J. 2012. Quantitative hardware prediction modeling for hardware/software co-design. Ph.D. thesis.Google Scholar
- Meeuws, R. J., Galuzzi, C., and Bertels, K. 2011. High level quantitative hardware prediction modeling using statistical methods. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Models, and Simulations (SAMOS’11). 140--149.Google Scholar
- Meeuws, R., Sigdel, K., Yankova, Y., and Bertels, K. 2008. High level quantitative interconnect estimation for early design space exploration. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’08). 317--320.Google Scholar
- Meeuws, R. J., Yankova, Y. D., and Bertels, K. L. M. 2006. Towards a quantitative model for hardware/software partitioning. Tech rep., part of Rcosy DES.6392 project.Google Scholar
- Meeuws, R. J., Yankova, Y. D., Bertels, K., Gaydadjiev, G. N., and Vassiliadis, S. 2007. A quantitative prediction model for hardware/software partitioning. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’07). 317--320.Google Scholar
- Monostori, Á., Frühauf, H. H., and Kókai, G. 2005. Quick estimation of resources of FPGAs and ASICs using neural networks. In Proceedings of Lernen, Wissensentdeckung und Adaptivität (LWA’05). 210--215.Google Scholar
- Morris, K. 2004. Catapult C: Mentor announces architectural synthesis. eejournalnet.Google Scholar
- Munson, J. C. 2002. Software Engineering Measurement. CRC Press, Inc., Boca Raton, FL. Google Scholar
Digital Library
- NanGate Inc. NanGate FreePDK45 Generic Open Cell Lib. v1.3.Google Scholar
- Nayak, A., Haldar, M., Choudhary, A., and Banerjee, P. 2002. Accurate area and delay estimators for FPGAs. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe (DATE’02). 862. Google Scholar
Digital Library
- Nocedal, J. and Wright, S. J. 2000. Numerical Optimization. Springer.Google Scholar
- nVidia Corp. 2011. Tegra 2 Technical Reference Manual.Google Scholar
- Oliveira, A. L., Braga, P. L., Lima, R. M. F., and Cornelio, M. L. 2010. GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf. Softw. Tech. 52, 11, 1155--1166. Google Scholar
Digital Library
- Ostadzadeh, S. A., Meeuws, R. J., Ashraf, I., Galuzzi, C., and Bertels, K. L. M. 2012. The Q2 profiling framework: Driving application mapping for heterogeneous reconfigurable platforms. In Proceedings of the International Workshop on Reconfigurable Computing: Architectures, Tools and Applications (ARC’12). 76--88. Google Scholar
Digital Library
- Oviedo, E. I. 1980. Control flow, data flow, and program complexity. In Proceedings of the Annual International Computer Software and Applications Conference (COMPSAC’80). 146--152.Google Scholar
- Palermo, G., Silvano, C., and Zaccaria, V. 2009. ReSPIR: A response surface-based pareto iterative refinement for application-specific design space exploration. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 28, 12, 1816--1829. Google Scholar
Digital Library
- Rupp, K. and Selberherr, S. 2011. The economic limit to Moore ’s law. IEEE Trans. Semicond. Manuf. 24, 1, 1--4.Google Scholar
Cross Ref
- Schumacher, P. and Jha, P. K. 2008. Fast and accurate resource estimation of RTL-based designs targeting FPGAS. In Proceedings of the 17th International Conference on Field Programmable Logic and Applications (FPL’08). 59--64.Google Scholar
- Schumacher, P. R., Miller, I. D., Parlour, D. B., Janneck, J. W., and Jha, P. K. 2011. Method of estimating resource requirements for a circuit design. Patent. US 7979835.Google Scholar
- Sheiner, L. B. and Beal, S. L. 1981. Some suggestions for measuring predictive performance. J. Pharmacokinetics Pharmacodynamics 9, 503--512. 10.1007/BF01060893.Google Scholar
Cross Ref
- So, B., Diniz, P. C., and Hall, M. W. 2003. Using estimates from behavioral synthesis tools in compiler-directed design space exploration. In Proceedings of the IEEE/ACM Design Automation Conference (DAC’03). ACM, 514--519. Google Scholar
Digital Library
- Supplee, L. M. E. A. 1997. MELP: The new federal standard at 2400 bps. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing. 1591--1594. Google Scholar
Digital Library
- Venables, W. and Ripley, B. 2002. Modern Applied Statistics with S. 4th Ed. Statistics and Computing, Springer. Google Scholar
Digital Library
- Villarreal, J., Park, A., Najjar, W., and Halstead, R. 2010. Designing modular hardware accelerators in C with ROCCC 2.0. In Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’10). 127--134. Google Scholar
Digital Library
- Welsh, A., Cunningham, R., Donnelly, C., and Lindenmayer, D. 1996. Modelling the abundance of rare species: Statistical models for counts with extra zeros. Ecol. Modell. 88, 1--3, 297--308.Google Scholar
Cross Ref
- Xilinx. 2011. Zynq-7000 extensible processing platform summary.Google Scholar
- Yankova, Y., Kuzmanov, G., Bertels, K., Gaydadjiev, G., Lu, Y., and Vassiliadis, S. 2007. DWARV: Delftworkbench automated reconfigurable VHDL generator. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’07). 697--701.Google Scholar
Index Terms
Quipu: A Statistical Model for Predicting Hardware Resources
Recommendations
Reconfigurable Processing With Field Programmable Gate Arrays
ASAP '96: Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and ProcessorsIn-system-programmable, SRAM-based Field Programmable Gate Arrays (FPGAs) can be used to create processors and coprocessors whose internal architecture as well as interconnections can be reconfigured to match the needs of a given application. Exploiting ...
Architecture Exploration of Standard-Cell and FPGA-Overlay CGRAs Using the Open-Source CGRA-ME Framework
ISPD '18: Proceedings of the 2018 International Symposium on Physical DesignWe describe an open-source software framework,CGRA-ME, for the modeling and exploration of coarse-grained reconfigurable architectures (CGRAs). CGRAs are programmable hardware devices having large ALU-like logic blocks, and datapath bus-style inter-...
The RecoBlock SoC platform: a flexible array of reusable run-time-reconfigurable IP-blocks
DATE '13: Proceedings of the Conference on Design, Automation and Test in EuropeRun-time reconfigurable (RTR) FPGAs combine the flexibility of software with the high efficiency of hardware. Still, their potential cannot be fully exploited due to increased complexity of the design process. Consequently, to enable an efficient design ...






Comments