skip to main content
research-article

Statistical Performance Modeling in Functional Instruction Set Simulators

Published:01 June 2012Publication History
Skip Abstract Section

Abstract

Despite the recent progress in improving the speed of instruction-accurate simulators cycle-accurate simulation is still prohibitively slow for all but the most basic programs. In this article we present a statistical machine learning approach to performance estimation in fast, instruction accurate simulators and evaluate our methodology comprehensively against three popular embedded RISC processors and about 300 embedded applications. We show that our methodology is capable of providing accurate performance estimations with an average error of less than 3.9% while, on average, operating ≈ 14.5 times faster than cycle-accurate simulation.

References

  1. Amarasinghe, S. 2007. StreamIt---benchmarks. http://cag.csail.mit.edu/streamit/shtml/benchmarks.shtml.Google ScholarGoogle Scholar
  2. Apple, Inc. 2007. Apple CHUD tools. http://www.apple.com.Google ScholarGoogle Scholar
  3. ARC International. 2007a. ARC 700 core family brochure. http://www.arc.com.Google ScholarGoogle Scholar
  4. ARC International. 2007b. ARC VTOC tool. http://www.arc.com/software/simulation/vtoc.html.Google ScholarGoogle Scholar
  5. Austin, T. M. 2007. Pointer-intensive benchmark suite. http://www.cs.wisc.edu/~austin/ptr-dist.html.Google ScholarGoogle Scholar
  6. Austin, T. M., Breach, S. E., and Sohi, G. S. 1993. Efficient detection of all pointer and array access errors. Tech. rep., University of Wisconsin.Google ScholarGoogle Scholar
  7. Bammi, J. R., Kruijtzer, W., Lavagno, L., Harcourt, E., and Lazarescu, M. T. 2000. Software performance estimation strategies in a system-level design tool. In Proceedings of the 8th International Workshop on Hardware/Software Codesign (CODES’00). ACM, New York, NY, USA, 82--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bontempi, G. and Kruijtzer, W. 2002. A data analysis method for software performance prediction. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’02). IEEE Computer Society, Los Alamitos, CA, 971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. De Bus, B., De Sutter, B., Van Put, L., Chanet, D., and De Bosschere, K. 2004. Link-time optimization of ARM binaries. In Proceedings of the ACM SIGPLAN Joint Conference on Languages, Compilers and Tools for Embedded Systems (LCTES’04). ACM Press, 211--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Diniz, P. C. 2003. A compiler approach to performance prediction using empirical-based modeling. In Proceedings of the International Conference on Computational Science. Lecture Notes in Computer Science, vol. 2659, 916--925. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dubach, C., Cavazos, J., Franke, B., Fursin, G., O’Boyle, M. F., and Temam, O. 2007. Fast compiler optimisation evaluation using code-feature based performance prediction. In Proceedings of the 4th International Conference on Computing Frontiers (CF’07). ACM, New York, NY, 131--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eeckhout, L. and Bosschere, K. D. 2001. Hybrid analytical-statistical modeling for efficiently exploring architecture and workload design spaces. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’01). IEEE Computer Society, Los Alamitos, CA, 25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Eeckhout, L., Nussbaum, S., Smith, J. E., and Bosschere, K. D. 2003. Statistical simulation: Adding efficiency to the computer designer’s toolbox. IEEE Micro. 23, 5, 26--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. EEMBC. 2008. EEMBC benchmarks. http://www.eembc.org.Google ScholarGoogle Scholar
  15. Franke, B. 2008. Fast cycle-approximate instruction set simulation. In Proceedings of the 11th International Workshop on Software & Compilers for Embedded Systems (SCOPES’’08). ACM, New York, NY, USA, 69--78. Google ScholarGoogle ScholarCross RefCross Ref
  16. Freescale Semiconductor, Inc. 2007a. MPC 7410 RISC microprocessor hardware specification. http://www.freescale.com.Google ScholarGoogle Scholar
  17. Freescale Semiconductor, Inc. 2007b. SimG4 timing model. http://www.freescale.com.Google ScholarGoogle Scholar
  18. Hamerly, G., Perelman, E., Lau, J., and Calder, B. 2005. SimPoint 3.0: Faster and more flexible program analysis. J. Instr.-Level Paral. 7, 1--28.Google ScholarGoogle Scholar
  19. Hoffmann, A., Meyr, H., and Leupers, R. 2002. Architecture Exploration for Embedded Processors with Lisa. Kluwer Academic Publishers, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L. K., and De Bosschere, K. 2006. Performance prediction based on inherent program similarity. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT’06). ACM, New York, NY, 114--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hsu, C.-H. and Kremer, U. 1998. Iperf: A framework for automatic construction of performance prediction models. In Proceedings of the Workshop on Profile and Feedback-Directed Compilation (PFDC’98).Google ScholarGoogle Scholar
  22. Intel Corporation. 2000. Intel StrongARM SA-1110 microprocessor---Developers manual. http://www.intel.com.Google ScholarGoogle Scholar
  23. Joseph, P., Vaswani, K., and Thazhuthaveetil, M. J. 2006. Construction and use of linear regression models for processor performance analysis. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture (HPCA’06). IEEE, 99--108.Google ScholarGoogle Scholar
  24. Kempf, T., Karuri, K., Wallentowitz, S., Ascheid, G., Leupers, R., and Meyr, H. 2006. A sw performance estimation framework for early system-level-design using fine-grained instrumentation. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE’06). 468--473. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lee, B. C. and Brooks, D. M. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, NY, 185--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lee, C. 2007. MediaBench. http://euler.slu.edu/~fritts/mediabench/mb1/.Google ScholarGoogle Scholar
  27. Lee, C., Potkonjak, M., and Mangione-Smith, W. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lee, C. G. 1998. UTDSP benchmark suite. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html.Google ScholarGoogle Scholar
  29. Nohl, A., Braun, G., Schliebusch, O., Leupers, R., Meyr, H., and Hoffmann, A. 2002. A universal technique for fast and flexible instruction-set architecture simulation. In Proceedings of the 39th Conference on Design Automation (DAC’02). ACM, New York, NY, 22--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Oyamada, M. S., Zschornack, F., and Wagner, F. R. 2004. Accurate software performance estimation using domain classification and neural networks. In Proceedings of the 17th Symposium on Integrated Circuits and System Design (SBCCI’04). ACM, New York, NY, 175--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Peng, H., Long, F., and Ding, C. 2005. Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Patt. Anal. Mach. Intell. 27, 8, 1226--1238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Powell, D. C. and Franke, B. 2009. Using continuous statistical machine learning to enable high-speed performance prediction in hybrid instruction-/cycle-accurate instruction set simulators. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’09). ACM, New York, NY, 315--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Qin, W. 2007. SimIt-ARM. http://simit-arm.sourceforge.net.Google ScholarGoogle Scholar
  34. Reshadi, M., Bansal, N., Mishra, P., and Dutt, N. 2003. An efficient retargetable framework for instruction-set simulation. In Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’03). ACM, New York, NY, 13--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Reshadi, M., Mishra, P., and Dutt, N. 2003. Instruction set compiled simulation: A technique for fast and flexible instruction set simulation. In Proceedings of the Conference on Design Automation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Reshadi, M., Mishra, P., and Dutt, N. 2009. Hybrid-compiled simulation: An efficient technique for instruction-set architecture simulation. ACM Trans. Embed. Comput. Syst. 8, 3, 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Schwaighofer, A. and Tresp, V. 2003. Transductive and inductive methods for approximate Gaussian Process regression. In Advances in Neural Information Processing Systems vol. 15, S. T. S. Becker and K. Obermayer Eds., MIT Press, Cambridge, MA, 953--960.Google ScholarGoogle Scholar
  38. Snyder, W., Wasson, P., and Galbi, D. 2007. Verilator. http://www.veripool.com/verilator.html.Google ScholarGoogle Scholar
  39. Tan, L. 2006. The worst case execution time tool challenge 2006: The external test. Tech. rep., University of Duisburg-Essen, Los Alamitos, CA.Google ScholarGoogle Scholar
  40. Topham, N. and Jones, D. 2007. High speed CPU simulation using JIT binary translation. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation (MoBS).Google ScholarGoogle Scholar
  41. Wand, M. and Jones, M. 1995. Monographs on Statistics and Applied Probability, vol. 60, Chapman and Hall, London.Google ScholarGoogle Scholar
  42. Weber, S. J., Moskewicz, M. W., Gries, M., Sauer, C., and Keutzer, K. 2004. Fast cycle-accurate simulation and instruction set generation for constraint-based descriptions of programmable architectures. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’04). IEEE Computer Society, Los Alamitos, CA, 18--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Williams, C. and Rasmussen, C. 1996. Advances in Neural Information Processing Systems. Vol. 8. MIT Press, Cambridge, MA, 514--520.Google ScholarGoogle Scholar
  44. Wunderlich, R. E., Wenisch, T. F., Falsafi, B., and Hoe, J. C. 2003. SMARTS: Accelerating microarchitecture simulation via rigorous statistical sampling. In Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA’03). ACM, New York, NY, 84--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. z̆ivojnović, V., Velarde, J. M., Schläger, C., and Meyr, H. 1994. DSPSTONE: A DSP-oriented benchmarking methodology. In Proceedings of the International Conference on Signal Processing and Technology (ICSPAT’94).Google ScholarGoogle Scholar

Index Terms

  1. Statistical Performance Modeling in Functional Instruction Set Simulators

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Article Metrics

            • Downloads (Last 12 months)6
            • Downloads (Last 6 weeks)2

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!