skip to main content
research-article

Code transformation and instruction set extension

Published:24 July 2009Publication History
Skip Abstract Section

Abstract

The demand for flexible embedded solutions and short time-to-market has led to the development of extensible processors that allow for customization through user-defined instruction set extensions (ISEs). These are usually identified from plain C sources. In this article, we propose a combined exploration of code transformations and ISE identification. The resulting performance of such a combination has been measured on two benchmark suites. Our results demonstrate that combined code transformations and ISEs can yield average performance improvements of 49%. This outperforms ISEs when applied in isolation, and in extreme cases yields a speed-up of 2.85.

References

  1. ACE Associated Compiler Experts b.v. 2008. ACE CoSy Web site. http://www.ace.nl/compiler/cosy.html.Google ScholarGoogle Scholar
  2. Agakov, F., Bonilla, E., Cavazos, J., Franke, B., O'boyle, M. F., Thomson, J., Toussaint, M., and Williams, C. K. 2006. Using machine learning to focus iterative optimization. In Proceedings of the 4th Annual International Symposium on Code Generation and Optimization. 295--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. ARC International. 2007a. ARC FPX White paper.Google ScholarGoogle Scholar
  4. ARC International. 2007b. ARChitect product brief.Google ScholarGoogle Scholar
  5. Atasu, K., Dündar, G., and Özturan, C. 2005. An integer linear programming approach for identifying instruction-set extensions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. IEEE, Los Alamitos, 172--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., and Sangiovannivincentelli, A. 2003. Metropolis: an integrated electronic system design environment. Computer 36, 4, 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Berkelaar, M. 2008. Mixed integer programming (MIP) solver. http://groups.yahoo.com/group/lp_solve/.Google ScholarGoogle Scholar
  8. Biswas, P., Banerjee, S., Dutt, N. D., Pozzi, L., and Ienne, P. 2006. ISEGEN: An iterative improvement-based ISE generation technique for fast customization of processors. IEEE Trans. VLSI 14, 7, 754--762. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bonzini, P. and Pozzi, L. 2006. Code transformation strategies for extensible embedded processors. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). ACM, New York, 242--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Borghi, A., David, V., and Demaille, A. 2006. C-Transformers: A framework to write C program transformations. ACM Crossroads 12, 3, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Brown, D., Henshaw, W. D., and Quinlan, D. J. 1999. Overture: An object-oriented framework for solving partial differential equations on overlapping grids. In Proceedings of the SIAM Conference on Object Oriented Methods for Scientific Computing. SIAM, Philadelphia.Google ScholarGoogle Scholar
  12. Chow, K. and Wu, Y. 1999. Feedback-directed selection and characterization of compiler optimizations. In Proceedings of the 2nd Workshop on Feedback Directed Optimization. ACM, New York.Google ScholarGoogle Scholar
  13. Chung, E., Benini, L., and Micheli, G. D. 2000. Energy efficient source code transformation based on value profiling. In Proceedings of the International Workshop on Compilers and Operating Systems for Low-Power. ACM, New York.Google ScholarGoogle Scholar
  14. Coware. 2007. Processor designer datasheet. http://www.coware.com/PDF/products/LISATek.pdf.Google ScholarGoogle Scholar
  15. Falk, H. and Marwedel, P. 2004. Source Code Optimization Techniques for Data Flow Dominated Embedded Software. Kluwer Academic Publishers, Dordrecht, The Netherlands.Google ScholarGoogle Scholar
  16. Francis, H. 2001. ARM DSP-enhanced extensions.Google ScholarGoogle Scholar
  17. Franke, B. and O'boyle, M. 2003a. Array recovery and high-level transformations for DSP applications. ACM Trans. Embed. Comput. Syst. 2, 2, 132--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Franke, B. and O'boyle, M. 2003b. Combining program recovery, auto-parallelization and locality analysis for C programs on multi-processor embedded systems. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques. IEEE, Los Alamitos, 104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Franke, B., O'boyle, M., Thomson, J., and Fursin, G. 2005. Probabilistic source-level optimization of embedded programs. In Proceedings of the Conference on Languages, Compilers and Tools for Embedded Systems. ACM, New York, 78--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Glökler, T., Hoffmann, A., and Meyr, H. 2003. Methodical low-power ASIP design space exploration. VLSI Signal Process. 33, 3, 229--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Gupta, R. and Bodik, R. 2004. Register pressure sensitive redundancy elimination. Lecture Notes in Computer Science, vol. 1575, 107--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hohenauer, M., Scharwaechter, H., Karuri, K., Wahlen, O., Kogel, T., Leupers, R., Ascheid, G., and Meyr, H. 2004. Compiler-in-loop architecture exploration for efficient application specific embedded processor design. http://www.iss.rwth-aachen.de/4_publikationen/res_pdf/2004HohenauerDE.pdf.Google ScholarGoogle Scholar
  23. Intel. 2007. Intel PXA270 processor for embedded computing. http://www.intel.com/design/embeddedpca/applicationsprocessors/302302.htm.Google ScholarGoogle Scholar
  24. Keutzer, K., Malik, S., Newton, A., Rabaey, J., and Sangiovanni-Vincentelli, A. 2000. System-level design: Orthogonalization of concerns and platform-based design. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 19, 1523--1543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kulkarni, C., Catthoor, F., and Man, H. D. 1998. Code transformations for low- power caching in embedded multi-media processors. In Proceedings of the 12th International Parallel Processing Symposium. IEEE, Los Alamitos, 292--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lee, C. G. 1998. UTDSP benchmarks. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html.Google ScholarGoogle Scholar
  27. Leupers, R., Karuri, K., Kraemer, S., and Pandey, M. 2006. A design flow for configurable embedded processors-based on optimized instruction set extension synthesis. In Proceedings of Design Automation&Test in Europe (DATE'06). IEEE, Los Alamitos, 581--586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Luz, V. D. L. and Kandemir, M. 2004. Array regrouping and its use in compiling data-intensive embedded applications. IEEE Trans. Comput. 53, 1, 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mckay, B. D. 2008. Nauty user's guide. http://cs.anu.edu.au/~bdm/nauty/.Google ScholarGoogle Scholar
  30. Mips TECHNOLOGIES. 2007. MIPS32(R) architecture for programmers. http://www.mips.com/products/product-materials/processor/mips-architecture.Google ScholarGoogle Scholar
  31. Peymandoust, A., Pozzi, L., Ienne, P., and Micheli, G. D. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on Application-Specific Systems, Architectures and Processors. ACM, New York, 108--118.Google ScholarGoogle Scholar
  32. Pozzi, L., Atasu, K., and Ienne, P. 2006. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 25, 7, 1209--1229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Pozzi, L. and Ienne, P. 2005. Exploiting pipe-lining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES'05). ACM, New York, 2--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Rozenblit, J. and Buchenrieder, K. 1995. Codesign -- Computer-Aided Software/Hardware Engineering. IEEE, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Schordan, M. and Quinlan, D. J. 2003. A source-to-source architecture for user-defined optimizations. In Proceedings of the Joint Modular Languages Conference. Springer-Verlag, Berlin, 214--223.Google ScholarGoogle Scholar
  36. Seoul National University -- Real-Time Research Group. 2008. SNU real-time benchmarks. http://archi.snu.ac.kr/realtime/benchmark/.Google ScholarGoogle Scholar
  37. Stretch Inc. 2007. SCP architecture reference. http://www.stretchinc.com.Google ScholarGoogle Scholar
  38. Tensilica, Inc. 2005. The XPRES compiler: Triple-threat solution to code performance challenges.Google ScholarGoogle Scholar
  39. Verma, A. K. and Ienne, P. 2004. Improved use of the carry-save representation for the synthesis of complex arithmetic circuits. In Proceedings of the International Conference on Computer-Aided Design. IEEE, Los Alamitos, 791--798. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Verma, A. K. and Ienne, P. 2006. Towards the automatic exploration of arithmetic circuit architectures. In Proceedings of the 43rd Design Automation Conference (DAC'06). IEEE, Los Alamitos, 445--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wang, Y. and Kaeli, D. 2003. Source-level transformations to improve I/O data partitioning. In Proceedings of the International Workshop on Storage Network Architecture and Parallel I/Os. IEEE, Los Alamitos, 27--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Wilson, R. P., French, R. S., Wilson, C. S., Amarasinghe, S. P., Anderson, J. M., Tjiang, S. W. K., Liao, S.-W., Tseng, C.-W., Hall, M. W., Lam, M. S., and Hennessy, J. L. 1994. SUIF: An infrastructure for research on parallelizing and optimizing compilers. SIGPLAN Notices 29, 12, 31--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Winters, B. and Hu, A. 2000. Source-level transformations for improved formal verification. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers&Processors. IEEE, Los Alamitos, 599--602. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Code transformation and instruction set extension

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!