Abstract
The demand for flexible embedded solutions and short time-to-market has led to the development of extensible processors that allow for customization through user-defined instruction set extensions (ISEs). These are usually identified from plain C sources. In this article, we propose a combined exploration of code transformations and ISE identification. The resulting performance of such a combination has been measured on two benchmark suites. Our results demonstrate that combined code transformations and ISEs can yield average performance improvements of 49%. This outperforms ISEs when applied in isolation, and in extreme cases yields a speed-up of 2.85.
- ACE Associated Compiler Experts b.v. 2008. ACE CoSy Web site. http://www.ace.nl/compiler/cosy.html.Google Scholar
- Agakov, F., Bonilla, E., Cavazos, J., Franke, B., O'boyle, M. F., Thomson, J., Toussaint, M., and Williams, C. K. 2006. Using machine learning to focus iterative optimization. In Proceedings of the 4th Annual International Symposium on Code Generation and Optimization. 295--305. Google Scholar
Digital Library
- ARC International. 2007a. ARC FPX White paper.Google Scholar
- ARC International. 2007b. ARChitect product brief.Google Scholar
- Atasu, K., Dündar, G., and Özturan, C. 2005. An integer linear programming approach for identifying instruction-set extensions. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. IEEE, Los Alamitos, 172--177. Google Scholar
Digital Library
- Balarin, F., Watanabe, Y., Hsieh, H., Lavagno, L., Passerone, C., and Sangiovannivincentelli, A. 2003. Metropolis: an integrated electronic system design environment. Computer 36, 4, 45--52. Google Scholar
Digital Library
- Berkelaar, M. 2008. Mixed integer programming (MIP) solver. http://groups.yahoo.com/group/lp_solve/.Google Scholar
- Biswas, P., Banerjee, S., Dutt, N. D., Pozzi, L., and Ienne, P. 2006. ISEGEN: An iterative improvement-based ISE generation technique for fast customization of processors. IEEE Trans. VLSI 14, 7, 754--762. Google Scholar
Digital Library
- Bonzini, P. and Pozzi, L. 2006. Code transformation strategies for extensible embedded processors. In Proceedings of the International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES). ACM, New York, 242--252. Google Scholar
Digital Library
- Borghi, A., David, V., and Demaille, A. 2006. C-Transformers: A framework to write C program transformations. ACM Crossroads 12, 3, 3. Google Scholar
Digital Library
- Brown, D., Henshaw, W. D., and Quinlan, D. J. 1999. Overture: An object-oriented framework for solving partial differential equations on overlapping grids. In Proceedings of the SIAM Conference on Object Oriented Methods for Scientific Computing. SIAM, Philadelphia.Google Scholar
- Chow, K. and Wu, Y. 1999. Feedback-directed selection and characterization of compiler optimizations. In Proceedings of the 2nd Workshop on Feedback Directed Optimization. ACM, New York.Google Scholar
- Chung, E., Benini, L., and Micheli, G. D. 2000. Energy efficient source code transformation based on value profiling. In Proceedings of the International Workshop on Compilers and Operating Systems for Low-Power. ACM, New York.Google Scholar
- Coware. 2007. Processor designer datasheet. http://www.coware.com/PDF/products/LISATek.pdf.Google Scholar
- Falk, H. and Marwedel, P. 2004. Source Code Optimization Techniques for Data Flow Dominated Embedded Software. Kluwer Academic Publishers, Dordrecht, The Netherlands.Google Scholar
- Francis, H. 2001. ARM DSP-enhanced extensions.Google Scholar
- Franke, B. and O'boyle, M. 2003a. Array recovery and high-level transformations for DSP applications. ACM Trans. Embed. Comput. Syst. 2, 2, 132--162. Google Scholar
Digital Library
- Franke, B. and O'boyle, M. 2003b. Combining program recovery, auto-parallelization and locality analysis for C programs on multi-processor embedded systems. In Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques. IEEE, Los Alamitos, 104. Google Scholar
Digital Library
- Franke, B., O'boyle, M., Thomson, J., and Fursin, G. 2005. Probabilistic source-level optimization of embedded programs. In Proceedings of the Conference on Languages, Compilers and Tools for Embedded Systems. ACM, New York, 78--86. Google Scholar
Digital Library
- Glökler, T., Hoffmann, A., and Meyr, H. 2003. Methodical low-power ASIP design space exploration. VLSI Signal Process. 33, 3, 229--246. Google Scholar
Digital Library
- Gupta, R. and Bodik, R. 2004. Register pressure sensitive redundancy elimination. Lecture Notes in Computer Science, vol. 1575, 107--122. Google Scholar
Digital Library
- Hohenauer, M., Scharwaechter, H., Karuri, K., Wahlen, O., Kogel, T., Leupers, R., Ascheid, G., and Meyr, H. 2004. Compiler-in-loop architecture exploration for efficient application specific embedded processor design. http://www.iss.rwth-aachen.de/4_publikationen/res_pdf/2004HohenauerDE.pdf.Google Scholar
- Intel. 2007. Intel PXA270 processor for embedded computing. http://www.intel.com/design/embeddedpca/applicationsprocessors/302302.htm.Google Scholar
- Keutzer, K., Malik, S., Newton, A., Rabaey, J., and Sangiovanni-Vincentelli, A. 2000. System-level design: Orthogonalization of concerns and platform-based design. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 19, 1523--1543. Google Scholar
Digital Library
- Kulkarni, C., Catthoor, F., and Man, H. D. 1998. Code transformations for low- power caching in embedded multi-media processors. In Proceedings of the 12th International Parallel Processing Symposium. IEEE, Los Alamitos, 292--297. Google Scholar
Digital Library
- Lee, C. G. 1998. UTDSP benchmarks. http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html.Google Scholar
- Leupers, R., Karuri, K., Kraemer, S., and Pandey, M. 2006. A design flow for configurable embedded processors-based on optimized instruction set extension synthesis. In Proceedings of Design Automation&Test in Europe (DATE'06). IEEE, Los Alamitos, 581--586. Google Scholar
Digital Library
- Luz, V. D. L. and Kandemir, M. 2004. Array regrouping and its use in compiling data-intensive embedded applications. IEEE Trans. Comput. 53, 1, 1--19. Google Scholar
Digital Library
- Mckay, B. D. 2008. Nauty user's guide. http://cs.anu.edu.au/~bdm/nauty/.Google Scholar
- Mips TECHNOLOGIES. 2007. MIPS32(R) architecture for programmers. http://www.mips.com/products/product-materials/processor/mips-architecture.Google Scholar
- Peymandoust, A., Pozzi, L., Ienne, P., and Micheli, G. D. 2003. Automatic instruction set extension and utilization for embedded processors. In Proceedings of the 14th International Conference on Application-Specific Systems, Architectures and Processors. ACM, New York, 108--118.Google Scholar
- Pozzi, L., Atasu, K., and Ienne, P. 2006. Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans. Comput. Aid. Des. Integr. Circ. Syst. 25, 7, 1209--1229. Google Scholar
Digital Library
- Pozzi, L. and Ienne, P. 2005. Exploiting pipe-lining to relax register-file port constraints of instruction-set extensions. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES'05). ACM, New York, 2--10. Google Scholar
Digital Library
- Rozenblit, J. and Buchenrieder, K. 1995. Codesign -- Computer-Aided Software/Hardware Engineering. IEEE, Los Alamitos, CA. Google Scholar
Digital Library
- Schordan, M. and Quinlan, D. J. 2003. A source-to-source architecture for user-defined optimizations. In Proceedings of the Joint Modular Languages Conference. Springer-Verlag, Berlin, 214--223.Google Scholar
- Seoul National University -- Real-Time Research Group. 2008. SNU real-time benchmarks. http://archi.snu.ac.kr/realtime/benchmark/.Google Scholar
- Stretch Inc. 2007. SCP architecture reference. http://www.stretchinc.com.Google Scholar
- Tensilica, Inc. 2005. The XPRES compiler: Triple-threat solution to code performance challenges.Google Scholar
- Verma, A. K. and Ienne, P. 2004. Improved use of the carry-save representation for the synthesis of complex arithmetic circuits. In Proceedings of the International Conference on Computer-Aided Design. IEEE, Los Alamitos, 791--798. Google Scholar
Digital Library
- Verma, A. K. and Ienne, P. 2006. Towards the automatic exploration of arithmetic circuit architectures. In Proceedings of the 43rd Design Automation Conference (DAC'06). IEEE, Los Alamitos, 445--450. Google Scholar
Digital Library
- Wang, Y. and Kaeli, D. 2003. Source-level transformations to improve I/O data partitioning. In Proceedings of the International Workshop on Storage Network Architecture and Parallel I/Os. IEEE, Los Alamitos, 27--35. Google Scholar
Digital Library
- Wilson, R. P., French, R. S., Wilson, C. S., Amarasinghe, S. P., Anderson, J. M., Tjiang, S. W. K., Liao, S.-W., Tseng, C.-W., Hall, M. W., Lam, M. S., and Hennessy, J. L. 1994. SUIF: An infrastructure for research on parallelizing and optimizing compilers. SIGPLAN Notices 29, 12, 31--37. Google Scholar
Digital Library
- Winters, B. and Hu, A. 2000. Source-level transformations for improved formal verification. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers&Processors. IEEE, Los Alamitos, 599--602. Google Scholar
Digital Library
Index Terms
Code transformation and instruction set extension
Recommendations
Combining source-to-source transformations and processor instruction set extensions for the automated design-space exploration of embedded systems
LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsIndustry's demand for flexible embedded solutions providing high performance and short time-to-market has led to the development of configurable and extensible processors. These pre-verified application-specific processors build on proven baseline cores ...
Combining source-to-source transformations and processor instruction set extensions for the automated design-space exploration of embedded systems
Proceedings of the 2007 LCTES conferenceIndustry's demand for flexible embedded solutions providing high performance and short time-to-market has led to the development of configurable and extensible processors. These pre-verified application-specific processors build on proven baseline cores ...
Code transformation strategies for extensible embedded processors
CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systemsEmbedded application requirements, including high performance, low power consumption and fast time to market, are uncommon in the broader domain of general purpose applications. In order to satisfy these demands, chip manufacturers often provide ...






Comments