Abstract
The overhead in terms of code size, power consumption, and execution time caused by the use of precompiled libraries and separate compilation is often unacceptable in the embedded world, where real-time constraints, battery life-time, and production costs are of critical importance. In this paper, we present our link-time optimizer for the ARM architecture. We discuss how we can deal with the peculiarities of the ARM architecture related to its visible program counter and how the introduced overhead can to a large extent be eliminated. Our link-time optimizer is evaluated with four tool chains, two proprietary ones from ARM and two open ones based on GNU GCC. When used with proprietary tool chains from ARM Ltd., our link-time optimizer achieved average code size reductions of 16.0 and 18.5%, while the programs have become 12.8 and 12.3% faster, and 10.7 to 10.1% more energy efficient. Finally, we show how the incorporation of link-time optimization in tool chains may influence library interface design.
- Angiolini, F., Menichelli, F., Ferrero, A., Benini, L., and Olivieri, M. 2004. A post-compiler approach to scratchpad mapping of code. In Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. 259--267.]] Google Scholar
- ARM Ltd. 1995. An Introduction to Thumb. ARM Ltd.]]Google Scholar
- ARM Ltd. 2005. ELF for the ARM Architecture. ARM Ltd.]]Google Scholar
- Austin, T., Larson, E., and Ernst, D. 2002. Simplescalar: An infrastructure for computer system modeling. Computer 35, 2, 59--67.]] Google Scholar
Digital Library
- Beszédes, A., Ferenc, R., Gyimóthy, T., Dolenc, A., and Karsisto, K. 2003. Survey of code-size reduction methods. ACM Comput. Surv. 35, 3, 223--267.]] Google Scholar
Digital Library
- Chanet, D., De Sutter, B., De Bus, B., Van Put, L., and De Bosschere, K. 2005. System-wide compaction and specialization of the Linux kernel. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES). 95--104, ACM Press.]] Google Scholar
- Chen, G. and Kandemir, M. 2005. Optimizing address code generation for array-intensive DSP applications. In Proc. of the International Symposium on Code Generation and Optimization. 141--152.]] Google Scholar
Digital Library
- Cohn, R., Goodwin, D., Lowney, P., and Rubin, N. 1997. Spike: An optimizer for Alpha/NT executables. In Proceedings of the USENIX Windows NT Workshop. 17--24.]] Google Scholar
Digital Library
- Corliss, M., Lewis, E., and Roth, A. 2003. A DISE implementation of dynamic code decompression. In Proceedings of the ACM SIGPLAN 2003 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'03). 232--243.]] Google Scholar
- De Bus, B. 2005. Reliable, retargetable and extensible link-time program rewriting. Ph.D. thesis, Ghent University.]]Google Scholar
- De Bus, B., Kästner, D., Chanet, D., Van Put, L., and De Sutter, B. 2003. Post-pass compaction techniques. Communications of the ACM 46, 8 (8), 41--46.]] Google Scholar
- De Bus, B., Chanet, D., De Sutter, B., Van Put, L., and De Bosschere, K. 2004. The design of FIT, a flexible instrumentation toolkit. In Proceedings of the 2004 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE'04). 29--34.]] Google Scholar
- De Sutter, B., De Bus, B., De Bosschere, K., Keyngnaert, P., and Demoen, B. 2000. On the static analysis of indirect control transfers in binaries. In Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications. 1013--1019.]]Google Scholar
- De Sutter, B., De Bus, B., De Bosschere, K., and Debray, S. 2001. Combining global code and data compaction. In Proc. of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems. 29--38.]] Google Scholar
- De Sutter, B., De Bus, B., and De Bosschere, K. 2002. Sifting out the mud: low level C++ code reuse. In Proceedings of the 17th ACM SIGPLAN conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). 275--291.]] Google Scholar
- De Sutter, B., De Bus, B., and De Bosschere, K. 2005b. Bidirectional liveness analysis, or how less than half of the alpha's registers are used. Journal of Systems Architecture, Elsevier, 52(10), 535--548. October 2006.]] Google Scholar
Digital Library
- De Sutter, B., De Bus, B., and De Bosschere, K. 2005a. Link-time binary rewriting techniques for program compaction. ACM Transactions on Programming Languages and Systems 27, 5 (9), 882--945.]] Google Scholar
Digital Library
- De Sutter, B., Vandierendonck, H., De Bus, B., and De Bosschere, K. 2003. On the side-effects of code abstraction. In Proceedings of the 2003 ACM SIGPLAN Conference on Languages, Compilers and Tools for Embedded Systems (LCTES'03). 245--253.]] Google Scholar
- Debray, S., Evans, W., Muth, R., and De Sutter, B. 2000. Compiler techniques for code compaction. ACM Transactions on Programming Languages and Systems 22, 2 (3), 378--415.]] Google Scholar
Digital Library
- Ernst, J., Evans, W., Fraser, C., Lucco, S., and Proebsting, T. 1997. Code compression. In Proceedings of the 1997 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'97). 358--365.]] Google Scholar
- Franz, M. 1997. Adaptive compression of syntax trees and iterative dynamic code optimization: Two basic technologies for mobile-object systems. In Mobile Object Systems: Towards the Programmable Internet, J. Vitek and C. Tschudin, Eds. Number 1222 in LNCS. Springer, New York. 263--276.]] Google Scholar
Digital Library
- Franz, M. and Kistler, T. 1997. Slim binaries. Communications of the ACM 40, 12 (Dec.), 87--94.]] Google Scholar
Digital Library
- Fraser, C. 1999. Automatic inference of models for statistical code compression. In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation (PLDI'99). 242--246.]] Google Scholar
- Furber, S. 1996. ARM System Architecture. Addison Wesley, Reading, MA.]] Google Scholar
- Haber, G., Klausner, M., Eisenberg, V., Mendelson, B., and Gurevich, M. 2003. Optimization opportunities created by global data reordering. In Proc. of the International Symposium on Code Generation and Optimization. 228--237.]] Google Scholar
- Kästner, D. 2000. PROPAN: A retargetable system for postpass optimizations and analyses. In Proceedings of the 2000 ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems (LCTES'00).]] Google Scholar
- Kästner, D. and Wilhelm, S. 2002. Generic control-flow reconstruction from assembly code. In Proceedings of the joint conference on Languages, Compilers and Tools for Embedded Systems (LCTES): Software and Compilers for Embedded Systems (SCOPES). 46--55.]] Google Scholar
- Kemp, T. M., Montoye, R. M., Harper, J. D., Palmer, J. D., and Auerbach, D. J. 1998. A decompression core for PowerPC. IBM J. Research and Development 42, 6 (Nov.).]] Google Scholar
Digital Library
- Kirovski, D., Kin, J., and Mangione-Smith, W. H. 1997. Procedure based program compression. In Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO-30).]] Google Scholar
Digital Library
- Lattner, C. and Adve, V. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. of the International Symposium on Code Generation and Optimization. 75--86.]] Google Scholar
- Lekatsas, H., Henkel, J., Chakradhar, S., Jakkula, V., and Sankaradass, M. 2003. Coco: a hardware/software platform for rapid prototyping of code compression technologies. In Proceedings of the 40th conference on Design Automation (DAC). 306--311.]] Google Scholar
- Levine, J. 2000. Linkers & Loaders. Morgan Kaufmann Publishers, San Mateo, CA.]]Google Scholar
- Luk, C.-K., Muth, R., Patil, H., Cohn, R., and Lowney, G. 2004. Ispike: A post-link optimizer for the Intel Itanium architecture. In Proc. of the International Symposium on Code Generation and Optimization. 15--26.]] Google Scholar
- Muth, R. 1999. Alto: A platform for object code modification. Ph.D. thesis, University Of Arizona.]] Google Scholar
- Muth, R., Debray, S. K., Watterson, S. A., and De Bosschere, K. 2001. alto: a link-time optimizer for the Compaq Alpha. Software---Practice and Experience 31, 1, 67--101.]] Google Scholar
Digital Library
- Pugh, W. 1999. Compressing Java class files. In Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'99). 247--258.]] Google Scholar
- Srivastava, A. and Wall, D. W. 1994. Link-time optimization of address calculation on a 64-bit architecture. In Proc. of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 49--60.]] Google Scholar
Index Terms
Link-time compaction and optimization of ARM executables
Recommendations
Link-time optimization of ARM binaries
LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsThe overhead in terms of code size, power consumption and execution time caused by the use of precompiled libraries and separate compilation is often unacceptable in the embedded world, where real-time constraints, battery life-time and production costs ...
Link-time optimization of ARM binaries
LCTES '04The overhead in terms of code size, power consumption and execution time caused by the use of precompiled libraries and separate compilation is often unacceptable in the embedded world, where real-time constraints, battery life-time and production costs ...
UltraSPARC: Compiling for Maximum Floating Point Performance
COMPCON '96: Proceedings of the 41st IEEE International Computer ConferenceUltraSPARC-I is the first microprocessor from Sun Microsystems to implement the new 64-bit SPARC V9 architecture. UltraSPARC-I is a superscalar processor capable of issuing up to four instructions together and possesses several features designed to ...








Comments