Abstract
For cost-sensitive or memory constrained embedded systems, code size is at least as important as performance. Consequently, compact code generation has become a major focus of attention within the compiler community. In this paper we develop a pragmatic, yet effective code size reduction technique, which exploits structural similarity of functions. It avoids code duplication through merging of similar functions and targeted insertion of control flow to resolve small differences. We have implemented our purely software based and platform-independent technique in the LLVM compiler frame work and evaluated it against the SPEC CPU2006 benchmarks and three target platforms: Intel x86, ARM based Qualcomm Krait(TM), and Qualcomm Hexagon(TM) DSP. We demonstrate that code size for SPEC CPU2006 can be reduced by more than 550KB on x86. This corresponds to an overall code size reduction of 4%, and up to 11.5% for individual programs. Overhead introduced by additional control flow is compensated for by better I-cache performance of the compacted programs. We also show that identifying suitable candidates and subsequent merging of functions can be implemented efficiently.
- ArmtextscCortex M-4 specification, 2013. URL http://www.arm.com/products/processors/cortex-m/cortex-m4-processor.php.Google Scholar
- A. Beszédes, R. Ferenc, T. Gyimóthy, A. Dolenc, and K. Karsisto. Survey of code-size reduction methods. ACM Comput. Surv., 35 (3): 223--267, Sept. 2003. Google Scholar
Digital Library
- P. Brisk, J. Macbeth, A. Nahapetian, and M. Sarrafzadeh. A dictionary construction technique for code compression systems with echo instructions. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES'05, pages 105--114, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- S. Cesare and Y. Xiang. Malware variant detection using similarity search over sets of control flow graphs. In 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom, pages 181--189, 2011. Google Scholar
Digital Library
- S. Chandar, M. Mehendale, and R. Govindarajan. Area and power reduction of embedded DSP systems using instruction compression and re-configurable encoding. J. VLSI Signal Process. Syst., 44 (3): 245--267, Sept. 2006. Google Scholar
Digital Library
- W.-K. Chen, B. Li, and R. Gupta. Code compaction of matching single-entry multiple-exit regions. In R. Cousot, editor, phStatic Analysis, volume 2694 of Lecture Notes in Computer Science, pages 401--417. Springer Berlin Heidelberg, 2003. Google Scholar
Digital Library
- L. Codrescu, W. Anderson, S. Venkumanhanti, M. Zeng, E. Plondke, C. Koob, A. Ingle, R. Maule, and R. Talluri. QualcommtextscHexagon DSP: An architecture optimized for mobile multimedia and communications. In Proceedings of the IEEE HotChips Symposium on High-Performance Chips, (HotChips 2013), Aug. 2013.Google Scholar
Cross Ref
- K. D. Cooper and N. McIntosh. Enhanced code compression for embedded RISC processors. In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, PLDI '99, pages 139--149, New York, NY, USA, 1999. ACM. Google Scholar
Digital Library
- M. L. Corliss, E. C. Lewis, and A. Roth. The implementation and evaluation of dynamic code decompression using DISE. ACM Trans. Embed. Comput. Syst., 4 (1): 38--72, Feb. 2005. Google Scholar
Digital Library
- ner, Chanet, Van Put, and De Sutter}DeBus:2003:PCT:859670.859696B. De Bus, D. K\"astner, D. Chanet, L. Van Put, and B. De Sutter. Post-pass compaction techniques. Commun. ACM, 46 (8): 41--46, Aug. 2003. Google Scholar
Digital Library
- B. De Sutter, B. De Bus, K. De Bosschere, and S. Debray. Combining global code and data compaction. In Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems, LCTES'01, pages 29--38, New York, NY, USA, 2001. ACM. Google Scholar
Digital Library
- B. De Sutter, B. De Bus, and K. De Bosschere. Sifting out the mud: low level C++code reuse. In Proceedings of the 17th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA'02, pages 275--291, New York, NY, USA, 2002. ACM. Google Scholar
Digital Library
- B. De Sutter, B. De Bus, and K. De Bosschere. Link-time binary rewriting techniques for program compaction. ACM Trans. Program. Lang. Syst., 27 (5): 882--945, Sept. 2005. Google Scholar
Digital Library
- S. Debray and W. Evans. Profile-guided code compression. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, PLDI'02, pages 95--105, New York, NY, USA, 2002. ACM. Google Scholar
Digital Library
- S. K. Debray, W. Evans, R. Muth, and B. De Sutter. Compiler techniques for code compaction. ACM Trans. Program. Lang. Syst., 22 (2): 378--415, Mar. 2000. Google Scholar
Digital Library
- M. Drinić, D. Kirovski, and H. Vo. Code optimization for code compression. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO'03, pages 315--324, Washington, DC, USA, 2003. IEEE Computer Society. Google Scholar
Digital Library
- T. Dullien and R. Rolles. Graph-based comparison of executable objects. In Proceedings of the Symposium sur la Securite des Technologies de l'Information et des Communications, 2005.Google Scholar
- B. Eckel. phThinking in C++, Vol. 2, chapter 5. Pearson Education, 2003. ISBN 0130353132. Google Scholar
Digital Library
- T. J. Edler von Koch, I. Böhm, and B. Franke. Integrated instruction selection and register allocation for compact code generation exploiting freeform mixing of 16- and 32-bit instructions. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO'10, pages 180--189, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- H. Flake. Structural comparison of executable objects. In U. Flegel and M. Meier, editors, DIMVA, volume 46 of LNI, pages 161--173. GI, 2004.Google Scholar
- A. Halambi, A. Shrivastava, P. Biswas, N. D. Dutt, and A. Nicolau. An efficient compiler technique for code size reduction using reduced bit-width ISAs. In DATE, pages 402--408. IEEE Computer Society, 2002. Google Scholar
Digital Library
- M. Haneda, P. M. W. Knijnenburg, and H. A. G. Wijshoff. Code size reduction by compiler tuning. In Proceedings of the 6th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS'06, pages 186--195, Berlin, Heidelberg, 2006. Springer-Verlag. Google Scholar
Digital Library
- R. Jenkins. GCC Bug 29442: insn-attrtab has grown too large. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29442, October 2006.Google Scholar
- R. Komondoor and S. Horwitz. Using slicing to identify duplication in source code. In Proceedings of the 8th International Symposium on Static Analysis, SAS'01, pages 40--56, London, UK, UK, 2001. Springer-Verlag. Google Scholar
Digital Library
- R. Komondoor and S. Horwitz. Eliminating duplication in source code via procedure extraction. Dept. of Computer Sciences, Univ. of Wisconsin-Madison, Tech. Rep. 1461, 2002.Google Scholar
- I. Kupka. On similarity of functions. Topology Proceedings, 36: 137--187, 2010.Google Scholar
- G. Lóki, A. Kiss, J. Jász, and A. Beszédes. Code factoring in GCC. In Proceedings of the 2004 GCC Developers' Summit, pages 79--84, June 2004.Google Scholar
- Y. Park, J. Choi, C. Kang, C. Lee, Y. Shin, B. Choi, J. Kim, S. Jeon, J. Sel, J. Park, K. Choi, T. Yoo, J. Sim, and K. Kim. Highly manufacturable 32Gb multi-level NAND Flash memory with 0.0098μ m2 cell size using TANOS (Si-Oxide-Al2O3-TaN) cell technology. In International Electron Devices Meeting, IEDM'06, pages 1--4, 2006.Google Scholar
- C. K. Roy, J. R. Cordy, and R. Koschke. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Sci. Comput. Program., 74 (7): 470--495, May 2009. Google Scholar
Digital Library
- eveloper Community(2013)}gccR. M. Stallman and the GCC Developer Community. GNU Compiler Collection Internals, 2013.Google Scholar
- S. Tallam, C. Coutant, I. L. Taylor, X. D. Li, and C. Demetriou. Safe ICF: Pointer safe and unwinding aware identical code folding in gold. In GCC Developers Summit, 2010.Google Scholar
- M. Thuresson and P. Stenstrom. Evaluation of extended dictionary-based static code compression schemes. In Proceedings of the 2nd Conference on Computing Frontiers, CF'05, pages 77--86, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- L. Van Put, D. Chanet, B. De Bus, B. De Sutter, and K. De Bosschere. DIABLO: a reliable, retargetable and extensible link-time rewriting framework. In Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, pages 7--12, 2005.Google Scholar
- D. Ye, J. Ray, C. Harle, and D. Kaeli. Performance characterization of SPEC CPU2006 integer benchmarks on x86--64 architecture. In 2006 IEEE International Symposium on Workload Characterization, pages 120--127, 2006.Google Scholar
Cross Ref
- P. Zhao and J. N. Amaral. Function outlining and partial inlining. In Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing, SBAC-PAD'05, pages 101--108, Washington, DC, USA, 2005. IEEE Computer Society. Google Scholar
Digital Library
Index Terms
Exploiting function similarity for code size reduction
Recommendations
Exploiting function similarity for code size reduction
LCTES '14: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsFor cost-sensitive or memory constrained embedded systems, code size is at least as important as performance. Consequently, compact code generation has become a major focus of attention within the compiler community. In this paper we develop a pragmatic,...
Enhancing the performance of 16-bit code using augmenting instructions
LCTES '03: Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systemsIn the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. ...
Enhancing the performance of 16-bit code using augmenting instructions
Special Issue: Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool support for embedded systems (San Diego, CA).In the embedded domain, memory usage and energy consumption are critical constraints. Dual width instruction set embedded processors such as the ARM provide a 16-bit instruction set in addition to the 32-bit instruction set to address these concerns. ...







Comments