Abstract
Developing a code optimizer is challenging, especially for new, idiosyncratic ISAs. Superoptimization can, in principle, discover machine-specific optimizations automatically by searching the space of all instruction sequences. If we can increase the size of code fragments a superoptimizer can optimize, we will be able to discover more optimizations. We develop LENS, a search algorithm that increases the size of code a superoptimizer can synthesize by rapidly pruning away invalid candidate programs. Pruning is achieved by selectively refining the abstraction under which candidates are considered equivalent, only in the promising part of the candidate space. LENS also uses a bidirectional search strategy to prune the candidate space from both forward and backward directions. These pruning strategies allow LENS to solve twice as many benchmarks as existing enumerative search algorithms, while LENS is about 11-times faster.
Additionally, we increase the effective size of the superoptimized fragments by relaxing the correctness condition using contexts (surrounding code). Finally, we combine LENS with complementary search techniques into a cooperative superoptimizer, which exploits the stochastic search to make random jumps in a large candidate space, and a symbolic (SAT-solver-based) search to synthesize arbitrary constants. While existing superoptimizers consistently solve 9--16 out of 32 benchmarks, the cooperative superoptimizer solves 29 benchmarks. It can synthesize code fragments that are up to 82% faster than code generated by gcc -O3 from WiBench and MiBench.
- Souper. http://github.com/google/souper. URL http://github.com/google/souper.Google Scholar
- T. Akiba, K. Imajo, H. Iwami, Y. Iwata, T. Kataoka, N. Takahashi, M. Moskal, and N. Swamy. Calibrating research in program synthesis using 72,000 hours of programmer time. Technical report, MSR, 2013.Google Scholar
- R. Alur, R. Bodik, E. Dallal, D. Fisman, P. Garg, G. Juniwal, H. Kress-Gazit, P. Madhusudan, M. M. K. Martin, M. Raghothaman, S. Saha, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. Syntax-guided synthesis. In SyGus Competition, 2014.Google Scholar
- ARM. Cortex-A9: Technical Reference Manual, 2012. URL http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388i/DDI0388I_cortex_a9_r4p1_trm.pdf.Google Scholar
- S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In ASPLOS, 2006.Google Scholar
Digital Library
- G. Barthe, J. M. Crespo, S. Gulwani, C. Kunz, and M. Marron. From relational verification to simd loop synthesis. In PPoPP, 2013.Google Scholar
Digital Library
- J. Bungo. The use of compiler optimizations for embedded systems software. Crossroads, 15 (1): 8--15, Sept. 2008.Google Scholar
Digital Library
- A. Duller, D. Towner, G. Panesar, A. Gray, and W. Robbins. picoarray technology: the tool's story. In Design, Automation and Test in Europe, 2005.Google Scholar
Digital Library
- J. Galenson, P. Reames, R. Bodik, B. Hartmann, and K. Sen. Codehint: Dynamic and interactive synthesis of code snippets. In ICSE, 2014.Google Scholar
Digital Library
- V. Govindaraju, C.-H. Ho, T. Nowatzki, J. Chhugani, N. Satish, K. Sankaralingam, and C. Kim. Dyser: Unifying functionality and parallelism specialization for energy-efficient computing. Micro, IEEE, Sept 2012.Google Scholar
Digital Library
- T. Granlund and R. Kenner. Eliminating branches using a superoptimizer and the gnu c compiler. In PLDI, 1992.Google Scholar
Digital Library
- GreenArrays. Product Brief: GreenArrays GA144, 2010. URL http://www.greenarraychips.com/home/documents/greg/PB001--100503-GA144--1--10.pdf.Google Scholar
- S. Gulwani, S. Jha, A. Tiwari, and R. Venkatesan. Synthesis of loop-free programs. In PLDI, 2011.Google Scholar
Digital Library
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. Mudge, R. Brown, and T. Austin. Mibench: a free, commercially representative embedded benchmark suite. In IEEE International Symposium on Workload Characterization, 2001.Google Scholar
Cross Ref
- Intel. Reducing Data Center Energy Consumption. Technical report, 2008.Google Scholar
- M. Kandemir, N. Vijaykrishnan, and M. Irwin. Compiler optimizations for low power systems. In Power Aware Computing, Series in Computer Science. Springer US, 2002.Google Scholar
Cross Ref
- N. P. Lopes, D. Menendez, S. Nagarakatte, and J. Regehr. Provably correct peephole optimizations with alive. In PLDI, 2015.Google Scholar
Digital Library
- H. Massalin. Superoptimizer: a look at the smallest program. In ASPLOS, 1987.Google Scholar
- P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha. A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm. In Custom Integrated Circuits Conference (CICC), 2011 IEEE, 2011.Google Scholar
Cross Ref
- P. M. Phothilimthana, T. Jelvis, R. Shah, N. Totla, S. Chasins, and R. Bodik. Chlorophyll: Synthesis-aided compiler for low-power spatial architectures. In PLDI, 2014.Google Scholar
Digital Library
- P. M. Phothilimthana, A. Thakur, R. Bodik, and D. Dhurjati. Greenthumb: Superoptimizer construction framework. In Proceedings of International Conference on Compiler Construction, 2016.Google Scholar
Digital Library
- W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz. Convolution engine: Balancing efficiency and flexibility in specialized computing. In ISCA, 2013.Google Scholar
- A. Reynolds, M. Deters, V. Kuncak, C. Tinelli, and C. Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In CAV, 2015.Google Scholar
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In ASPLOS, 2013.Google Scholar
Digital Library
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In PLDI, 2014.Google Scholar
Digital Library
- R. Sharma. Personal communication, June 2015.Google Scholar
- A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat. Combinatorial sketching for finite programs. In ASPLOS, 2006.Google Scholar
Digital Library
- V. Srinivasan and T. Reps. Synthesis of machine code from semantics. In PLDI, 2015.Google Scholar
Digital Library
- The Linley Group. Processor watch: Getting way out of box. http://www.linleygroup.com/newsletters/newsletter_detail.php?num=5038, 2013. Accessed: 2014--11--13.Google Scholar
- E. Torlak and R. Bodik. Growing solver-aided languages with Rosette. In Onward!, 2013.Google Scholar
- E. Torlak and R. Bodik. A lightweight symbolic virtual machine for solver-aided host languages. In PLDI, 2014.Google Scholar
Digital Library
- A. Udupa, A. Raghavan, J. V. Deshmukh, S. Mador-Haim, M. M. Martin, and R. Alur. Transit: Specifying protocols with concolic snippets. In PLDI, 2013.Google Scholar
Digital Library
- H. S. Warren. Hacker's Delight. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002.Google Scholar
Digital Library
- H. S. Warren. A hacker's assistant. Oct. 2008. URL http://www.hackersdelight.org/aha/aha.pdf.Google Scholar
- Wikipedia. List of arm microarchitectures. http://en.wikipedia.org/wiki/List_of_ARM_microarchitectures, 2014. Accessed: 2014--11--13.Google Scholar
- C. Zhang. Dynamically Reconfigurable Architectures for Real-time Baseband Processing. PhD thesis, Lund University, 2014.Google Scholar
- Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge. Wibench: An open source kernel suite for benchmarking wireless systems. In Workload Characterization (IISWC), 2013 IEEE International Symposium on, 2013.Google Scholar
Cross Ref
Index Terms
Scaling up Superoptimization
Recommendations
Scaling up Superoptimization
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating SystemsDeveloping a code optimizer is challenging, especially for new, idiosyncratic ISAs. Superoptimization can, in principle, discover machine-specific optimizations automatically by searching the space of all instruction sequences. If we can increase the ...
Scaling up Superoptimization
ASPLOS'16Developing a code optimizer is challenging, especially for new, idiosyncratic ISAs. Superoptimization can, in principle, discover machine-specific optimizations automatically by searching the space of all instruction sequences. If we can increase the ...
Dataflow-based pruning for speeding up superoptimization
Superoptimization is a compilation strategy that uses search to improve code quality, rather than relying on a canned sequence of transformations, as traditional optimizing compilers do. This search can be seen as a program synthesis problem: from ...







Comments