Abstract
Machine-code synthesis is the problem of searching for an instruction sequence that implements a semantic specification, given as a formula in quantifier-free bit-vector logic (QFBV). Instruction sets like Intel's IA-32 have around 43,000 unique instruction schemas; this huge instruction pool, along with the exponential cost inherent in enumerative synthesis, results in an enormous search space for a machine-code synthesizer: even for relatively small specifications, the synthesizer might take several hours or days to find an implementation. In this paper, we present several improvements to the algorithms used in a state-of-the-art machine-code synthesizer McSynth. In addition to a novel pruning heuristic, our improvements incorporate a number of ideas known from the literature, which we adapt in novel ways for the purpose of speeding up machine-code synthesis. Our experiments for Intel's IA-32 instruction set show that our improvements enable synthesis of code for 12 out of 14 formulas on which McSynth times out, speeding up the synthesis time by at least 1981X, and for the remaining formulas, speeds up synthesis by 3X.
- Compilers: Principles, Techniques, and Tools, chapter 8: Code Generation. Addison-Wesley, 2007.Google Scholar
- A. Aho, M. Ganapathi, and S. Tjiang. Code generation using tree matching and dynamic programming. TOPLAS, 35(4), 1989. Google Scholar
Digital Library
- G. Balakrishnan and T. Reps. WYSINWYX: What You See Is Not What You eXecute. TOPLAS, 32(6), 2010. Google Scholar
Digital Library
- S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In ASPLOS, 2006. Google Scholar
Digital Library
- S. Bansal and A. Aiken. Binary translation using peephole superoptimizers. In OSDI, 2008. Google Scholar
Digital Library
- D. Brumley, I. Jager, T. Avgerinos, and E. Schwartz. BAP: A Binary Analysis Platform. In CAV, 2011. Google Scholar
Digital Library
- B. Dutertre and L. de Moura. Yices: An SMT solver, 2006. http://yices.csl.sri.com/.Google Scholar
- K. ElWazeer, K. Anand, A. Kotha, M. Smithson, and R. Barua. Scalable variable and data type detection in a binary rewriter. In PLDI, 2013. Google Scholar
Digital Library
- C. Fraser, D. Hanson, and T. Proebsting. Engineering a simple, efficient code-generator generator. LOPLAS, 1 (3), 1992. Google Scholar
Digital Library
- G. Goff, K. Kennedy, and C. Tseng. Practical dependence testing. In PLDI, 1991. Google Scholar
Digital Library
- J. Henning. SPEC CPU2006 Benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1–17, 2006. Google Scholar
Digital Library
- N. Jones, C. Gomard, and P. Sestoft. Partial Evaluation and Automatic Program Generation. Prentice-Hall, Inc., 1993. Google Scholar
Digital Library
- R. Joshi, G. Nelson, and K. Randall. Denali: A goaldirected superoptimizer. In PLDI, 2002. Google Scholar
Digital Library
- J. Lim and T. Reps. TSL: A system for generating abstract interpreters and its application to machine-code analysis. TOPLAS, 35(4), 2013. Google Scholar
Digital Library
- J. Lim, A. Lal, and T. Reps. Symbolic analysis via semantic reinterpretation. Softw. Tools for Tech. Transfer, 13(1):61–87, 2011.Google Scholar
Digital Library
- H. Massalin. Superoptimizer: A look at the smallest program. In ASPLOS, 1987. Google Scholar
Digital Library
- D. Maydan, J. Hennessy, and M. Lam. Efficient and exact data dependence analysis. In PLDI, 1991. Google Scholar
Digital Library
- P. Phothilimthana, A. Thakur, R. Bodik, and D. Ghurjati. Scaling up superoptimization. In ASPLOS, 2016. Google Scholar
Digital Library
- P. Phothilimthana, A. Thakur, R. Bodik, and D. Ghurjati. GreenThumb: Superoptimizer construction framework. UCB/EECS-2016-8, University of California–Berkeley Tech Report, Feb. 2016.Google Scholar
- V. Raychev, M. Vechev, and E. Yahav. Code completion with statistical language models. In PLDI, 2014. Google Scholar
Digital Library
- V. Raychev, M. Vechev, and A. Krause. Predicting program properties from“big code”. In POPL, 2015. Google Scholar
Digital Library
- H. Sa¨ıdi. Logical foundation for static analysis: Application to binary static analysis for security. ACM SIGAda Ada Letters, 28(1):96–102, 2008. Google Scholar
Digital Library
- E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In ASPLOS, 2013. Google Scholar
Digital Library
- D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. BitBlaze: A new approach to computer security via binary analysis. In Int. Conf. on Information Systems Security, 2008. Google Scholar
Digital Library
- V. Srinivasan and T. Reps. Partial evaluation of machine code. In OOPSLA, 2015. Google Scholar
Digital Library
- V. Srinivasan and T. Reps. Synthesis of machine code from semantics. In PLDI, 2015. Google Scholar
Digital Library
- V. Srinivasan and T. Reps. An improved algorithm for slicing machince code. In OOPSLA, 2016. Google Scholar
Digital Library
Recommendations
Speeding up machine-code synthesis
OOPSLA 2016: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and ApplicationsMachine-code synthesis is the problem of searching for an instruction sequence that implements a semantic specification, given as a formula in quantifier-free bit-vector logic (QFBV). Instruction sets like Intel's IA-32 have around 43,000 unique ...
Synthesis of machine code from semantics
PLDI '15In this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive ...
Synthesis of machine code from semantics
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationIn this paper, we present a technique to synthesize machine-code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic formula. Our technique uses an instantiation of the Counter-Example Guided Inductive ...







Comments