skip to main content
research-article

Automated synthesis of symbolic instruction encodings from I/O samples

Published:11 June 2012Publication History
Skip Abstract Section

Abstract

Symbolic execution is a key component of precise binary program analysis tools. We discuss how to automatically boot-strap the construction of a symbolic execution engine for a processor instruction set such as x86, x64 or ARM. We show how to automatically synthesize symbolic representations of individual processor instructions from input/output examples and express them as bit-vector constraints. We present and compare various synthesis algorithms and instruction sampling strategies. We introduce a new synthesis algorithm based on smart sampling which we show is one to two orders of magnitude faster than previous synthesis algorithms in our context. With this new algorithm, we can automatically synthesize bit-vector circuits for over 500 x86 instructions (8/16/32-bits, outputs, EFLAGS) using only 6 synthesis templates and in less than two hours using the Z3 SMT solver on a regular machine. During this work, we also discovered several inconsistencies across x86 processors, errors in the x86 Intel spec, and several bugs in previous manually-written x86 instruction handlers.

References

  1. D. Brumley, I. Jager, Th. Avgerinos, and E. J. Schwartz. BAP: A Binary Analysis Platform. In CAV'2011, July 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Chlipala. Modular Development of Certified Program Verifiers with a Proof Assistant. In ICFP 2006, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. de Moura and N. Bjorner. Z3: An Efficient SMT Solver. In TACAS 2008, April 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Godefroid and J. Kinder. Proving Memory Safety of Floating-Point Computations by Combining Static and Dynamic Program Analysis. In ISSTA 2010, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Godefroid, M.Y. Levin, and D. Molnar. Active Property Checking. In EMSOFT 2008, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Godefroid, M.Y. Levin, and D. Molnar. Automated Whitebox Fuzz Testing. In NDSS 2008, February 2008.Google ScholarGoogle Scholar
  7. S. A. Goldman and M. J. Kearns. On the Complexity of Teaching. Journal of Computer and System Sciences, 50:303--314, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. S. Gulwani. Automating String Processing in Spreadsheets using Input-Output Examples. In POPL 2011, January 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Gulwani, V. A. Korthikanti, and A. Tiwari. Synthesizing Geometry Constructions. In PLDI 2011, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. C. Hsieh, D. R. Engler, and G. Back. Reverse-Engineering Instruction Encodings. In USENIX 2001, June 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Immerman. Descriptive complexity. Springer, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  12. S. Itzhaky, S. Gulwani, N. Immerman, and M. Sagiv. A Simple Inductive Synthesis Methodology and its Applications. In OOPSLA 2010, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari. Oracle-Guided Component-Based Program Synthesis. In ICSE 2010, May 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Ma, A. Forin, and J. Liu. Rapid Prototyping and Compact Testing of CPU Emulators. In Proceedings of the 21st IEEE International Symposium on Rapid System Prototyping, June 2010.Google ScholarGoogle Scholar
  15. L. Martignoni, R. Paleari, G. Fresi Roglia, and D. Bruschi. Testing CPU Emulators. In ISSTA 2009, July 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Martignoni, R. Paleari, G. Fresi Roglia, and D. Bruschi. Testing System Virtual Machines. In ISSTA 2010, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Molnar, X. C. Li, and D. Wagner. Dynamic Test Generation To Find Integer Bugs in x86 Binary Linux Programs. In Proc. of the 18th Usenix Security Symposium, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Regehr and U. Duongsaa. Deriving Abstract Transfer Functions for Analyzing Embedded Software. In LCTES 2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Regehr and A. Reid. HOIST: A System for Automatically Deriving Static Analyzers for Embedded Systems. In ASPLOS 2004, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Sarkar, P. Sewell, F. Zappa Nardelli, S. Owens, T. Ridge, Th. Braibant, M. O. Myreen, and J. Aglave. The Semantics of x86-CC Multiprocessor Machine Code. In POPL 2009, January 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Solar-Lezama, R. M. Rabbah, R. Bodík, and K. Ebcioglu. Programming by Sketching for Bit-Streaming Programs. In PLDI 2005, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Solar-Lezama, L. Tancau, R. Bodík, S. A. Seshia, and V. A. Saraswat. Combinatorial Sketching for Finite Programs. In ASPLOS 2006, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. BitBlaze: A New Approach to Computer Security via Binary Analysis. In ICISS 2008, December 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Taly, S. Gulwani, and A. Tiwari. Synthesizing Switching Logic Using Constraint Solving. In VMCAI 2009, January 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Taly and A. Tiwari. Switching Logic Synthesis for Reachability. In EMSOFT 2010, October 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated synthesis of symbolic instruction encodings from I/O samples

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM SIGPLAN Notices
                  ACM SIGPLAN Notices  Volume 47, Issue 6
                  PLDI '12
                  June 2012
                  534 pages
                  ISSN:0362-1340
                  EISSN:1558-1160
                  DOI:10.1145/2345156
                  Issue’s Table of Contents
                  • cover image ACM Conferences
                    PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation
                    June 2012
                    572 pages
                    ISBN:9781450312059
                    DOI:10.1145/2254064

                  Copyright © 2012 ACM

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 11 June 2012

                  Check for updates

                  Qualifiers

                  • research-article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader
                About Cookies On This Site

                We use cookies to ensure that we give you the best experience on our website.

                Learn more

                Got it!