skip to main content
article
Public Access

Stratified synthesis: automatically learning the x86-64 instruction set

Published:02 June 2016Publication History
Skip Abstract Section

Abstract

The x86-64 ISA sits at the bottom of the software stack of most desktop and server software. Because of its importance, many software analysis and verification tools depend, either explicitly or implicitly, on correct modeling of the semantics of x86-64 instructions. However, formal semantics for the x86-64 ISA are difficult to obtain and often written manually through great effort. We describe an automatically synthesized formal semantics of the input/output behavior for a large fraction of the x86-64 Haswell ISA’s many thousands of instruction variants. The key to our results is stratified synthesis, where we use a set of instructions whose semantics are known to synthesize the semantics of additional instructions whose semantics are unknown. As the set of formally described instructions increases, the synthesis vocabulary expands, making it possible to synthesize the semantics of increasingly complex instructions. Using this technique we automatically synthesized formal semantics for 1,795 instruction variants of the x86-64 Haswell ISA. We evaluate the learned semantics against manually written semantics (where available) and find that they are formally equivalent with the exception of 50 instructions, where the manually written semantics contain an error. We further find the learned formulas to be largely as precise as manually written ones and of similar size.

References

  1. N. Amit, D. Tsafrir, A. Schuster, A. Ayoub, and E. Shlomo. Virtual cpu validation. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP ’15, pages 311–327, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3834-9. doi: 10.1145/2815400.2815420. URL http://doi. acm.org/10.1145/2815400.2815420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Balakrishnan, R. Gruian, T. Reps, and T. Teitelbaum. Codesurfer/x86—a platform for analyzing x86 executables. In Compiler Construction, pages 250–254. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Balakrishnan, R. Gruian, T. W. Reps, and T. Teitelbaum. Codesurfer/x86-a platform for analyzing x86 executables. In Compiler Construction, 14th International Conference, CC 2005, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh, UK, April 4-8, 2005, Proceedings, pages 250–254, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006, pages 394–403, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, pages 394– 403, New York, NY, USA, 2006. ACM. ISBN 1-59593- 451-0. doi: 10.1145/1168857.1168906. URL http: //doi.acm.org/10.1145/1168857.1168906. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanovic, T. King, A. Reynolds, and C. Tinelli. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, pages 171–177, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Brumley, I. Jager, T. Avgerinos, and E. J. Schwartz. BAP: A binary analysis platform. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, pages 463–469, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Charney. Personal communication, February 2016.Google ScholarGoogle Scholar
  9. M. Christodorescu, N. Kidd, and W.-H. Goh. String analysis for x86 binaries. In Proceedings of the Workshop on Program Analysis for Software Tools and Engineering, volume 31, pages 88–95, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Darulova and V. Kuncak. Sound compilation of reals. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014, pages 235–248, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. K. Feser, S. Chaudhuri, and I. Dillig. Synthesizing data structure transformations from input-output examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229–239, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Godefroid and A. Taly. Automated synthesis of symbolic instruction encodings from i/o samples. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, pages 441–452, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1205-9. doi: 10.1145/2254064.2254116. URL http://doi. acm.org/10.1145/2254064.2254116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Gulwani, S. Jha, A. Tiwari, and R. Venkatesan. Synthesis of loop-free programs. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011, pages 62–73, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Intel. Intel 64 and IA-32 Architectures Software Developer Manuals, Revision 325462-057US, December 2015. URL http://www.intel. com/content/www/us/en/processors/ architectures-software-developer-manuals. html.Google ScholarGoogle Scholar
  15. S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari. Oracleguided component-based program synthesis. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010, pages 215–224, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Kinder and H. Veith. Jakstab: A static analysis platform for binaries. In Computer Aided Verification, 20th International Conference, CAV 2008, Princeton, NJ, USA, July 7-14, 2008, Proceedings, pages 423–427, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. X. Leroy. The CompCert C Verified Compiler, 2012.Google ScholarGoogle Scholar
  18. J. Lim and T. W. Reps. TSL: A system for generating abstract interpreters and its application to machine-code analysis. ACM Trans. Program. Lang. Syst., 35(1):4, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. V. Nori, S. Ozair, S. K. Rajamani, and D. Vijaykeerthy. Efficient synthesis of probabilistic programs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 208–217, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Osera and S. Zdancewic. Type-and-example-directed program synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 619–630, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. A. Ramos and D. R. Engler. Practical, Low-Effort Equivalence Verification of Real Code. In Computer Aided Verification, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. doi: 10.1007/ 978-3-642-22110-1_55. URL http://dx.doi. org/10.1007/978-3-642-22110-1_55.Google ScholarGoogle Scholar
  23. J. Regehr and U. Duongsaa. Deriving abstract transfer functions for analyzing embedded software. In Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’06), Ottawa, Ontario, Canada, June 14-16, 2006, pages 34–43, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Regehr and A. Reid. HOIST: a system for automatically deriving static analyzers for embedded systems. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2004, Boston, MA, USA, October 7-13, 2004, pages 133–143, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Reps and G. Balakrishnan. Improved memory-access analysis for x86 executables. In Compiler Construction, pages 16–35. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. W. Reps, S. Sagiv, and G. Yorsh. Symbolic implementation of the best transformer. In Verification, Model Checking, and Abstract Interpretation, 5th International Conference, VMCAI 2004, Venice, January 11-13, 2004, Proceedings, pages 252– 266, 2004.Google ScholarGoogle Scholar
  27. E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems, ASPLOS ’13, Houston, TX, USA - March 16 - 20, 2013, pages 305–316, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 9, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Solar-Lezama, R. M. Rabbah, R. Bod´ık, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005, pages 281–294, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. X. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via binary analysis. In Information Systems Security, 4th International Conference, ICISS 2008, Hyderabad, India, December 16-20, 2008. Proceedings, pages 1–25, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. Srinivasan and T. W. Reps. Synthesis of machine code from semantics. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 596–607, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. V. Thakur, J. Lim, A. Lal, A. Burton, E. Driscoll, M. Elder, T. Andersen, and T. W. Reps. Directed proof generation for machine code. In Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings, pages 288–305, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. M. Wintersteiger, Y. Hamadi, and L. M. de Moura. Efficiently solving quantified bit-vector formulas. Formal Methods in System Design, 42(1):3–23, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Stratified synthesis: automatically learning the x86-64 instruction set

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 51, Issue 6
        PLDI '16
        June 2016
        726 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2980983
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
          June 2016
          726 pages
          ISBN:9781450342612
          DOI:10.1145/2908080
          • General Chair:
          • Chandra Krintz,
          • Program Chair:
          • Emery Berger

        Copyright © 2016 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 June 2016

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!