skip to main content
article

From MinX to MinC: semantics-driven decompilation of recursive datatypes

Published:11 January 2016Publication History
Skip Abstract Section

Abstract

Reconstructing the meaning of a program from its binary executable is known as reverse engineering; it has a wide range of applications in software security, exposing piracy, legacy systems, etc. Since reversing is ultimately a search for meaning, there is much interest in inferring a type (a meaning) for the elements of a binary in a consistent way. Unfortunately existing approaches do not guarantee any semantic relevance for their reconstructed types. This paper presents a new and semantically-founded approach that provides strong guarantees for the reconstructed types. Key to our approach is the derivation of a witness program in a high-level language alongside the reconstructed types. This witness has the same semantics as the binary, is type correct by construction, and it induces a (justifiable) type assignment on the binary. Moreover, the approach effectively yields a type-directed decompiler. We formalise and implement the approach for reversing MinX, an abstraction of x86, to MinC, a type-safe dialect of C with recursive datatypes. Our evaluation compiles a range of textbook C algorithms to MinX and then recovers the original structures.

References

  1. G. Balakrishnan and T. Reps. Analyzing Memory Accesses in x86 Executables. In CC, LNCS, pages 5–23. Springer, 2004.Google ScholarGoogle Scholar
  2. G. Balakrishnan and T. Reps. Divine: Discovering Variables in Executables. In VMCAI, LNCS, pages 1–28. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Blazy, V. Laporte, and D. Pichardie. Verified Abstract Interpretation Techniques for Disassembling Low-level Self-modifying Code. In ITP, volume 8558 of LNCS, pages 128–143, 2014.Google ScholarGoogle Scholar
  4. E. Chan, S. Venkataraman, N. Tkach, K. Larson, A. Gutierrez, and R. H. Campbell. Characterizing Data Structures for Volatile Forensics. In Systematic Approaches to Digital Forensic Engineering, pages 1–9, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Cozzie, F. Stratton, H. Xue, and S. T. King. Digging For Data Structures. In USENIX Symposium on Operating Systems Design and Implementation, pages 231–244. USENIX, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Dolgova and A. Chernov. Automatic Reconstruction of Data types in the Decompilation Problem. Programming and Computer Software, 35(2):105–119, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Elwazeer, K. Anand, A. Kotha, M. Smithson, and R. Barua. Scalable Variable and Data Type Detection in a Binary Rewriter. In PLDI, pages 51–60, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Frühwirth. Constraint Handling Rules. CUP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. I. Guilfanov. A Simple Type System for Program Reengineering. In WCRE, pages 357–. IEEE Computer Society, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Jaffar. Efficient Unification over Infinite Terms. New Generation Computing, 2(3):207–219, 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. S. Warren Jr. Hacker’s Delight. Addison-Wesley, 2002.Google ScholarGoogle Scholar
  12. S. Katsumata and A. Ohori. Proof-Directed De-compilation of Low-Level Code. In ESOP, volume 2028 of LNCS, pages 352–366. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Kinder, H. Veith, and F. Zuleger. An Abstract Interpretation-Based Framework for Control Flow Reconstruction from Binaries. In VMCAI, volume 5403 of LNCS, pages 214–228. Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Kowalski. Algorithm = Logic + Control. CACM, 22(7):424–436, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Lee, T. Avgerinos, and D. Brumley. TIE: Principled Reverse Engineering of Types in Binary Programs. In NDSS. The Internet Society, 2011.Google ScholarGoogle Scholar
  16. X. Leroy. Formal Certification of a Compiler Back-end or: Programming a Compiler with a Proof Assistant. In POPL, pages 42–54, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. M. Li and F. Manyà. MaxSAT, Hard and Soft Constraints. In Handbook of Satisfiability, pages 613–631. IOS Press, 2009.Google ScholarGoogle Scholar
  18. Z. Lin, X. Zhang, and D. Xu. Automatic Reverse Engineering of Data Structures from Binary Execution. In NDSS. The Internet Society, 2010.Google ScholarGoogle Scholar
  19. R. Milner. A Theory of Type Polymorphism in Programming. Journal of Computer and System Science, 17:348–375, 1978.Google ScholarGoogle ScholarCross RefCross Ref
  20. G. Morrisett and D. Walker. From System F to Typed Assembly Language. TOPLAS, 21(3):527–568, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Mycroft. Type-Based Decompilation (or Program Reconstruction via Type Reconstruction). In ESOP, volume 1576 of LNCS, pages 208–223. Springer, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. O. Myreen, M. J. C. Gordon, and K. Slind. Machine-Code Verification for Multiple Architectures - An Application of Decompilation into Logic. In FMCAD, pages 1–8, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Z. Pavlinovic, T. King, and T. Wies. Finding Minimum Type Error Sources. In OOPSLA, pages 525–542. ACM Press, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. P. Peres Cervantes. Static Methods to Check Low-Level Code for a Graph Reduction Machine. PhD thesis, University of York, 2014. http://etheses.whiterose.ac.uk/id/eprint/6248.Google ScholarGoogle Scholar
  25. E. Robbins, J. Howe, and A. King. Theory Propagation and Reification. Science of Computer Programming, 111:3–22, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Robbins, J. M. Howe, and A. King. Theory Propagation and Rational-Trees. In PPDP, pages 193–204. ACM Press, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. K. Troshina, Y. Derevenets, and A. Chernov. Reconstruction of composite types for Decompilation. In Working Conference on Source Code Analysis and Manipulation, pages 179–188, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. J. Van Emmerik. Static Single Assignment for Decompilation. PhD thesis, University of Queensland, 2007. http://espace.library. uq.edu.au/view/UQ:158682.Google ScholarGoogle Scholar
  30. W. Wang. Ucc, 2014. http://ucc.sourceforge.net/.Google ScholarGoogle Scholar
  31. M. A. Weiss. Data Structures and Algorithm Analysis in C. Addison-Wesley, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. From MinX to MinC: semantics-driven decompilation of recursive datatypes

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 51, Issue 1
        POPL '16
        January 2016
        815 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2914770
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
          January 2016
          815 pages
          ISBN:9781450335492
          DOI:10.1145/2837614

        Copyright © 2016 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 January 2016

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!