skip to main content
research-article

Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables

Published:14 January 2015Publication History
Skip Abstract Section

Abstract

We introduce a model for mixed syntactic/semantic approximation of programs based on symbolic finite automata (SFA). The edges of SFA are labeled by predicates whose semantics specifies the denotations that are allowed by the edge. We introduce the notion of abstract symbolic finite automaton (ASFA) where approximation is made by abstract interpretation of symbolic finite automata, acting both at syntactic (predicate) and semantic (denotation) level. We investigate in the details how the syntactic and semantic abstractions of SFA relate to each other and contribute to the determination of the recognized language. Then we introduce a family of transformations for simplifying ASFA. We apply this model to prove properties of commonly used tools for similarity analysis of binary executables. Following the structure of their control flow graphs, disassembled binary executables are represented as (concrete) SFA, where states are program points and predicates represent the (possibly infinite) I/O semantics of each basic block in a constraint form. Known tools for binary code analysis are viewed as specific choices of symbolic and semantic abstractions in our framework, making symbolic finite automata and their abstract interpretations a unifying model for comparing and reasoning about soundness and completeness of analyses of low-level code.

Skip Supplemental Material Section

Supplemental Material

p329-sidebyside.mpg

References

  1. T. Ball, R. Majumdar, T. D. Millstein, and S. K. Rajamani. Automatic predicate abstraction of C programs. In M. Burke and M. L. Soffa, editors, PLDI, pages 203--213. ACM, 2001. ISBN 1--58113--414--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Cousot. Verification by abstract interpretation. In Verification: Theory and Practice, Essays Dedicated to Zohar Manna on the Occasion of His 64th Birthday, volume 2772, pages 243--268. Springer, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  3. P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the 4th ACM Symposium on Principles of Programming Languagesrm ( POPL '77), pages 238--252. ACM Press, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Conference Record of the 6th ACM Symposium on Principles of Programming Languagesrm ( POPL '79), pages 269--282. ACM Press, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Cousot and R. Cousot. Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In Proceedings of the Seventh ACM Conference on Functional Programming Languages and Computer Architecture, pages 170--181. ACM Press, New York, NY, 25--28 June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Cousot, R. Cousot, and L. Mauborgne. Theories, solvers and static analysis by abstract interpretation. J. ACM, 59 (6): 31, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Dalla Preda, R. Giacobazzi, S. K. Debray, K. Coogan, and G. M. Townsend. Modelling metamorphism by abstract interpretation. In Proc.\ of the 19th Int.\ Static Analysis Symp.\rm ( SAS '10), volume 6337 of Lecture Notes in Computer Science, pages 218--235. Springer-Verlag, Berlin, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. D'Antoni and M. Veanes. Equivalence of extended symbolic finite transducers. In N. Sharygina and H. Veith, editors, CAV, volume 8044 of Lecture Notes in Computer Science, pages 624--639. Springer, 2013. ISBN 978--3--642--39798--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. D'Antoni and M. Veanes. Minimization of symbolic automata. In S. Jagannathan and P. Sewell, editors, POPL, pages 541--554. ACM, 2014. ISBN 978--1--4503--2544--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. V. D'Silva. Widening for automata. Diploma Thesis, Institut Fur Informatick, Universitat Zurich, 2006.Google ScholarGoogle Scholar
  11. V. D'Silva, L. Haller, and D. Kroening. Abstract satisfaction. In S. Jagannathan and P. Sewell, editors, POPL, pages 139--150. ACM, 2014. ISBN 978--1--4503--2544--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. H. Flake. Structural comparison of executable objects. In U. Flegel and M. Meier, editors, DIMVA, volume 46 of LNI, pages 161--173. GI, 2004. ISBN 3--88579--375-X.Google ScholarGoogle Scholar
  13. C. Flanagan and S. Qadeer. Predicate abstraction for software verification. In Proc. of Conf. Record of the 29th ACM Symp. on Principles of Programming Languagesrm ( POPL '02), pages 191--202. ACM Press, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. L. Gall and B. Jeannet. Lattice automata: A representation for languages on infinite alphabets, and some applications to verification. In H. R. Nielson and G. Filé, editors, SAS, volume 4634 of Lecture Notes in Computer Science, pages 52--68. Springer, 2007. ISBN 978-3-540-74060-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Gao, M. Reiter, and D. Song. BinHunt: Automatically finding semantic differences in binary programs. In Proceedings of the 10th International Conference on Information and Communications Security, ICICS '08, pages 238--255. Springer-Verlag, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Giacobazzi, F. Ranzato, and F. Scozzari. Making abstract interpretation complete. Journal of the ACM, 47 (2): 361--416, March 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Hunt and I. Mastroeni. The PER model of abstract non-interference. In C. Hankin and I. Siveroni, editors, Proc.\ of The 12th Internat.\ Static Analysis Symp.\ (SAS '05), volume 3672 of Lecture Notes in Computer Science, pages 171--185. Springer-Verlag, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Lakhotia, M. Dalla Preda, and R. Giacobazzi. Fast location of similar code fragments using semantic 'juice'. In 2nd Workshop on Program Protection and Reverse Engineering PPREW 2013. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Mastroeni and R. Giacobazzi. An abstract interpretation-based model for safety semantics. Int. J. Comput. Math., 88 (4): 665--694, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. W. Reps, S. Sagiv, and G. Yorsh. Symbolic implementation of the best transformer. In B. Steffen and G. Levi, editors, VMCAI, volume 2937 of Lecture Notes in Computer Science, pages 252--266. Springer, 2004. ISBN 3--540--20803--8.Google ScholarGoogle Scholar
  21. H. Rogers. Theory of recursive functions and effective computability. The MIT press, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. V. Thakur, M. Elder, and T. W. Reps. Bilateral algorithms for symbolic abstraction. In A. Miné and D. Schmidt, editors, SAS, volume 7460 of Lecture Notes in Computer Science, pages 111--128. Springer, 2012. ISBN 978--3--642--33124--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, and N. Bjørner. Symbolic finite state transducers: algorithms and applications. In J. Field and M. Hicks, editors, POPL, pages 137--150. ACM, 2012. ISBN 978--1--4503--1083--3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Ward. The Closure Operators of a Lattice. Annals of Mathematics, 43 (2): 191--196, 1942.Google ScholarGoogle ScholarCross RefCross Ref
  25. Zynamics. BinDiff3.2manual., 2004. URL http://www.zynamics.com/bindiff/manual/.Google ScholarGoogle Scholar

Index Terms

  1. Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 50, Issue 1
          POPL '15
          January 2015
          682 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2775051
          • Editor:
          • Andy Gill
          Issue’s Table of Contents
          • cover image ACM Conferences
            POPL '15: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
            January 2015
            716 pages
            ISBN:9781450333009
            DOI:10.1145/2676726

          Copyright © 2015 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 January 2015

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!