Abstract
We introduce a model for mixed syntactic/semantic approximation of programs based on symbolic finite automata (SFA). The edges of SFA are labeled by predicates whose semantics specifies the denotations that are allowed by the edge. We introduce the notion of abstract symbolic finite automaton (ASFA) where approximation is made by abstract interpretation of symbolic finite automata, acting both at syntactic (predicate) and semantic (denotation) level. We investigate in the details how the syntactic and semantic abstractions of SFA relate to each other and contribute to the determination of the recognized language. Then we introduce a family of transformations for simplifying ASFA. We apply this model to prove properties of commonly used tools for similarity analysis of binary executables. Following the structure of their control flow graphs, disassembled binary executables are represented as (concrete) SFA, where states are program points and predicates represent the (possibly infinite) I/O semantics of each basic block in a constraint form. Known tools for binary code analysis are viewed as specific choices of symbolic and semantic abstractions in our framework, making symbolic finite automata and their abstract interpretations a unifying model for comparing and reasoning about soundness and completeness of analyses of low-level code.
Supplemental Material
- T. Ball, R. Majumdar, T. D. Millstein, and S. K. Rajamani. Automatic predicate abstraction of C programs. In M. Burke and M. L. Soffa, editors, PLDI, pages 203--213. ACM, 2001. ISBN 1--58113--414--2. Google Scholar
Digital Library
- P. Cousot. Verification by abstract interpretation. In Verification: Theory and Practice, Essays Dedicated to Zohar Manna on the Occasion of His 64th Birthday, volume 2772, pages 243--268. Springer, 2003.Google Scholar
Cross Ref
- P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the 4th ACM Symposium on Principles of Programming Languagesrm ( POPL '77), pages 238--252. ACM Press, 1977. Google Scholar
Digital Library
- P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In Conference Record of the 6th ACM Symposium on Principles of Programming Languagesrm ( POPL '79), pages 269--282. ACM Press, 1979. Google Scholar
Digital Library
- P. Cousot and R. Cousot. Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In Proceedings of the Seventh ACM Conference on Functional Programming Languages and Computer Architecture, pages 170--181. ACM Press, New York, NY, 25--28 June 1995. Google Scholar
Digital Library
- P. Cousot, R. Cousot, and L. Mauborgne. Theories, solvers and static analysis by abstract interpretation. J. ACM, 59 (6): 31, 2012. Google Scholar
Digital Library
- M. Dalla Preda, R. Giacobazzi, S. K. Debray, K. Coogan, and G. M. Townsend. Modelling metamorphism by abstract interpretation. In Proc.\ of the 19th Int.\ Static Analysis Symp.\rm ( SAS '10), volume 6337 of Lecture Notes in Computer Science, pages 218--235. Springer-Verlag, Berlin, 2010. Google Scholar
Digital Library
- L. D'Antoni and M. Veanes. Equivalence of extended symbolic finite transducers. In N. Sharygina and H. Veith, editors, CAV, volume 8044 of Lecture Notes in Computer Science, pages 624--639. Springer, 2013. ISBN 978--3--642--39798--1. Google Scholar
Digital Library
- L. D'Antoni and M. Veanes. Minimization of symbolic automata. In S. Jagannathan and P. Sewell, editors, POPL, pages 541--554. ACM, 2014. ISBN 978--1--4503--2544--8. Google Scholar
Digital Library
- V. D'Silva. Widening for automata. Diploma Thesis, Institut Fur Informatick, Universitat Zurich, 2006.Google Scholar
- V. D'Silva, L. Haller, and D. Kroening. Abstract satisfaction. In S. Jagannathan and P. Sewell, editors, POPL, pages 139--150. ACM, 2014. ISBN 978--1--4503--2544--8. Google Scholar
Digital Library
- H. Flake. Structural comparison of executable objects. In U. Flegel and M. Meier, editors, DIMVA, volume 46 of LNI, pages 161--173. GI, 2004. ISBN 3--88579--375-X.Google Scholar
- C. Flanagan and S. Qadeer. Predicate abstraction for software verification. In Proc. of Conf. Record of the 29th ACM Symp. on Principles of Programming Languagesrm ( POPL '02), pages 191--202. ACM Press, 2002. Google Scholar
Digital Library
- T. L. Gall and B. Jeannet. Lattice automata: A representation for languages on infinite alphabets, and some applications to verification. In H. R. Nielson and G. Filé, editors, SAS, volume 4634 of Lecture Notes in Computer Science, pages 52--68. Springer, 2007. ISBN 978-3-540-74060-5. Google Scholar
Digital Library
- D. Gao, M. Reiter, and D. Song. BinHunt: Automatically finding semantic differences in binary programs. In Proceedings of the 10th International Conference on Information and Communications Security, ICICS '08, pages 238--255. Springer-Verlag, 2008. Google Scholar
Digital Library
- R. Giacobazzi, F. Ranzato, and F. Scozzari. Making abstract interpretation complete. Journal of the ACM, 47 (2): 361--416, March 2000. Google Scholar
Digital Library
- S. Hunt and I. Mastroeni. The PER model of abstract non-interference. In C. Hankin and I. Siveroni, editors, Proc.\ of The 12th Internat.\ Static Analysis Symp.\ (SAS '05), volume 3672 of Lecture Notes in Computer Science, pages 171--185. Springer-Verlag, 2005. Google Scholar
Digital Library
- A. Lakhotia, M. Dalla Preda, and R. Giacobazzi. Fast location of similar code fragments using semantic 'juice'. In 2nd Workshop on Program Protection and Reverse Engineering PPREW 2013. ACM, 2013. Google Scholar
Digital Library
- I. Mastroeni and R. Giacobazzi. An abstract interpretation-based model for safety semantics. Int. J. Comput. Math., 88 (4): 665--694, 2011. Google Scholar
Digital Library
- T. W. Reps, S. Sagiv, and G. Yorsh. Symbolic implementation of the best transformer. In B. Steffen and G. Levi, editors, VMCAI, volume 2937 of Lecture Notes in Computer Science, pages 252--266. Springer, 2004. ISBN 3--540--20803--8.Google Scholar
- H. Rogers. Theory of recursive functions and effective computability. The MIT press, 1992. Google Scholar
Digital Library
- A. V. Thakur, M. Elder, and T. W. Reps. Bilateral algorithms for symbolic abstraction. In A. Miné and D. Schmidt, editors, SAS, volume 7460 of Lecture Notes in Computer Science, pages 111--128. Springer, 2012. ISBN 978--3--642--33124--4. Google Scholar
Digital Library
- M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, and N. Bjørner. Symbolic finite state transducers: algorithms and applications. In J. Field and M. Hicks, editors, POPL, pages 137--150. ACM, 2012. ISBN 978--1--4503--1083--3. Google Scholar
Digital Library
- M. Ward. The Closure Operators of a Lattice. Annals of Mathematics, 43 (2): 191--196, 1942.Google Scholar
Cross Ref
- Zynamics. BinDiff3.2manual., 2004. URL http://www.zynamics.com/bindiff/manual/.Google Scholar
Index Terms
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables
Recommendations
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of executables
We introduce a model for mixed syntactic/semantic approximation of programs based on symbolic finite automata (SFA). The edges of SFA are labeled by predicates whose semantics specifies the denotations that are allowed by the edge. We introduce the ...







Comments