Abstract
When analyzing programs, large libraries pose significant challenges to static points-to analysis. A popular solution is to have a human analyst provide points-to specifications that summarize relevant behaviors of library code, which can substantially improve precision and handle missing code such as native code. We propose Atlas, a tool that automatically infers points-to specifications. Atlas synthesizes unit tests that exercise the library code, and then infers points-to specifications based on observations from these executions. Atlas automatically infers specifications for the Java standard library, and produces better results for a client static information flow analysis on a benchmark of 46 Android apps compared to using existing handwritten specifications.
Supplemental Material
- Aws Albarghouthi, Isil Dillig, and Arie Gurfinkel. 2016. Maximal specification synthesis. In POPL. Google Scholar
Digital Library
- Karim Ali and Ondrej Lhoták. 2013. Averroes: Whole-program analysis without the whole program. In ECOOP. Google Scholar
Digital Library
- Rajeev Alur, Pavol Cerny, Parthasarathy Madhusudan, and Wonhong Nam. 2005. Synthesis of interface specifications for Java classes. In POPL. Google Scholar
Digital Library
- Glenn Ammons, Rastislav Bodík, and James R Larus. 2002. Mining specifications. In POPL. Google Scholar
Digital Library
- Lars Ole Andersen. 1994. Program analysis and specialization for the C programming language. Ph.D. Dissertation. University of Cophenhagen.Google Scholar
- Dana Angluin. 1987. Learning regular sets from queries and counterexamples. Information and computation (1987). Google Scholar
Digital Library
- Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In PLDI. Google Scholar
Digital Library
- Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Interactively verifying absence of explicit information flows in Android apps. In OOPSLA. Google Scholar
Digital Library
- Osbert Bastani, Saswat Anand, and Alex Aiken. 2015. Specification inference using context-free language reachability. In POPL. Google Scholar
Digital Library
- Osbert Bastani, Lazaro Clapp, Saswat Anand, Rahul Sharma, and Alex Aiken. 2017. Eventually Sound Points-To Analysis with Missing Code. arXiv preprint arXiv:1711.03436 (2017).Google Scholar
- Osbert Bastani, Rahul Sharma, Alex Aiken, and Percy Liang. 2017. Synthesizing program input grammars. In PLDI. Google Scholar
Digital Library
- Nels E Beckman and Aditya V Nori. 2011. Probabilistic, modular and scalable inference of typestate specifications. In PLDI. Google Scholar
Digital Library
- Lazaro Clapp, Saswat Anand, and Alex Aiken. 2015. Modelgen: mining explicit information flow specifications from concrete executions. In ISSTA. Google Scholar
Digital Library
- Facebook. 2017. Adding models. (2017). http://fbinfer.com/docs/adding-models.htmlGoogle Scholar
- Manuel Fähndrich, Jeffrey S Foster, Zhendong Su, and Alexander Aiken. 1998. Partial online cycle elimination in inclusion constraint graphs. In PLDI. Google Scholar
Digital Library
- Yu Feng, Saswat Anand, Isil Dillig, and Alex Aiken. 2014. Apposcopy: Semantics-based detection of android malware through static analysis. In FSE. Google Scholar
Digital Library
- Adam P Fuchs, Avik Chaudhuri, and Jeffrey S Foster. 2009. Scandroid: Automated security certification of android. (2009).Google Scholar
- Stefan Heule, Eric Schkufza, Rahul Sharma, and Alex Aiken. 2016. Stratified synthesis: automatically learning the x86-64 instruction set. In PLDI. Google Scholar
Digital Library
- Stefan Heule, Manu Sridharan, and Satish Chandra. 2015. Mimic: Computing models for opaque code. In FSE. Google Scholar
Digital Library
- Jinseong Jeon, Xiaokang Qiu, Jonathan Fetter-Degges, Jeffrey S Foster, and Armando Solar-Lezama. 2016. Synthesizing framework models for symbolic execution. In ICSE. Google Scholar
Digital Library
- Levente Kocsis and Csaba Szepesvári. 2006. Bandit based monte-carlo planning. In ECML. Google Scholar
Digital Library
- John Kodumal and Alex Aiken. 2004. The set constraint/CFL reachability connection in practice. In PLDI. Google Scholar
Digital Library
- John Kodumal and Alexander Aiken. 2005. Banshee: A scalable constraint-based analysis toolkit. In SAS. Google Scholar
Digital Library
- Ted Kremenek, Paul Twohey, Godmar Back, Andrew Ng, and Dawson Engler. 2006. From uncertainty to belief: Inferring the specification within. In OSDI. Google Scholar
Digital Library
- Percy Liang and Mayur Naik. 2011. Scaling abstraction refinement via pruning. In PLDI. Google Scholar
Digital Library
- Benjamin Livshits, Aditya V Nori, Sriram K Rajamani, and Anindya Banerjee. 2009. Merlin: specification inference for explicit information flow problems. In PLDI. Google Scholar
Digital Library
- David Melski and Thomas Reps. 2000. Interconvertibility of a class of set constraints and context-free-language reachability. TCS (2000). Google Scholar
Digital Library
- Ana Milanova, Atanas Rountev, and Barbara G Ryder. 2002. Parameterized object sensitivity for points-to and side-effect analyses for Java. In ISSTA. Google Scholar
Digital Library
- Mayur Naik, Alex Aiken, and John Whaley. 2006. Effective static race detection for Java. In PLDI. Google Scholar
Digital Library
- Jeremy W Nimmer and Michael D Ernst. 2002. Automatic generation of program specifications. In ISSTA. Google Scholar
Digital Library
- José Oncina and Pedro García. 1992. Identifying regular languages in polynomial time. Advances in Structural and Syntactic Pattern Recognition (1992).Google Scholar
- Murali Krishna Ramanathan, Ananth Grama, and Suresh Jagannathan. 2007. Static specification inference using predicate mining. In PLDI. Google Scholar
Digital Library
- Thomas Reps. 1998. Program analysis via graph reachability. Information and software technology (1998).Google Scholar
- Andrei Sabelfeld and Andrew C Myers. 2003. Language-based information-flow security. IEEE Journal on selected areas in communications (2003). Google Scholar
Digital Library
- Rahul Sharma and Alex Aiken. 2014. From invariant checking to invariant inference using randomized search. In CAV. Google Scholar
Digital Library
- Rahul Sharma, Aditya V Nori, and Alex Aiken. 2012. Interpolants as classifiers. In CAV. Google Scholar
Digital Library
- Rahul Sharma, Eric Schkufza, Berkeley Churchill, and Alex Aiken. 2013. Data-driven equivalence checking. In OOPSLA. Google Scholar
Digital Library
- Olin Shivers. 1991. Control-flow analysis of higher-order languages. Ph.D. Dissertation. Citeseer. Google Scholar
Digital Library
- Sharon Shoham, Eran Yahav, Stephen Fink, and Marco Pistoia. 2007. Static specification mining using automata-based abstractions. In ISSTA. Google Scholar
Digital Library
- Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective analysis: context-sensitivity, across the board. In PLDI. Google Scholar
Digital Library
- Manu Sridharan and Rastislav Bodík. 2006. Refinement-based context-sensitive points-to analysis for Java. In PLDI. Google Scholar
Digital Library
- Manu Sridharan, Denis Gopan, Lexin Shan, and Rastislav Bodík. 2005. Demand-driven points-to analysis for Java. In OOPSLA. Google Scholar
Digital Library
- Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot-a Java bytecode optimization framework. In CASCON. Google Scholar
Digital Library
- John Whaley and Monica Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI. Google Scholar
Digital Library
- Robert P Wilson and Monica S Lam. 1995. Efficient context-sensitive pointer analysis for C programs. In PLDI. Google Scholar
Digital Library
- Jinlin Yang, David Evans, Deepali Bhardwaj, Thirumalesh Bhat, and Manuvir Das. 2006. Perracotta: mining temporal API rules from imperfect traces. In ICSE. Google Scholar
Digital Library
- Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On abstraction refinement for program analyses in Datalog. In PLDI. Google Scholar
Digital Library
- Haiyan Zhu, Thomas Dillig, and Isil Dillig. 2013. Automated inference of library specifications for source-sink property verification. In APLAS. Google Scholar
Digital Library
Index Terms
Active learning of points-to specifications
Recommendations
Active learning of points-to specifications
PLDI 2018: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and ImplementationWhen analyzing programs, large libraries pose significant challenges to static points-to analysis. A popular solution is to have a human analyst provide points-to specifications that summarize relevant behaviors of library code, which can substantially ...
"What's in a name?" going beyond allocation site names in heap analysis
ISMM 2017: Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory ManagementA points-to analysis computes a sound abstraction of heap memory conventionally using a name-based abstraction that summarizes runtime memory by grouping locations using the names of allocation sites: All concrete heap locations allocated by the same ...
"What's in a name?" going beyond allocation site names in heap analysis
ISMM '17A points-to analysis computes a sound abstraction of heap memory conventionally using a name-based abstraction that summarizes runtime memory by grouping locations using the names of allocation sites: All concrete heap locations allocated by the same ...







Comments