skip to main content
research-article
Open Access
Artifacts Available
Artifacts Evaluated & Functional

Precise and scalable points-to analysis via data-driven context tunneling

Published:24 October 2018Publication History
Skip Abstract Section

Abstract

We present context tunneling, a new approach for making k-limited context-sensitive points-to analysis precise and scalable. As context-sensitivity holds the key to the development of precise and scalable points-to analysis, a variety of techniques for context-sensitivity have been proposed. However, existing approaches such as k-call-site-sensitivity or k-object-sensitivity have a significant weakness that they unconditionally update the context of a method at every call-site, allowing important context elements to be overwritten by more recent, but not necessarily more important, context elements. In this paper, we show that this is a key limiting factor of existing context-sensitive analyses, and demonstrate that remarkable increase in both precision and scalability can be gained by maintaining important context elements only. Our approach, called context tunneling, updates contexts selectively and decides when to propagate the same context without modification.

We attain context tunneling via a data-driven approach. The effectiveness of context tunneling is very sensitive to the choice of important context elements. Even worse, precision is not monotonically increasing with respect to the ordering of the choices. As a result, manually coming up with a good heuristic rule for context tunneling is extremely challenging and likely fails to maximize its potential. We address this challenge by developing a specialized data-driven algorithm, which is able to automatically search for high-quality heuristics over the non-monotonic space of context tunneling.

We implemented our approach in the Doop framework and applied it to four major flavors of context-sensitivity: call-site-sensitivity, object-sensitivity, type-sensitivity, and hybrid context-sensitivity. In all cases, 1-context-sensitive analysis with context tunneling far outperformed deeper context-sensitivity with k=2 in both precision and scalability.

Skip Supplemental Material Section

Supplemental Material

a140-jeon.webm

References

  1. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 259–269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dzintars Avots, Michael Dalton, V. Benjamin Livshits, and Monica S. Lam. 2005. Improving Software Security with a C Pointer Analysis. In Proceedings of the 27th International Conference on Software Engineering (ICSE ’05). ACM, New York, NY, USA, 332–341. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sam Blackshear, Bor-Yuh Evan Chang, and Manu Sridharan. 2015. Selective Control-flow Abstraction via Jumping. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 163–182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’09). ACM, New York, NY, USA, 243–262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sooyoung Cha, Sehun Jeong, and Hakjoo Oh. 2016. Learning a Strategy for Choosing Widening Thresholds from a Large Codebase. Springer International Publishing, Cham, 25–41.Google ScholarGoogle Scholar
  6. Kwonsoo Chae, Hakjoo Oh, Kihong Heo, and Hongseok Yang. 2017. Automatically Generating Features for Learning Program Analysis Heuristics. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ramkrishna Chatterjee, Barbara G. Ryder, and William A. Landi. 1999. Relevant Context Inference. In Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’99). ACM, New York, NY, USA, 133–146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stephen J. Fink, Eran Yahav, Nurit Dor, G. Ramalingam, and Emmanuel Geay. 2008. Effective Typestate Verification in the Presence of Aliasing. ACM Trans. Softw. Eng. Methodol. 17, 2, Article 9 (May 2008), 34 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qing Gao, Yingfei Xiong, Yaqing Mi, Lu Zhang, Weikun Yang, Zhaoping Zhou, Bing Xie, and Hong Mei. 2015. Safe Memoryleak Fixing for C Programs. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 459–470. http://dl.acm.org/citation.cfm?id=2818754.2818812 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Behnaz Hassanshahi, Raghavendra Kagalavadi Ramesh, Padmanabhan Krishnan, Bernhard Scholz, and Yi Lu. 2017. An Efficient Tunable Selective Points-to Analysis for Large Codebases. In Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (SOAP 2017). ACM, New York, NY, USA, 13–18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kihong Heo, Hakjoo Oh, and Hongseok Yang. 2016. Learning a Variable-Clustering Strategy for Octagon from Labeled Data Generated by a Static Analysis. Springer Berlin Heidelberg, Berlin, Heidelberg, 237–256.Google ScholarGoogle Scholar
  12. Kihong Heo, Hakjoo Oh, and Kwangkeun Yi. 2017. Machine-Learning-Guided Selectively Unsound Static Analysis. In Proceedings of the 39th International Conference on Software Engineering. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael Hind. 2001. Pointer Analysis: Haven’t We Solved This Problem Yet?. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’01). ACM, New York, NY, USA, 54–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sehun Jeong, Minseok Jeon, Sungdeok Cha, and Hakjoo Oh. 2017. Data-Driven Context-Sensitivity for Points-to Analysis. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Bageshri Karkare and Uday P. Khedker. 2007. An Improved Bound for Call Strings Based Interprocedural Analysis of Bit Vector Frameworks. ACM Trans. Program. Lang. Syst. 29, 6, Article 38 (Oct. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, and Ben Hardekopf. 2014. JSAI: A Static Analysis Platform for JavaScript. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 121–132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. George Kastrinis and Yannis Smaragdakis. 2013. Hybrid Context-sensitivity for Points-to Analysis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA, 423–434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Uday P. Khedker and Bageshri Karkare. 2008. Efficiency, Precision, Simplicity, and Generality in Interprocedural Data Flow Analysis: Resurrecting the Classical Call Strings Method. In Proceedings of the Joint European Conferences on Theory and Practice of Software 17th International Conference on Compiler Construction (CC’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 213–228. http://dl.acm.org/citation.cfm?id=1788374.1788394 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Junhee Lee, Seongjoon Hong, and Hakjoo Oh. 2018. MemFix: Static Analysis-Based Repair of Memory Deallocation Errors for C. In The 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ondřej Lhoták and Laurie Hendren. 2006. Context-Sensitive Points-to Analysis: Is It Worth It?. In Proceedings of the 15th International Conference on Compiler Construction (CC’06). Springer-Verlag, Berlin, Heidelberg, 47–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ondřej Lhoták and Laurie Hendren. 2008. Evaluating the Benefits of Context-sensitive Points-to Analysis Using a BDD-based Implementation. ACM Trans. Softw. Eng. Methodol. 18, 1, Article 3 (Oct. 2008), 53 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Donglin Liang and Mary Jean Harrold. 1999. Efficient Points-to Analysis for Whole-program Analysis. In Proceedings of the 7th European Software Engineering Conference Held Jointly with the 7th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-7). Springer-Verlag, London, UK, UK, 199–215. http://dl.acm.org/citation. cfm?id=318773.318943 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Donglin Liang, Maikel Pennings, and Mary Jean Harrold. 2005. Evaluating the Impact of Context-sensitivity on Andersen’s Algorithm for Java Programs. In Proceedings of the 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’05). ACM, New York, NY, USA, 6–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Percy Liang and Mayur Naik. 2011. Scaling Abstraction Refinement via Pruning. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, New York, NY, USA, 590–601. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Percy Liang, Omer Tripp, and Mayur Naik. 2011. Learning Minimal Abstractions. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 31–42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Benjamin Livshits and Monica S. Lam. 2003. Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. In Proceedings of the 9th European Software Engineering Conference Held Jointly with 11th ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE-11). ACM, New York, NY, USA, 317–326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Matthew Might, Yannis Smaragdakis, and David Van Horn. 2010. Resolving and Exploiting the k-CFA Paradox: Illuminating Functional vs. Object-oriented Program Analysis. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). ACM, New York, NY, USA, 305–315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2002. Parameterized Object Sensitivity for Points-to and Side-effect Analyses for Java. In Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’02). ACM, New York, NY, USA, 1–11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized Object Sensitivity for Points-to Analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (Jan. 2005), 1–41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mayur Naik, Alex Aiken, and John Whaley. 2006. Effective Static Race Detection for Java. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’06). ACM, New York, NY, USA, 308–319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hakjoo Oh, Wonchan Lee, Kihong Heo, Hongseok Yang, and Kwangkeun Yi. 2014. Selective Context-sensitivity Guided by Impact Pre-analysis. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 475–484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hakjoo Oh, Hongseok Yang, and Kwangkeun Yi. 2015. Learning a Strategy for Adapting a Program Analysis via Bayesian Optimisation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 572–588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Rohan Padhye and Uday P. Khedker. 2013. Interprocedural data flow analysis in Soot using value contexts. In Proceedings of the 2nd ACM SIGPLAN International Workshop on State Of the Art in Java Program analysis, SOAP 2013, Seattle, WA, USA, June 20, 2013. 31–36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Changhee Park and Sukyoung Ryu. 2015. Scalable and Precise Static Analysis of JavaScript Applications via LoopSensitivity. In 29th European Conference on Object-Oriented Programming (ECOOP 2015) (Leibniz International Proceedings in Informatics (LIPIcs)), John Tang Boyland (Ed.), Vol. 37. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 735–756.Google ScholarGoogle Scholar
  35. Micha Sharir and Amir Pnueli. 1981. Two approaches to interprocedural data flow analysis. Prentice-Hall, Englewood Cliffs, NJ, Chapter 7, 189–234.Google ScholarGoogle Scholar
  36. O. Shivers. 1988. Control Flow Analysis in Scheme. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation (PLDI ’88). ACM, New York, NY, USA, 164–174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Gagandeep Singh, Markus Püschel, and Martin Vechev. 2018. Fast Numerical Program Analysis with Reinforcement Learning. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 211–229.Google ScholarGoogle Scholar
  38. Yannis Smaragdakis and George Balatsouras. 2015. Pointer Analysis. Found. Trends Program. Lang. 2, 1 (April 2015), 1–69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yannis Smaragdakis, Martin Bravenboer, and Ondrej Lhoták. 2011. Pick Your Contexts Well: Understanding Object-sensitivity. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 17–30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yannis Smaragdakis, George Kastrinis, and George Balatsouras. 2014. Introspective Analysis: Context-sensitivity, Across the Board. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 485–495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Y. Sui, D. Ye, and J. Xue. 2014. Detecting Memory Leaks Statically with Full-Sparse Value-Flow Analysis. IEEE Transactions on Software Engineering 40, 2 (Feb 2014), 107–122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tian Tan, Yue Li, and Jingling Xue. 2016. Making k-Object-Sensitive Pointer Analysis More Precise with Still k-Limiting. In Static Analysis, Xavier Rival (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 489–510.Google ScholarGoogle Scholar
  43. Tian Tan, Yue Li, and Jingling Xue. 2017. Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 278–291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Rei Thiessen and Ondřej Lhoták. 2017. Context Transformations for Pointer Analysis. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 263–277. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, and Omri Weisman. 2009. TAJ: Effective Taint Analysis of Web Applications. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 87–97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Shiyi Wei and Barbara G. Ryder. 2015. Adaptive Context-sensitive Analysis for JavaScript. In 29th European Conference on Object-Oriented Programming, ECOOP 2015, July 5-10, 2015, Prague, Czech Republic. 712–734.Google ScholarGoogle Scholar
  47. John Whaley and Monica S. Lam. 2004. Cloning-based Context-sensitive Pointer Alias Analysis Using Binary Decision Diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI ’04). ACM, New York, NY, USA, 131–144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Robert P. Wilson and Monica S. Lam. 1995. Efficient Context-sensitive Pointer Analysis for C Programs. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI ’95). ACM, New York, NY, USA, 1–12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Guoqing Xu and Atanas Rountev. 2008. Merging Equivalent Contexts for Scalable Heap-cloning-based Context-sensitive Points-to Analysis. In Proceedings of the 2008 International Symposium on Software Testing and Analysis (ISSTA ’08). ACM, New York, NY, USA, 225–236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Hua Yan, Yulei Sui, Shiping Chen, and Jingling Xue. 2017. Machine-Learning-Guided Typestate Analysis for Static UseAfter-Free Detection. In Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC 2017). ACM, New York, NY, USA, 42–54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Xin Zhang, Ravi Mangal, Radu Grigore, Mayur Naik, and Hongseok Yang. 2014. On Abstraction Refinement for Program Analyses in Datalog. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 239–248. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Precise and scalable points-to analysis via data-driven context tunneling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!