Abstract
Inclusion-based points-to analysis provides a good trade-off between precision of results and speed of analysis, and it has been incorporated into several production compilers including gcc. There is an extensive literature on how to speed up this algorithm using heuristics such as detecting and collapsing cycles of pointer-equivalent variables. This paper describes a complementary approach based on exploiting parallelism. Our implementation exploits two key insights. First, we show that inclusion-based points-to analysis can be formulated entirely in terms of graphs and graph rewrite rules. This exposes the amorphous data-parallelism in this algorithm and makes it easier to develop a parallel implementation. Second, we show that this graph-theoretic formulation reveals certain key properties of the algorithm that can be exploited to obtain an efficient parallel implementation. Our parallel implementation achieves a scaling of up to 3x on a 8-core machine for a suite of ten large C programs. For all but the smallest benchmarks, the parallel analysis outperforms a state-of-the-art, highly optimized, serial implementation of the same algorithm. To the best of our knowledge, this is the first parallel implementation of a points-to analysis.
- }}Galois website. http://iss.ices.utexas.edu/galois/.Google Scholar
- }}A. Aho, R. Sethi,, and J. Ullman. Compilers: principles, techniques, and tools. Addison Wesley, 1986. Google Scholar
Digital Library
- }}L. O. Andersen. Program Analysis and Specialization for the C Programming Language. PhD thesis, DIKU, University of Copenhagen, May 1994. (DIKU report 94/19).Google Scholar
- }}Marc Berndl, Ondrej Lhotak, Feng Qian, Laurie Hendren, and Navindra Umanee. Points-to analysis using BDDs. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pages 103--114, New York, NY, USA, 2003. ACM. Google Scholar
Digital Library
- }}Randal E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers, 35:677--691, 1986. Google Scholar
Digital Library
- }}H. Ehrig and M. Lowe. Parallel and distributed derivations in the single-pushout approach. Theoretical Computer Science, 109:123--143, 1993. Google Scholar
Digital Library
- }}Manuel Fahndrich, Jeffrey S. Foster, Zhendong Su, and Alexander Aiken. Partial online cycle elimination in inclusion constraint graphs. In PLDI '98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, pages 85--96, New York, NY, USA, 1998. ACM. Google Scholar
Digital Library
- }}Ben Hardekopf and Calvin Lin. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In PLDI, 2007. Google Scholar
Digital Library
- }}Ben Hardekopf and Calvin Lin. Exploiting pointer and location equivalence to optimize pointer analysis. In SAS, pages 265--280, 2007. Google Scholar
Digital Library
- }}Nevin Heintze and Olivier Tardieu. Ultra-fast aliasing analysis using cla: a million lines of c code in a second. SIGPLAN Not., 36(5):254--263, 2001. Google Scholar
Digital Library
- }}Maurice Herlihy and Eric Koskinen. Transactional boosting: a methodology for highly-concurrent transactional objects. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 207--216, New York, NY, USA, 2008. ACM. Google Scholar
Digital Library
- }}Maurice Herlihy and J. Eliot B. Moss. Transactional memory: architectural support for lock-free data structures. In ISCA, 1993. Google Scholar
Digital Library
- }}Michael Hind. Pointer analysis: haven't we solved this problem yet? In PASTE '01: Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, pages 54--61, New York, NY, USA, 2001. ACM. Google Scholar
Digital Library
- }}Vineet Kahlon. Bootstrapping: a technique for scalable flow and context-sensitive pointer alias analysis. In PLDI '08: Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, pages 249--259, New York, NY, USA, 2008. ACM. Google Scholar
Digital Library
- }}Ken Kennedy and John Allen, editors. Optimizing compilers for modern architectures: a dependence-based approach. Morgan Kaufmann, 2001. Google Scholar
Digital Library
- }}J. W. Klop, Marc Bezem, and R. C. De Vrijer, editors. Term Rewriting Systems. Cambridge University Press, New York, NY, USA, 2001. Google Scholar
Digital Library
- }}Venkata Krishnan and Josep Torrellas. A chip-multiprocessor architecture with speculative multithreading. IEEE Trans. Comput., 48(9):866--880, 1999. Google Scholar
Digital Library
- }}M. Kulkarni, K. Pingali, B. Walter, G. Ramanarayanan, K. Bala, and L. P. Chew. Optimistic parallelism requires abstractions. SIGPLAN Not. (Proceedings of PLDI 2007), 42(6):211--222, 2007. Google Scholar
Digital Library
- }}Milind Kulkarni, Martin Burtscher, Rajasekhar Inkulu, Keshav Pingali, and Calin Casc¸aval. How much parallelism is there in irregular applications? In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 3--14, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- }}Milind Kulkarni, Keshav Pingali, Ganesh Ramanarayanan, Bruce Walter, Kavita Bala, and L. Paul Chew. Optimistic parallelism benefits from data partitioning. SIGARCH Comput. Archit. News, 36(1):233--243, 2008. Google Scholar
Digital Library
- }}Chris Lattner and Vikram Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), Palo Alto, California, Mar 2004. Google Scholar
Digital Library
- }}Jorn Lind-Nielsen. Buddy, a Binary Decision Diagram package. http://www.itu.dk/research/buddy/. Department of Information Technology, Technical University of Denmark.Google Scholar
- }}Mario Mendez-Lojo, Donald Nguyen, Dimitrios Prountzos, Xin Sui, M. Amber Hassaan, Milind Kulkarni, Martin Burtscher, and Keshav Pingali. Structure-driven optimizations for amorphous data-parallel programs. In Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 3--14, January 2010. Google Scholar
Digital Library
- }}Flemming Nielson, Hanne R. Nielson, and Chris Hankin. Principles of Program Analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1999. Google Scholar
Digital Library
- }}Fernando Magno Quintao Pereira and Daniel Berlin. Wave propagation and deep propagation for pointer analysis. In CGO '09: Proceedings of the 2009 International Symposium on Code Generation and Optimization, pages 126--135,Washington, DC, USA, 2009. IEEE Computer Society. Google Scholar
Digital Library
- }}Keshav Pingali, Milind Kulkarni, Donald Nguyen, Martin Burtscher, Mario Mendez-Lojo, Dimitrios Prountzos, Xin Sui, and Zifei Zhong. Amorphous data parallelism in irregular algorithms. regular tech report TR-09-05, The University of Texas at Austin, 2009.Google Scholar
- }}C. D. Polychronopoulos and D. J. Kuck. Guided selfscheduling: A practical scheduling scheme for parallel supercomputers. IEEE Trans. Comput., 36(12):1425--1439, 1987. Google Scholar
Digital Library
- }}Feng Qian. SableJBDD, a Java Binary Decision Diagram Package. http://www.sable.mcgill.ca/~fqian/ SableJBDD/.Google Scholar
- }}Atanas Rountev and Satish Chandra. Off-line variable substitution for scaling points-to analysis. In PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, pages 47--56, New York, NY, USA, 2000. ACM. Google Scholar
Digital Library
- }}Erik Ruf. Partitioning dataflow analyses using types. In POPL '97: Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 15--26, New York, NY, USA, 1997. ACM. Google Scholar
Digital Library
- }}Craig Silverstein. Google Performance Tools. http://code.google.com/p/google-perftools/.Google Scholar
- }}Bjarne Steensgaard. Points-to analysis in almost linear time. In POPL '96: Proceedings of the 23rd ACM SIGPLANSIGACT symposium on Principles of programming languages, pages 32--41, New York, NY, USA, 1996. ACM. Google Scholar
Digital Library
- }}H°akan Sundell and Philippas Tsigas. Lock-free deques and doubly linked lists. J. Parallel Distrib. Comput., 68(7):1008--1020, 2008. Google Scholar
Digital Library
- }}Robert Endre Tarjan. Efficiency of a good but not linear set union algorithm. J. ACM, 22(2):215--225, 1975. Google Scholar
Digital Library
- }}John Whaley and Monica S. Lam. Cloning-based contextsensitive pointer alias analysis using binary decision diagrams. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 131--144, New York, NY, USA, 2004. ACM. Google Scholar
Digital Library
- }}Sean Zhang, Barbara G. Ryder, and William Landi. Program decomposition for pointer aliasing: a step toward practical analyses. SIGSOFT Softw. Eng. Notes, 21(6):81--92, 1996. Google Scholar
Digital Library
Index Terms
Parallel inclusion-based points-to analysis
Recommendations
Parallel inclusion-based points-to analysis
OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsInclusion-based points-to analysis provides a good trade-off between precision of results and speed of analysis, and it has been incorporated into several production compilers including gcc. There is an extensive literature on how to speed up this ...
A GPU implementation of inclusion-based points-to analysis
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingGraphics Processing Units (GPUs) have emerged as powerful accelerators for many regular algorithms that operate on dense arrays and matrices. In contrast, we know relatively little about using GPUs to accelerate highly irregular algorithms that operate ...
A GPU implementation of inclusion-based points-to analysis
PPOPP '12Graphics Processing Units (GPUs) have emerged as powerful accelerators for many regular algorithms that operate on dense arrays and matrices. In contrast, we know relatively little about using GPUs to accelerate highly irregular algorithms that operate ...







Comments