skip to main content
research-article
Open Access

TreeFuser: a framework for analyzing and fusing general recursive tree traversals

Published:12 October 2017Publication History
Skip Abstract Section

Abstract

Series of traversals of tree structures arise in numerous contexts: abstract syntax tree traversals in compiler passes, rendering traversals of the DOM in web browsers, kd-tree traversals in computational simulation codes. In each of these settings, a tree is traversed multiple times to compute various values and modify various portions of the tree. While it is relatively easy to write these traversals as separate small updates to the tree, for efficiency reasons, traversals are often manually fused to reduce the number of times that each portion of the tree is traversed: by performing multiple operations on the tree simultaneously, each node of the tree can be visited fewer times, increasing opportunities for optimization and decreasing cache pressure and other overheads. This fusion process is often done manually, requiring careful understanding of how each of traversals of the tree interact. This paper presents an automatic approach to traversal fusion: tree traversals can be written independently, and then our framework analyzes the dependences between the traversals to determine how they can be fused to reduce the number of visits to each node in the tree. A critical aspect of our framework is that it exploits two opportunities to increase the amount of fusion: i) it automatically integrates code motion, and ii) it supports partial fusion, where portions of one traversal can be fused with another, allowing for a reduction in node visits without requiring that two traversals be fully fused. We implement our framework in Clang, and show across several case studies that we can successfully fuse complex tree traversals, reducing the overall number of traversals and substantially improving locality and performance.

References

  1. Pierre Amiranoff, Albert Cohen, and Paul Feautrier. 2006. Beyond Iteration Vectors: Instancewise Relational Abstract Domains. In Proceedings of the 13th International Conference on Static Analysis (SAS’06). Springer-Verlag, Berlin, Heidelberg, 161–180. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’08). ACM, New York, NY, USA, 101–113. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Wei-Ngan Chin. 1992. Safe Fusion of Functional Expressions. In Proceedings of the 1992 ACM Conference on LISP and Functional Programming (LFP ’92). ACM, New York, NY, USA, 11–20. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Albert Cohen and Jean-François Collard. 1998. Instance-Wise Reaching Definition Analysis for Recursive Programs Using Context-Free Transductions. In Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT ’98). IEEE Computer Society, Washington, DC, USA, 332–. http://dl.acm.org/citation.cfm?id=522344. 825716Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. 2007. Tree Automata Techniques and Applications. Available on: http://www.grappa.univ- lille3.fr/tata . (2007). release October, 12th 2007.Google ScholarGoogle Scholar
  6. Loris D’Antoni, Margus Veanes, Benjamin Livshits, and David Molnar. 2014. Fast: A Transducer-based Language for Tree Manipulation. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 384–394. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Alain Darte. 1999. On the Complexity of Loop Fusion. In Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques (PACT ’99). IEEE Computer Society, Washington, DC, USA, 149–. http: //dl.acm.org/citation.cfm?id=520793.825721 Google ScholarGoogle ScholarCross RefCross Ref
  8. John Doner. 1970. Tree Acceptors and Some of Their Applications. J. Comput. Syst. Sci. 4, 5 (Oct. 1970), 406–451. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Torbjörn Ekman and Görel Hedin. 2007. The Jastadd Extensible Java Compiler. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications (OOPSLA ’07). ACM, New York, NY, USA, 1–18. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Joost Engelfriet. 1975. Bottom-up and top-down tree transformations— a comparison. Mathematical systems theory 9, 2 (01 Jun 1975), 198–231. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  11. Joost Engelfriet and Sebastian Maneth. 2002. Output String Languages of Compositions of Deterministic Macro Tree Transducers. J. Comput. Syst. Sci. 64, 2 (March 2002), 350–395. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Joost Engelfriet and Heiko Vogler. 1985. Macro tree transducers. J. Comput. System Sci. 31, 1 (1985), 71 – 146. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  13. N. Engheta, W. D. Murphy, V. Rokhlin, and M. S. Vassiliou. 1992. The fast multipole method (FMM) for electromagnetic scattering problems. IEEE Transactions on Antennas and Propagation 40, 6 (Jun 1992), 634–641. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Farrow, K. Kennedy, and L. Zucconi. 1976. Graph Grammars and Global Program Data Flow Analysis. In Proceedings of the 17th Annual Symposium on Foundations of Computer Science (SFCS ’76). IEEE Computer Society, Washington, DC, USA, 42–56. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Paul Feautrier. 1992a. Some Efficient Solutions to the Affine Scheduling Problem: I. One-dimensional Time. Int. J. Parallel Program. 21, 5 (Oct. 1992), 313–348. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Paul Feautrier. 1992b. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. International Journal of Parallel Programming 21, 6 (01 Dec 1992), 389–420. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  17. Paul Feautrier. 1998. A parallelization framework for recursive tree programs. In Euro-Par’98 Parallel Processing: 4th International Euro-Par Conference Southampton, UK, September 1–4, 1998 Proceedings, David Pritchard and Jeff Reeve (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 470–479. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  18. Rakesh Ghiya and Laurie J. Hendren. 1996. Is It a Tree, a DAG, or a Cyclic Graph? A Shape Analysis for Heap-directed Pointers in C. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’96). ACM, New York, NY, USA, 1–15. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rakesh Ghiya, Laurie J. Hendren, and Yingchun Zhu. 1998. Detecting Parallelism in C Programs with Recursive Darta Structures. In Proceedings of the 7th International Conference on Compiler Construction (CC ’98). Springer-Verlag, London, UK, UK, 159–173. http://dl.acm.org/citation.cfm?id=647474.727598 Google ScholarGoogle ScholarCross RefCross Ref
  20. Andrew Gill, John Launchbury, and Simon L. Peyton Jones. 1993. A Short Cut to Deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (FPCA ’93). ACM, New York, NY, USA, 223–232. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Greengard and V. Rokhlin. 1987. A Fast Algorithm for Particle Simulations. J. Comput. Phys. 73, 2 (Dec. 1987), 325–348. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Robert J. Harrison, George I. Fann, Takeshi Yanai, Zhengting Gan, and Gregory Beylkin. 2004. Multiresolution quantum chemistry: Basic theory and initial applications. The Journal of Chemical Physics 121, 23 (2004), 11587–11598. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  23. N. Hegde, J. Liu, K. Sundararajah, and M. Kulkarni. 2017. Treelogy: A benchmark suite for tree traversals. In 2017 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 227–238. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  24. Ralf Hinze, Thomas Harper, and Daniel W. H. James. 2011. Theory and Practice of Fusion. In Proceedings of the 22Nd International Conference on Implementation and Application of Functional Languages (IFL’10). Springer-Verlag, Berlin, Heidelberg, 19–37. http://dl.acm.org/citation.cfm?id=2050135.2050137 Google ScholarGoogle ScholarCross RefCross Ref
  25. Joseph Hummel, Laurie J. Hendren, and Alexandru Nicolau. 1994. A General Data Dependence Test for Dynamic, Pointerbased Data Structures. In Proceedings of the ACM SIGPLAN 1994 Conference on Programming Language Design and Implementation (PLDI ’94). ACM, New York, NY, USA, 218–229. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ken Kennedy and Kathryn S. McKinley. 1994. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing. Springer-Verlag, London, UK, UK, 301–320. http://dl.acm.org/citation.cfm?id=645671.665526 Google ScholarGoogle ScholarCross RefCross Ref
  27. Donald E. Knuth. 1968. Semantics of context-free languages. Mathematical systems theory 2, 2 (01 Jun 1968), 127–145. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  28. J. R. Larus and P. N. Hilfinger. 1988. Detecting Conflicts Between Structure Accesses. In Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language Design and Implementation (PLDI ’88). ACM, New York, NY, USA, 24–31. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Parthasarathy Madhusudan, Xiaokang Qiu, and Andrei Stefanescu. 2012. Recursive Proofs for Inductive Tree Data-structures. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12). ACM, New York, NY, USA, 123–136. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Andreas Maletti. 2008. Compositions of Extended Top-down Tree Transducers. Inf. Comput. 206, 9-10 (Sept. 2008), 1187–1196. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. MÃČÂşnica MartÃČÂŋnez and Alberto Pardo. 2013. A shortcut fusion approach to accumulations. Science of Computer Programming 78, 8 (2013), 1121 – 1136. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. 1970. Evaluation Techniques for Storage Hierarchies. IBM Syst. J. 9, 2 (June 1970), 78–117. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Erik Meijer and Johan Jeuring. 1995. Merging Monads and Folds for Functional Programming. In Advanced Functional Programming, First International Spring School on Advanced Functional Programming Techniques-Tutorial Text. SpringerVerlag, London, UK, UK, 228–266. http://dl.acm.org/citation.cfm?id=647698.734152 Google ScholarGoogle ScholarCross RefCross Ref
  34. Leo A. Meyerovich and Rastislav Bodik. 2010. Fast and Parallel Webpage Layout. In Proceedings of the 19th International Conference on World Wide Web (WWW ’10). ACM, New York, NY, USA, 711–720. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Leo A. Meyerovich, Matthew E. Torok, Eric Atkinson, and Rastislav Bodik. 2013. Parallel Schedule Synthesis for Attribute Grammars. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13). ACM, New York, NY, USA, 187–196. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Dmitry Petrashko, Ondřej Lhoták, and Martin Odersky. 2017. Miniphases: Compilation Using Modular and Efficient Tree Transformations. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 201–216. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Apan Qasem and Ken Kennedy. 2006. Profitable Loop Fusion and Tiling Using Model-driven Empirical Search. In Proceedings of the 20th Annual International Conference on Supercomputing (ICS ’06). ACM, New York, NY, USA, 249–258. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noel Pouchet, Fabrice Rastello, Robert J. Harrison, and P. Sadayappan. 2016a. A Domain-specific Compiler for a Parallel Multiresolution Adaptive Numerical Simulation Environment. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’16). IEEE Press, Piscataway, NJ, USA, Article 40, 12 pages. http://dl.acm.org/citation.cfm?id=3014904.3014958Google ScholarGoogle Scholar
  39. Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, Robert J. Harrison, and P. Sadayappan. 2016b. On Fusing Recursive Traversals of K-d Trees. In Proceedings of the 25th International Conference on Compiler Construction (CC 2016). ACM, New York, NY, USA, 152–162. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. V Rokhlin. 1985. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 60, 2 (1985), 187 – 207. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  41. Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Jovanovic, HyoukJoong Lee, Manohar Jonnalagedda, Kunle Olukotun, and Martin Odersky. 2013. Optimizing Data Structures in High-level Programs: New Directions for Extensible Compilers Based on Staging. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’13). ACM, New York, NY, USA, 497–510. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Radu Rugina and Martin C. Rinard. 2005. Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions. ACM Trans. Program. Lang. Syst. 27, 2 (March 2005), 185–235. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Mooly Sagiv, Thomas Reps, and Reinhard Wilhelm. 1999. Parametric Shape Analysis via 3-valued Logic. In Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’99). ACM, New York, NY, USA, 105–118. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Philip Wadler. 1988. Deforestation: Transforming Programs to Eliminate Trees. In Proceedings of the Second European Symposium on Programming. North-Holland Publishing Co., Amsterdam, The Netherlands, The Netherlands, 231–248. http://dl.acm.org/citation.cfm?id=80098.80104Google ScholarGoogle ScholarCross RefCross Ref
  45. Yusheng Weijiang, Shruthi Balakrishna, Jianqiao Liu, and Milind Kulkarni. 2015. Tree Dependence Analysis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 314–325. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ben Wiedermann and William R. Cook. 2007. Extracting Queries by Static Analysis of Transparent Persistence. In Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’07). ACM, New York, NY, USA, 199–210. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TreeFuser: a framework for analyzing and fusing general recursive tree traversals

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
        October 2017
        1786 pages
        EISSN:2475-1421
        DOI:10.1145/3152284
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 October 2017
        Published in pacmpl Volume 1, Issue OOPSLA

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!