skip to main content
research-article
Open Access

GLORE: generalized loop redundancy elimination upon LER-notation

Published:12 October 2017Publication History
Skip Abstract Section

Abstract

This paper presents GLORE, a novel approach to enabling the detection and removal of large-scoped redundant computations in nested loops. GLORE works on LER-notation, a new representation of computations in both regular and irregular loops. Together with a set of novel algorithms, it makes GLORE able to systematically consider computation reordering at both the expression level and the loop level in a unified manner. GLORE shows an applicability much broader than prior methods have, and frequently lowers the computational complexities of some nested loops that are elusive to prior optimization techniques, producing significantly larger speedups.

References

  1. R. Allen and K. Kennedy. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers.Google ScholarGoogle Scholar
  2. Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008a. http://pluto-compiler.sourceforge.net.Google ScholarGoogle Scholar
  3. Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. 2008b. A Practical Automatic Polyhedral Program Optimization System. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI).Google ScholarGoogle Scholar
  4. Lam Chi-Chung, P Sadayappan, and Rephael Wenger. 1997. On optimizing a class of multi-dimensional loops with reduction for parallel execution. Parallel Processing Letters 7, 02 (1997), 157–168.Google ScholarGoogle ScholarCross RefCross Ref
  5. Keith Cooper, Jason Eckhardt, and Ken Kennedy. 2008. Redundancy elimination revisited. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques. ACM, 12–21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Steven J Deitz, Bradford L Chamberlain, and Lawrence Snyder. 2001. Eliminating redundancies in sum-of-product array computations. In Proceedings of the 15th international conference on Supercomputing. ACM, 65–77.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yufei Ding, Lin Ning, Hui Guan, and Xipeng Shen. 2017. Generalizations of the theory and deployment of triangular inequality for compiler-based strength reduction. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 33–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Ding, X. Shen, M. Musuvathi, and T. Mytkowicz. 2015. TOP: A Framework for Enabling Algorithmic Optimizations for Distance-Related Problems. In Proceedings of the 41st International Conference on Very Large Data Bases. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jonathan Drake and Greg Hamerly. 2012. Accelerated k-means with adaptive distance bounds. In 5th NIPS Workshop on Optimization for Machine Learning.Google ScholarGoogle Scholar
  10. Charles Elkan. 2003. Using the triangle inequality to accelerate k-means. In ICML, Vol. 3. 147–153.Google ScholarGoogle Scholar
  11. AM Fahim, AM Salem, FA Torkey, and MA Ramadan. 2006. An efficient enhanced k-means clustering algorithm. Journal of Zhejiang University SCIENCE A, Springer 7, 10 (2006), 1626–1633.Google ScholarGoogle ScholarCross RefCross Ref
  12. Andrew V Goldberg and Chris Harrelson. 2005. Computing the shortest path: A search meets graph theory. In Proceedings of the sixteenth annual ACM-SIAM. 156–165.Google ScholarGoogle Scholar
  13. Michael Greenspan, Guy Godin, and Jimmy Talbot. 2000. Acceleration of binning nearest neighbor methods. In Vision Interface, IEEE. 337–344.Google ScholarGoogle Scholar
  14. Gautam Gupta and Sanjay V Rajopadhye. 2006. Simplifying reductions.. In POPL, Vol. 6. 30–41.Google ScholarGoogle Scholar
  15. Ronald J Gutman. 2004. Reach-Based Routing: A New Approach to Shortest Path Algorithms Optimized for Road Networks.. In ALENEX/ANALC. 100–111.Google ScholarGoogle Scholar
  16. Greg Hamerly. 2010. Making k-means Even Faster.. In SDM, SIAM. 130–140.Google ScholarGoogle Scholar
  17. Matthew A Hammer, Joshua Dunfield, Kyle Headley, Nicholas Labich, Jeffrey S Foster, Michael Hicks, and David Van Horn. 2015. Incremental Computation with Names. arXiv preprint arXiv:1503.07792 (2015).Google ScholarGoogle Scholar
  18. Matthew A Hammer, Khoo Yit Phang, Michael Hicks, and Jeffrey S Foster. 2014. Adapton: Composable, demand-driven incremental computation. In ACM SIGPLAN Notices, Vol. 49. ACM, 156–166.Google ScholarGoogle Scholar
  19. Albert Hartono, Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Marcel Nooijen, Gerald Baumgartner, David E Bernholdt, Venkatesh Choppella, Russell M Pitzer, J Ramanujam, et al. 2006. Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations. In International Conference on Computational Science. Springer, 267–275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Albert Hartono, Alexander Sibiryakov, Marcel Nooijen, Gerald Baumgartner, David E Bernholdt, So Hirata, Chi-Chung Lam, Russell M Pitzer, J Ramanujam, and P Sadayappan. 2005. Automated operation minimization of tensor contraction expressions in electronic structure calculations. In International Conference on Computational Science. Springer, 155–164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. David Joyner, Ondřej Čertík, Aaron Meurer, and Brian E Granger. 2012. Open source computer algebra systems: SymPy. ACM Communications in Computer Algebra 45, 3/4 (2012), 225–234. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Chris Lattner and Vikram Adve. 2008. http://clang.llvm.org.Google ScholarGoogle Scholar
  23. Wang Kay Ngai, Ben Kao, Chun Kit Chui, Reynold Cheng, Michael Chau, and Kevin Y Yip. 2006. Efficient clustering of uncertain data. In Data Mining, 2006. ICDM’06, IEEE. 436–445.Google ScholarGoogle Scholar
  24. Oswaldo Olivo, Isil Dillig, and Calvin Lin. 2015. Static detection of asymptotic performance bugs in collection traversals. In ACM SIGPLAN Notices, Vol. 50. ACM, 369–378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Robert Paige and Shaye Koenig. 1982. Finite differencing of computable expressions. ACM Transactions on Programming Languages and Systems (TOPLAS) 4, 3, 402–454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yonghong Song and Zhiyuan Li. 1999. New tiling techniques to improve cache temporal locality. ACM SIGPLAN Notices 34, 5 (1999), 215–228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. TR. Omitted to Avoid Conflicts with Blind Review, 2017. Generalized Loop Redundancy Elimination upon Formula-Based Redundancy Removal. In http://goo.gl/j4UKAp.Google ScholarGoogle Scholar
  28. Jing Wang, Jingdong Wang, Qifa Ke, Gang Zeng, and Shipeng Li. 2012. Fast approximate k-means via cluster closures. In Computer Vision and Pattern Recognition (CVPR), IEEE. 3037–3044. Google ScholarGoogle ScholarCross RefCross Ref
  29. Xueyi Wang. 2011. A fast exact k-nearest neighbors algorithm for high dimensional search using k-means clustering and triangle inequality. In Neural Networks (IJCNN), IEEE. 1293–1299. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. GLORE: generalized loop redundancy elimination upon LER-notation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!