skip to main content
research-article
Open Access

Semiring optimizations: dynamic elision of expressions with identity and absorbing elements

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

This paper describes a compiler optimization to eliminates dynamic occurrences of expressions in the format aabc. The operation ⊕ must admit an identity element z, such that az = a. Also, z must be the absorbing element of ⊗, such that bz = zc = z. Semirings where ⊕ is the additive operator and ⊗ is the multiplicative operator meet this contract. This pattern is common in high-performance benchmarks—its canonical representative being the multiply-add operation aa + b × c. However, several other expressions involving arithmetic and logic operations satisfy the required algebra. We show that the runtime elimination of such assignments can be implemented in a performance-safe way via online profiling. The elimination of dynamic redundancies involving identity and absorbing elements in 35 programs of the LLVM test suite that present semiring patterns brings an average speedup of 1.19x (total optimized time over total unoptimized time) on top of clang -O3. When projected onto the entire test suite (259 programs) the optimization leads to a speedup of 1.025x. Once added onto clang, semiring optimizations approximates it to TACO, a specialized tensor compiler.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This paper describes a compiler optimization to eliminates dynamic occurrences of expressions in the format a=a+b*c. The operation + must admit an identity element z, such that a+z=a. Also, z must be the absorbing element of *, such that b*z=z*c=z. Semirings where + is the additive operator and * is the multiplicative operator meet this contract. This pattern is common in high-performance benchmarks—its canonical representative being the multiply-add operation a=a+b*c. However, several other expressions involving arithmetic and logic operations satisfy the required algebra. We show that the runtime elimination of such assignments can be implemented in a performance-safe way via online profiling. The elimination of dynamic redundancies involving identity and absorbing elements in 35 programs of the LLVM test suite that present semiring patterns brings an average speedup of 1.19x over clang -O3. When projected onto the entire test suite the optimization leads to a speedup of 1.025x

References

  1. Kadir Akbudak, Hatem Ltaief, Aleksandr Mikhalev, Ali Charara, Aniello Esposito, and David E. Keyes. 2018. Exploiting Data Sparsity for Large-Scale Matrix Computations. In Euro-Par. Springer, Heidelberg, Germany, 721-734. https: //doi.org/10.1007/978-3-319-96983-1_51 Google ScholarGoogle ScholarCross RefCross Ref
  2. Gordon B. Bell, Kevin M. Lepak, and Mikko H. Lipasti. 2000. Characterization of Silent Stores. In PACT. IEEE, Washington, DC, USA, 133-142.Google ScholarGoogle Scholar
  3. Hans-J. Boehm and Dhruva R. Chakrabarti. 2016. Persistence Programming Models for Non-Volatile Memory. In ISMM. Association for Computing Machinery, New York, NY, USA, 55-67. https://doi.org/10.1145/2926697.2926704 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Qiong Cai and Jingling Xue. 2003. Optimal and Eficient Speculation-Based Partial Redundancy Elimination. In CGO. IEEE, USA, 91-102.Google ScholarGoogle Scholar
  5. Brad Calder, Peter Feller, and Alan Eustace. 1997. Value Profiling. In MICRO. IEEE, USA, 259-269.Google ScholarGoogle Scholar
  6. D. Callahan, J. Dongarra, and D. Levine. 1988. Vectorizing Compilers: A Test Suite and Results. In Supercomputing (Orlando, Florida, USA). IEEE, Washington, DC, USA, 98-105.Google ScholarGoogle Scholar
  7. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-end Optimizing Compiler for Deep Learning. In OSDI (Carlsbad, CA, USA). USENIX Association, Berkeley, CA, USA, 579-594. http://dl.acm.org/citation.cfm?id= 3291168. 3291211Google ScholarGoogle Scholar
  8. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliford Stein. 2009. Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press, Cambridge, MA, US.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. 1989. An Eficient Method of Computing Static Single Assignment Form. In POPL (Austin, Texas, USA). ACM, New York, NY, USA, 25-35. https://doi.org/10.1145/75277.75280 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Brian Grant, Matthai Philipose, Markus Mock, Craig Chambers, and Susan J. Eggers. 1999. An Evaluation of Staged Run-Time Optimizations in DyC. In PLDI. Association for Computing Machinery, New York, NY, USA, 293-304. https: //doi.org/10.1145/301618.301683 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David Hilbert. 1904. Die Theorie der algebraischen Zahlkörper. Jahresbericht der Deutschen Mathematiker-Vereinigung, Germany.Google ScholarGoogle Scholar
  12. David G. Hough and Mike Cowlishaw. 2019. IEEE Standard for Floating-Point Arithmetic., 84 pages.Google ScholarGoogle Scholar
  13. Wen-Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, and Daniel M. Lavery. 1993. The Superblock: An Efective Technique for VLIW and Superscalar Compilation. J. Supercomput. 7, 1-2 (May 1993 ), 229-248. https: //doi.org/10.1007/BF01205185 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Daniel A. Jiménez. 2003. Reconsidering Complex Branch Predictors. In HPCA (HPCA '03). IEEE Computer Society, USA, 43.Google ScholarGoogle Scholar
  15. Samira Khan, Chris Wilkerson, Zhe Wang, Alaa R. Alameldeen, Donghyuk Lee, and Onur Mutlu. 2017. Detecting and Mitigating Data-dependent DRAM Failures by Exploiting Current Memory Content. In MICRO. ACM, New York, NY, USA, 27-40.Google ScholarGoogle Scholar

Index Terms

  1. Semiring optimizations: dynamic elision of expressions with identity and absorbing elements

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the ACM on Programming Languages
          Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
          November 2020
          3108 pages
          EISSN:2475-1421
          DOI:10.1145/3436718
          Issue’s Table of Contents

          Copyright © 2020 Owner/Author

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 November 2020
          Published in pacmpl Volume 4, Issue OOPSLA

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!