skip to main content
tutorial

Improving Data Access Efficiency by Using Context-Aware Loads and Stores

Published:04 June 2015Publication History
Skip Abstract Section

Abstract

Memory operations have a significant impact on both performance and energy usage even when an access hits in the level-one data cache (L1 DC). Load instructions in particular affect performance as they frequently result in stalls since the register to be loaded is often referenced before the data is available in the pipeline. L1 DC accesses also impact energy usage as they typically require significantly more energy than a register file access. Despite their impact on performance and energy usage, L1 DC accesses on most processors are performed in a general fashion without regard to the context in which the load or store operation is performed. We describe a set of techniques where the compiler enhances load and store instructions so that they can be executed with fewer stalls and/or enable the L1 DC to be accessed in a more energy-efficient manner. We show that using these techniques can simultaneously achieve a 6% gain in performance and a 43% reduction in L1 DC energy usage.

References

  1. A. Bardizbanyan, M. Själander, D. Whalley, and P. Larsson-Edefors. Speculative tag access for reduced energy dissipation in set-associative L1 data caches. In IEEE Int. Conf. Computer Design, pages 302--308, October 2013.Google ScholarGoogle ScholarCross RefCross Ref
  2. L. Jin and S. Cho. Macro data load: An efficient mechanism for enhancing loaded data reuse. IEEE Trans. on Computers, 60(4):526--537, April 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Lotfi-Kamran, B. Grot, M. Ferdman, S. Volos, O. Kockberber, J. Picorel, A. Adileh, D. Jevdjic, S. Idgunji, E. Ozer, and B. Falsafi. Scale-out processors. In Annual Int. Symp. Computer Architecture, pages 500--511, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. E. Benitez and J. W. Davidson. A portable global optimizer and linker. In ACM SIGPLAN Conf. Programming Language Design and Implementation, pages 329--338, June 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. MiBench: A free, commercially representative embedded benchmark suite. In IEEE Int. Workshop/Symp. on Workload Characterization, pages 3--14, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Austin, E. Larson, and D. Ernst. SimpleScalar: An infrastructure for computer system modeling. Computer, 35(2):59--67, February 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Williamson. ARM Cortex A8: A high performance processor for low power applications. In E. John and J. Rubio, editors, Unique Chips and Systems. CRC Press, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. D. Powell, A. Agarwal, T. N. Vijaykumar, B. Falsafi, and K. Roy. Reducing set-associative cache energy via way-prediction and selective direct-mapping. In IEEE/ACM Annual Int. Symp. Microarchitecture, pages 54--65, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Inoue, T. Ishihara, and K. Murakami. Way-predicting set-associative cache for high performance and low energy consumption. In Int. Symp. Low Power Electronics and Design, pages 273--275, August 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Nicolaescu, B. Salamat, A. Veidenbaum, and M. Valero. Fast speculative address generation and way caching for reducing L1 data cache energy. In IEEE Int. Conf. Computer Design, pages 101--107, October 2006.Google ScholarGoogle ScholarCross RefCross Ref
  11. C. Zhang, F. Vahid, J. Yang, and W. Najjar. A way-halting cache for low-energy high-performance systems. ACM Trans. on Architecture and Code Optimization, 2(1):34--54, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. E. Witchel, S. Larsen, C. S. Ananian, and K. Asanović. Direct addressed caches for reduced power consumption. In IEEE/ACM Annual Int. Symp. Microarchitecture, pages 124--133, December 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Su and A Despain. Cache design trade-offs for power and performance optimization: A case study. In Int. Symp. Low Power Electronics and Design, pages 63--68, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Kin, M. Gupta, and W. H. Mangione-Smith. The filter cache: an energy efficient memory structure. In IEEE/ACM Annual Int. Symp. Microarchitecture, pages 184--193, December 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Bardizbanyan, M. Själander, D. Whalley, and P. Larsson-Edefors. Designing a practical data filter cache to improve both energy efficiency and performance. ACM Trans. on Architecture and Code Optimization, 10(4):54:1--54:25, December 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Bardizbanyan, P. Gavin, D. Whalley, M. Själander, P. Larsson-Edefors, S. McKee, and P. Stenström. Improving data access efficiency by using a tagless access buffer (TAB). In Int. Symp. Code Generation and Optimization, pages 269--279, February 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving Data Access Efficiency by Using Context-Aware Loads and Stores

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 5
      LCTES '15
      May 2015
      141 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2808704
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES'15: Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems 2015 CD-ROM
        June 2015
        149 pages
        ISBN:9781450332576
        DOI:10.1145/2670529

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 June 2015

      Check for updates

      Qualifiers

      • tutorial
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!