skip to main content
research-article

Compiler driven data layout optimization for regular/irregular array access patterns

Published:12 June 2008Publication History
Skip Abstract Section

Abstract

Embedded multimedia applications consist of regular and irregular memory access patterns. Particularly, irregular pattern are not amenable to static analysis for extraction of access patterns, and thus prevent efficient use of a Scratch Pad Memory (SPM) hierarchy for performance and energy improvements. To resolve this, we present a compiler strategy to optimize data layout in regular/irregular multimedia applications running on embedded multiprocessor environments. The goal is to maximize the amount of accesses to the SPM over the entire system which leads to a reduction in the energy consumption of the system. This is achieved by optimizing data placement of application-wide reused data so that it resides in the SPMs of processing elements. Specifically, our scheme is based on a profiling that generates a memory access footprint. The memory access footprint is used to identify data elements with fine granularity that can profitably be placed in the SPMs to maximize performance and energy gains. We present a heuristic approach that efficiently exploits the SPMs using memory access footprint. Our experimental results show that our approach is able to reduce energy consumption by 30% and improve performance by 18% over cache based memory subsystems for various multimedia applications.

References

  1. R. Banakar, S. Steinke, B. Lee, M. Balakrishnan, and P. Marwedel, "Scratchpad memory: A design alternative for cache on-chip memory in embedded systems," in Proc. of the 10th International Workshop on Hardware/Software Codesign, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. T. Kandemir, J. Ramanujam, M. J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh, "Dynamic management of scratch-pad memory space," in Design Automation Conference, 2001, pp. 690--695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Verma, L. Wehmeyer, and P. Marwedel, "Dynamic overlay of scratchpad memory for energy minimization," in Proc. of the 2nd ACM international conference on Hardware/software codesign and system synthesis, 2004, pp. 104--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. F. Bacon, S. L. Graham, and O. J. Sharp, "Compiler transformations for high-performance computing," ACM Comput. Surv., vol. 26, no. 4, pp. 345--420, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. S. McKinley, S. Carr, and C.-W. Tseng, "Improving data locality with loop transformations," ACM Trans. on Programming Languages and Systems, vol. 18, no. 4, pp. 424--453, July 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. O. Avissar, R. Barua, and D. Stewart, "An optimal memory allocation scheme for scratch-pad-based embedded systems," ACM Trans. on Embedded Computing Sys., vol. 1, no. 1, pp. 6--26, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. R. Panda, N. D. Dutt, and A. Nicolau, "Efficient utilization of scratch-pad memory in embedded processor applications," in Proc. of the European conference on Design and Test, 1997, p. 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Sjödin and C. von Platen, "Storage allocation for embedded processors," in Proc. of the international conference on Compilers, architecture, and synthesis for embedded systems, 2001, pp. 15--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Steinke, L. Wehmeyer, B. Lee, and P. Marwedel, "Assigning program and data objects to scratchpad for energy reduction," in Proc. of the conference on Design, automation and test in Europe, 2002, p. 409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Verma, S. Steinke, and P. Marwedel, "Data partitioning for maximal scratchpad usage," in Proc. of Asia South Pacific Design Automated Conference, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. D. Cooper and T. J. Harvey, "Compiler-controlled memory," in Architectural Support for Programming Languages and Operating Systems, 1998, pp. 2--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. I. Issenin, E. Brockmeyer, M. Miranda, and N. Dutt, "Drdu: A data reuse analysis technique for efficient scratch-pad memory management," ACM Trans. Des. Autom. Electron. Syst., vol. 12, no. 2, p. 15, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Dominguez, S. Udayakumaran, and R. Barua, "Heap data allocation to scratch-pad memory in embedded systems," J. Embedded Comput., vol. 1, no. 4, pp. 521--540, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Udayakumaran and R. Barua, "Compiler-decided dynamic memory allocation for scratch-pad based embedded systems," in Proc. of the international conference on Compilers, architecture and synthesis for embedded systems, 2003, pp. 276--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Meftali, F. Gharsalli, F. Rousseau, and A. A. Jerraya, "An optimal memory allocation for application-specific multiprocessor system-on-chip," in Proc. of the 14th international symposium on Systems synthesis, 2001, pp. 19--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Kandemir and N. Dutt, Memory Systems and Compiler Support for MPSoC Architectures. Morgan Kaufmann, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  17. I. Issenin, E. Brockmeyer, B. Durinck, and N. Dutt, "Multiprocessor system-on-chip data reuse analysis for exploring customized memory hierarchies," in Proc. of the 43rd annual conference on Design automation, 2006, pp. 49--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Li, L. Gao, and J. Xue, "Memory coloring: A compiler approach for scratchpad memory management," in Proc. of the 14th International Conference on Parallel Architectures and Compilation Techniques, 2005, pp. 329--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. J. Absar and F. Catthoor, "Compiler-based approach for exploiting scratch-pad in presence of irregular array access," in Proc. of the conference on Design, Automation and Test in Europe, 2005, pp. 1162--1167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Chen, O. Ozturk, M. Kandemir, and M. Karakoy, "Dynamic scratch-pad memory management for irregular array access patterns," in Proc. of the conference on Design, automation and test in Europe. European Design and Automation Association, 2006, pp. 931--936. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. "Arm advanced micro bus architecture (amba)," ARM. {Online}. Available: http://www.arm.com/products/solutions/AMBAHomePage.htmlGoogle ScholarGoogle Scholar
  22. "Sonics, integration architectures." {Online}. Available: http://www.sonicsinc.comGoogle ScholarGoogle Scholar
  23. M. Gschwind, H. P. Hofstee, B. K. Flachs, M. Hopkins, Y. Watanabe, and T. Yamazaki, "Synergistic processing in cell's multicore architecture." IEEE Micro, vol. 26, no. 2, pp. 10--24, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. J. S. H. Yongjoo Kim, Seongnam Kwon and Y. Paek, "An openmp translator with retargetable parallel programming model for mpsoc," in Proc. of Intl. Conf. on Ubiquitous Information Technologies and Applications, 2007.Google ScholarGoogle Scholar
  25. Y. Wu, "Efficient discovery of regular stride patterns in irregular programs and its use in compiler prefetching," SIGPLAN Not., vol. 37, no. 5, pp. 210--221, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Felner and S. Kraus, "Kbfs: K-best-first search," in Annuals of Mathematics and Artificial Intelligence, 2003, pp. 19--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems," in International Symposium on Microarchitecture, 1997, pp. 330--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. C. Burger and T. M. Austin, "The simplescalar tool set, version 2.0," Tech. Rep. CS-TR-1997-1342, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Shivakumar and N. P. Jouppi, "Cacti 3.0: An integrated cache timing, power, and area model."Google ScholarGoogle Scholar
  30. "128 mbit micron mobile sdram data sheet." Micron Technology Incorporated. {Online}. Available: http://www.micron.comGoogle ScholarGoogle Scholar
  31. L. Benini, A. Macii, E. Macii, and M. Poncino, "Increasing energy efficiency of embedded systems by application-specific memory hierarchy generation," IEEE Des. Test, vol. 17, no. 2, pp. 74--85, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Compiler driven data layout optimization for regular/irregular array access patterns

    Recommendations

    Reviews

    Michael Zastre

    Memory access patterns on multiprocessor systems-on-chip (MPSoC) are complex enough that they often defeat hardwired caching strategies. Instead of hardwired caches, these systems use scratch pad memory (SPM), acting as a software-controlled cache on each individual processor. Decisions on data placement in SPMs are more straightforward when access patterns in an application are regular than when they are irregular (regular versus irregular applications). What this paper presents are solutions to two problems in the context of irregular applications: how to use profile (runtime) statistics that identify parts of data arrays for copying into SPMs, such that performance and energy consumption are improved, and how to maximize utilization of the SPM resources. The paper claims a reduction in energy consumption using standard benchmarks of 30 percent and a gain in performance of 18 percent. Sections 1 and 2 outline the problem and previous work. Sections 3 and 4 describe the memory model, including a statement of the strategy used to decide data layout for a given program and program profile. This last section is the biggest of the paper, and develops the ideas via a series of 11 definitions that can be used as ingredients in a data layout algorithm. Section 5 addresses the impact of placing such data from irregular applications into SPMs: address translation is not trivial. Section 6 describes some of the experimental results obtained via simulation, using tools such as SimpleScalar. Section 7 concludes the paper. The results are impressive, yet the authors clearly identify one shortcoming: their approach does use profiles, but as input to a compilation step, the data-placement scheme is unable to exploit runtime behavior at the time of execution in order to decide upon changes during execution of the program. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 43, Issue 7
      LCTES '08
      July 2008
      167 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1379023
      Issue’s Table of Contents
      • cover image ACM Conferences
        LCTES '08: Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
        June 2008
        180 pages
        ISBN:9781605581040
        DOI:10.1145/1375657

      Copyright © 2008 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 June 2008

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!