skip to main content
research-article

Dynamic scratchpad memory management for code in portable systems with an MMU

Published:29 January 2008Publication History
Skip Abstract Section

Abstract

In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache with a scratchpad memory (SPM) and a small minicache. Serializing the address translation with the actual memory access enables the memory system to access either only the SPM or the minicache. Independent of the SPM size and based solely on profiling information, a postpass optimizer classifies the code of an application binary into a pageable and a cacheable code region. The latter is placed at a fixed location in the external memory and cached by the minicache. The former, the pageable code region, is copied on demand to the SPM before execution. Both the pageable code region and the SPM are logically divided into pages the size of an MMU memory page. Using the MMU's pagefault exception mechanism, a runtime scratchpad memory manager (SPMM) tracks page accesses and copies frequently executed code pages to the SPM before they get executed. In order to minimize the number of page transfers from the external memory to the SPM, good code placement techniques become more important with increasing sizes of the MMU pages. We discuss code-grouping techniques and provide an analysis of the effect of the MMU's page size on execution time, energy consumption, and external memory accesses. We show that by using the data cache as a victim buffer for the SPM, significant energy savings are possible. We evaluate our SPM allocation strategy with fifteen applications, including H.264, MP3, MPEG-4, and PGP. The proposed memory system requires 8% less die are compared to a fully-cached configuration. On average, we achieve a 31% improvement in runtime performance and a 35% reduction in energy consumption with an MMU page size of 256 bytes.

References

  1. Angiolini, F., Benini, L., and Caprara, A. 2003. Polynomial-time algorithm for on-chip scratchpad memory partitioning. In CASES '03: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. 318--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Angiolini, F., Menichelli, F., Ferrero, A., Benini, L., and Olivieri, M. 2004. A post-compiler approach to scratchpad mapping of code. In CASES '04: Proceedings of the 2004 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. 259--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. ARM926EJ-S 2002. ARM926EJ-S Jazelle-enhanced macrocell,. http://www.arm.com/products/CPUs/ARM926EJ-S.html.Google ScholarGoogle Scholar
  4. ARMv6 2002. ARM Architecture Version 6 (ARMv6),. http://www.arm.com.Google ScholarGoogle Scholar
  5. Banakar, R., Steinke, S., Lee, B.-S., M.Balakrishnan, and Marwedel, P. 2002. Scratchpad memory: A design alternative for cache on-chip memory in embedded systems. In Proc. of the 10th International Symposium on Hardware/Software Codesign (CODES). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cho, H., Egger, B., Lee, J., and Shin, H. 2007. Dynamic data scratchpad memory management for a memory subsystem with an mmu. In LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools. ACM Press, New York. 195--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cormen, T. H., Leiserson, C. E., and Rivest, R. L. 1990. Introduction to Algorithms. The MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Denning, P. J. 1967. The working set model for program behavior. In SOSP '67: Proceedings of the 1st ACM Symposium on Operating System Principles. ACM Press, New York. 15.1--15.12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dominguez, A., Udayakumaran, S., and Barua, R. 2005. Heap data allocation to scratch-pad memory in embedded systems. Journal of Embedded Computing 1, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Egger, B., Kim, C., Jang, C., Nam, Y., Lee, J., and Min, S. L. 2006a. A dynamic code placement technique for scratchpad memory using postpass optimization. In CASES '06: Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Egger, B., Lee, J., and Shin, H. 2006b. Scratchpad memory management for portable systems with a memory management unit. In EMSOFT'06: Proceedings of the 6th ACM & IEEE Internationel Conference on Embedded Software. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Fotheringham, J. 1961. Dynamic storage allocation in the atlas computer, including an automatic use of a backing store. Commun. ACM 4, 10, 435--436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 1998. Mibench: A free, commercially representative embedded benchmark suite. In Proceedings of the 4th Annual Workshop on Workload Characterization. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H.264 2003. H.264 Video Codec. http://www.itu.int/rec/T-REC-H.264.Google ScholarGoogle Scholar
  15. Intel IXP Network Processor 2002. The Intel IXP Network Processor,. http://developer.intel.com/technology/itj/2002/volume06issue03/.Google ScholarGoogle Scholar
  16. Intel XScale 2002. Intel XScale Architecture. http://www.intel.com.Google ScholarGoogle Scholar
  17. Janapsatya, A., Ignjatovic;, A., and Parameswaran, S. 2006. A novel instruction scratchpad memory optimization method based on concomitance metric. In ASP-DAC '06: Proceedings of the 2006 Conference on Asia South Pacific Design Automation. 612--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kandemir, M. and Choudhary, A. 2002. Compiler-directed scratch pad memory hierarchy design and management. In DAC '02: Proceedings of the 39th Conference on Design Automation. 628--633. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kandemir, M., Ramanujam, J., Irwin, J., Vijaykrishnan, N., Kadayif, I., and Parikh, A. 2001. Dynamic management of scratch-pad memory space. In DAC '01: Proceedings of the 38th Conference on Design Automation. 690--695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lee, C., Potkonjak, M., and Mangione-Smith, W. H. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In International Symposium on Microarchitecture. 330--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Li, L., Gao, L., and Xue, J. 2005. Memory coloring: A compiler approach for scratchpad memory management. In PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). 329--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Machanick, P., Salverda, P., and Pompe, L. 1998. Hardware-software trade-offs in a direct rambus implementation of the rampage memory hierarchy. In ASPLOS-VIII: Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems. 105--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Micron Technology, Inc. 2003. MT48H8M16LF Mobile SDRAM.Google ScholarGoogle Scholar
  24. Micron Technology, Inc. 2004. Mobile SDRAM Power Calc 10.Google ScholarGoogle Scholar
  25. Moussouris, J., Crudele, L., Freitas, D., Hansen, C., Hudson, E., Przybylski, S., Riordan, T., and Rowen, C. 1986. A cmos risc processor with integrated system functions. In COMPCON.Google ScholarGoogle Scholar
  26. MP3 1996. MP3 Reference Decoder. http://www.mp3-tech.org/programmer/sources/dist10.tgz.Google ScholarGoogle Scholar
  27. Nguyen, N., Dominguez, A., and Barua, R. 2005. Memory allocation for embedded systems with a compile-time-unknown scratch-pad size. In CASES '05: Proceedings of the 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems. 115--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Panda, P. R., Dutt, N. D., and Nicolau, A. 1997. Efficient utilization of scratch-pad memory in embedded processor applications. In European Design Automation and Test Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Park, C., Lim, J., Kwon, K., Lee, J., and Min, S. L. 2004. Compiler-assisted demand paging for embedded systems with flash memory. In EMSOFT'04: The ACM Conference on Embedded Software. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pettis, K. and Hansen, R. C. 1990. Profile guided code positioning. In PLDI '90: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation. 16--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. PGPi. 2002. Pretty Good Privacy (PGPi). http://www.pgpi.org/.Google ScholarGoogle Scholar
  32. Philips LPC3180 2006. Philips LPC3180 microcontroller. http://www.standardics.philips.com/.Google ScholarGoogle Scholar
  33. Poletti, F., Marchal, P., Atienza, D., Benini, L., Catthoor, F., and Mendias, J. M. 2004. An integrated hardware/software approach for run-time scratchpad management. In DAC '04: Proceedings of the 41st Annual Conference on Design Automation. 238--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Samsung Semiconductor. 2005. K4X51163PC Mobile DDR SRAM.Google ScholarGoogle Scholar
  35. Shrivastava, A., Issenin, I., and Dutt, N. 2005. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In CASES '05: Proceedings of the 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems. 90--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. SNACK 2004. Seoul National University Advanced Compiler Tool Kit. http://aces.snu.ac.kr/snack.html.Google ScholarGoogle Scholar
  37. Steinke, S., Grunwald, N., Wehmeyer, L., Banakar, R., Balakrishnan, M., and Marwedel, P. 2002. Reducing energy consumption by dynamic copying of instructions onto onchip memory. In ISSS '02: Proceedings of the 15th International Symposium on System Synthesis. 213--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Udayakumaran, S. and Barua, R. 2003. Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In CASES '03: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. 276--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Verma, M., Wehmeyer, L., and Marvedel, P. 2004. Cache-aware scratchpad allocation algorithm. In Proceedings of International Conference on Design, Automation and Test in Europe (DATE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Verma, M., Petzold, K., Wehmeyer, L., Falk, H., and Marvedel, P. 2005. Scratchpad sharing strategies for multiprocess embedded systems: A first approach. In 3rd Workshop on Embedded Systems for Real-Time Multimedia.Google ScholarGoogle Scholar
  41. Wilton, S. and Jouppi, N. 1996. CACTI: An enhanced cache access and cycle time model. IEEE Journal of Solid State Circuits 31, 5, 677--688.Google ScholarGoogle ScholarCross RefCross Ref
  42. Xvid 2005. Xvid MPEG-4 Video Codec. http://www.xvid.org.Google ScholarGoogle Scholar

Index Terms

  1. Dynamic scratchpad memory management for code in portable systems with an MMU

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!