skip to main content
research-article

Placement of Linked Dynamic Data Structures over Heterogeneous Memories in Embedded Systems

Published:17 February 2015Publication History
Skip Abstract Section

Abstract

Software applications use dynamic memory (allocated and deallocated in the system's heap) to handle dynamism in their working conditions. Embedded systems tend to include complex memory organizations but most techniques for dynamic memory management do not deal with the placement of data objects in physical memory modules. Additionally, the performance of hardware-controlled cache memories may be severely hindered when used with linked data structures. We therefore present a methodology to map dynamic data on the multilevel memory subsystem of embedded systems, taking advantage of any available memories (e.g., on-chip SRAMs) and avoiding interference with the cache memories. The resulting data placement uses an exclusive memory model and is compatible with existing techniques for managing static data. Our methodology helps the designer achieve reductions in energy consumption and execution time that can be obtained by an expert in an automated way while keeping control over the process through multiple configuration knobs.

References

  1. Mohammed Javed Absar, Francesco Poletti, Paul Marchal, Francky Catthoor, and Luca Benini. 2004. Fast and power-efficient dynamic data-layout with DMA-capable memories. In Proceedings of the PACS.Google ScholarGoogle Scholar
  2. Nawaaz Ahmed, Nikolay Mateev, and Keshav Pingali. 2000. Tiling imperfectly-nested loop nests. In Proceedings of Supercomputing. IEEE, Washington, DC, Article 31. http://dl.acm.org/citation.cfm?id=370049.370401 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. I. Anagnostopoulos, S. Xydis, A. Bartzas, Zhonghai Lu, D. Soudris, and A. Jantsch. 2011. Custom microcoded dynamic memory management for distributed on-chip memory organizations. Embedded Systems Letters 3, 2 (June 2011), 66--69. DOI:http://dx.doi.org/10.1109/LES.2011.2146228 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. ARM. 2011. Cortex-A15 Technical Reference Manual Rev. r2p0. ARM.Google ScholarGoogle Scholar
  5. David Atienza, José M. Mendías, Stylianos Mamagkakis, Dimitrios Soudris, and Francky Catthoor. 2006. Systematic dynamic memory management design methodology for reduced memory footprint. ACM TODAES 11, 2 (2006), 465--489. DOI:http://dx.doi.org/10.1145/1142155.1142165 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Oren Avissar, Rajeev Barua, and Dave Stewart. 2001. Heterogeneous memory management for embedded systems. In Proceedings of CASES. ACM, 34--43. DOI:http://dx.doi.org/10.1145/502217.502223 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Christos Baloukas, Jose L. Risco-Martin, David Atienza, Christophe Poucet, Lazaros Papadopoulos, Stylianos Mamagkakis, Dimitrios Soudris, J. Ignacio Hidalgo, Francky Catthoor, and Juan Lanchares. 2009. Optimization methodology of dynamic data structures based on genetic algorithms for multimedia embedded systems. Journal of Systems and Software 82, 4 (2009), 590--602. DOI:http://dx.doi.org/10.1016/j.jss.2008.08.032 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Rajeshwari Banakar, Stefan Steinke, Bo-Sik Lee, M. Balakrishnan, and Peter Marwedel. 2002. Scratchpad memory: A design alternative for cache on-chip memory in embedded systems. In Proceedings of CODES. ACM, 73--78. DOI:http://dx.doi.org/10.1145/774789.774805 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Alexandros Bartzas, Miguel Peón-Quirós, Christophe Poucet, Christos Baloukas, Stylianos Mamagkakis, Francky Catthoor, Dimitrios Soudris, and Jose Manuel Mendías. 2010. Software metadata: Systematic characterization of the memory behaviour of dynamic applications. Journal of Systems and Software 83, 6 (2010), 1051--1075. DOI:http://dx.doi.org/DOI: 10.1016/j.jss.2010.01.001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Luca Benini and Giovanni de Micheli. 2000. System-level power optimization: Techniques and tools. ACM TODAES 5, 2 (2000), 115--192. DOI:http://dx.doi.org/10.1145/335043.335044 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A scalable memory allocator for multithreaded applications. SIGPLAN Notes 35, 11 (Nov. 2000), 117--128. DOI:http://dx.doi.org/10.1145/356989.357000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Emery D. Berger, Benjamin G. Zorn, and Kathryn S. McKinley. 2001. Composing high-performance memory allocators. In Proceedings of PLDI. ACM, 114--124. DOI:http://dx.doi.org/10.1145/378795.378821 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gilles Brassard and T. Bratley. 1996. Fundamentals of Algorithmics (1st (Spanish) ed.). Prentice Hall, 227--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Francky Catthoor, Sven Wuytack, G. E. de Greef, Florin Banica, Lode Nachtergaele, and Arnout Vandecappelle. 1998. Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design. Kluwer Academic Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Trishul M. Chilimbi, Bob Davidson, and James R. Larus. 1999. Cache-conscious structure definition. In Proceedings of PLDI. ACM, 13--24. DOI:http://dx.doi.org/10.1145/301618.301635 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Minas Dasygenis, Erik Brockmeyer, Bart Durinck, Francky Catthoor, Dimitrios Soudris, and Adonios Thanailakis. 2006. A combined DMA and application-specific prefetching approach for tackling the memory latency bottleneck. IEEE TVLSI 14, 3, 279--291. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Edgard Daylight, David Atienza, Arnout Vandecappelle, Francky Catthoor, and José Manuel Mendías. 2004. Memory-access-aware data structure transformations for embedded software with dynamic data accesses. IEEE TVLSI 12, 3 (2004), 269--280. DOI:http://dx.doi.org/10.1109/TVLSI.2004.824303 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hugo De Man. 2004. Connecting E-dreams to deep-submicron realities. In Proceedings of PATMOS. Springer. DOI:http://dx.doi.org/10.1007/b100662Google ScholarGoogle Scholar
  19. Angel Dominguez, Sumesh Udayakumaran, and Rajeev Barua. 2005. Heap data allocation to scratch-pad memory in embedded systems. Journal of Embedded Computing 1, 4 (2005), 521--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lieven Eeckhout, H. Vandierendonck, and Koen De Bosschere. 2003. Quantifying the impact of input data sets on program behavior and its applications. Journal of Instruction-Level Parallelism 5 (2003), 1--33.Google ScholarGoogle Scholar
  21. Edward Fredkin. 1960. Trie memory. Communications of the ACM 3, 9 (Sept. 1960), 490--499. DOI:http://dx.doi.org/10.1145/367390.367400 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1995. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Bert Geelen, Erik Brockmeyer, Bart Durinck, Gauthier Lafruit, and Rudy Lauwereins. 2005. Alleviating memory bottlenecks by software-controlled data transfers in a data-parallel wavelet transform on a multicore DSP. In Proceedings of SPS-DARTS. 143--146.Google ScholarGoogle Scholar
  24. Stefan Valentin Gheorghita, Martin Palkovic, Juan Hamers, Arnout Vandecappelle, Stelios Mamagkakis, Twan Basten, Lieven Eeckhout, Henk Corporaal, Francky Catthoor, Frederik Vandeputte, and Koen De Bosschere. 2009. System-scenario-based design of dynamic embedded systems. ACM TODAES 14, 1 (2009), 1--45. DOI:http://dx.doi.org/10.1145/1455229.1455232 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. González-Alberquilla, Fernando Castro, Luis Piñuel, and Francisco Tirado. 2010. Stack filter: Reducing L1 data cache power consumption. Journal of Systems Architecture 56 (Dec. 2010), 685--695. DOI:http://dx.doi.org/10.1016/j.sysarc.2010.10.002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tristan Henderson, David Kotz, and Ilya Abyzov. 2004. The changing usage of a mature campus-wide wireless network. In Proceedings of MobiCom. ACM, 187--201. DOI:http://dx.doi.org/10.1145/1023720.1023739 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. HP Labs. 2008. CACTI 5.3. Retrieved from http://quid.hpl.hp.com:9081/cacti/.Google ScholarGoogle Scholar
  28. Franois Ingelrest, Guillermo Barrenetxea, Gunnar Schaefer, Martin Vetterli, Olivier Couach, and Marc Parlange. 2010. SensorScope: Application-specific sensor network for environmental monitoring. ACM TOSN 6, 2 (2010), 1--32. DOI:http://dx.doi.org/10.1145/1689239.1689247 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. JEDEC. 2011. Low Power Double Data Rate 2 (LPDDR2) - JESD209-2E. JEDEC Solid State Technology Association.Google ScholarGoogle Scholar
  30. N. Jouppi and S. Wilton. 1994. Tradeoffs in two-level on-chip caching. In Proceedings of ISCA. IEEE, 34--45. DOI:http://dx.doi.org/10.1145/191995.192015 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mahmut Kandemir, I. Kadayif, A. Choudhary, J. Ramanujam, and I. Kolcu. 2004. Compiler-directed scratchpad memory optimization for embedded multiprocessors. IEEE TVLSI Systems 12, 3 (2004), 281--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mahmut Kandemir, J. Ramanujam, J. Irwin, N. Vijaykrishnan, I. Kadayif, and A. Parikh. 2001. Dynamic management of scratch-pad memory space. In Proceedings of DAC. 690--695. DOI:http://dx.doi.org/10.1145/378239.379049 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chris Lattner and Vikram Adve. 2005. Automatic pool allocation: Improving performance by controlling data structure layout in the heap. In Proceedings of PLDI. ACM, 129--142. DOI:http://dx.doi.org/10.1145/1065010.1065027 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Doug Lea. 1996. A Memory Allocator. Retrieved from http://g.oswego.edu/dl/html/malloc.html.Google ScholarGoogle Scholar
  35. Wentong Li, S. Mohanty, and K. Kavi. 2006. A page-based hybrid (software-hardware) dynamic memory allocator. IEEE CAL 5, 2 (2006), 13--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Amy W. Lim, Shih-Wei Liao, and Monica S. Lam. 2001. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proc. of PPoPP. ACM, 103--112. DOI:http://dx.doi.org/10.1145/379539.379586 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Stylianos Mamagkakis, David Atienza, Christophe Poucet, Francky Catthoor, and Dimitrios Soudris. 2006. Energy-efficient dynamic memory allocators at the middleware level of embedded systems. In Proceedings of EMSOFT. ACM, 215--222. DOI:http://dx.doi.org/10.1145/1176887.1176919 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Paul Marchal, Francky Catthoor, Davide Bruni, Luca Benini, José Ignacio Gómez, and Luis Piñuel. 2004. Integrated task scheduling and data assignment for SDRAMs in dynamic applications. IEEE Design and Test of Computers 21, 5 (2004), 378--387. DOI:http://dx.doi.org/10.1109/MDT.2004.66 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Barry H. Margolin, Richard P. Parmelee, and Martin Schatzoff. 1971. Analysis of free-storage algorithms. IBM Systems Journal 10, 4 (1971), 283--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ross McIlroy, Peter Dickman, and Joe Sventek. 2008. Efficient dynamic heap allocation of scratch-pad memory. In Proceedings of ISMM. ACM, 31--40. DOI:http://dx.doi.org/10.1145/1375634.1375640 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. MICRON. 2010. Mobile LPSDR SDRAM - MT48H32M32LF/LG Rev. D 1/11 EN. Micron Technology, Inc.Google ScholarGoogle Scholar
  42. MICRON. 2012. Mobile LPDDR2 SDRAM - MT42L64M32D1 Rev. N 3/12 EN. Micron Technology, Inc.Google ScholarGoogle Scholar
  43. Preeti Ranjan Panda, Nikil D. Dutt, and Alexandru Nicolau. 2000. On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM TODAES 5, 3 (2000), 682--704. DOI:http://dx.doi.org/10.1145/348019.348570 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Francesco Poletti, Paul Marchal, David Atienza, Luca Benini, Francky Catthoor, and José M. Mendías. 2004. An integrated hardware/software approach for run-time scratchpad management. In Proceedings of DAC. 238--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Christophe Poucet, David Atienza, and Francky Catthoor. 2006. Template-based semi-automatic profiling of multimedia applications. In Proceedings of ICME. IEEE, 1061--1064.Google ScholarGoogle ScholarCross RefCross Ref
  46. M. Shreedhar and George Varghese. 1996. Efficient fair queueing using deficit round-robin. IEEE/ACM Trans. Networking 4, 3 (1996), 375--385. DOI:http://dx.doi.org/10.1109/90.502236 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. María Soto, André Rossi, and Marc Sevaux. 2012. A mathematical model and a metaheuristic approach for a memory allocation problem. Journal of Heuristics 18, 1 (Feb. 2012), 149--167. DOI:http://dx.doi.org/10.1007/s10732-011-9165-3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Stefan Steinke, Lars Wehmeyer, B. Lee, and Peter Marwedel. 2002. Assigning program and data objects to scratchpad for energy reduction. In Proceedings of DATE. 409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. S. Subha. 2009. An exclusive cache model. In IEEE ITNG. 1715--1716. DOI:http://dx.doi.org/10.1109/ITNG.2009.89 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM TECS 5, 2 (2006), 472--511. DOI:http://dx.doi.org/10.1145/1151074.1151085 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Manish Verma, Stefan Steinke, and Peter Marwedel. 2003. Data partitioning for maximal scratchpad usage. In Proceedings of ASP-DAC. 77--83. DOI:http://dx.doi.org/10.1145/1119772.1119788 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Manish Verma, Lars Wehmeyer, and Peter Marwedel. 2004. Cache-aware scratchpad allocation algorithm. In Proceedings of DATE. 21264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Paul R. Wilson, Mark S. Johnstone, Michael Neely, and David Boles. 1995. Dynamic storage allocation: A survey and critical review. In Proceedings of IWMM. Springer-Verlag, 1--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Sven Wuytack, Jean-Philippe Diguet, Francky Catthoor, and Hugo De Man. 1998. Formalized methodology for data reuse exploration for low-power hierarchical memory mappings. IEEE TVLSI 6, 4 (Dec. 1998), 529--537. DOI:http://dx.doi.org/10.1109/92.736124 Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ying Zheng, B. T. Davis, and M. Jordan. 2004. Performance evaluation of exclusive cache hierarchies. In Proceedings of ISPASS. IEEE, Washington, DC, 89--96. DOI:http://dx.doi.org/10.1109/ISPASS.2004.1291359 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Placement of Linked Dynamic Data Structures over Heterogeneous Memories in Embedded Systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!