skip to main content
research-article

Power-aware dynamic memory management on many-core platforms utilizing DVFS

Published:06 December 2013Publication History
Skip Abstract Section

Abstract

Today multicore platforms are already prevalent solutions for modern embedded systems. In the future, embedded platforms will have an even more increased processor core count, composing many-core platforms. In addition, applications are becoming more complex and dynamic and try to efficiently utilize the amount of available resources on the embedded platforms. Efficient memory utilization is a key challenge for application developers, especially since memory is a scarce resource and often becomes the system's bottleneck. To cope with this dynamism and achieve better memory footprint utilization (low memory fragmentation) application developers resort to the usage of dynamic memory (heap) management techniques, by allocating and deallocating data at runtime. Moreover, overall power consumption is another key challenge that needs to be taken into consideration. Towards this, designers employ the usage of Dynamic Voltage and Frequency Scaling (DVFS) mechanisms, adapting to the application's computational demands at runtime. In this article, we propose the combination of dynamic memory management techniques with DVFS ones. This is performed by integrating, within the memory manager, runtime monitoring mechanisms that steer the DVFS mechanisms to adjust clock frequency and voltage supply based on heap performance. The proposed approach has been evaluated on a distributed shared-memory many-core platform composed of multiple LEON3 processors interconnected by a Network-on-Chip infrastructure, supporting DVFS. Experimental results show that by using the proposed method for monitoring and applying DVFS mechanisms the power consumption concerning dynamic memory management was reduced by approximately 37%. In addition we present the trade-offs the proposed approach. Last, by combining the developed method with heap fragmentation-aware dynamic memory managers, we achieve low heap fragmentation values combined with low power consumption.

References

  1. Aeroflex Gaisler. 2012. Leon3 processor. online.Google ScholarGoogle Scholar
  2. Agarwala, S., Rajagopal, A., et al. 2007. A 65nm c64x+ multi-core dsp platform for communications infrastructure. In Proceedings of the IEEE International Solid-State Circuits Conference. 262--601.Google ScholarGoogle Scholar
  3. Anagnostopoulos, I., Xydis, S., Bartzas, A., Lu, Z., Soudris, D., and Jantsch, A. 2011. Custom microcoded dynamic memory management for distributed on-chip memory organizations. IEEE Embedded Sys. Lett. 3, 2, 66--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Beigné, E., Clermidy, F., Miermont, S., and Vivet, P. 2008. Dynamic voltage and frequency scaling architecture for units integration within a GALS NoC. In Proceedings of the 2nd ACM/IEEE International Symposium on Networks-on-Chip. IEEE, 129--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Berger, E. D., McKinley, K. S., Blumofe, R. D., and Wilson, P. R. 2000. Hoard: A scalable memory allocator for multithreaded applications. SIGPLAN Not. 35, 11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bhatti, M., Belleudi, C., and Auguin, M. 2010. An inter-task real time DVFS scheme for multiprocessor embedded systems. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing. 136--143.Google ScholarGoogle Scholar
  7. Borkar, S. 2007. Thousand core chips: A technology perspective. In Proceedings of the IEEE/ACM Design Automation Conference. 746--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chabloz, J.-M. and Hemani, A. 2009. A flexible communication scheme for rationally-related clock frequencies. In Proceedings of the IEEE International Conference on Computer Design. IEEE. 109--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chabloz, J.-M. and Hemani, A. 2010a. Distributed dvfs using rationally-related frequencies and discrete voltage levels. In Proceedings of the International Symposium on Low-Power Electronics and Design. ACM, 247--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chabloz, J.-M. and Hemani, A. 2010b. Lowering the latency of interfaces for rationally-related frequencies. In Proceedings of the IEEE International Conference on Computer Design. 23--30.Google ScholarGoogle Scholar
  11. Chabloz, J.-M. and Hemani, A. 2012. Power Management Architecture in McNoC. Springer, 55.Google ScholarGoogle Scholar
  12. Chang, J. M. and Gehringer, E. F. 1996. A high-performance memory allocator for object-oriented systems. IEEE Trans. Comput. 45, 3, 357--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Chapiro, D. M. 1985. Globally-asynchronous locally-synchronous systems (performance, reliability, digital). Ph.D. thesis. AAI8506166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chen, X., Lu, Z., Jantsch, A., and Chen, S. 2010. Supporting distributed shared memory on multi-core network-on-chips using a dual microcoded controller. In Proceedings of the Conference and Exhibition on Design, Automation and Test in Europe. 39--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dean, J. and Ghemawat, S. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1, 107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gutnik, V. and Chandrakasan, A. P. 1997. Embedded power supply for low-power dsp. IEEE Trans. Very Large Scale Integr. Syst. 5, 425--435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Herbert, S. and Marculescu, D. 2007. Analysis of dynamic voltage/frequency scaling in chipmultiprocessors. In Proceedings of the International Symposium on Low-Power Electronics and Design. ACM, 38--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hirata, K. and Goodacre, J. 2007. ARM MPCore; The streamlined and scalable ARM11 processor core. In Proceedings of the Asia and South Pacific Design Automation Conference. IEEE, 747--748. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Horowitz, M., Indermaur, T., and Gonzalez, R. 1994. Low-power digital design. In Proceedings of the IEEE Symposium on Low Power Electronics. 8--11.Google ScholarGoogle Scholar
  20. Iyengar, A. K. 1993. Parallel dynamic storage allocation algorithms. In Proceedings of the 5th IEEE Symposium on Parallel and Distributed Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Larson, P. and Krishnan, M. 1998. Memory allocation for long-running server applications. In Proceedings of the International Symposium on Memory Management. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lea, D. 2007. A memory allocator. online, http://gee.cs.oswego.edu/dl/html/malloc.Google ScholarGoogle Scholar
  23. Mamagkakis, S., Atienza, D., Poucet, C., Catthoor, F., and Soudris, D. 2006. Energy-efficient dynamic memory allocators at the middleware level of embedded systems. In Proceedings of the ACM & IEEE International Conference on Embedded Software. ACM, 215--222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mendias, J. M., Mamagkakis, S., Soudris, D., and Catthoor, F. 2006. Systematic dynamic memory management design methodology for reduced memory footprint. ACM Trans. Des. Autom. Electron. Syst. 11, 2, 465--489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Monchiero, M., Palermo, G., Silvano, C., and Villa, O. 2007. Exploration of distributed shared memory architectures for NoC-based multiprocessors. J. Syst. Archit. 53, 10, 719--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sakurai, T. and Newton, A. 1990. Alpha-power law mosfet model and its applications to cmos inverter delay and other formulas. IEEE J. Solid-State Circ. 25, 2, 584--594.Google ScholarGoogle ScholarCross RefCross Ref
  27. Shalan, M. and Mooney, V. J. 2002. Hardware support for real-time embedded multiprocessor system-on-a-chip memory management. In Proceedings of the International Workshop on Hardware/Software Codesign. ACM, 79--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shin, Y., Choi, K., and Sakurai, T. 2000. Power optimization of real-time embedded systems on variable speed processors. In Proceedings of the IEEE International Conference on Computer-Aided Design. IEEE, 365--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. SIA. 2011. International Technology Roadmap for Semiconductors. Semiconductor Industry Association.Google ScholarGoogle Scholar
  30. Talbot, J., Yoo, R. M., and Kozyrakis, C. 2011. Phoenix++: Modular MapReduce for shared-memory systems. In Proceedings of the 2nd International Workshop on MapReduce. ACM, 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Teehan, P., Greenstreet, M., and Lemieux, G. 2007. A survey and taxonomy of GALS design styles. IEEE Des. Test 24, 418--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tran, A. T., Truong, D. N., and Baas, B. M. 2009. A GALS many-core heterogeneous DSP platform with source-synchronous on-chip interconnection network. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip. IEEE, 214--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vee, V.-Y. and Hsu, W.-J. 1999. A scalable and efficient storage allocator on shared memory multiprocessors. In Proceedings of the International Symposium on Pervasive Systems, Algorithms, and Networks. 230--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Vo, K. P. 1996. Vmalloc: A general and efficient memory allocator. Softw. Pract. Exper. 26, 1--18.Google ScholarGoogle ScholarCross RefCross Ref
  35. Wilson, P., Johnstone, M. S., Neely, M., and Boles, D. 1995. Dynamic storage allocation: A survey and critical review. In Memory Management, Lecture Notes in Computer Science, vol. 986. Springer, 1--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xydis, S., Bartzas, A., Anagnostopoulos, I., Soudris, D., and Pekmestzi, K. 2010. Custom mutli-threaded dynamic memory management for multiprocessor system-on-chip platforms. In Proceedings of the International Conference on Embedded Computer Systems. 102--109.Google ScholarGoogle Scholar
  37. Yoo, R. M., Roamno, A., and Kozurakis, C. 2009. Phoenix rebirth: Scalable mapreduce on a large-scale shared-memory system. In Proceedings of the IEEE International Symposium on Workload Characterization. IEEE, 198--207. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Power-aware dynamic memory management on many-core platforms utilizing DVFS

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!