Abstract
The memory subsystem is increasingly subject to an intensive energy minimization effort in embedded and System-on-Chip development. While the main focus is typically put on energy consumption reduction, there are other optimization aspects that become more and more relevant as well, e.g., peak power constraints or time budgets. In this regard, the present article makes the following contributions. Taking industrial-grade information into account, different Static Random-Access Memory (SRAM) power modes and their characteristics are presented at first. Using this information, a comprehensive optimization model with the main intention of energy minimization is defined. It is based on memory access statistics that represent the embedded software of interest, which allows for application-tailored improvements. Further, it considers different power states of the memory subsystem and enables the definition of peak power and time corridor constraints. The presented two-stage implementation of this optimization model allows the handling of large design spaces. Clearly defined interfaces facilitate the exchange of individual workflow parts in a plug-and-play fashion and further enable a neat integration of our optimization method with existing hardware/software (HW/SW) codesign synthesis flows. A general evaluation for different technology nodes yields that the optimization potential of memory low-power modes increases with advancing miniaturization but also depends on the data footprint of the embedded software. Experimental results for a set of benchmark applications confirm these findings and provide energy savings of up to 90% and over 60% on average compared to a monolithic memory layout without low-power modes.
- F. Angiolini, L. Benini, and A. Caprara. 2005. An efficient profile-based algorithm for scratchpad memory partitioning. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 24, 11 (2005), 1660--1676. DOI:https://doi.org/10.1109/TCAD.2005.852299Google Scholar
Digital Library
- O. Avissar, R. Barua, and D. Stewart. 2002. An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Transactions on Embedded Computing Systems 1, 1 (2002), 6--26. DOI:https://doi.org/10.1145/581888.581891Google Scholar
Digital Library
- L. Benini, A. Macii, and M. Poncino. 2000. A recursive algorithm for low-power memory partitioning. In Proceedings of the 2000 International Symposium on Low Power Electronics and Design (ISLPED’00). 78--83. DOI:https://doi.org/10.1145/344166.344518Google Scholar
Digital Library
- Gary William Flake, Robert E. Tarjan, and Kostas Tsioutsiouliklis. 2004. Graph clustering and minimum cut trees. Internet Mathematics 1, 4 (2004), 385--408. DOI:https://doi.org/10.1080/15427951.2004.10129093 arXiv:https://doi.org/10.1080/15427951.2004.10129093Google Scholar
Cross Ref
- R. Görke. 2010. An Algorithmic Walk from Static to Dynamic Graph Clustering. Ph.D. Dissertation. Karlsruhe Institute of Technology (KIT). Retrieved July 2019 https://publikationen.bibliothek.kit.edu/1000018288.Google Scholar
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the 2001 IEEE International Workshop on Workload Characterization. IEEE. DOI:https://doi.org/10.1109/WWC.2001.990739Google Scholar
- J. Hu, Q. Zhuge, C. J. Xue, W. Tseng, and E. H.-M. Sha. 2014. Management and optimization for nonvolatile memory-based hybrid scratchpad memory on multicore embedded processors. ACM Transactions on Embedded Computing Systems 13, 4, Article 79 (2014), 1--25. DOI:https://doi.org/10.1145/2560019Google Scholar
Digital Library
- T. C. Hu and Andrew B. Kahng. 2016. Linear and Integer Programming Made Easy. Springer International Publishing, Switzerland. DOI:https://doi.org/10.1007/978-3-319-24001-5Google Scholar
- ITRS. 2015. More Moore. Retrieved July 2019 from https://www.semiconductors.org/main/2015_international_technology_roadmap_for_semiconductors_itrs/.Google Scholar
- M. Kandemir, M. J. Irwin, G. Chen, and I. Kolcu. 2005. Compiler-guided leakage optimization for banked scratch-pad memories. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 13, 10 (2005), 1136--1146. DOI:https://doi.org/10.1109/TVLSI.2005.859478Google Scholar
Digital Library
- S. Lafond and J. Lilius. 2006. Static energy saving through multi-bank memory architecture. In Proc. of the 2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation. 2006 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, 43--49. DOI:https://doi.org/10.1109/ICSAMOS.2006.300807Google Scholar
- W.-C. Lin and C.-H. Chen. 2004. An energy-delay efficient power management scheme for embedded system in multimedia applications. In Proceedings of the 2004 IEEE Asia-Pacific Conference on Circuits and Systems, Vol. 2. 869--872. DOI:https://doi.org/10.1109/APCCAS.2004.1413017Google Scholar
- M. Loghi, O. Golubeva, E. Macii, and M. Poncino. 2010. Architectural leakage power minimization of scratchpad memories by application-driven subbanking. IEEE Transactions on Computers 59, 7 (2010), 891--904. DOI:https://doi.org/10.1109/TC.2010.43Google Scholar
Digital Library
- V. De La Luz, M. Kandemir, and I. Kolcu. 2002. Automatic data migration for reducing energy consumption in multi-bank memory systems. In Proceedings of the 2002 Design Automation Conference. 213--218. DOI:https://doi.org/10.1109/DAC.2002.1012622Google Scholar
- V. De La Luz, M. Kandemir, and I. Kolcu. 2006. Reducing memory energy consumption of embedded applications that process dynamically allocated data. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25, 9 (2006), 1855--1860. DOI:https://doi.org/10.1109/TCAD.2005.859521Google Scholar
Digital Library
- A. Mathur and L. Minwell. 2009. Memory Power Reduction in SoC Design Using PowerPro MG. Retrieved July 2019 from https://www.design-reuse.com/articles/21806/memory-power-reduction-soc-design.html.Google Scholar
- F. Menichelli and M. Olivieri. 2009. Static minimization of total energy consumption in memory subsystem for scratchpad-based systems-on-chips. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17, 2 (2009), 161--171. DOI:https://doi.org/10.1109/TVLSI.2008.2001940Google Scholar
Digital Library
- L. Minwell. 2011. Advanced Power Management in Embedded Memory Subsystems. Retrieved July 2019 from https://www.design-reuse.com/articles/26402/power-management-in-embedded-memory-subsystems.html.Google Scholar
- S. P. Mohanty, N. Ranganathan, and S. K. Chappidi. 2006. ILP models for simultaneous energy and transient power minimization during behavioral synthesis. ACM Transactions on Design Automation of Electronic Systems 11, 1 (2006), 186--212. DOI:https://doi.org/10.1145/1124713.1124725Google Scholar
Digital Library
- N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi. 2009. CACTI 6.0: A Tool to Model Large Caches. Technical Report HPL-2009-85. HP Laboratories. Retrieved July 2019 from http://www.hpl.hp.com/techreports/2009/HPL-2009-85.pdf.Google Scholar
- NanGate Inc. 2011. NanGate FreePDK45 Open Cell Library. Retrieved July 2019 from https://www.silvaco.com/products/nangate/FreePDK45_Open_Cell_Library/index.html.Google Scholar
- M. E. J. Newman and M. Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69 (2004), 026113. Issue 2. DOI:https://doi.org/10.1103/PhysRevE.69.026113Google Scholar
Cross Ref
- S. Srinivasan, F. Angiolini, M. Ruggiero, L. Benini, and N. Vijaykrishnan. 2005. Simultaneous memory and bus partitioning for SoC architectures. In Proceedings of the 2005 IEEE International SOC Conference. IEEE, 125--128. DOI:https://doi.org/10.1109/SOCC.2005.1554478Google Scholar
Cross Ref
- L. Steinfeld, M. Ritt, F. Silveira, and L. Carro. 2013. Low-Power Processors Require Effective Memory Partitioning. Springer, Berlin, 73--81. DOI:https://doi.org/10.1007/978-3-642-38853-8_7Google Scholar
- S. Steinke, L. Wehmeyer, Bo-Sik Lee, and P. Marwedel. 2002. Assigning program and data objects to scratchpad for energy reduction. In Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition. 409--415. DOI:https://doi.org/10.1109/DATE.2002.998306Google Scholar
Cross Ref
- M. Strobel, M. Eggenberger, and M. Radetzki. 2016. Low power memory allocation and mapping for area-constrained systems-on-chips. EURASIP Journal on Embedded Systems 2017, 1 (2016). DOI:https://doi.org/10.1186/s13639-016-0039-5Google Scholar
- M. Strobel and M. Radetzki. 2017. Hybrid instruction set simulation for fast and accurate memory access profiling. In Proceedings of the 13th Workshop on Intelligent Solutions in Embedded Systems (WISES). 23--28. DOI:https://doi.org/10.1109/WISES.2017.7986927Google Scholar
Cross Ref
- L. Zhang, C. Wu, L. F. Mao, and J. Zheng. 2012. Integrated SRAM compiler with clamping diode to reduce leakage and dynamic power in nano-CMOS process. IET Micro Nano Letters 7, 2 (2012), 171--173. DOI:https://doi.org/10.1049/mnl.2011.0680Google Scholar
Cross Ref
Index Terms
Power-mode-aware Memory Subsystem Optimization for Low-power System-on-Chip Design
Recommendations
Power and endurance aware Flash-PCM memory system
IGCC '11: Proceedings of the 2011 International Green Computing Conference and WorkshopsTwo major performance issues of Flash NAND are the write latency for random writes, and the lifetime of NAND chips. Several methods, mainly focusing on the Flash Translation Layer (FTL) or the Flash Buffer Management have been proposed to address these ...
Next high performance and low power flash memory package structure
In general, NAND flash memory has advantages in low power consumption, storage capacity, and fast erase/write performance in contrast to NOR flash. But, main drawback of the NAND flash memory is the slow access time for random read operations. Therefore,...
Shadow Stack Scratch-Pad-Memory for Low Power SoC
SEC '08: Proceedings of the 2008 Fifth IEEE International Symposium on Embedded ComputingIn many embedded system, researches focus on how to use on-chip memory, like SPM, to reduce the energy consumption generated by off-chip memory, such as SDRAM or DRAM. They put the often use instructions and important data into on-chip memory to reduce ...






Comments