Abstract
As technology scales down, energy consumption is becoming a big problem for traditional SRAM-based cache hierarchies. The emerging Spin-Torque Transfer RAM (STT-RAM) is a promising replacement for large on-chip cache due to its ultra low leakage power and high storage density. However, write operations on STT-RAM suffer from considerably higher energy consumption and longer latency than SRAM. Hybrid cache consisting of both SRAM and STT-RAM has been proposed recently for both performance and energy efficiency. Most management strategies for hybrid caches employ migration-based techniques to dynamically move write-intensive data from STT-RAM to SRAM. These techniques lead to extra overheads. In this paper, we propose a compiler-assisted approach, preferred caching, to significantly reduce the migration overhead by giving migration-intensive memory blocks the preference for the SRAM part of the hybrid cache. Furthermore, a data assignment technique is proposed to improve the efficiency of preferred caching. The reduction of migration overhead can in turn improve the performance and energy efficiency of STT-RAM based hybrid cache. The experimental results show that, with the proposed techniques, on average, the number of migrations is reduced by 21.3%, the total latency is reduced by 8.0% and the total dynamic energy is reduced by 10.8%.
- http://www.arm.com/products/processors/cortex-r/index.php.Google Scholar
- http://cache.freescale.com/files/32bit/doc/ref_manual/e300coreRM.pdf.Google Scholar
- http://www.samsung.com/global/business/semiconductor/products/fusionmemory/Products_NcPRAM.html.Google Scholar
- http://www2.renesas.com/micro/en/product/vr/legacy.html.Google Scholar
- B. Calder, C. Krintz, S. John, and T. Austin. Cache-conscious data placement. In Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, ASPLOS-VIII, pages 139--149, New York, NY, USA, 1998. ACM. ISBN 1-58113-107-0. doi: http://doi.acm.org/10.1145/291069.291036. URL http://doi.acm.org/10.1145/291069.291036. Google Scholar
Digital Library
- X. Dong, X. Wu, G. Sun, Y. Xie, H. Li, and Y. Chen. Circuit and microarchitecture evaluation of 3d stacking magnetic ram (mram) as a universal memory replacement. In Proceedings of the 45th annual Design Automation Conference, DAC '08, pages 554--559, New York, NY, USA, 2008. ACM. ISBN 978-1-60558-115-6. doi: http://doi.acm.org/10.1145/1391469.1391610. URL http://doi.acm.org/10.1145/1391469.1391610. Google Scholar
Digital Library
- M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. Mibench: A free, commercially representative embedded benchmark suite. In Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop on, pages 3--14, dec. 2001. doi: 10.1109/WWC.2001.990739. Google Scholar
Digital Library
- J. Hu, C. Xue, Q. Zhuge, W.-C. Tseng, and E.-M. Sha. Towards energy efficient hybrid on-chip scratch pad memory with non-volatile memory. In Design, Automation Test in Europe Conference Exhibition (DATE), 2011, pages 1--6, march 2011.Google Scholar
- A. Jadidi, M. Arjomand, and H. Sarbazi-Azad. High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement. In Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design, ISLPED '11, pages 79--84, Piscataway, NJ, USA, 2011. IEEE Press. ISBN 978-1-61284-660-6. URL http://dl.acm.org/citation.cfm?id=2016802.2016827. Google Scholar
Digital Library
- C. Lattner and V. Adve. Llvm: A compilation framework for lifelong program analysis & transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, CGO '04, pages 75--, Washington, DC, USA, 2004. IEEE Computer Society. ISBN 0-7695-2102-9. URL http://dl.acm.org/citation.cfm?id=977395.977673. Google Scholar
Digital Library
- B. C. Lee, E. Ipek, O. Mutlu, and D. Burger. Architecting phase change memory as a scalable dram alternative. In Proceedings of the 36th annual international symposium on Computer architecture, ISCA '09, pages 2--13, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-526-0. doi: http://doi.acm.org/10.1145/1555754.1555758. URL http://doi.acm.org/10.1145/1555754.1555758. Google Scholar
Digital Library
- J. Li, C. Xue, and Y. Xu. Stt-ram based energy-efficiency hybrid cache for cmps. In VLSI and System-on-Chip (VLSI-SoC), 2011 IEEE/IFIP 19th International Conference on, pages 31--36, oct. 2011. doi: 10.1109/VLSISoC.2011.6081626.Google Scholar
Cross Ref
- Y. Li, A. Abousamra, R. Melhem, and A. K. Jones. Compiler-assisted data distribution for chip multiprocessors. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 501--512, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0178-7. doi: http://doi.acm.org/10.1145/1854273.1854335. URL http://doi.acm.org/10.1145/1854273.1854335. Google Scholar
Digital Library
- T. Liu, Y. Zhao, C. Xue, and M. Li. Power-aware variable partitioning for dsps with hybrid pram and dram main memory. In Design Automation Conference (DAC), 2011 48th ACM/EDAC/IEEE, pages 405--410, june 2011. Google Scholar
Digital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, PLDI '05, pages 190--200, New York, NY, USA, 2005. ACM. ISBN 1-59593-056-6. doi: 10.1145/1065010.1065034. URL http://doi.acm.org/10.1145/1065010.1065034. Google Scholar
Digital Library
- N. Muralimanohar, R. Balasubramonian, and N. Jouppi. Optimizing nuca organizations and wiring alternatives for large caches with cacti 6.0. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 40, pages 3--14, Washington, DC, USA, 2007. IEEE Computer Society. ISBN 0-7695-3047-8. doi: http://dx.doi.org/10.1109/MICRO.2007.30. URL http://dx.doi.org/10.1109/MICRO.2007.30. Google Scholar
Digital Library
- E. Petrank and D. Rawitz. The hardness of cache conscious data placement. Nordic J. of Computing, 12:275--307, June 2005. ISSN 1236--6064. URL http://dl.acm.org/citation.cfm?id=1145884.1145889. Google Scholar
Digital Library
- M. K. Qureshi, V. Srinivasan, and J. A. Rivers. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th annual international symposium on Computer architecture, ISCA '09, pages 24--33, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-526-0. doi: http://doi.acm.org/10.1145/1555754.1555760. URL http://doi.acm.org/10.1145/1555754.1555760. Google Scholar
Digital Library
- S. Sarkar and D. M. Tullsen. Compiler techniques for reducing data cache miss rate on a multithreaded architecture. In Proceedings of the 3rd international conference on High performance embedded architectures and compilers, HiPEAC'08, pages 353--368, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 3-540-77559-5, 978-3-540-77559-1. URL http://dl.acm.org/citation.cfm?id=1786054.1786087. Google Scholar
Digital Library
- G. Sun, X. Dong, Y. Xie, J. Li, and Y. Chen. A novel architecture of the 3d stacked mram l2 cache for cmps. In High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, pages 239--249, feb. 2009. doi: 10.1109/HPCA.2009.4798259.Google Scholar
Cross Ref
- X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony, and Y. Xie. Hybrid cache architecture with disparate memory technologies. In Proceedings of the 36th annual international symposium on Computer architecture, ISCA '09, pages 34--45, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-526-0. doi: http://doi.acm.org/10.1145/1555754.1555761. URL http://doi.acm.org/10.1145/1555754.1555761. Google Scholar
Digital Library
- Y. Wu and J. R. Larus. Static branch frequency and program profile analysis. In Proceedings of the 27th annual international symposium on Microarchitecture, MICRO 27, pages 1--11, New York, NY, USA, 1994. ACM. ISBN 0-89791-707-3. doi: http://doi.acm.org/10.1145/192724.192725. URL http://doi.acm.org/10.1145/192724.192725. Google Scholar
Digital Library
- W. Zhang and T. Li. Exploring phase change memory and 3d die-stacking for power/thermal friendly, fast and durable memory architectures. In Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pages 101--112, Washington, DC, USA, 2009. IEEE Computer Society. ISBN 978-0-7695-3771-9. doi: 10.1109/PACT.2009.30. URL http://dl.acm.org/citation.cfm?id=1636712.1637751. Google Scholar
Digital Library
- P. Zhou, B. Zhao, J. Yang, and Y. Zhang. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th annual international symposium on Computer architecture, ISCA '09, pages 14--23, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-526-0. doi: http://doi.acm.org/10.1145/1555754.1555759. URL http://doi.acm.org/10.1145/1555754.1555759. Google Scholar
Digital Library
Index Terms
Compiler-assisted preferred caching for embedded systems with STT-RAM based hybrid cache
Recommendations
Compiler-assisted preferred caching for embedded systems with STT-RAM based hybrid cache
LCTES '12: Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded SystemsAs technology scales down, energy consumption is becoming a big problem for traditional SRAM-based cache hierarchies. The emerging Spin-Torque Transfer RAM (STT-RAM) is a promising replacement for large on-chip cache due to its ultra low leakage power ...
DCCS: Double Circular Caching Scheme for DRAM/PRAM Hybrid Cache
DRAM is widely adopted as a cache for secondary storage due to its small access latency. Compared with DRAM, PRAM draws a lot of attention recently, since it provides higher density and has no need to refresh the capacitor charge periodically. The non-...
Wear-Resistant Hybrid Cache Architecture with Phase Change Memory
NAS '12: Proceedings of the 2012 IEEE Seventh International Conference on Networking, Architecture, and StoragePhase-change Random Access Memory (PRAM) is one of the most promising technologies among emerging non-volatile memory technologies, which provides many benefits, such as high density, non-volatility and low leakage power. However, the limited write ...






Comments