skip to main content
research-article

A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels

Published:17 February 2015Publication History
Skip Abstract Section

Abstract

Ever-increasing demands for main memory bandwidth and memory speed/power tradeoff led to the introduction of memories with multiple memory channels, such as Wide IO DRAM. Efficient utilization of a multichannel memory as a shared resource in multiprocessor real-time systems depends on mapping of the memory clients to the memory channels according to their requirements on latency, bandwidth, communication, and memory capacity. However, there is currently no real-time memory controller for multichannel memories, and there is no methodology to optimally configure multichannel memories in real-time systems. As a first work toward this direction, we present two main contributions in this article: (1) a configurable real-time multichannel memory controller architecture with a novel method for logical-to-physical address translation and (2) two design-time methods to map memory clients to the memory channels, one an optimal algorithm based on an integer programming formulation of the mapping problem, and the other a fast heuristic algorithm. We demonstrate the real-time guarantees on bandwidth and latency provided by our multichannel memory controller architecture by experimental evaluation. Furthermore, we compare the performance of the mapping problem formulation in a solver and the heuristic algorithm against two existing mapping algorithms in terms of computation time and mapping success ratio. We show that an optimal solution can be found in 2 hours using the solver and in less than 1 second with less than 7% mapping failure using the heuristic for realistically sized problems. Finally, we demonstrate configuring a Wide IO DRAM in a high-definition (HD) video and graphics processing system to emphasize the practical applicability and effectiveness of this work.

References

  1. E. Aho, J. Nikara, P. A. Tuominen, and K. Kuusilinna. 2009. A case for multi-channel memories in video recording. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'09). 934--939. DOI:http://dx.doi.org/10.1109/DATE.2009.5090799 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Akesson and K. Goossens. 2011a. Architectures and modeling of predictable memory controllers for improved system integration. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763145Google ScholarGoogle Scholar
  3. B. Akesson and K. Goossens. 2011b. Memory Controllers for Real-Time Embedded Systems. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Akesson, L. Steffens, E. Strooisma, and K. Goossens. 2008. Real-time scheduling using credit-controlled static-priority arbitration. In Proceedings of the 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'08). 3--14. DOI:http://dx.doi.org/10.1109/RTCSA.2008.21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Awasthi, D. W. Nellans, K. Sudan, R. Balasubramonian, and A. Davis. 2010. Handling the problems and opportunities posed by multiple on-chip memory controllers. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT'10). ACM, 319--330. DOI:http://dx.doi.org/10.1145/1854273.1854314 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Bayliss and G. A. Constantinides. 2012. Analytical synthesis of bandwidth-efficient SDRAM address generators. Microprocessors and Microsystems 36, 8 (Nov. 2012), 665--675. DOI:http://dx.doi.org/10.1016/j.micpro.2012.05.007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. C. Bonatto, A. B. Soares, and A. A. Susin. 2011. Multichannel SDRAM controller design for H.264/AVC video decoder. In Proceedings of the 2011 VII Southern Conference on Programmable Logic (SPL'11). 137--142. DOI:http://dx.doi.org/10.1109/SPL.2011.5782638Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Bouquet. 2000. Optimal Multi-channel Memory Controller System. Patent number: 6643746.Google ScholarGoogle Scholar
  9. F. Cabarcas, A. Rico, Y. Etsion, and A. Ramirez. 2010. Interleaving granularity on high bandwidth memory architecture for CMPs. In Proceedings of the International Conference on Embedded Computer Systems (SAMOS'10). 250--257. DOI:http://dx.doi.org/10.1109/ICSAMOS.2010.5642060Google ScholarGoogle Scholar
  10. P. Casini. 2008. SoC Architecture to Multichannel Memory Management Using Sonics IMT. White paper. Sonics, Inc.Google ScholarGoogle Scholar
  11. CPLEX. 2014. IBM ILOG CPLEX Optimizer. Retrieved http://www.ibm.com.Google ScholarGoogle Scholar
  12. R. L. Cruz. 1991. A calculus for network delay. II. Network analysis. IEEE Transactions on Information Theory 37, 1 (1991), 132--141. DOI:http://dx.doi.org/10.1109/18.61110 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. D. Gomony, B. Akesson, and K. Goossens. 2013. Architecture and optimal configuration of a real-time multi-channel memory controller. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE), 2013. 1307--1312. DOI:http://dx.doi.org/10.7873/DATE.2013.270 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. D. Gomony, C. Weis, B. Akesson, N. Wehn, and K. Goossens. 2012. DRAM selection and configuration for real-time mobile systems. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'12). 51--56. DOI:http://dx.doi.org/10.1109/DATE.2012.6176432 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Goossens, J. Kuijsten, B. Akesson, and K. Goossens. 2013. A reconfigurable real-time SDRAM controller for mixed time-criticality systems. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. HMC. 2014. Homepage. Retrieved from http://www.hybridmemorycube.org.Google ScholarGoogle Scholar
  17. H. Hongqi, X. Jiadong, D. Zhemin, and S. Jingnan. 2007. High efficiency synchronous DRAM controller for H.264 HDTV encoder. In Proceedings of the 2007 IEEE Workshop on Signal Processing Systems. 373--376. DOI:http://dx.doi.org/10.1109/SIPS.2007.4387575Google ScholarGoogle ScholarCross RefCross Ref
  18. JEDEC. 2014. Wide I/O Single Data Rate Specification. Retrieved from http://www.jedec.org.Google ScholarGoogle Scholar
  19. M. Katevenis, S. Sidiropoulos, and C. Courcoubetis. 1991. Weighted round-robin cell multiplexing in a general-purpose ATM switch chip. IEEE Journal on Selected Areas in Communications 9, 8 (1991), 1265--1279. DOI:http://dx.doi.org/10.1109/49.105173 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar. 2014. Bounding memory interference delay in COTS-based multicore systems. In Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS'14).Google ScholarGoogle Scholar
  21. P. Kollig, C. Osborne, and T. Henriksson. 2009. Heterogeneous multi-core platform for consumer multimedia applications. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). 1254--1259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Lee, M. Potkonjak, and W. H. Mangione-Smith. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture. 330--335. DOI:http://dx.doi.org/10.1109/MICRO.1997.645830 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Y. Li, B. Akesson, and K. Goossens. 2014. Dynamic command scheduling for real-time memory controllers. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS'14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Lin and S. A. Brandt. 2005. Improving soft real-time performance through better slack management. In Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS'05). 12 pp.--421. DOI:http://dx.doi.org/10.1109/RTSS.2005.26 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Loi and L. Benini. 2010. An efficient distributed memory interface for many-core platform with 3D stacked DRAM. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'10). 99--104. DOI:http://dx.doi.org/10.1109/DATE.2010.5457230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Melpignano, L. Benini, E. Flamand, B. Jego, T. Lepley, G. Haugou, F. Clermidy, and D. Dutoit. 2012. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications. In Proceedings of the 49th Annual Design Automation Conference (DAC'12). ACM, 1137--1142. DOI:http://dx.doi.org/10.1145/2228360.2228568 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Nikara, E. Aho, P. A. Tuominen, and K. Kuusilinna. 2009. Performance analysis of multi-channel memories in mobile devices. In Proceedings of the 2009 Symposium on System-on-Chip (SOC'09). 128--131. DOI:http://dx.doi.org/10.1109/SOCC.2009.5335661 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Ou, N. Xiao, and M. Lai. 2011. A scalable multi-channel parallel NAND flash memory controller architecture. In Proceedings of the 2011 6th Annual Chinagrid Conference (ChinaGrid). 48--53. DOI:http://dx.doi.org/10.1109/ChinaGrid.2011.29 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Paolieri, E. Quiñones, and F. J. Cazorla. 2013. Timing effects of DDR memory systems in hard real-time multicore architectures: Issues and solutions. ACM Trans. Embed. Comput. Syst. 12, 1s, Article 64 (March 2013), 26 pages. DOI:http://dx.doi.org/10.1145/2435227.2435260 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Reineke, I. Liu, H. D. Patel, S. Kim, and E. A. Lee. 2011. PRET DRAM controller: Bank privatization for predictability and temporal isolation. In Proceedings of the 7th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'11). ACM, 99--108. DOI:http://dx.doi.org/10.1145/2039370.2039388 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. C. Sancho, M. Lang, and D. K. Kerbyson. 2010. Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance. In Proceedings of the 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW'10). 1--7. DOI:http://dx.doi.org/10.1109/IPDPSW.2010.5470812Google ScholarGoogle Scholar
  32. H. Shah, A. Raabe, and A. Knoll. 2012. Bounding WCET of applications using SDRAM with priority based budget scheduling in MPSoCs. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'12). 665--670. DOI:http://dx.doi.org/10.1109/DATE.2012.6176554 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Shreedhar and G. Varghese. 1996. Efficient fair queuing using deficit round-robin. IEEE/ACM Transactions on Networking 4, 3 (1996), 375--385. DOI:http://dx.doi.org/10.1109/90.502236 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. SimpleScalar. 2004. Homepage. Retrieved from http://www.simplescalar.com.Google ScholarGoogle Scholar
  35. S. Sriram and S. S. Bhattacharyya. 2000. Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Steffens, M. Agarwal, and P. Wolf. 2008. Real-time analysis for memory access in media processing SoCs: A practical approach. In Proceedings of the 2008 Euromicro Conference on Real-Time Systems (ECRTS'08). 255--265. DOI:http://dx.doi.org/10.1109/ECRTS.2008.36 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Steine, M. Bekooij, and M. Wiggers. 2009. A priority-based budget scheduler with conservative dataflow model. In Proceedings of the 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools (DSD'09). 37--44. DOI:http://dx.doi.org/10.1109/DSD.2009.148 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Stevens. 2010. QoS for High-Performance and Power-Efficient HD Multimedia. ARM White paper. Retrieved from http://wwww.arm.com.Google ScholarGoogle Scholar
  39. D. Stiliadis and A. Varma. 1998. Latency-rate servers: A general model for analysis of traffic scheduling algorithms. IEEE/ACM Transactions on Networking 6, 5 (1998), 611--624. DOI:http://dx.doi.org/10.1109/90.731196 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Texas Instruments Inc. 2014. TMS320VC5505/5504 DSP Direct Memory Access (DMA) Controller. Retrieved from http://www.ti.com.Google ScholarGoogle Scholar
  41. C. H. (Kees) van Berkel. 2009. Multi-core for mobile phones. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). 1260--1265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. van der Wolf and J. Geuzebroek. 2011. SoC infrastructures for predictable system integration. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763146Google ScholarGoogle Scholar
  43. Z. P. Wu, Y. Krish, and R. Pellizzoni. 2013. Worst case analysis of DRAM latency in multi-requestor systems. In 2013 IEEE 34th Real-Time Systems Symposium (RTSS). 372--383. DOI:http://dx.doi.org/10.1109/RTSS.2013.44 Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xilinx Inc. 2014. LogiCORE IP XPS Multi-channel External Memory Controller (XPS MCH EMC). Retrieved from http://www.xilinx.com.Google ScholarGoogle Scholar
  45. G. Zhang, H. Wang, X. Chen, S. Huang, and P. Li. 2012. Heterogeneous multi-channel: Fine-grained DRAM control for both system performance and power efficiency. In Proceedings of the 2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC). 876--881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. T. Zhang, K. Wang, Y. Feng, Y. Chen, Q. Li, B. Shao, J. Xie, X. Song, L. Duan, Y. Xie, X. Cheng, and Y. Lin. 2010. A 3D SoC design for H.264 application with on-chip DRAM stacking. In Proceedings of the 2010 IEEE International 3D Systems Integration Conference (3DIC). 1--6. DOI:http://dx.doi.org/10.1109/3DIC.2010.5751446Google ScholarGoogle ScholarCross RefCross Ref
  47. Z. Zhu, Z. Zhang, and X. Zhang. 2002. Fine-grain priority scheduling on multi-channel memory systems. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture. 107--116. DOI:http://dx.doi.org/10.1109/HPCA.2002.995702 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!