Abstract
Ever-increasing demands for main memory bandwidth and memory speed/power tradeoff led to the introduction of memories with multiple memory channels, such as Wide IO DRAM. Efficient utilization of a multichannel memory as a shared resource in multiprocessor real-time systems depends on mapping of the memory clients to the memory channels according to their requirements on latency, bandwidth, communication, and memory capacity. However, there is currently no real-time memory controller for multichannel memories, and there is no methodology to optimally configure multichannel memories in real-time systems. As a first work toward this direction, we present two main contributions in this article: (1) a configurable real-time multichannel memory controller architecture with a novel method for logical-to-physical address translation and (2) two design-time methods to map memory clients to the memory channels, one an optimal algorithm based on an integer programming formulation of the mapping problem, and the other a fast heuristic algorithm. We demonstrate the real-time guarantees on bandwidth and latency provided by our multichannel memory controller architecture by experimental evaluation. Furthermore, we compare the performance of the mapping problem formulation in a solver and the heuristic algorithm against two existing mapping algorithms in terms of computation time and mapping success ratio. We show that an optimal solution can be found in 2 hours using the solver and in less than 1 second with less than 7% mapping failure using the heuristic for realistically sized problems. Finally, we demonstrate configuring a Wide IO DRAM in a high-definition (HD) video and graphics processing system to emphasize the practical applicability and effectiveness of this work.
- E. Aho, J. Nikara, P. A. Tuominen, and K. Kuusilinna. 2009. A case for multi-channel memories in video recording. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'09). 934--939. DOI:http://dx.doi.org/10.1109/DATE.2009.5090799 Google Scholar
Digital Library
- B. Akesson and K. Goossens. 2011a. Architectures and modeling of predictable memory controllers for improved system integration. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763145Google Scholar
- B. Akesson and K. Goossens. 2011b. Memory Controllers for Real-Time Embedded Systems. Springer. Google Scholar
Digital Library
- B. Akesson, L. Steffens, E. Strooisma, and K. Goossens. 2008. Real-time scheduling using credit-controlled static-priority arbitration. In Proceedings of the 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'08). 3--14. DOI:http://dx.doi.org/10.1109/RTCSA.2008.21 Google Scholar
Digital Library
- M. Awasthi, D. W. Nellans, K. Sudan, R. Balasubramonian, and A. Davis. 2010. Handling the problems and opportunities posed by multiple on-chip memory controllers. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT'10). ACM, 319--330. DOI:http://dx.doi.org/10.1145/1854273.1854314 Google Scholar
Digital Library
- S. Bayliss and G. A. Constantinides. 2012. Analytical synthesis of bandwidth-efficient SDRAM address generators. Microprocessors and Microsystems 36, 8 (Nov. 2012), 665--675. DOI:http://dx.doi.org/10.1016/j.micpro.2012.05.007 Google Scholar
Digital Library
- A. C. Bonatto, A. B. Soares, and A. A. Susin. 2011. Multichannel SDRAM controller design for H.264/AVC video decoder. In Proceedings of the 2011 VII Southern Conference on Programmable Logic (SPL'11). 137--142. DOI:http://dx.doi.org/10.1109/SPL.2011.5782638Google Scholar
Cross Ref
- C. Bouquet. 2000. Optimal Multi-channel Memory Controller System. Patent number: 6643746.Google Scholar
- F. Cabarcas, A. Rico, Y. Etsion, and A. Ramirez. 2010. Interleaving granularity on high bandwidth memory architecture for CMPs. In Proceedings of the International Conference on Embedded Computer Systems (SAMOS'10). 250--257. DOI:http://dx.doi.org/10.1109/ICSAMOS.2010.5642060Google Scholar
- P. Casini. 2008. SoC Architecture to Multichannel Memory Management Using Sonics IMT. White paper. Sonics, Inc.Google Scholar
- CPLEX. 2014. IBM ILOG CPLEX Optimizer. Retrieved http://www.ibm.com.Google Scholar
- R. L. Cruz. 1991. A calculus for network delay. II. Network analysis. IEEE Transactions on Information Theory 37, 1 (1991), 132--141. DOI:http://dx.doi.org/10.1109/18.61110 Google Scholar
Digital Library
- M. D. Gomony, B. Akesson, and K. Goossens. 2013. Architecture and optimal configuration of a real-time multi-channel memory controller. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE), 2013. 1307--1312. DOI:http://dx.doi.org/10.7873/DATE.2013.270 Google Scholar
Digital Library
- M. D. Gomony, C. Weis, B. Akesson, N. Wehn, and K. Goossens. 2012. DRAM selection and configuration for real-time mobile systems. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'12). 51--56. DOI:http://dx.doi.org/10.1109/DATE.2012.6176432 Google Scholar
Digital Library
- S. Goossens, J. Kuijsten, B. Akesson, and K. Goossens. 2013. A reconfigurable real-time SDRAM controller for mixed time-criticality systems. In Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13). Google Scholar
Digital Library
- HMC. 2014. Homepage. Retrieved from http://www.hybridmemorycube.org.Google Scholar
- H. Hongqi, X. Jiadong, D. Zhemin, and S. Jingnan. 2007. High efficiency synchronous DRAM controller for H.264 HDTV encoder. In Proceedings of the 2007 IEEE Workshop on Signal Processing Systems. 373--376. DOI:http://dx.doi.org/10.1109/SIPS.2007.4387575Google Scholar
Cross Ref
- JEDEC. 2014. Wide I/O Single Data Rate Specification. Retrieved from http://www.jedec.org.Google Scholar
- M. Katevenis, S. Sidiropoulos, and C. Courcoubetis. 1991. Weighted round-robin cell multiplexing in a general-purpose ATM switch chip. IEEE Journal on Selected Areas in Communications 9, 8 (1991), 1265--1279. DOI:http://dx.doi.org/10.1109/49.105173 Google Scholar
Digital Library
- H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar. 2014. Bounding memory interference delay in COTS-based multicore systems. In Proceedings of the IEEE Real-Time Technology and Applications Symposium (RTAS'14).Google Scholar
- P. Kollig, C. Osborne, and T. Henriksson. 2009. Heterogeneous multi-core platform for consumer multimedia applications. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). 1254--1259. Google Scholar
Digital Library
- C. Lee, M. Potkonjak, and W. H. Mangione-Smith. 1997. MediaBench: A tool for evaluating and synthesizing multimedia and communications systems. In Proceedings of the 30th Annual IEEE/ACM International Symposium on Microarchitecture. 330--335. DOI:http://dx.doi.org/10.1109/MICRO.1997.645830 Google Scholar
Digital Library
- Y. Li, B. Akesson, and K. Goossens. 2014. Dynamic command scheduling for real-time memory controllers. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS'14). Google Scholar
Digital Library
- C. Lin and S. A. Brandt. 2005. Improving soft real-time performance through better slack management. In Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS'05). 12 pp.--421. DOI:http://dx.doi.org/10.1109/RTSS.2005.26 Google Scholar
Digital Library
- I. Loi and L. Benini. 2010. An efficient distributed memory interface for many-core platform with 3D stacked DRAM. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'10). 99--104. DOI:http://dx.doi.org/10.1109/DATE.2010.5457230 Google Scholar
Digital Library
- D. Melpignano, L. Benini, E. Flamand, B. Jego, T. Lepley, G. Haugou, F. Clermidy, and D. Dutoit. 2012. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications. In Proceedings of the 49th Annual Design Automation Conference (DAC'12). ACM, 1137--1142. DOI:http://dx.doi.org/10.1145/2228360.2228568 Google Scholar
Digital Library
- J. Nikara, E. Aho, P. A. Tuominen, and K. Kuusilinna. 2009. Performance analysis of multi-channel memories in mobile devices. In Proceedings of the 2009 Symposium on System-on-Chip (SOC'09). 128--131. DOI:http://dx.doi.org/10.1109/SOCC.2009.5335661 Google Scholar
Digital Library
- Y. Ou, N. Xiao, and M. Lai. 2011. A scalable multi-channel parallel NAND flash memory controller architecture. In Proceedings of the 2011 6th Annual Chinagrid Conference (ChinaGrid). 48--53. DOI:http://dx.doi.org/10.1109/ChinaGrid.2011.29 Google Scholar
Digital Library
- M. Paolieri, E. Quiñones, and F. J. Cazorla. 2013. Timing effects of DDR memory systems in hard real-time multicore architectures: Issues and solutions. ACM Trans. Embed. Comput. Syst. 12, 1s, Article 64 (March 2013), 26 pages. DOI:http://dx.doi.org/10.1145/2435227.2435260 Google Scholar
Digital Library
- J. Reineke, I. Liu, H. D. Patel, S. Kim, and E. A. Lee. 2011. PRET DRAM controller: Bank privatization for predictability and temporal isolation. In Proceedings of the 7th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'11). ACM, 99--108. DOI:http://dx.doi.org/10.1145/2039370.2039388 Google Scholar
Digital Library
- J. C. Sancho, M. Lang, and D. K. Kerbyson. 2010. Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance. In Proceedings of the 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW'10). 1--7. DOI:http://dx.doi.org/10.1109/IPDPSW.2010.5470812Google Scholar
- H. Shah, A. Raabe, and A. Knoll. 2012. Bounding WCET of applications using SDRAM with priority based budget scheduling in MPSoCs. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'12). 665--670. DOI:http://dx.doi.org/10.1109/DATE.2012.6176554 Google Scholar
Digital Library
- M. Shreedhar and G. Varghese. 1996. Efficient fair queuing using deficit round-robin. IEEE/ACM Transactions on Networking 4, 3 (1996), 375--385. DOI:http://dx.doi.org/10.1109/90.502236 Google Scholar
Digital Library
- SimpleScalar. 2004. Homepage. Retrieved from http://www.simplescalar.com.Google Scholar
- S. Sriram and S. S. Bhattacharyya. 2000. Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, New York, NY. Google Scholar
Digital Library
- L. Steffens, M. Agarwal, and P. Wolf. 2008. Real-time analysis for memory access in media processing SoCs: A practical approach. In Proceedings of the 2008 Euromicro Conference on Real-Time Systems (ECRTS'08). 255--265. DOI:http://dx.doi.org/10.1109/ECRTS.2008.36 Google Scholar
Digital Library
- M. Steine, M. Bekooij, and M. Wiggers. 2009. A priority-based budget scheduler with conservative dataflow model. In Proceedings of the 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools (DSD'09). 37--44. DOI:http://dx.doi.org/10.1109/DSD.2009.148 Google Scholar
Digital Library
- A. Stevens. 2010. QoS for High-Performance and Power-Efficient HD Multimedia. ARM White paper. Retrieved from http://wwww.arm.com.Google Scholar
- D. Stiliadis and A. Varma. 1998. Latency-rate servers: A general model for analysis of traffic scheduling algorithms. IEEE/ACM Transactions on Networking 6, 5 (1998), 611--624. DOI:http://dx.doi.org/10.1109/90.731196 Google Scholar
Digital Library
- Texas Instruments Inc. 2014. TMS320VC5505/5504 DSP Direct Memory Access (DMA) Controller. Retrieved from http://www.ti.com.Google Scholar
- C. H. (Kees) van Berkel. 2009. Multi-core for mobile phones. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). 1260--1265. Google Scholar
Digital Library
- P. van der Wolf and J. Geuzebroek. 2011. SoC infrastructures for predictable system integration. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763146Google Scholar
- Z. P. Wu, Y. Krish, and R. Pellizzoni. 2013. Worst case analysis of DRAM latency in multi-requestor systems. In 2013 IEEE 34th Real-Time Systems Symposium (RTSS). 372--383. DOI:http://dx.doi.org/10.1109/RTSS.2013.44 Google Scholar
Digital Library
- Xilinx Inc. 2014. LogiCORE IP XPS Multi-channel External Memory Controller (XPS MCH EMC). Retrieved from http://www.xilinx.com.Google Scholar
- G. Zhang, H. Wang, X. Chen, S. Huang, and P. Li. 2012. Heterogeneous multi-channel: Fine-grained DRAM control for both system performance and power efficiency. In Proceedings of the 2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC). 876--881. Google Scholar
Digital Library
- T. Zhang, K. Wang, Y. Feng, Y. Chen, Q. Li, B. Shao, J. Xie, X. Song, L. Duan, Y. Xie, X. Cheng, and Y. Lin. 2010. A 3D SoC design for H.264 application with on-chip DRAM stacking. In Proceedings of the 2010 IEEE International 3D Systems Integration Conference (3DIC). 1--6. DOI:http://dx.doi.org/10.1109/3DIC.2010.5751446Google Scholar
Cross Ref
- Z. Zhu, Z. Zhang, and X. Zhang. 2002. Fine-grain priority scheduling on multi-channel memory systems. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture. 107--116. DOI:http://dx.doi.org/10.1109/HPCA.2002.995702 Google Scholar
Digital Library
Index Terms
A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory Channels
Recommendations
Refresh pausing in DRAM memory systems
Dynamic Random Access Memory (DRAM) cells rely on periodic refresh operations to maintain data integrity. As the capacity of DRAM memories has increased, so has the amount of time consumed in doing refresh. Refresh operations contend with read ...
Page placement in hybrid memory systems
ICS '11: Proceedings of the international conference on SupercomputingPhase-Change Memory (PCM) technology has received substantial attention recently. Because PCM is byte-addressable and exhibits access times in the nanosecond range, it can be used in main memory designs. In fact, PCM has higher density and lower idle ...
Memory controllers for high-performance and real-time MPSoCs: requirements, architectures, and future trends
CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesisDesigning memory controllers for complex real-time and high-performance multi-processor systems-on-chip is challenging, since sufficient capacity and (real-time) performance must be provided in a reliable manner at low cost and with low power ...






Comments