Abstract
We explore techniques to reverse-engineer DRAM embedded memory controllers (MCs), including page policies, address mapping, and command arbitration. There are several benefits to knowing this information: They allow tightening worst-case bounds of embedded systems and platform-aware optimizations at the operating system, source-code, and compiler levels. We develop a latency-based analysis, which we use to devise algorithms and C programs to extract MC properties. We show the effectiveness of the proposed approach by reverse-engineering the MC details in the XUPV5-LX110T Xilinx platform. Furthermore, to cover a breadth of policies, we use a simulation framework and document our findings.
- Andreas Abel and Jan Reineke. 2013. Measurement-based modeling of the cache replacement policy. Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’13). Google Scholar
Digital Library
- Benny Akesson, Kees Goossens, and Markus Ringhofer. 2007. Predator: A predictable SDRAM memory controller. In Proceedings of the IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’07). Google Scholar
Digital Library
- Niladrish Chatterjee, Rajeev Balasubramonian, Manjunath Shevgoor, S. Pugsley, A. Udipi, Ali Shafiee, Kshitij Sudan, Manu Awasthi, and Zeshan Chishti. 2012. Usimm: The Utah simulated memory module. University of Utah, Tech. Rep (2012). Retrieved on 15 October, 2018 from https://github.com/pranith/usimm.Google Scholar
- Clark L. Coleman and Jack W. Davidson. 2001. Automatic memory hierarchy characterization. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’01).Google Scholar
- Adrian Cosoroaba. 2007. Memory interfaces made easy with xilinx fpgas and the memory interface generator. Xilinx Corporation, White Paper (2007).Google Scholar
- Wei Ding, Jun Liu, Mahmut Kandemir, and Mary Jane Irwin. 2013. Reshaping cache misses to improve row-buffer locality in multicore systems. In Proceedings of the IEEE International Conference on Parallel Architectures and Compilation Techniques (PACT’13). Google Scholar
Digital Library
- Sven Goossens, Benny Akesson, and Kees Goossens. 2013. Conservative open-page policy for mixed time-criticality memory controllers. In Proceedings of the IEEE Design, Automation Test in Europe Conference Exhibition (DATE’13). Google Scholar
Digital Library
- Danlu Guo, Mohamed Hassan, Rodolfo Pellizzoni, and Hiren Patel. 2018. A comparative study of predictable dram controllers. ACM Trans. Embed. Comput. Syst. 17, 2 (2018). Google Scholar
Digital Library
- Mohamed Hassan. 2017. Predictable shared memory resources for multi-core real-time systems. Retrieved on 5 October, 2018 from https://uwspace.uwaterloo.ca/handle/10012/11676.Google Scholar
- Mohamed Hassan, Anirudh M Kaushik, and Hiren Patel. 2015. Reverse-engineering embedded memory controllers through latency-based analysis. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’15).Google Scholar
Cross Ref
- Mohamed Hassan and Hiren Patel. 2016. MCXplore: An automated framework for validating memory controller designs. In Proceedings of the IEEE Conference on Design, Automation 8 Test in Europe (DATE’16). Google Scholar
Digital Library
- Mohamed Hassan and Hiren Patel. 2018. MCXplore: Automating the validation process of DRAM memory controller designs. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 37, 5 (2018).Google Scholar
- Mohamed Hassan, Hiren Patel, and Rodolfo Pellizzoni. 2015. A framework for scheduling DRAM memory accesses for multi-core mixed-time critical systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).Google Scholar
Cross Ref
- Mohamed Hassan, Hiren Patel, and Rodolfo Pellizzoni. 2017. PMC: A requirement-aware DRAM controller for multicore mixed criticality systems. ACM Trans. Embed. Comput. Syst. 16, 4 (2017). Google Scholar
Digital Library
- Mohamed Hassan and Rodolfo Pellizzoni. 2018. Bounding DRAM interference in COTS heterogeneous MPSoCs for mixed criticality systems. In Proceedings of the ACM SIGBED International Conference on Embedded Software (EMSOFT’18).Google Scholar
Cross Ref
- Intel. 2011. Intel 64 and IA-32 Architectures, Software Developer’s Manual, Instruction Set Reference, A--Z.Google Scholar
- Intel. 2017. Intel Memory Latency Checker v3.3. Retrieved from https://software.intel.com/en-us/articles/intelr-memory-latency-checker.Google Scholar
- Intel. 2017. Intel Xeon Processor X5650. Retrieved from http://ark.intel.com/products/47922.Google Scholar
- Bruce Jacob, Spencer Ng, and David Wang. 2010. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann.Google Scholar
Digital Library
- JEDEC. 2008. JEDEC DDR3 SDRAM specifications JESD79-3D. Retrieved from http://www.jedec.org/standards-documents/docs/jesd-79-3d.Google Scholar
- Tobias John and Robert Baumgartl. 2007. Exact cache characterization by experimental parameter extraction. In Proceedings of the ACM International Conference on Real-Time and Network Systems (RTNS’07).Google Scholar
- Hyoseung Kim, Dionisio De Niz, Björn Andersson, Mark Klein, Onur Mutlu, and Ragunathan Rajkumar. 2014. Bounding memory interference delay in COTS-based multi-core systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’22).Google Scholar
Cross Ref
- Hyesoon Kim, Jaekyu Lee, Nagesh B. Lakshminarayana, Jaewoong Sim, Jieun Lim, and Tri Pho. 2012. Macsim: A CPU-GPU heterogeneous simulation framework user guide. Georgia Institute of Technology (2012). Retrieved on 15 October, 2018 from https://github.com/gthparch/macsim.Google Scholar
- Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 2014. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. In ACM SIGARCH Computer Architecture News. Google Scholar
Digital Library
- Y. Kim, W. Yang, and O. Mutlu. 2016. Ramulator: A fast and extensible DRAM simulator. IEEE Comput. Architect. Lett. 15, 1 (2016), 45--49. Google Scholar
Digital Library
- Wei-Fen Lin, Steven K. Reinhardt, and Doug Burger. 2001. Reducing DRAM latencies with an integrated memory hierarchy design. In Proceedings of the IEEE Symposium on High-Performance Computer Architecture (HPCA’01). 301--312. Google Scholar
Digital Library
- Micron. 2017. Micron DDR2 SDRAM. Retrieved from https://www.micron.com/products/dram/ddr2-sdram.Google Scholar
- Jonathan Millen. 1999. 20 years of covert channel modeling and analysis. In Proceedings of the IEEE Symposium on Security and Privacy.Google Scholar
Cross Ref
- Thomas Moscibroda and Onur Mutlu. 2007. Memory performance attacks: Denial of memory service in multi-core systems. In Proceedings of the USENIX Security Symposium. Google Scholar
Digital Library
- Onur Mutlu and Lavanya Subramanian. 2014. Research problems and opportunities in memory systems. Supercomput. Front. Innovat. 1, 3 (2014), 19--55. Google Scholar
Digital Library
- P. R. Panda, N. D. Dutt, and A. Nicolau. 1997. Exploiting off-chip memory access modes in high-level synthesis. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD’97). Google Scholar
Digital Library
- M. Paolieri, E. Quiñones, F. J. Cazorla, and M. Valero. 2009. An analyzable memory controller for hard real-time CMPs. IEEE Embed. Syst. Lett. 1, 4 (2009), 86--90. Google Scholar
Digital Library
- Heekwon Park, Seungjae Baek, Jongmoo Choi, Donghee Lee, and Sam H. Noh. 2013. Regularities considered harmful: Forcing randomness to memory accesses to reduce row buffer conflicts for multi-core, multi-bank systems. In ACM SIGPLAN Notices. Google Scholar
Digital Library
- Peter Pessl, Daniel Gruss, Clémentine Maurice, Michael Schwarz, and Stefan Mangard. 2016. DRAMA: Exploiting DRAM addressing for cross-CPU attacks. In Proceedings of the USENIX Security Symposium. Google Scholar
Digital Library
- Jan Reineke, Isaac Liu, Hiren Patel, Sungjun Kim, and Edward A. Lee. 2011. PRET DRAM Controller: Bank privatization for predictability and temporal isolation. In Proceedings of the IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’11). Google Scholar
Digital Library
- P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. Comput. Architect. Lett. 10, 1 (2011), 16--19. Google Scholar
Digital Library
- Ali Shafiee, Akhila Gundu, Manjunath Shevgoor, Rajeev Balasubramonian, and Mohit Tiwari. 2015. Avoiding information leakage in the memory controller with fixed service policies. In Proceedings of the 48th International Symposium on Microarchitecture. ACM. Google Scholar
Digital Library
- Yao Wang, Andrew Ferraiuolo, and G. Edward Suh. 2014. Timing channel protection for a shared memory controller. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’14).Google Scholar
- H. Wong, M.-M. Papadopoulou, M. Sadooghi-Alvandi, and A. Moshovos. 2010. Demystifying GPU microarchitecture through microbenchmarking. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems Software (ISPASS’10).Google Scholar
- Zheng Pei Wu, Y. Krish, and R. Pellizzoni. 2013. Worst case analysis of DRAM latency in multi-requestor systems. In Proceedings of the 34th IEEE Real-Time Systems Symposium (RTSS’13). Google Scholar
Digital Library
- UG347 Xilinx. 2011. ML505/506/507 evaluation platform user guide. Doc. Revis. 3, 2 (2011). Retrieved from http://www.xilinx.com/support/documentation/boards_and_kits/ug347.pdf.Google Scholar
- Heechul Yun, Renato Mancuso, Zheng-Pei Wu, and Rodolfo Pellizzoni. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’14).Google Scholar
Cross Ref
- Heechul Yun, Rodolfo Pellizzon, and Prathap Kumar Valsan. 2015. Parallelism-aware memory interference delay analysis for COTS multicore systems. In Proceedings of the IEEE Euromicro Conference on Real-Time Systems (ECRTS’15). Google Scholar
Digital Library
- Yuanrui Zhang, Wei Ding, Jun Liu, and Mahmut Kandemir. 2011. Optimizing data layouts for parallel computation on multicores. In Proceedings of the IEEE Conference on Parallel Architectures and Compilation Techniques (PACT’11). Google Scholar
Digital Library
- Zhao Zhang, Zhichun Zhu, and Xiaodong Zhang. 2000. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture (MICRO’00). Google Scholar
Digital Library
Index Terms
Exposing Implementation Details of Embedded DRAM Memory Controllers through Latency-based Analysis
Recommendations
Power management of hybrid DRAM/PRAM-based main memory
DAC '11: Proceedings of the 48th Design Automation ConferenceHybrid main memory consisting of DRAM and non-volatile memory is attractive since the non-volatile memory can give the advantage of low standby power while DRAM provides high performance and better active power. In this work, we address the power ...
Using run-time reverse-engineering to optimize DRAM refresh
MEMSYS '17: Proceedings of the International Symposium on Memory SystemsThe overhead of DRAM refresh is increasing with each density generation. To help offset some of this overhead, JEDEC designed the modern Auto-Refresh command with a highly optimized architecture internal to the DRAM---an architecture that violates the ...
Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms
Performance evaluation reviewVariation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of ...






Comments