skip to main content
research-article

Modular Switched Multiported SRAM-Based Memories

Published:14 July 2016Publication History
Skip Abstract Section

Abstract

Multiported RAMs are essential for high-performance parallel computation systems. VLIW and vector processors, CGRAs, DSPs, CMPs, and other processing systems often rely upon multiported memories for parallel access. Although memories with a large number of read and write ports are important, their high implementation cost means that they are used sparingly. As a result, FPGA vendors only provide dual-ported block RAMs (BRAMs) to handle the majority of usage patterns. Furthermore, recent attempts to create FPGA-based multiported memories suffer from low storage utilization. Whereas most approaches provide simple unidirectional ports with a fixed read or write, others propose true bidirectional ports where each port dynamically switches read and write. True RAM ports are useful for systems with transceivers and provide high RAM flexibility; however, this flexibility incurs high BRAM consumption. In this article, a novel, modular, and BRAM-based switched multiported RAM architecture is proposed. In addition to unidirectional ports with fixed read/write, this switched architecture allows a group of write ports to switch with another group of read ports dynamically, hence altering the number of active ports. The proposed switched-ports architecture is less flexible than a true-multiported RAM where each port is switched individually. Nevertheless, switched memories can dramatically reduce BRAM consumption compared to true ports for systems with alternating port requirements. Previous live-value-table (LVT) and XOR approaches are merged and optimized into a generalized and modular structure that we call an invalidation-based live-value-table (I-LVT). Like a regular LVT, the I-LVT determines the correct bank to read from, but it differs in how updates to the table are made; the LVT approach requires multiple write ports, often leading to an area-intensive register-based implementation, whereas the XOR approach suffers from excessive storage overhead since wider memories are required to accommodate the XOR-ed data. Two specific I-LVT implementations are proposed and evaluated: binary and thermometer coding. The I-LVT approach is especially suitable for deep memories because the table is implemented only in SRAM cells. The I-LVT method gives higher performance while occupying fewer BRAMs than earlier approaches: for several configurations, BRAM usage is reduced by greater than 44% and clock speed is improved by greater than 76%. The I-LVT can be used with fixed ports, true ports, or the proposed switched ports architectures. Formal proofs for the suggested methods, resources consumption analysis, usage guidelines, and analytic comparison to other methods are provided. A fully parameterized Verilog implementation is released as an open source library. The library has been extensively tested using Altera’s EDA tools.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ameer M. S. Abdelhadi and Guy G. F. Lemieux. 2014. Modular multi-ported SRAM-based memories. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’14). 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ameer M. S. Abdelhadi and Guy G. F. Lemieux. 2015. Switched Multi-Ported RAM Verilog Source Code. Retrieved April 18, 2016, from https://github.com/AmeerAbdelhadi/Switched-Multiported-RAM.Google ScholarGoogle Scholar
  3. D. Alpert and D. Avnon. 1993. Architecture of the Pentium microprocessor. IEEE Micro 13, 3, 11--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Altera Corp. 2013. Stratix V Device Handbook. Available at https://www.altera.com.Google ScholarGoogle Scholar
  5. H. Bajwa and X. Chen. 2007. Low-power high-performance and dynamically configured multi-port cache memory architecture. In Proceedings of the International Conference on Electrical Engineering (ICEE’07). 1--6.Google ScholarGoogle Scholar
  6. A. Brant, A. Abdelhadi, A. Severance, and G. G. F. Lemieux. 2012. Pipeline frequency boosting: Hiding dual-ported block RAM latency using intentional clock skew. In Proceedings of the 2012 International Conference on Field-Programmable Technology (FPT’12). 235--238.Google ScholarGoogle ScholarCross RefCross Ref
  7. B. A. Chappell, T. I. Chappell, M. K. Ebcioglu, and S. E. Schuster. 1996. Virtual multi-port RAM employing multiple accesses during single machine cycle. US Patent 5,542,067.Google ScholarGoogle Scholar
  8. J. Choi, K. Nam, A. Canis, J. Anderson, S. Brown, and T. Czajkowski. 2012. Impact of cache architecture and interface on performance and area of FPGA-based processor/parallel-accelerator systems. In Proceedings of the 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines (FCCM’12). 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. S. Ditlow, R. K. Montoye, S. N. Storino, S. M. Dance, S. Ehrenreich, B. M. Fleischer, T. W. Fox, K. M. Holmes, J. Mihara, Y. Nakamura, S. Onishi, R. Shearer, D. Wendel, and L. Chang. 2011. A 4R2W register file for a 2.3GHz wire-speed POWER™ processor with double-pumped write operation. In Proceedings of the 2011 IEEE International Solid-State Circuits Conference (ISSCC’11). 256--258.Google ScholarGoogle Scholar
  10. E. S. Fetzer and J. T. Orton. 2002. A fully-bypassed 6-issue integer datapath and register file on an Itanium microprocessor. In Proceedings of the 2002 IEEE International Solid-State Circuits Conference (ISSCC’02), Vol. 1. 420--478.Google ScholarGoogle Scholar
  11. Joseph A. Fisher. 1983. Very long instruction word architectures and the ELI-512. In Proceedings of the 10th Annual International Symposium on Computer Architecture (ISCA’83). ACM, New York, NY, 140--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Weixing Ji, Feng Shi, Baojun Qiao, and Hong Song. 2007. Multi-port memory design methodology based on block read and write. In Proceedings of the IEEE International Conference on Control and Automation (ICCA’07). 256--259.Google ScholarGoogle Scholar
  13. R. E. Kessler. 1999. The Alpha 21264 microprocessor. IEEE Micro 19, 2, 24--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Z. Kwok and S. J. E. Wilton. 2005. Register file architecture optimization in a coarse-grained reconfigurable architecture. In Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. E. LaForest, Z. Li, T. O’Rourke, M. G. Liu, and J. G. Steffan. 2014. Composing multi-ported memories on FPGAs. ACM Transactions on Reconfigurable Technology and Systems 7, 3, Article No. 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. E. LaForest, M. G. Liu, E. R. Rapati, and J. G. Steffan. 2012. Multi-ported memories for FPGAs via XOR. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’12). 209--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. E. LaForest and J. G. Steffan. 2010. Efficient multi-ported memories for FPGAs. In Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA’10). 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. J. Mattausch. 1997. Hierarchical N-port memory architecture based on 1-port memory cells. In Proceedings of the 23rd European Solid-State Circuits Conference (ESSCIRC’97). 348--351.Google ScholarGoogle Scholar
  19. J. H. Tseng and K. Asanović. 2003. Banked multiported register files for high-frequency superscalar microprocessors. In Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA’03). 62--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Yokota. 1990. Multiport memory system. US Patent 4,930,066.Google ScholarGoogle Scholar
  21. Wang Zuo, Wang Zuo, and Li Jiaxing. 2008. An intelligent multi-port memory. In Proceedings of the International Symposium on Intelligent Information Technology Application Workshops (IITAW’08). 251--254. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modular Switched Multiported SRAM-Based Memories

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 9, Issue 3
        Special Issue on Reconfigurable Components with Source Code
        September 2016
        128 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/2940351
        • Editor:
        • Steve Wilton
        Issue’s Table of Contents

        Copyright © 2016 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 July 2016
        • Accepted: 1 November 2015
        • Revised: 1 September 2015
        • Received: 1 February 2015
        Published in trets Volume 9, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!