Abstract
The rapidly increasing data in recent years requires the datacenter infrastructure to store and process data with extremely high throughput and low latency. Fortunately, persistent memory (PM) and RDMA technologies bring new opportunities towards this goal. Both of them are capable of delivering more than 10 GB/s of bandwidth and sub-microsecond latency. However, our past experiences and recent studies show that it is non-trivial to build an efficient and distributed storage system with such new hardware. In this article, we design and implement TH-DPMS (<underline>T</underline>sing<underline>H</underline>ua <underline>D</underline>istributed <underline>P</underline>ersistent <underline>M</underline>emory <underline>S</underline>ystem) based on persistent memory and RDMA, which unifies the memory, file system, and key-value interface in a single system. TH-DPMS is designed based on a unified distributed persistent memory abstract, pDSM. pDSM acts as a generic layer to connect the PMs of different storage nodes via high-speed RDMA network and organizes them into a global shared address space. It provides the fundamental functionalities, including global address management, space management, fault tolerance, and crash consistency guarantees. Applications are enabled to access pDSM with a group of flexible and easy-to-use APIs by using either raw read/write interfaces or the transactional ones with ACID guarantees. Based on pDSM, we implement a distributed file system and a key-value store named pDFS and pDKVS, respectively. Together, they uphold TH-DPMS with high-performance, low-latency, and fault-tolerant data storage. We evaluate TH-DPMS with both micro-benchmarks and real-world memory-intensive workloads. Experimental results show that TH-DPMS is capable of delivering an aggregated bandwidth of 120 GB/s with 6 nodes. When processing memory-intensive workloads such as YCSB and Graph500, TH-DPMS improves the performance by one order of magnitude compared to existing systems and keeps consistent high efficiency when the workload size grows to multiple terabytes.
- Mellanox Technologies. 2019. ConnectX-6 VPI Card. Retrieved from https://www.mellanox.com/related-docs/prod_adapter_cards/PB_ConnectX-6_VPI_Card.pdf.Google Scholar
- Intel Corporation. 2019. The Distributed Asynchronous Object Storage. Retrieved from https://daos-stack.github.io/.Google Scholar
- Red Hat. Inc. 2019. GlusterFS. Retrieved from https://www.gluster.org/.Google Scholar
- Intel Corporation. 2019. Intel Optane DC Persistent Memory. Retrieved from https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html.Google Scholar
- IDC. 2020. The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things. Retrieved from https://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm.Google Scholar
- Berkeley Architecture Research. 2020. The Firebox Project. Retrieved from https://bar.eecs.berkeley.edu/projects/firebox.html.Google Scholar
- Intel Corporation. 2020. Intel Data Direct I/O Technology. Retrieved from https://www.intel.com/content/www/us/en/io/data-direct-i-o-technology.html.Google Scholar
- HP Development Company. 2020. The Machine Project. Retrieved from https://www.hpl.hp.com/research/systems-research/themachine.Google Scholar
- Intel Corporation. 2020. PMDK: Persistent Memory Development Kit. Retrieved from https://github.com/pmem/pmdk.Google Scholar
- Intel Corporation. 2020. pmemkv. Retrieved from https://github.com/pmem/pmemkv/.Google Scholar
- Redis Labs. 2020. Redis. Retrieved from https://redis.io/.Google Scholar
- Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Stanko Novaković, Arun Ramanathan, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. 2018. Remote regions: A simple abstraction for remote memory. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’18). USENIX Association, 775--787. Retrieved from https://www.usenix.org/conference/atc18/presentation/aguilera.Google Scholar
- Joy Arulraj, Andrew Pavlo, and Subramanya R. Dulloor. 2015. Let’s talk about storage 8 recovery methods for non-volatile memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 707--722.Google Scholar
- I. G. Baek, M. S. Lee, S. Seo, M. J. Lee, D. H. Seo, D.-S. Suh, J. C. Park, S. O. Park, H. S. Kim, I. K. Yoo, et al. 2004. Highly scalable nonvolatile resistive memory using simple binary oxide driven by asymmetric unipolar voltage pulses. In Proceedings of the IEEE International Electron Devices Meeting. IEEE, 587--590.Google Scholar
- Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A scalable memory allocator for multithreaded applications. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00). Association for Computing Machinery, New York, NY, 117--128. DOI:https://doi.org/10.1145/378993.379232Google Scholar
- Silas Boyd-Wickizer, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2014. OpLog: A library for scaling update-heavy data structures. Retrieved from https://people.csail.mit.edu/nickolai/papers/boyd-wickizer-oplog-tr.pdf.Google Scholar
- John B. Carter, John K. Bennett, and Willy Zwaenepoel. 1995. Techniques for reducing consistency-related communication in distributed shared-memory systems. ACM Trans. Comput. Syst. 13, 3 (1995), 205--243.Google Scholar
Digital Library
- Shimin Chen and Qin Jin. 2015. Persistent B+-trees in non-volatile main memory. Proc. VLDB Endow. 8, 7 (Feb. 2015), 786--797. DOI:https://doi.org/10.14778/2752939.2752947Google Scholar
Digital Library
- Youmin Chen, Youyou Lu, Pei Chen, and Jiwu Shu. 2019. Efficient and consistent NVMM cache for SSD-based file system. IEEE Trans. Comput. 68, 8 (Aug. 2019), 1147--1158. DOI:https://doi.org/10.1109/TC.2018.2870137Google Scholar
Digital Library
- Youmin Chen, Youyou Lu, and Jiwu Shu. 2019. Scalable RDMA RPC on reliable connection with efficient resource sharing. In Proceedings of the 14th EuroSys Conference 2019 (EuroSys’19). Association for Computing Machinery, New York, NY. DOI:https://doi.org/10.1145/3302424.3303968Google Scholar
Digital Library
- Youmin Chen, Youyou Lu, Fan Yang, Qing Wang, Yang Wang, and Jiwu Shu. 2020. FlatStore: An efficient log-structured key-value storage engine for persistent memory. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM.Google Scholar
Digital Library
- Youmin Chen, Youyou Lu, Bohong Zhu, and Jiwu Shu. 2019. Kernel/User-level Collaborative Persistent Memory File System with Efficiency and Protection. arxiv:cs.OS/1908.10740Google Scholar
- Youmin Chen, Jiwu Shu, Jiaxin Ou, and Youyou Lu. 2018. HiNFS: A persistent memory file system with both buffering and direct-access. ACM Trans. Storage 14, 1 (Apr. 2018). DOI:https://doi.org/10.1145/3204454Google Scholar
Digital Library
- Joel Coburn, Adrian M. Caulfield, Ameen Akel, Laura M. Grupp, Rajesh K. Gupta, Ranjit Jhala, and Steven Swanson. 2011. NV-Heaps: Making persistent objects fast and safe with next-generation, non-volatile memories. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). ACM, New York, NY, 105--118. DOI:https://doi.org/10.1145/1950365.1950380Google Scholar
Digital Library
- Jeremy Condit, Edmund B. Nightingale, Christopher Frost, Engin Ipek, Benjamin Lee, Doug Burger, and Derrick Coetzee. 2009. Better I/O through byte-addressable, persistent memory. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). ACM, New York, NY, 133--146. DOI:https://doi.org/10.1145/1629575.1629589Google Scholar
Digital Library
- Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC’10). Association for Computing Machinery, New York, NY, 143--154. DOI:https://doi.org/10.1145/1807128.1807152Google Scholar
Digital Library
- Mingkai Dong, Heng Bu, Jiefei Yi, Benchao Dong, and Haibo Chen. 2019. Performance and protection in the ZoFS user-space NVM file system. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP’19).Google Scholar
Digital Library
- Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast remote memory. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI’14). 401--414.Google Scholar
Digital Library
- Aleksandar Dragojević, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: Distributed transactions with consistency, availability, and performance. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 54--70.Google Scholar
Digital Library
- Subramanya R. Dulloor, Sanjay Kumar, Anil Keshavamurthy, Philip Lantz, Dheeraj Reddy, Rajesh Sankaran, and Jeff Jackson. 2014. System software for persistent memory. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). ACM, New York, NY. DOI:https://doi.org/10.1145/2592798.2592814Google Scholar
Digital Library
- Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. 2012. SAP HANA database: Data management for modern business applications. ACM SIGMOD Rec. 40, 4 (2012), 45--51.Google Scholar
Digital Library
- Sanjay Ghemawat and Jeff Dean. 2011. LevelDB. Retrieved from https://github.com/google/leveldb.Google Scholar
- Saugata Ghose, Abdullah Giray Yaglikçi, Raghav Gupta, Donghyuk Lee, Kais Kudrolli, William X. Liu, Hasan Hassan, Kevin K. Chang, Niladrish Chatterjee, Aditya Agrawal, et al. 2018. What your DRAM power models are not telling you: Lessons from a detailed experimental study. Proc. ACM Meas. Anal. Comput. Syst. 2, 3 (2018), 38.Google Scholar
Digital Library
- Morteza Hoseinzadeh. 2019. A survey on tiering and caching in high-performance storage systems. arXiv preprint arXiv:1904.11560 (2019).Google Scholar
- Deukyeon Hwang, Wook-Hee Kim, Youjip Won, and Beomseok Nam. 2018. Endurable transient inconsistency in byte-addressable persistent B+-tree. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18). 187.Google Scholar
- Taeho Hwang, Jaemin Jung, and Youjip Won. 2014. HEAPO: Heap-based persistent object store. ACM Trans. Storage 11, 1 (Dec. 2014). DOI:https://doi.org/10.1145/2629619Google Scholar
Digital Library
- Nusrat Sharmin Islam, Md. Wasi-ur Rahman, Xiaoyi Lu, and Dhabaleswar K. Panda. 2016. High performance design for HDFS with byte-addressability of NVM and RDMA. In Proceedings of the International Conference on Supercomputing (ICS’16). Association for Computing Machinery, New York, NY. DOI:https://doi.org/10.1145/2925426.2926290Google Scholar
- Joseph Izraelevitz, Jian Yang, Lu Zhang, Juno Kim, Xiao Liu, Amirsaman Memaripour, Yun Joon Soh, Zixuan Wang, Yi Xu, Subramanya R. Dulloor, et al. 2019. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv preprint arXiv:1903.05714 (2019).Google Scholar
- Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md. Wasi-ur Rahman, Nusrat S. Islam, Xiangyong Ouyang, Hao Wang, Sayantan Sur, and Dhabaleswar K. Panda. 2011. Memcached design on high performance RDMA capable interconnects. In Proceedings of the International Conference on Parallel Processing (ICPP’11). IEEE Computer Society, 743--752. DOI:https://doi.org/10.1109/ICPP.2011.37Google Scholar
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2015. Using RDMA efficiently for key-value services. ACM SIGCOMM Comput. Commun. Rev. 44, 4 (2015), 295--306.Google Scholar
Digital Library
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design guidelines for high performance RDMA systems. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’16). 437--450.Google Scholar
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, scalable and simple distributed transactions with two-sided (RDMA) datagram RPCs. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). 185--201.Google Scholar
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2019. Datacenter RPCs can be general and fast. In Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation (NSDI’19). USENIX Association, 1--16.Google Scholar
- Sudarsun Kannan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Yuangang Wang, Jun Xu, and Gopinath Palani. 2018. Designing a true direct-access file system with DevFS. In Proceedings of the 16th USENIX Conference on File and Storage Technologies. 241.Google Scholar
- David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the 29th ACM Symposium on Theory of Computing (STOC’97). Association for Computing Machinery, New York, NY, 654--663. DOI:https://doi.org/10.1145/258533.258660Google Scholar
Digital Library
- Sanidhya Kashyap, Dai Qin, Steve Byan, Virendra J. Marathe, and Sanketh Nalli. 2019. Correct, fast remote persistence. arXiv preprint arXiv:1909.02092 (2019).Google Scholar
- Peter J. Keleher, Alan L. Cox, Sandhya Dwarkadas, and Willy Zwaenepoel. 1994. TreadMarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the USENIX Winter Conference, Vol. 1994. 23--36.Google Scholar
- Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, and Thomas Anderson. 2017. Strata: A cross media file system. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, NY, 460--477. DOI:https://doi.org/10.1145/3132747.3132770Google Scholar
Digital Library
- Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 2--13.Google Scholar
Digital Library
- Se Kwon Lee, Jayashree Mohan, Sanidhya Kashyap, Taesoo Kim, and Vijay Chidambaram. 2019. Recipe: Converting concurrent DRAM indexes to persistent-memory indexes. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. ACM, 462--477.Google Scholar
Digital Library
- Bojie Li, Tianyi Cui, Zibo Wang, Wei Bai, and Lintao Zhang. 2019. SocksDirect: Datacenter sockets can be fast and compatible. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM’19). ACM, New York, NY, 90--103. DOI:https://doi.org/10.1145/3341302.3342071Google Scholar
Digital Library
- Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. KV-Direct: High-performance in-memory key-value store with programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 137--152.Google Scholar
Digital Library
- Kai Li. 1988. IVY: A shared virtual memory system for parallel computing. In Proceedings of the International Conference on Parallel Processing 2 88 (1988), 94.Google Scholar
- Siyang Li, Youyou Lu, Jiwu Shu, Yang Hu, and Tao Li. 2017. LocoFS: A loosely-coupled metadata service for distributed file systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’17). Association for Computing Machinery, New York, NY. DOI:https://doi.org/10.1145/3126908.3126928Google Scholar
Digital Library
- Mengxing Liu, Mingxing Zhang, Kang Chen, Xuehai Qian, Yongwei Wu, Weimin Zheng, and Jinglei Ren. 2017. DudeTM: Building durable transactions with decoupling for persistent memory. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 329--343. DOI:https://doi.org/10.1145/3037697.3037714Google Scholar
Digital Library
- Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: An RDMA-enabled distributed persistent memory file system. In Proceedings of the USENIX Conference on Usenix Annual Technical Conference (USENIX ATC’17). USENIX Association, 773--785.Google Scholar
- Youyou Lu, Jiwu Shu, and Long Sun. 2015. Blurred persistence in transactional persistent memory. In Proceedings of the 31st Symposium on Mass Storage Systems and Technologies (MSST’15). IEEE, 1--13.Google Scholar
Cross Ref
- Youyou Lu, Jiwu Shu, and Long Sun. 2016. Blurred persistence: Efficient transactions in persistent memory. ACM Trans. Storage 12, 1 (Jan. 2016). DOI:https://doi.org/10.1145/2851504Google Scholar
Digital Library
- Youyou Lu, Jiwu Shu, Long Sun, and Onur Mutlu. 2014. Loose-ordering consistency for persistent memory. In Proceedings of the IEEE 32nd International Conference on Computer Design (ICCD’14). IEEE, 216--223.Google Scholar
Cross Ref
- Teng Ma, Mingxing Zhang, Kang Chen, Xuehai Qian, Zhuo Song, and Yongwei Wu. 2020. AsymNVM: An efficient framework for implementing persistent data structures on asymmetric NVM architecture. In Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM.Google Scholar
Digital Library
- Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 103--114.Google Scholar
- Moohyeon Nam, Hokeun Cha, Young-ri Choi, Sam H. Noh, and Beomseok Nam. 2019. Write-optimized dynamic hashing for persistent memory. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 31--44.Google Scholar
Digital Library
- Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. 2015. Latency-tolerant software distributed shared memory. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). USENIX Association, 291--305. Retrieved from https://www.usenix.org/conference/atc15/technical-session/presentation/nelson.Google Scholar
- Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, et al. 2013. Scaling memcache at Facebook. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation (NSDI’13). USENIX Association, 385--398.Google Scholar
Digital Library
- Stanko Novakovic, Alexandros Daglis, Edouard Bugnion, Babak Falsafi, and Boris Grot. 2014. Scale-out NUMA. ACM SIGPLAN Not. 49, 4 (2014), 3--18.Google Scholar
Digital Library
- Stanko Novakovic, Yizhou Shan, Aasheesh Kolli, Michael Cui, Yiying Zhang, Haggai Eran, Boris Pismenny, Liran Liss, Michael Wei, Dan Tsafrir, and Marcos Aguilera. 2019. StoRM: A fast transactional dataplane for remote data structures. In Proceedings of the 12th ACM International Conference on Systems and Storage (SYSTOR’19). Association for Computing Machinery, New York, NY, 97--108. DOI:https://doi.org/10.1145/3319647.3325827Google Scholar
Digital Library
- Rohan Kadekodi, Se Kwon Lee, Sanidhya Kashyap, Taesoo Kim, Aasheesh Kolli, and Vijay Chidambara. 2019. SplitFS: A file system that minimizes software overhead in file systems for persistent memory. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP’19).Google Scholar
Digital Library
- Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’14). USENIX Association, 305--320.Google Scholar
- Jiaxin Ou, Jiwu Shu, and Youyou Lu. 2016. A high performance file system for non-volatile main memory. In Proceedings of the 11th European Conference on Computer Systems (EuroSys’16). ACM, New York, NY. DOI:https://doi.org/10.1145/2901318.2901324Google Scholar
Digital Library
- Ismail Oukid, Johan Lasperas, Anisoara Nica, Thomas Willhalm, and Wolfgang Lehner. 2016. FPTree: A hybrid SCM-DRAM persistent and concurrent B-tree for storage class memory. In Proceedings of the International Conference on Management of Data (SIGMOD’16). ACM, New York, NY, 371--386. DOI:https://doi.org/10.1145/2882903.2915251Google Scholar
Digital Library
- Marius Poke and Torsten Hoefler. 2015. DARE: High-performance state machine replication on RDMA networks. In Proceedings of the 24th International Symposium on High-performance Parallel and Distributed Computing. ACM, 107--118.Google Scholar
Digital Library
- Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 24--33.Google Scholar
- Alex Shamis, Matthew Renzelmann, Stanko Novakovic, Georgios Chatzopoulos, Aleksandar Dragojeviundefined, Dushyanth Narayanan, and Miguel Castro. 2019. Fast general distributed transactions with opacity. In Proceedings of the International Conference on Management of Data (SIGMOD’19). Association for Computing Machinery, New York, NY, 433--448. DOI:https://doi.org/10.1145/3299869.3300069Google Scholar
Digital Library
- Yizhou Shan, Shin-Yeh Tsai, and Yiying Zhang. 2017. Distributed shared persistent memory. In Proceedings of the Symposium on Cloud Computing (SoCC’17). Association for Computing Machinery, New York, NY, 323--337. DOI:https://doi.org/10.1145/3127479.3128610Google Scholar
Digital Library
- Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Ana Klimovic, Adrian Schuepbach, and Bernard Metzler. 2019. Unification of temporary storage in the NodeKernel architecture. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’19). USENIX Association, 767--782. Retrieved from https://www.usenix.org/conference/atc19/presentation/stuedi.Google Scholar
- Shivaram Venkataraman, Niraj Tolia, Parthasarathy Ranganathan, and Roy H. Campbell. 2011. Consistent and durable data structures for non-volatile byte-addressable memory. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). USENIX Association, 5--5. Retrieved from http://dl.acm.org/citation.cfm?id=1960475.1960480.Google Scholar
- Haris Volos, Sanketh Nalli, Sankarlingam Panneerselvam, Venkatanathan Varadarajan, Prashant Saxena, and Michael M. Swift. 2014. Aerie: Flexible file-system interfaces to storage-class memory. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). ACM, New York, NY. DOI:https://doi.org/10.1145/2592798.2592810Google Scholar
- Haris Volos, Andres Jaan Tack, and Michael M. Swift. 2011. Mnemosyne: Lightweight persistent memory. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). ACM, New York, NY, 91--104. DOI:https://doi.org/10.1145/1950365.1950379Google Scholar
- Xingda Wei, Zhiyuan Dong, Rong Chen, and Haibo Chen. 2018. Deconstructing RDMA-enabled distributed transactions: Hybrid is better! In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 233--251.Google Scholar
- Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 87--104.Google Scholar
Digital Library
- Sage A. Weil, Andrew W. Leung, Scott A. Brandt, and Carlos Maltzahn. 2007. Rados: A scalable, reliable storage service for petabyte-scale storage clusters. In Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing’07. ACM, 35--44.Google Scholar
Digital Library
- Xiaojian Wu and A. L. Narasimha Reddy. 2011. SCMFS: A file system for storage class memory. In Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC’11). ACM, New York, NY. DOI:https://doi.org/10.1145/2063384.2063436Google Scholar
- Fei Xia, Dejun Jiang, Jin Xiong, and Ninghui Sun. 2017. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). 349--362.Google Scholar
- Jian Xu and Steven Swanson. 2016. NOVA: A log-structured file system for hybrid volatile/non-volatile main memories. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, 323--338. Retrieved from http://dl.acm.org/citation.cfm?id=2930583.2930608.Google Scholar
Digital Library
- Jian Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah, Amit Borase, Tamires Brito Da Silva, Steven Swanson, and Andy Rudoff. 2017. NOVA-Fortis: A fault-tolerant non-volatile main memory file system. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, NY, 478--496. DOI:https://doi.org/10.1145/3132747.3132761Google Scholar
Digital Library
- Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2019. Orion: A distributed file system for non-volatile main memory and RDMA-capable networks. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). USENIX Association, 221--234. Retrieved from https://www.usenix.org/conference/fast19/presentation/yang.Google Scholar
- Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2020. FileMR: Rethinking RDMA networking for scalable persistent memory. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI’20). USENIX Association, 111--125. Retrieved from https://www.usenix.org/conference/nsdi20/presentation/yang.Google Scholar
- Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steve Swanson. 2020. An empirical guide to the behavior and use of scalable persistent memory. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). USENIX Association, 169--182. Retrieved from https://www.usenix.org/conference/fast20/presentation/yang.Google Scholar
Digital Library
- Jun Yang, Qingsong Wei, Cheng Chen, Chundong Wang, Khai Leong Yong, and Bingsheng He. 2015. NV-tree: Reducing consistency cost for NVM-based single level systems. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, Berkeley, CA, 167--181. Retrieved from http://dl.acm.org/citation.cfm?id=2750482.2750495.Google Scholar
Digital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud’10). 95.Google Scholar
Digital Library
- Kaisheng Zeng, Youyou Lu, Hu Wan, and Jiwu Shu. 2017. Efficient storage management for aged file systems on persistent memory. In Proceedings of the Conference on Design, Automation 8 Test in Europe (DATE’17). European Design and Automation Association, 1773--1778.Google Scholar
Cross Ref
- Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. 2015. Mojim: A reliable and highly-available non-volatile memory system. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). Association for Computing Machinery, New York, NY, 3--18. DOI:https://doi.org/10.1145/2694344.2694370Google Scholar
Digital Library
- Shengan Zheng, Morteza Hoseinzadeh, and Steven Swanson. 2019. Ziggurat: A tiered file system for non-volatile main memories and disks. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 207--219.Google Scholar
- Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA’09). ACM, New York, NY, 14--23.Google Scholar
Digital Library
- Pengfei Zuo, Yu Hua, and Jie Wu. 2018. Write-optimized and high-performance hashing index scheme for persistent memory. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 461--476.Google Scholar
Index Terms
TH-DPMS: Design and Implementation of an RDMA-enabled Distributed Persistent Memory Storage System
Recommendations
Octopus+: An RDMA-Enabled Distributed Persistent Memory File System
Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software ...
Challenges and solutions for fast remote persistent memory access
SoCC '20: Proceedings of the 11th ACM Symposium on Cloud ComputingNon-volatile main memory DIMMs (NVMMs), such as Intel's Optane DC Persistent Memory modules, provide data durability with orders of magnitude higher performance than prior durable technologies. This paper explores the unique challenges that arise when ...
HiNFS: A Persistent Memory File System with Both Buffering and Direct-Access
Special Issue on NVM and StoragePersistent memory provides data persistence at main memory with emerging non-volatile main memories (NVMMs). Recent persistent memory file systems aggressively use direct access, which directly copy data between user buffer and the storage layer, to ...






Comments