Abstract
During the past decade, network and storage devices have undergone rapid performance improvements, delivering ultra-low latency and several Gbps of bandwidth. Nevertheless, current network and storage stacks fail to deliver this hardware performance to the applications, often due to the loss of I/O efficiency from stalled CPU performance. While many efforts attempt to address this issue solely on either the network or the storage stack, achieving high-performance for networked-storage applications requires a holistic approach that considers both.
In this article, we present FlashNet, a software I/O stack that unifies high-performance network properties with flash storage access and management. FlashNet builds on RDMA principles and abstractions to provide a direct, asynchronous, end-to-end data path between a client and remote flash storage. The key insight behind FlashNet is to co-design the stack’s components (an RDMA controller, a flash controller, and a file system) to enable cross-stack optimizations and maximize I/O efficiency. In micro-benchmarks, FlashNet improves 4kB network I/O operations per second (IOPS by 38.6% to 1.22M, decreases access latency by 43.5% to 50.4μs, and prolongs the flash lifetime by 1.6-5.9× for writes. We illustrate the capabilities of FlashNet by building a Key-Value store and porting a distributed data store that uses RDMA on it. The use of FlashNet’s RDMA API improves the performance of KV store by 2× and requires minimum changes for the ported data store to access remote flash devices.
- 2018. RDMA communication manager API. Retrieved July 2018 from https://linux.die.net/man/7/rdma_cm.Google Scholar
- Irfan Ahmad, Ajay Gulati, and Ali Mashtizadeh. 2011. vIC: Interrupt coalescing for virtual machine storage device IO. In Proceedings of the 2011 USENIX Conference (USENIX ATC’11). USENIX Association, Berkeley, CA, 45--58. Google Scholar
Digital Library
- Jens Axboe. 2018. Flexible I/O tester. Retrieved July 2018 from https://linux.die.net/man/1/fio.Google Scholar
- Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, and John D. Davis. 2012. CORFU: A shared log design for flash clusters. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12). USENIX Association, Berkeley, CA, 1--14. Google Scholar
Digital Library
- Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the killer microseconds. Commun. ACM 60, 4 (Mar. 2017), 48--54. Google Scholar
Digital Library
- Stephen Bates. 2015. Donard: NVM Express for Peer-2-Peer between SSDs and other PCIe Devices. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/nvme_fab/StephenBates_Donard_NVM_Express_Peer-2_Peer.pdf.Google Scholar
- Adam Belay, George Prekas, Ana Klimovic, Samuel Grossman, Christos Kozyrakis, and Edouard Bugnion. 2014. IX: A protected dataplane operating system for high throughput and low latency. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 49--65. Google Scholar
Digital Library
- Matias Bjørling, Jens Axboe, David Nellans, and Philippe Bonnet. 2013. Linux block IO: Introducing multi-queue SSD access on multi-core systems. In Proceedings of the 6th International Systems and Storage Conference (SYSTOR’13). ACM, New York, NY, Article 22, 10 pages. Google Scholar
Digital Library
- M. A. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. W. Felten, and J. Sandberg. 1994. Virtual memory mapped network interface for the SHRIMP multicomputer. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA’94). IEEE Computer Society Press, Los Alamitos, CA, 142--153. Google Scholar
Digital Library
- Greg Buzzard, David Jacobson, Milon Mackey, Scott Marovich, and John Wilkes. 1996. An implementation of the hamlyn sender-managed interface architecture. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation (OSDI’96). ACM, New York, NY, 245--259. Google Scholar
Digital Library
- Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, and Steven Swanson. 2012. Providing safe, user space access to fast, solid state disks. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII). ACM, New York, NY, 387--400. Google Scholar
Digital Library
- Adrian M. Caulfield and Steven Swanson. 2013. QuickSAN: A storage area network for fast, distributed, solid state disks. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 464--474. Google Scholar
Digital Library
- Mallikarjun Chadalapaka, Hemal Shah, Uri Elzur, Patricia Thaler, and Michael Ko. 2003. A study of iSCSI extensions for RDMA (iSER). In Proceedings of the ACM SIGCOMM Workshop on Network-I/O Convergence: Experience, Lessons, Implications (NICELI’03). ACM, New York, NY, 209--219. Google Scholar
Digital Library
- Lei Chai, Xiangyong Ouyang, Ranjit Noronha, and Dhabaleswar K. Panda. 2007. pNFS/PVFS2 over InfiniBand: Early experiences. In Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing’07 (PDSW’07). ACM, New York, NY, 5--11. Google Scholar
Digital Library
- Li-Pin Chang, Tei-Wei Kuo, and Shi-Wu Lo. 2004. Real-time garbage collection for flash-memory storage systems of real-time embedded systems. ACM Trans. Embed. Comput. Syst. 3, 4 (Nov. 2004), 837--863. Google Scholar
Digital Library
- Brendan Cully, Jake Wires, Dutch Meyer, Kevin Jamieson, Keir Fraser, Tim Deegan, Daniel Stodden, Geoffrey Lefebvre, Daniel Ferstay, and Andrew Warfield. 2014. Strata: Scalable high-performance storage on virtualized non-volatile memory. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). USENIX Association, Berkeley, CA, 17--31. Google Scholar
Digital Library
- Matt DeBergalis, Peter Corbett, Steve Kleiman, Arthur Lent, Dave Noveck, Tom Talpey, and Mark Wittle. 2003. The direct access file system. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association, Berkeley, CA, 175--188. Google Scholar
Digital Library
- Chet Douglas. 2015. RDMA with PMEM: Software mechanisms for enabling access to remote persistent memory. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/ChetDouglas_RDMA_with_PM.pdf.Google Scholar
- Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro. 2014. FaRM: Fast remote memory. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 401--414. Google Scholar
Digital Library
- Aleksandar Dragojević, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: Distributed transactions with consistency, availability, and performance. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, NY, 54--70. Google Scholar
Digital Library
- D. R. Engler, M. F. Kaashoek, and J. O’Toole, Jr. 1995. Exokernel: An operating system architecture for application-level resource management. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP’95). ACM, New York, NY, 251--266. Google Scholar
Digital Library
- Roman Pletka et al. 2018. Management of next-generation NAND flash to achieve enterprise-level endurance and latency targets (unpublshed).Google Scholar
- Blake G. Fitch et al. 2013. Blue Gene Active Storage (BGAS) for High Performance BG/Q I/O and Scalable Data-centric Analytics. Retrieved July 2018 from https://www.fz-juelich.de/SharedDocs/Downloads/IAS/JSC/EN/slides/bgas/bgas-fitch.pdf?__blob=publicationFile.Google Scholar
- Philip Werner Frey. 2010. Zero-Copy Network Communication: An Applicability Study of iWARP beyond Micro Benchmarks. Ph.D. Dissertation. ETH Zurich. Dissertation Number 19001.Google Scholar
- Philip Werner Frey and Gustavo Alonso. 2009. Minimizing the hidden cost of RDMA. In Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems (ICDCS’09). IEEE Computer Society, Washington, DC, 553--560. Google Scholar
Digital Library
- Garth A. Gibson, David F. Nagle, Khalil Amiri, Jeff Butler, Fay W. Chang, Howard Gobioff, Charles Hardin, Erik Riedel, David Rochberg, and Jim Zelenka. 1998. A cost-effective, high-bandwidth storage architecture. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII). ACM, New York, NY, 92--103. Google Scholar
Digital Library
- Garth A. Gibson, David F. Nagle, Khalil Amiri, Fay W. Chang, Eugene M. Feinberg, Howard Gobioff, Chen Lee, Berend Ozceri, Erik Riedel, David Rochberg, and Jim Zelenka. 1997. File server scaling with network-attached secure disks. In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’97). ACM, New York, NY, 272--284. Google Scholar
Digital Library
- Zvika Guz, Harry (Huan) Li, Anahita Shayesteh, and Vijay Balakrishnan. 2017. NVMe-over-fabrics performance characterization and the path to low-overhead flash disaggregation. In Proceedings of the 10th ACM International Systems and Storage Conference (SYSTOR’17). ACM, New York, NY, Article 16, 9 pages. Google Scholar
Digital Library
- Sangjin Han, Scott Marshall, Byung-Gon Chun, and Sylvia Ratnasamy. 2012. MegaPipe: A new programming interface for scalable network I/O. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 135--148. Google Scholar
Digital Library
- Red Hat. 2018. GlusterFS. Retrieved July 2018 from http://www.gluster.org/.Google Scholar
- Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In Proceedings of the 22nd International Symposium on Distributed Computing (DISC’08). Springer-Verlag, Berlin, 350--364. Google Scholar
Digital Library
- Dean Hildebrand and Peter Honeyman. 2005. Exporting storage systems in a scalable manner with pNFS. In Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05). IEEE Computer Society, Los Alamitos, CA, 18--27. Google Scholar
Digital Library
- Torsten Hoefler, Robert B. Ross, and Timothy Roscoe. 2015. Distributing the data plane for remote storage access. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HotOS’15). USENIX Association, Berkeley, CA. Google Scholar
Digital Library
- Xiao-Yu Hu, Robert Haas, and Eleftheriou Evangelos. 2011. Container marking: Combining data placement, garbage collection and wear levelling for flash. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). IEEE Computer Society, Los Alamitos, CA, 237--247. Google Scholar
Digital Library
- NVM Express Inc. 2016. NVM Express over Fabrics Specification 1.0. Retrieved July 2018 from http://www.nvmexpress.org/wp-content/uploads/NVMe_over_Fabrics_1_0_Gold_20160605-1.pdf.Google Scholar
- Solarflare Communications Inc. 2018. OpenOnload. Retrieved July 2018 from http://www.openonload.org/.Google Scholar
- Apache Crail (Incubating). 2018. A High-Performance Distributed Data Store for the Apache Ecosystem. Retrieved July 2018 from http://crail.incubator.apache.org/.Google Scholar
- Intel. 2018. DPDK: Data Plane Development Kit. Retrieved July 2018 from http://dpdk.org/.Google Scholar
- Intel. 2018. Intel Optane SSD 900P Series. Retrieved July 2018 from https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/gaming-enthusiast-ssds/optane-900p-series.html.Google Scholar
- Intel. 2018. Intel’s 3D XPoint Technology Products—What’s Available and What’s Coming Soon. Retrieved July 2018 from https://software.intel.com/en-us/articles/3d-xpoint-technology-products.Google Scholar
- Nikolas Ioannou, Kornilios Kourtis, and Ioannis Koltsidas. 2018. Elevating commodity storage with the SALSA host translation layer. In Proceedings of the 26th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS'18). 277--292.Google Scholar
Cross Ref
- Eun Young Jeong, Shinae Woo, Muhammad Jamshed, Haewon Jeong, Sunghwan Ihm, Dongsu Han, and KyoungSoo Park. 2014. mTCP: A highly scalable user-level TCP stack for multicore systems. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 489--502. Google Scholar
Digital Library
- Abhijeet Joglekar, Michael E. Kounavis, and Frank L. Berry. 2005. A scalable and high performance software iSCSI implementation. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies - Volume 4 (FAST’05). USENIX Association, Berkeley, CA, 267--280. Google Scholar
Digital Library
- Scott M. Johnson. 2014. Violin and Microsoft’s High-Performance, All-Flash Enterprise Storage. Retrieved July 2018 from https://insightsblog.violinsystems.com/blog/violin-and-microsoft-windows-flash-array.Google Scholar
- Rick Jones et al. 2018. Netperf: A network performance benchmark. Retrieved July 2018 from https://github.com/HewlettPackard/netperf.Google Scholar
- William K. Josephson, Lars A. Bongo, David Flynn, and Kai Li. 2010. DFS: A file system for virtualized flash storage. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, Berkeley, CA, 85--100. Google Scholar
Digital Library
- M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Hector M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. 1997. Application performance and flexibility on exokernel systems. In Proceedings of the th ACM Symposium on Operating Systems Principles (SOSP’97). ACM, New York, NY, 52--65. Google Scholar
Digital Library
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA efficiently for key-value services. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM’14). ACM, New York, NY, 295--306. Google Scholar
Digital Library
- Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, scalable and simple distributed transactions with two-sided (RDMA) datagram RPCs. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 185--201. Google Scholar
Digital Library
- Hyeong-Jun Kim, Young-Sik Lee, and Jin-Soo Kim. 2016. NVMeDirect: A user-space I/O framework for application-specific optimization on NVMe SSDs. In Proceedings of the 8th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’16). USENIX Association, Berkeley, CA, 41--45. http://dl.acm.org/citation.cfm?id=3026852.3026861 Google Scholar
Digital Library
- Ana Klimovic, Christos Kozyrakis, Eno Thereska, Binu John, and Sanjeev Kumar. 2016. Flash storage disaggregation. In Proceedings of the 11th European Conference on Computer Systems (EuroSys’16). ACM, New York, NY, Article 29, 15 pages. Google Scholar
Digital Library
- Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2017. ReFlex: Remote flash == Local flash. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 345--359. Google Scholar
Digital Library
- Kenneth C. Knowlton. 1965. A fast storage allocator. Commun. ACM 8, 10 (Oct. 1965), 623--624. Google Scholar
Digital Library
- Evangelos Koukis, Anastassios Nanos, and Nectarios Koziris. 2010. GMBlock: Optimizing data movement in a block-level storage sharing system over myrinet. Cluster Comput. 13, 4 (Dec. 2010), 349--372. Google Scholar
Digital Library
- Changman Lee, Dongho Sim, Joo-Young Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, Berkeley, CA, 273--286. Google Scholar
Digital Library
- Edward K. Lee and Chandramohan A. Thekkath. 1996. Petal: Distributed virtual disks. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII). ACM, New York, NY, 84--92. Google Scholar
Digital Library
- Sungjin Lee, Ming Liu, Sangwoo Jun, Shuotao Xu, Jihong Kim, and Arvind Arvind. 2016. Application-managed flash. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, Berkeley, CA, 339--353. Google Scholar
Digital Library
- Ilya Lesokhin, Haggai Eran, Shachar Raindel, Guy Shapiro, Sagi Grimberg, Liran Liss, Muli Ben-Yehuda, Nadav Amit, and Dan Tsafrir. 2017. Page fault support for network controllers. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 449--466. Google Scholar
Digital Library
- Bo Li, Panyong Zhang, Zhigang Huo, and Dan Meng. 2009. Early experiences with write-write design of NFS over RDMA. In Proceedings of the 2009 IEEE International Conference on Networking, Architecture, and Storage (NAS’09). IEEE Computer Society, Los Alamitos, CA, 303--308. Google Scholar
Digital Library
- Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 429--444. Google Scholar
Digital Library
- Xiaoyi Lu, Dipti Shankar, Shashank Gugnani, and Dhabaleswar K. Panda. 2016. High-performance design of apache spark with RDMA and its benefits on various workloads. In Proceedings of the IEEE International Conference on Big Data. 253--262.Google Scholar
- Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, and Margo I. Seltzer. 2003. Making the most out of direct-access network attached storage. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association, Berkeley, CA, 189--202. Google Scholar
Digital Library
- Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, Margo I. Seltzer, Jeffrey S. Chase, Andrew J. Gallatin, Richard Kisley, Rajiv Wickremesinghe, and Eran Gabber. 2002. Structure and performance of the direct access file system. In Proceedings of the General Track of the Annual Conference on USENIX Annual Technical Conference (ATEC’02). USENIX Association, Berkeley, CA, 1--14. Google Scholar
Digital Library
- Ilias Marinos, Robert N. M. Watson, and Mark Handley. 2014. Network stack specialization for performance. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM’14). ACM, New York, NY, 175--186. Google Scholar
Digital Library
- Bernard Metzler. 2018. SoftiWARP: Software iWARP kernel driver and user library for Linux. Retrieved July 2018 from https://github.com/zrlio/softiwarp.Google Scholar
- James Mickens, Edmund B. Nightingale, Jeremy Elson, Krishna Nareddy, Darren Gehring, Bin Fan, Asim Kadav, Vijay Chidambaram, and Osama Khan. 2014. Blizzard: Fast, cloud-scale block storage for cloud-oblivious applications. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 257--273. http://dl.acm.org/citation.cfm?id=2616448.2616473. Google Scholar
Digital Library
- Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using One-sided RDMA reads to build a fast, CPU-efficient key-value store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference (USENIX ATC’13). USENIX Association, Berkeley, CA, 103--114. Google Scholar
Digital Library
- Mihir Nanavati, Malte Schwarzkopf, Jake Wires, and Andrew Warfield. 2015. Non-volatile storage. Commun. ACM 59, 1 (Dec. 2015), 56--63. Google Scholar
Digital Library
- Wael Noureddine. 2015. Implementing NVMe over Fabrics. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/networking/WaelNoureddine_Implementing_%20NVMe_revision.pdf.Google Scholar
- John Ousterhout, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park, Henry Qin, Mendel Rosenblum, Stephen Rumble, Ryan Stutsman, and Stephen Yang. 2015. The RAMCloud storage system. ACM Trans. Comput. Syst. 33, 3, Article 7 (Aug. 2015), 55 pages. Google Scholar
Digital Library
- Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel. 1999. IO-lite: A unified I/O buffering and caching system. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI’99). USENIX Association, Berkeley, CA, 15--28. Google Scholar
Digital Library
- Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. 2012. Improving network connection locality on multicore systems. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, NY, 337--350. Google Scholar
Digital Library
- Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe. 2014. Arrakis: The operating system is the control plane. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 1--16. http://dl.acm.org/citation.cfm?id=2685048.2685050 Google Scholar
Digital Library
- Jonas Pfefferle, Patrick Stuedi, Animesh Trivedi, Bernard Metzler, Ionnis Koltsidas, and Thomas R. Gross. 2015. A hybrid I/O virtualization framework for RDMA-capable network interfaces. In Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’15). ACM, New York, NY, 17--30. Google Scholar
Digital Library
- Luigi Rizzo. 2012. Netmap: A novel framework for fast packet I/O. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). USENIX Association, Berkeley, CA, 101--112. Google Scholar
Digital Library
- Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comp. Syst. 10, 1 (Feb. 1992), 26--52. Google Scholar
Digital Library
- Felix Schürmann, Fabien Delalondre, Pramod S. Kumbhar, John Biddiscombe, Miguel Gila, Davide Tacchella, Alessandro Curioni, Bernard Metzler, Peter Morjan, Joachim Fenkes, Michele M. Franceschini, Robert S. Germain, Lars Schneidenbach, T. J. Ward, and Blake G. Fitch. 2014. Rebasing I/O for scientific computing: Leveraging storage class memory in an IBM bluegene/Q supercomputer. In Proceedings of the 29th International Conference on Supercomputing (ISC’14), Vol. 8488. Springer-Verlag, New York, 331--347. Google Scholar
Digital Library
- Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson. 2014. Willow: A user-programmable SSD. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 67--80. Google Scholar
Digital Library
- Yizhou Shan, Shin-Yeh Tsai, and Yiying Zhang. 2017. Distributed shared persistent memory. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC’17). ACM, New York, NY, 323--337. Google Scholar
Digital Library
- Dong In Shin, Young Jin Yu, Hyeong S. Kim, Jae Woo Choi, Do Yung Jung, and Heon Y. Yeom. 2013. Dynamic interval polling and pipelined post I/O processing for low-latency storage class memory. In Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’13). USENIX Association, Berkeley, CA. Google Scholar
Digital Library
- Woong Shin, Qichen Chen, Myoungwon Oh, Hyeonsang Eom, and Heon Y. Yeom. 2014. OS I/O path optimizations for flash solid-state drives. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC’14). USENIX Association, Berkeley, CA, 483--488. Google Scholar
Digital Library
- V. Srinivasan, Brian Bulkowski, Wei-Ling Chu, Sunil Sayyaparaju, Andrew Gooding, Rajkumar Iyer, Ashish Shinde, and Thomas Lopatic. 2016. Aerospike: Architecture of a real-time operational DBMS. Proc. VLDB Endow. 9, 13 (Sept. 2016), 1389--1400. Google Scholar
Digital Library
- Patrick Stuedi, Animesh Trivedi, and Bernard Metzler. 2012. Wimpy nodes with 10GbE: Leveraging one-sided operations in soft-RDMA to boost memcached. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). USENIX Association, Berkeley, CA, 347--353. Google Scholar
Digital Library
- Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Radu Stoica, Bernard Metzler, Nikolas Ioannou, and Ioannis Koltsidas. 2017. Crail: A high-performance I/O architecture for distributed data processing. IEEE Bull. Techn. Committee on Data Eng. 40, 1 (Mar. 2017), 40--52.Google Scholar
- Nisha Talagala. 2012. Native Flash Support for Applications. Retrieved July 2018 from http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2012/20120823_S304B_Talagala.pdf.Google Scholar
- Tom Talpey. 2015. Remote Access to Ultra-low Latency Storage. Retrieved July 2018 from https://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/Talpey-Remote_Access_Storage.pdf.Google Scholar
- Mellanox Technologies. 2018. RDMA Aware Networks Programming User Manual. Retrieved July 2018 from http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf.Google Scholar
- Mellanox Technologies. 2018. Software RDMA over Converged Ethernet (RoCE). Retrieved July 2018 from https://github.com/SoftRoCE.Google Scholar
- Animesh Trivedi, Bernard Metzler, and Patrick Stuedi. 2011. A case for RDMA in clouds: Turning supercomputer networking into commodity. In Proceedings of the Second Asia-Pacific Workshop on Systems (APSys’11). ACM, New York, NY, Article 17, 5 pages. Google Scholar
Digital Library
- Animesh Trivedi, Patrick Stuedi, Bernard Metzler, Clemens Lutz, Martin Schmatz, and Thomas R. Gross. 2015. RStore: A direct-access DRAM-based data store. In Proceedings of the 35th IEEE International Conference on Distributed Computing Systems (ICDCS’15). 674--685.Google Scholar
- Animesh Trivedi, Patrick Stuedi, Bernard Metzler, Roman Pletka, Blake G. Fitch, and Thomas R. Gross. 2013. Unified high-performance I/O: One stack to rule them all. In Proceedings of the 14th USENIX Conference on Hot Topics in Operating Systems (HotOS’13). USENIX Association, Berkeley, CA. Google Scholar
Digital Library
- Shin-Yeh Tsai and Yiying Zhang. 2017. LITE kernel RDMA support for datacenter applications. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, NY, 306--324. Google Scholar
Digital Library
- T. von Eicken, A. Basu, V. Buch, and W. Vogels. 1995. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the F15th ACM Symposium on Operating Systems Principles (SOSP’95). ACM, New York, NY, 40--53. Google Scholar
Digital Library
- Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, NY, 87--104. Google Scholar
Digital Library
- Zev Weiss, Sriram Subramanian, Swaminathan Sundararaman, Nisha Talagala, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2015. ANViL: Advanced virtualization for modern non-volatile memory devices. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, Berkeley, CA, 111--118. Google Scholar
Digital Library
- Brent Welch, Marc Unangst, Zainul Abbasi, Garth Gibson, Brian Mueller, Jason Small, Jim Zelenka, and Bin Zhou. 2008. Scalable performance of the panasas parallel file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). USENIX Association, Berkeley, CA, Article 2, 17 pages. Google Scholar
Digital Library
- Tom White. 2009. Hadoop: The Definitive Guide (1st ed.). O’Reilly Media, Inc. Google Scholar
Digital Library
- John Wilkes. 1992. Hamlyn—An Interface for Sender-based Communications. Technical Report HPL-OSR-92-13. Palo Alto, CA.Google Scholar
- Dimitrios Xinidis, Angelos Bilas, and Michail D. Flouris. 2005. Performance evaluation of commodity iSCSI-based storage systems. In Proceedings of the 22nd IEEE /13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05). IEEE Computer Society, Los Alamitos, CA, 261--269. Google Scholar
Digital Library
- Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the 8th ACM International Systems and Storage Conference (SYSTOR’15). ACM, New York, NY, Article 6, 11 pages. Google Scholar
Digital Library
- Jisoo Yang, Dave B. Minturn, and Frank Hady. 2012. When poll is better than interrupt. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). USENIX Association, Berkeley, CA, 25--32. Google Scholar
Digital Library
- Yiwen Zhang, Juncheng Gu, Youngmoon Lee, Mosharaf Chowdhury, and Kang G. Shin. 2017. Performance isolation anomalies in RDMA. In Proceedings of the Workshop on Kernel-Bypass Networks (KBNets’17). ACM, New York, NY, 43--48. Google Scholar
Digital Library
- Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. 2015. Mojim: A reliable and highly-available non-volatile memory system. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). ACM, New York, NY, 3--18. Google Scholar
Digital Library
Index Terms
FlashNet: Flash/Network Stack Co-Design
Recommendations
Flash storage disaggregation
EuroSys '16: Proceedings of the Eleventh European Conference on Computer SystemsPCIe-based Flash is commonly deployed to provide datacenter applications with high IO rates. However, its capacity and bandwidth are often underutilized as it is difficult to design servers with the right balance of CPU, memory and Flash resources over ...
FlashNet: flash/network stack co-design
SYSTOR '17: Proceedings of the 10th ACM International Systems and Storage ConferenceDuring the past decade, network and storage devices have undergone rapid performance improvements, delivering ultra-low latency and several Gbps of bandwidth. Nevertheless, current network and storage stacks fail to deliver this hardware performance to ...
Adaptive Page Packing and Storing Method for PCM-Flash Hybrid Memory Structure
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and SystemsThis paper presents an advanced PCM-Flash hybrid memory structure for the integrated memory-disk (IMD) structure merging the conventional main memory and disk storage into a single memory layer. The proposed structure can enhance overall write access ...






Comments