skip to main content
research-article

ZNSwap: un-Block your Swap

Published:06 March 2023Publication History
Skip Abstract Section

Abstract

We introduce ZNSwap , a novel swap subsystem optimized for the recent Zoned Namespace (ZNS) SSDs. ZNSwap leverages ZNS’s explicit control over data management on the drive and introduces a space-efficient host-side Garbage Collector (GC) for swap storage co-designed with the OS swap logic. ZNSwap enables cross-layer optimizations, such as direct access to the in-kernel swap usage statistics by the GC to enable fine-grain swap storage management, and correct accounting of the GC bandwidth usage in the OS resource isolation mechanisms to improve performance isolation in multi-tenant environments. We evaluate ZNSwap using standard Linux swap benchmarks and two production key-value stores. ZNSwap shows significant performance benefits over the Linux swap on traditional SSDs, such as stable throughput for different memory access patterns, and 10× lower 99th percentile latency and 5× higher throughput for memcached key-value store under realistic usage scenarios.

REFERENCES

  1. [1] 2009. Swapfile: swap allocation use discard.Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7992fde72ce06c73280a1939b7a1e903bc95ef85.Google ScholarGoogle Scholar
  2. [2] 2016. Making swapping scalable.Retrieved from https://lwn.net/Articles/704478/.Google ScholarGoogle Scholar
  3. [3] 2016. Reconsidering swapping.Retrieved from https://lwn.net/Articles/690079/.Google ScholarGoogle Scholar
  4. [4] 2018. NVM Express 2.0 Zoned Namespace Command Set Specification. Retrieved from https://nvmexpress.org/specifications.Google ScholarGoogle Scholar
  5. [5] 2019. SAMSUNG. Ultra-low latency with Samsung Z-NAND SSD. Retrieved from http://www.samsung.com/us/labs/pdfs/collateral/Samsung_ZNAND_Technology_Brief_v5.pdf.Google ScholarGoogle Scholar
  6. [6] 2020. Swap: try to scan more free slots even when fragmented.Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ed43af10975eef7e21abbb81297d9735448ba4fa.Google ScholarGoogle Scholar
  7. [7] 2021. Archlinux SSD Optimizations. Retrieved from https://wiki.archlinux.org/title/Solid_state_drive##Continuous_TRIM.Google ScholarGoogle Scholar
  8. [8] 2021. cloc: Count lines of code.Retrieved from https://github.com/AlDanial/cloc.Google ScholarGoogle Scholar
  9. [9] 2021. Debian SSD Optimizations. Retrieved from https://wiki.debian.org/SSDOptimization##Mounting_SSD_filesystems.Google ScholarGoogle Scholar
  10. [10] 2021. Facebook cgroupv2 memory controller. Retrieved from https://facebookmicrosites.github.io/cgroup2/docs/memory-controller.html.Google ScholarGoogle Scholar
  11. [11] 2021. Kioxia’s PCIe 5.0 SSD Just Hit 14,000 MBps. Retrieved from https://www.tomshardware.com/news/kioxia-pcie-5-ssd-just-hit-140000-mbps.Google ScholarGoogle Scholar
  12. [12] 2021. LKP. https://01.org/lkp/.Google ScholarGoogle Scholar
  13. [13] 2021. Memcg backend asynchronous reclaim. Retrieved from https://partners-intl.aliyun.com/help/doc-detail/169535.htm.Google ScholarGoogle Scholar
  14. [14] 2021. Multi-generational LRU: the next generation.Retrieved from https://lwn.net/Articles/856931/.Google ScholarGoogle Scholar
  15. [15] 2021. OpenStack: Overcommitting CPU and RAM. Retrieved from https://docs.openstack.org/arch-design/design-compute.Google ScholarGoogle Scholar
  16. [16] 2021. Red Hat: Discarding Unused Blocks. Retrieved from https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_file_systems/discarding-unused-blocks_managing-file-systems.Google ScholarGoogle Scholar
  17. [17] 2021. Redis. Retrieved from https://redis.io.Google ScholarGoogle Scholar
  18. [18] 2021. Redis on Flash. Retrieved from https://redis.com/redis-enterprise/technology/redis-on-flash/.Google ScholarGoogle Scholar
  19. [19] 2021. Solid State Storage Performance Test Specification. Retrieved from https://www.snia.org/sites/default/files/technical_work/PTS/SSS_PTS_2.0.2.pdf.Google ScholarGoogle Scholar
  20. [20] 2021. Swap file on Amazon EC2. Retrieved from https://aws.amazon.com/premiumsupport/knowledge-center/ec2-memory-swap-file/.Google ScholarGoogle Scholar
  21. [21] 2021. Swap space on Amazon EC2. Retrieved from https://aws.amazon.com/premiumsupport/knowledge-center/ec2-memory-partition-hard-drive/.Google ScholarGoogle Scholar
  22. [22] 2021. swapon(8) Linux man pages. Retrieved from https://man7.org/linux/man-pages/man8/swapon.8.html.Google ScholarGoogle Scholar
  23. [23] 2021. Ubuntu: TRIM the swap partition. Retrieved from https://wiki.ubuntuusers.de/SSD/TRIM/##TRIM-der-Swap-Partition.Google ScholarGoogle Scholar
  24. [24] 2021. vm-scalability. Retrieved from https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git.Google ScholarGoogle Scholar
  25. [25] Atikoglu Berk, Xu Yuehai, Frachtenberg Eitan, Jiang Song, and Paleczny Mike. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the ACM SIGMETRICS Performance Evaluation Review. ACM, 5364.Google ScholarGoogle Scholar
  26. [26] Bjørling Matias. 2020. Zone append: A new way of writing to zoned storage. In Proceedings of the Vault Linux Storage and Filesystems Conference. USENIX Association, Santa Clara, CA.Google ScholarGoogle Scholar
  27. [27] Bjørling Matias, Aghayev Abutalib, Holmberg Hans, Ramesh Aravind, Moal D.L., Ganger G., and Amvrosiadis George. 2021. ZNS: Avoiding the block interface tax for flash-based SSDs. In Proceedings of the 2021 usenix Annual Technical Conference.Google ScholarGoogle Scholar
  28. [28] Bjørling Matias, Gonzalez Javier, and Bonnet Philippe. 2017. LightNVM: The linux open-channel SSD subsystem. In Proceedings of the 15th USENIX Conference on File and Storage Technologies FAST 17. 359374.Google ScholarGoogle Scholar
  29. [29] Cooper Brian F., Silberstein Adam, Tam Erwin, Ramakrishnan Raghu, and Sears Russell. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143154.Google ScholarGoogle Scholar
  30. [30] Desnoyers Peter. 2014. Analytic models of SSD write performance. ACM Transactions on Storage 10, 2 (2014), 125.Google ScholarGoogle Scholar
  31. [31] Fitzpatrick Brad. 2004. Distributed caching with memcached. Linux Journal 2004, 124 (2004), 5.Google ScholarGoogle Scholar
  32. [32] González Javier, Bjørling Matias, Lee Seongno, Dong Charlie, and Huang Yiren Ronnie. 2016. Application-driven flash translation layers on open-channel SSDs. In Proceedings of the 7th non Volatile Memory Workshop. 12.Google ScholarGoogle Scholar
  33. [33] Grupp Laura M., Davis John D., and Swanson Steven. 2012. The bleak future of NAND flash memory. In Proceedings of the FAST. 102.Google ScholarGoogle Scholar
  34. [34] Gupta Aayush, Kim Youngjae, and Urgaonkar Bhuvan. 2009. DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings. ACM SIGPLAN Notices 44, 3 (2009), 229240.Google ScholarGoogle Scholar
  35. [35] Holmberg Hans. 2021. dm-zap: Host-based FTL for ZNS SSDs. Retrieved from https://github.com/westerndigitalcorporation/dm-zap.Google ScholarGoogle Scholar
  36. [36] Hu Xiao-Yu, Eleftheriou Evangelos, Haas Robert, Iliadis Ilias, and Pletka Roman. 2009. Write amplification analysis in flash-based solid state drives. In Proceedings of the SYSTOR 2009: The Israeli Experimental Systems Conference. 19.Google ScholarGoogle Scholar
  37. [37] Hyun Choulseung, Choi Jongmoo, Lee Donghee, and Noh Sam H.. 2011. To TRIM or not to TRIM: Judicious triming for solid state drives. In Proceedings of the Poster Presentation in the 23rd ACM Symposium on Operating Systems Principles.Google ScholarGoogle Scholar
  38. [38] Junsu Im, Jinwook Bae, Chanwoo Chung, Arvind, and Sungjin Lee. 2020. PinK: High-speed In-storage key-value store with bounded tails. In Proceeding of the USENIX Annual Technical Conference (USENIX ATC’20).Google ScholarGoogle Scholar
  39. [39] Jiang Song, Zhang Lei, Yuan XinHao, Hu Hao, and Chen Yu. 2011. S-FTL: An efficient address translation for flash memory by exploiting spatial locality. In Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies. IEEE, 112.Google ScholarGoogle Scholar
  40. [40] Jung Dawoon, Kim Jin-soo, Park Seon-yeong, Kang Jeong-uk, and Lee Joonwon. 2005. Fass: A flash-aware swap system. In Proceedings of the International Workshop on Software Support for Portable Storage. Citeseer.Google ScholarGoogle Scholar
  41. [41] Kang Dong Hyun and Eom Young Ik. 2017. TO FLUSH or NOT: Zero padding in the file system with SSD devices. In Proceedings of the 8th Asia-Pacific Workshop on Systems. 19.Google ScholarGoogle Scholar
  42. [42] Kang Jeong-Uk, Hyun Jeeseok, Maeng Hyunjoo, and Cho Sangyeun. 2014. The multi-streamed solid-state drive. In Proceedings of the 6th USENIX Workshop on Hot Topics in Storage and File Systems.Google ScholarGoogle Scholar
  43. [43] Ko Sohyang, Jun Seonsoo, Ryu Yeonseung, Kwon Ohhoon, and Koh Kern. 2008. A new linux swap system for flash memory storage devices. In Proceedings of the 2008 International Conference on Computational Sciences and Its Applications. IEEE, 151156.Google ScholarGoogle Scholar
  44. [44] Lee Gyusun, Jin Wenjing, Song Wonsuk, Gong Jeonghun, Bae Jonghyun, Ham Tae Jun, Lee Jae W., and Jeong Jinkyu. 2020. A case for hardware-based demand paging. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture. IEEE, 11031116.Google ScholarGoogle Scholar
  45. [45] Lee Jongsung and Kim Jin-Soo. 2013. An empirical study of hot/cold data separation policies in solid state drives (SSDs). In Proceedings of the 6th International Systems and Storage Conference. 16.Google ScholarGoogle Scholar
  46. [46] Lee Jaehun, Park Sungmin, Ryu Minsoo, and Kang Sooyong. 2014. Performance evaluation of the SSD-based swap system for big data processing. In Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications. IEEE, 673680.Google ScholarGoogle Scholar
  47. [47] Leverich Jacob. 2014. Mutilate: high-performance memcached load generator. https://github.com/leverich/mutilate.Google ScholarGoogle Scholar
  48. [48] Li Cheng, Chen Hao, Ruan Chaoyi, Ma Xiaosong, and Xu Yinlong. 2021. Leveraging NVMe SSDs for building a fast, cost-effective, LSM-tree-based KV Store. ACM Transactions on Storage 17, 4 (2021), 129.Google ScholarGoogle Scholar
  49. [49] Lin Mingwei and Chen Shuyu. 2012. Flash-aware linux swap system for portable consumer electronics. IEEE Transactions on Consumer Electronics 58, 2 (2012), 419427.Google ScholarGoogle Scholar
  50. [50] Lin Mingwei, Chen Shuyu, and Wang Guiping. 2012. Greedy page replacement algorithm for flash-aware swap system. IEEE Transactions on Consumer Electronics 58, 2 (2012), 435440.Google ScholarGoogle Scholar
  51. [51] Moal Damien Le. 2017. dm-zoned: Zoned Block Device device mapper. Retrieved from https://lwn.net/Articles/714387/.Google ScholarGoogle Scholar
  52. [52] Nguyen Trong-Dat and Lee Sang-Won. 2016. I/O characteristics of MongoDB and trim-based optimization in flash SSDs. In Proceedings of the 6th International Conference on Emerging Databases: Technologies, Applications, and Theory. 139144.Google ScholarGoogle Scholar
  53. [53] S. Ohshima,2018. Scaling flash technology to meet application demands. Keynote 3 at Flash Memory Summit 2018.Google ScholarGoogle Scholar
  54. [54] Park SeongJae, Lee Yunjae, Kim Moonsub, and Yeom Heon Y.. 2019. Automating context-based access pattern hint injection for system performance and swap storage durability. In Proceedings of the 11th USENIX Workshop on Hot Topics in Storage and File Systems.Google ScholarGoogle Scholar
  55. [55] Park Seon-yeong, Jung Dawoon, Kang Jeong-uk, Kim Jin-soo, and Lee Joonwon. 2006. CFLRU: A replacement algorithm for flash memory. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems. 234241.Google ScholarGoogle Scholar
  56. [56] Saxena Mohit and Swift Michael M.. 2010. FlashVM: Virtual memory management on Flash. In Proceedings of the USENIX Annual Technical Conference.Google ScholarGoogle Scholar
  57. [57] Song Taejoon, Lee Gunho, and Kim Youngjin. 2019. Enhanced flash swap: Using NAND flash as a swap device with lifetime control. In Proceedings of the 2019 IEEE International Conference on Consumer Electronics. IEEE, 15.Google ScholarGoogle Scholar
  58. [58] Stoica Radu and Ailamaki Anastasia. 2013. Improving flash write performance by using update frequency. Proceedings of the VLDB Endowment 6, 9 (2013), 733744.Google ScholarGoogle Scholar
  59. [59] Houdt Benny Van. 2013. Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data. Performance Evaluation 70, 10 (2013), 692703.Google ScholarGoogle Scholar
  60. [60] Xu Shuotao. 2016. Bluecache: A Scalable Distributed Flash-based Key-value Store. Ph.D. Dissertation. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  61. [61] Yadgar Gala, Shor Roman, Yaakobi Eitan, and Schuster Assaf. 2015. It’s not where your data is, it’s how it got there. In Proceedings of the 7th {\(USENIX\)} Workshop on Hot Topics in Storage and File Systems.Google ScholarGoogle Scholar
  62. [62] Yang Jingpei, Pandurangan Rajinikanth, Choi Changho, and Balakrishnan Vijay. 2017. AutoStream: Automatic stream management for multi-streamed SSDs. In Proceedings of the 10th ACM International Systems and Storage Conference. 111.Google ScholarGoogle Scholar
  63. [63] Yang Jisoo and Seymour Julian. 2018. Pmbench: A micro-benchmark for profiling paging performance on a system with low-latency SSDs. In Proceedings of the Information Technology-New Generations. Springer, 627633.Google ScholarGoogle Scholar
  64. [64] Zhang Jiacheng, Lu Youyou, Shu Jiwu, and Qin Xiongjun. 2017. FlashKV: Accelerating KV performance with open-channel SSDs. ACM Transactions on Embedded Computing Systems 16, 5s (2017), 119.Google ScholarGoogle Scholar

Index Terms

  1. ZNSwap: un-Block your Swap

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Storage
        ACM Transactions on Storage  Volume 19, Issue 2
        May 2023
        269 pages
        ISSN:1553-3077
        EISSN:1553-3093
        DOI:10.1145/3585541
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 March 2023
        • Online AM: 1 February 2023
        • Revised: 19 January 2023
        • Accepted: 19 January 2023
        • Received: 14 December 2022
        Published in tos Volume 19, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Author Tags

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)301
        • Downloads (Last 6 weeks)60

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!