skip to main content
tutorial

Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching

Published:08 April 2017Publication History
Skip Abstract Section

Abstract

Storage consolidation in a virtualized environment introduces numerous duplications in virtual disks and imposes considerable pressure on disk I/O and caching. In this paper, we present a content look-aside buffer (CLB) approach for simultaneously providing redundancy-free virtual disk I/O and caching. CLB attaches persistent fingerprints to virtual disk blocks, which enables detection of I/O redundancy before disk access. At run time, CLB exploits content pages already present in the guest disk caches to service the redundant reads through page sharing, thus eliminating both redundant I/O requests and redundant disk cache copies. For write requests, CLB uses a group invalidating writeback protocol for updating fingerprints to support crash consistency while minimizing disk write overhead. By implementing and evaluating a CLB prototype on KVM hypervisor, we demonstrate that CLB delivers considerably improved I/O performance with realistic workloads. Our CLB prototype improves the throughput of sequential and random read on duplicate data by 4.1x and 26.2x, respectively. For typical read-intensive workloads, such as booting VM and launching application, CLB's I/O deduplication and cache deduplication eliminates 94.9%--98.5% of read requests and saves 50%--100% cache memory in each VM, respectively. Compared with the QEMU's raw virtual disk format, CLB improves the per-disk VM density by 8x--16x. For mixed read-write workloads, the cost of on-line fingerprint updating offsets the read benefit; nevertheless, CLB substantially improves overall performance.

References

  1. A. Arcangeli, I. Eidus, and C. Wright. Increasing memory density by using KSM. In Proceedings of the 2009 Ottawa Linux Symposium (OLS'09), pages 19--28, 2009.Google ScholarGoogle Scholar
  2. E. Bugnion, S. Devine, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP'97), pages 143--156, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Chen, Z. Wei, Z. Cui, M. Chen, H. Pan, and Y. Bao. CMD: Classification-based memory deduplication through page access characteristics. In Proceedings of 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE'14), pages 65--76, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Q. Chen, L. Liang, Y. Xia, H. Chen, and H. Kim. Mitigating sync amplification for copy-on-write virtual disk. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 241--247, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Citrix Systems, Inc. XenDesktop plannning guide: Storage best practices. White paper, 2011.Google ScholarGoogle Scholar
  6. C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI'05), pages 273--286, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized deduplication in SAN cluster file systems. In Proceedings of the 2009 USENIX Annual Technical Conference, pages 101--114, 2009.Google ScholarGoogle Scholar
  8. B. Debnath, S. Sengupta, and J. Li. ChunkStash: Speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX Annual Technical Conference, pages 215--229, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. El-Shimi, R. Kalach, A. Kumar, A. Oltean, J. Li, and S. Sengupta. Primary data deduplication -- large scale study and system design. In Proceedings of the 2012 USENIX Annual Technical Conference, pages 285--296, 2012.Google ScholarGoogle Scholar
  10. EMC Corporation. VNX fast cache: A detailed review. White paper, 2012.Google ScholarGoogle Scholar
  11. F. Guo and P. Efstathopoulos. Building a high-performance deduplication system. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Gupta, S. Lee, M. Vrable, S. Savage, A. C. Snoeren, G. Varghese, G. M. Voelker, and A. Vahdat. Difference engine: Harnessing memory redundancy in virtual machines. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI'08), pages 309--322, 2008.Google ScholarGoogle Scholar
  13. IBM Corporation. Best practice for KVM. White paper, 2012.Google ScholarGoogle Scholar
  14. H. Kim, H. Jo, and J. Lee. XHive: Efficient cooperative caching for virtual machines. IEEE Transactions On Computers, 60(1):106--119, Jan. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: the linux virtual machine monitor. In Proceedings of the 2007 Ottawa Linux Symposium (OLS'07), pages 225--230, 2007.Google ScholarGoogle Scholar
  16. R. Koller and R. Rangaswami. I/O deduplication: Utilizing content similarity to improve I/O performance. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST'10), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Liguori and E. V. Hensbergen. Experiences with content addressable storage and virtual disks. In Proceedings of the 1st Workshop on I/O Virtualization (WIOV'08), 2008.Google ScholarGoogle Scholar
  18. M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezis, and P. Camble. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST'09), pages 111--123, 2009.Google ScholarGoogle Scholar
  19. S. Mandal, G. Kuenning, D. Ok, V. Shastry, P. Shilane, S. Zhen, V. Tarasov, and E. Zadok. Using hints to improve inline block-layer deduplication. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 315--322, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. T. Meyer and W. J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST'11), pages 1--13, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. T. Meyer, G. Aggarwal, B. Cully, G. Lefebvre, M. J. Feeley, N. C. Hutchinson, and A. Warfield. Parallax: Virtual disks for virtual machines. In Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys'08), pages 41--54, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Miller, F. Franz, T. Groeninger, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. KSM++: Using IO-based hints to make memory-deduplication scanners more efficient. In Proceedings of the ASPLOS Workshop on Runtime Environments, Systems, Layering and Virtualized Environments (RESoLVE'12), 2012.Google ScholarGoogle Scholar
  23. K. Miller, F. Franz, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. XLH: More effective memory deduplication scanners through cross-layer hints. In Proceedings of the 2013 USENIX Annual Technical Conference, pages 279--290, 2013.Google ScholarGoogle Scholar
  24. G. Miłoś, D. G. Murray, S. Hand, and M. A. Fetterman. Satori: Enlightened page sharing. In Proceedings of the 2009 USENIX Annual Technical Conference, 2009.Google ScholarGoogle Scholar
  25. National Institute of Standards and Technology (NIST). Secure Hash Standard (SHS). Standard, October 2008.Google ScholarGoogle Scholar
  26. B. Pfaff, T. Garfinkel, and M. Rosenblum. Virtualization aware file systems: Getting beyond the limitations of virtual disks. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI'06), pages 353--366, 2006.Google ScholarGoogle Scholar
  27. S. Quinlan and S. Dorward. Venti: A new approach to archival storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02), pages 89--101, 2002.Google ScholarGoogle Scholar
  28. J. Ren and Q. Yang. A new buffer cache design exploiting both temporal and content localities. In Proceedings of the 30th International Conference on Distributed Computing Systems (ICDCS'10), pages 273--282, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Rhea, R. Cox, and A. Pesterev. Fast, inexpensive content-addressed storage in foundation. In Proceedings of the 2008 USENIX Annual Technical Conference, pages 143--156, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Rosenblum and T. Garfinkel. Virtual machine monitors: Current technology and future trends. IEEE Computer, 38(5): 39--47, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Russinovich, D. A. Solomon, and A. Ionescu. Windows Internals, 6th edition. Microsoft Press, 2012.Google ScholarGoogle Scholar
  32. J. Shafer. I/O virtualization bottlenecks in cloud computing today. In Proceedings of the 2nd Workshop on I/O Virtualization (WIOV'10), 2010.Google ScholarGoogle Scholar
  33. P. Sharma and P. Kulkarni. Singleton: System-wide page deduplication in virtual environments. In Proceedings of the 21st International Symposium on High-performance Parallel and Distributed Computing (HPDC'12), pages 15--26, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Singh. Page/slab cache control in a virtualized environment. In Proceedings of the 2010 Ottawa Linux Symposium (OLS'10), pages 255--262, 2010.Google ScholarGoogle Scholar
  35. J. E. Smith and R. Nair. The architecture of virtual machines. IEEE Computer, 38(5):32--38, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Srinivasan, T. Bisson, G. Goodson, and K. Voruganti. iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST'12), pages 299--312, 2012.Google ScholarGoogle Scholar
  37. V. Tarasov, D. Jain, G. Kuenning, S. Mandal, K. Palanisami, P. Shilane, S. Trehan, and E. Zadok. Dmdedup: Device mapper target for data deduplication. In Proceedings of the 2014 Ottawa Linux Symposium (OLS'14), 2014.Google ScholarGoogle Scholar
  38. VMware Corporation. VMware Virtual Desktop Infrastructure. White paper, 2007.Google ScholarGoogle Scholar
  39. VMware Corporation. View storage accelerator in VMware View 5.1. White paper, 2012.Google ScholarGoogle Scholar
  40. C. A. Waldspurger. Memory resource management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI'02), pages 181--194, 2002. Google ScholarGoogle ScholarCross RefCross Ref
  41. W. Xia, H. Jiang, D. Feng, and Y. Hua. SiLo: A similarity-locality based near-exact deduplication scheme with low RAM overhead and high throughput. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.Google ScholarGoogle Scholar
  42. B. Zhu, K. Li, and H. Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST'08), pages 269--282, 2008.Google ScholarGoogle Scholar

Index Terms

  1. Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 52, Issue 7
          VEE '17
          July 2017
          256 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/3140607
          Issue’s Table of Contents
          • cover image ACM Conferences
            VEE '17: Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
            April 2017
            261 pages
            ISBN:9781450349482
            DOI:10.1145/3050748

          Copyright © 2017 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 April 2017

          Check for updates

          Qualifiers

          • tutorial
          • Research
          • Refereed limited

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!