Abstract
Storage consolidation in a virtualized environment introduces numerous duplications in virtual disks and imposes considerable pressure on disk I/O and caching. In this paper, we present a content look-aside buffer (CLB) approach for simultaneously providing redundancy-free virtual disk I/O and caching. CLB attaches persistent fingerprints to virtual disk blocks, which enables detection of I/O redundancy before disk access. At run time, CLB exploits content pages already present in the guest disk caches to service the redundant reads through page sharing, thus eliminating both redundant I/O requests and redundant disk cache copies. For write requests, CLB uses a group invalidating writeback protocol for updating fingerprints to support crash consistency while minimizing disk write overhead. By implementing and evaluating a CLB prototype on KVM hypervisor, we demonstrate that CLB delivers considerably improved I/O performance with realistic workloads. Our CLB prototype improves the throughput of sequential and random read on duplicate data by 4.1x and 26.2x, respectively. For typical read-intensive workloads, such as booting VM and launching application, CLB's I/O deduplication and cache deduplication eliminates 94.9%--98.5% of read requests and saves 50%--100% cache memory in each VM, respectively. Compared with the QEMU's raw virtual disk format, CLB improves the per-disk VM density by 8x--16x. For mixed read-write workloads, the cost of on-line fingerprint updating offsets the read benefit; nevertheless, CLB substantially improves overall performance.
- A. Arcangeli, I. Eidus, and C. Wright. Increasing memory density by using KSM. In Proceedings of the 2009 Ottawa Linux Symposium (OLS'09), pages 19--28, 2009.Google Scholar
- E. Bugnion, S. Devine, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP'97), pages 143--156, 1997. Google Scholar
Digital Library
- L. Chen, Z. Wei, Z. Cui, M. Chen, H. Pan, and Y. Bao. CMD: Classification-based memory deduplication through page access characteristics. In Proceedings of 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE'14), pages 65--76, 2014. Google Scholar
Digital Library
- Q. Chen, L. Liang, Y. Xia, H. Chen, and H. Kim. Mitigating sync amplification for copy-on-write virtual disk. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 241--247, 2016.Google Scholar
Digital Library
- Citrix Systems, Inc. XenDesktop plannning guide: Storage best practices. White paper, 2011.Google Scholar
- C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI'05), pages 273--286, 2005.Google Scholar
Digital Library
- A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized deduplication in SAN cluster file systems. In Proceedings of the 2009 USENIX Annual Technical Conference, pages 101--114, 2009.Google Scholar
- B. Debnath, S. Sengupta, and J. Li. ChunkStash: Speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX Annual Technical Conference, pages 215--229, 2010.Google Scholar
Digital Library
- A. El-Shimi, R. Kalach, A. Kumar, A. Oltean, J. Li, and S. Sengupta. Primary data deduplication -- large scale study and system design. In Proceedings of the 2012 USENIX Annual Technical Conference, pages 285--296, 2012.Google Scholar
- EMC Corporation. VNX fast cache: A detailed review. White paper, 2012.Google Scholar
- F. Guo and P. Efstathopoulos. Building a high-performance deduplication system. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.Google Scholar
Digital Library
- D. Gupta, S. Lee, M. Vrable, S. Savage, A. C. Snoeren, G. Varghese, G. M. Voelker, and A. Vahdat. Difference engine: Harnessing memory redundancy in virtual machines. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI'08), pages 309--322, 2008.Google Scholar
- IBM Corporation. Best practice for KVM. White paper, 2012.Google Scholar
- H. Kim, H. Jo, and J. Lee. XHive: Efficient cooperative caching for virtual machines. IEEE Transactions On Computers, 60(1):106--119, Jan. 2011. Google Scholar
Digital Library
- A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: the linux virtual machine monitor. In Proceedings of the 2007 Ottawa Linux Symposium (OLS'07), pages 225--230, 2007.Google Scholar
- R. Koller and R. Rangaswami. I/O deduplication: Utilizing content similarity to improve I/O performance. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST'10), 2010. Google Scholar
Digital Library
- A. Liguori and E. V. Hensbergen. Experiences with content addressable storage and virtual disks. In Proceedings of the 1st Workshop on I/O Virtualization (WIOV'08), 2008.Google Scholar
- M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezis, and P. Camble. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST'09), pages 111--123, 2009.Google Scholar
- S. Mandal, G. Kuenning, D. Ok, V. Shastry, P. Shilane, S. Zhen, V. Tarasov, and E. Zadok. Using hints to improve inline block-layer deduplication. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 315--322, 2016.Google Scholar
Digital Library
- D. T. Meyer and W. J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST'11), pages 1--13, 2011.Google Scholar
Digital Library
- D. T. Meyer, G. Aggarwal, B. Cully, G. Lefebvre, M. J. Feeley, N. C. Hutchinson, and A. Warfield. Parallax: Virtual disks for virtual machines. In Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys'08), pages 41--54, 2008.Google Scholar
Digital Library
- K. Miller, F. Franz, T. Groeninger, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. KSM++: Using IO-based hints to make memory-deduplication scanners more efficient. In Proceedings of the ASPLOS Workshop on Runtime Environments, Systems, Layering and Virtualized Environments (RESoLVE'12), 2012.Google Scholar
- K. Miller, F. Franz, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. XLH: More effective memory deduplication scanners through cross-layer hints. In Proceedings of the 2013 USENIX Annual Technical Conference, pages 279--290, 2013.Google Scholar
- G. Miłoś, D. G. Murray, S. Hand, and M. A. Fetterman. Satori: Enlightened page sharing. In Proceedings of the 2009 USENIX Annual Technical Conference, 2009.Google Scholar
- National Institute of Standards and Technology (NIST). Secure Hash Standard (SHS). Standard, October 2008.Google Scholar
- B. Pfaff, T. Garfinkel, and M. Rosenblum. Virtualization aware file systems: Getting beyond the limitations of virtual disks. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI'06), pages 353--366, 2006.Google Scholar
- S. Quinlan and S. Dorward. Venti: A new approach to archival storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02), pages 89--101, 2002.Google Scholar
- J. Ren and Q. Yang. A new buffer cache design exploiting both temporal and content localities. In Proceedings of the 30th International Conference on Distributed Computing Systems (ICDCS'10), pages 273--282, 2010. Google Scholar
Digital Library
- S. Rhea, R. Cox, and A. Pesterev. Fast, inexpensive content-addressed storage in foundation. In Proceedings of the 2008 USENIX Annual Technical Conference, pages 143--156, 2008.Google Scholar
Digital Library
- M. Rosenblum and T. Garfinkel. Virtual machine monitors: Current technology and future trends. IEEE Computer, 38(5): 39--47, May 2005. Google Scholar
Digital Library
- M. Russinovich, D. A. Solomon, and A. Ionescu. Windows Internals, 6th edition. Microsoft Press, 2012.Google Scholar
- J. Shafer. I/O virtualization bottlenecks in cloud computing today. In Proceedings of the 2nd Workshop on I/O Virtualization (WIOV'10), 2010.Google Scholar
- P. Sharma and P. Kulkarni. Singleton: System-wide page deduplication in virtual environments. In Proceedings of the 21st International Symposium on High-performance Parallel and Distributed Computing (HPDC'12), pages 15--26, 2012. Google Scholar
Digital Library
- B. Singh. Page/slab cache control in a virtualized environment. In Proceedings of the 2010 Ottawa Linux Symposium (OLS'10), pages 255--262, 2010.Google Scholar
- J. E. Smith and R. Nair. The architecture of virtual machines. IEEE Computer, 38(5):32--38, May 2005. Google Scholar
Digital Library
- K. Srinivasan, T. Bisson, G. Goodson, and K. Voruganti. iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST'12), pages 299--312, 2012.Google Scholar
- V. Tarasov, D. Jain, G. Kuenning, S. Mandal, K. Palanisami, P. Shilane, S. Trehan, and E. Zadok. Dmdedup: Device mapper target for data deduplication. In Proceedings of the 2014 Ottawa Linux Symposium (OLS'14), 2014.Google Scholar
- VMware Corporation. VMware Virtual Desktop Infrastructure. White paper, 2007.Google Scholar
- VMware Corporation. View storage accelerator in VMware View 5.1. White paper, 2012.Google Scholar
- C. A. Waldspurger. Memory resource management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI'02), pages 181--194, 2002. Google Scholar
Cross Ref
- W. Xia, H. Jiang, D. Feng, and Y. Hua. SiLo: A similarity-locality based near-exact deduplication scheme with low RAM overhead and high throughput. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.Google Scholar
- B. Zhu, K. Li, and H. Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST'08), pages 269--282, 2008.Google Scholar
Index Terms
Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching
Recommendations
Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching
VEE '17: Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsStorage consolidation in a virtualized environment introduces numerous duplications in virtual disks and imposes considerable pressure on disk I/O and caching. In this paper, we present a content look-aside buffer (CLB) approach for simultaneously ...
Fast Memory Deduplication of Disk Cache Pages in Virtual Environments
BDCLOUD '15: Proceedings of the 2015 IEEE Fifth International Conference on Big Data and Cloud ComputingIn virtualized cloud computing environments, the memory resource is becoming the major bottleneck for limiting server consolidation degree of virtual machines. The fact that great amount of redundant cache pages exist in the consolidated virtual ...
DCD—disk caching disk: a new approach for boosting I/O performance
ISCA '96: Proceedings of the 23rd annual international symposium on Computer architectureThis paper presents a novel disk storage architecture called DCD, Disk Caching Disk, for the purpose of optimizing I/O performance. The main idea of the DCD is to use a small log disk, referred to as cache-disk, as a secondary disk cache to optimize ...







Comments