skip to main content
research-article

Transparent Online Storage Compression at the Block-Level

Authors Info & Claims
Published:01 May 2012Publication History
Skip Abstract Section

Abstract

In this work, we examine how transparent block-level compression in the I/O path can improve both the space efficiency and performance of online storage. We present ZBD, a block-layer driver that transparently compresses and decompresses data as they flow between the file-system and storage devices. Our system provides support for variable-size blocks, metadata caching, and persistence, as well as block allocation and cleanup. ZBD targets maintaining high performance, by mitigating compression and decompression overheads that can have a significant impact on performance by leveraging modern multicore CPUs through explicit work scheduling. We present two case-studies for compression. First, we examine how our approach can be used to increase the capacity of SSD-based caches, thus increasing their cost-effectiveness. Then, we examine how ZBD can improve the efficiency of online disk-based storage systems.

We evaluate our approach in the Linux kernel on a commodity server with multicore CPUs, using PostMark, SPECsfs2008, TPC-C, and TPC-H. Preliminary results show that transparent online block-level compression is a viable option for improving effective storage capacity, it can improve I/O performance up to 80% by reducing I/O traffic and seek distance, and has a negative impact on performance, up to 34%, only when single-thread I/O latency is critical. In particular, for SSD-based caching, our results indicate that, in line with current technology trends, compressed caching trades off CPU utilization for performance and enhances SSD efficiency as a storage cache up to 99%.

References

  1. Adaptec, Inc. 2009. MaxIQ SSD cache performance. White paper. www.adaptec.com/en-US/products/CloudComputing/-MaxIQ/SSD-Cache-Performance/index.htm.Google ScholarGoogle Scholar
  2. Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J. D., Manasse, M., and Panigrahy, R. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX Annual Technical Conference (ATC). 57--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aleph One Ltd, Embedded Debian. 2002. Yaffs: A NAND-Flash Filesystem.Google ScholarGoogle Scholar
  4. Appel, A. W. and Li, K. 1991. Virtual memory primitives for user programs. SIGPLAN Notes 26, 4, 96--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ayers, L. 1997. E2compr: Transparent file compression for Linux. http://e2compr.sourceforge.net/.Google ScholarGoogle Scholar
  6. Bobbarjung, D. R., Jagannathan, S., and Dubnicki, C. 2006. Improving duplicate elimination in storage systems. Trans. Storage 2, 4, 424--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Burrows, M., Jerian, C., Lampson, B., and Mann, T. 1992. On-line data compression in a log-structured file system. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’92). ACM, New York, 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cate, V. and Gross, T. 1991. Combining the concepts of compression and Caching for two-level filesystem. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’91). ACM, New York, 200--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Coffing, C. and Brown, J. H. 1997. A survey of modern file compression techniques. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.9847.Google ScholarGoogle Scholar
  10. Cormack, G. V. 1985. Data compression on a database system. Comm. ACM 28, 12, 1336--1342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Deutsch, L. P. and Gailly, J.-L. 1996. ZLIB Compressed Data Format Specification version 3.3. Internet RFC 1950. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dirik, C. and Jacob, B. 2009. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In Proceedings of the ISCA’09. ACM, 279--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Douglis, F. 1992. On the role of compression in distributed systems. In Proceedings of the ACM SIGOPS, EW 5. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Douglis, F. 1993. The compression cache: Using on-line compression to extend physical memory. In Proceedings of the Winter USENIX Conference. 519--529.Google ScholarGoogle Scholar
  15. Engel, J. and Mertens, R. 2006. LogFS - finally a scalable flash file system. http://lazybastard.org/ joern/logfs1.pdf.Google ScholarGoogle Scholar
  16. Fusion-io. 2007. Fusion-IO’s solid state storage: A new standard for enterprise-class reliability. http://www.fusionio.com.Google ScholarGoogle Scholar
  17. Gupta, N. 2010. Compcache: Compressed in-memory swap device for Linux. http://code.google.com/p/compcache.Google ScholarGoogle Scholar
  18. Katcher, J. 1997. PostMark: A new file system benchmark. http:// www.netapp.com/ tech_library/3022.html.Google ScholarGoogle Scholar
  19. Kgil, T. and Trevor, M. 2006. Flashcache: A NAND flash memory file cache for low power web servers. In Proceedings of the CASES’06. ACM, 103--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kim, H. and Ahn, S. 2008. BPLRU: A buffer management scheme for improving random writes in flash storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). USENIX Association, Berkeley, CA, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lee, S.-W., Moon, B., Park, C., Kim, J.-M., and Kim, S.-W. 2008. A case for flash memory SSD in enterprise database applications. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). ACM, New York, 1075--1086. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lelewer, D. A. and Hirschberg, D. S. 1987. Data compression. ACM Comput. Surv. 19, 3, 261--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Leventhal, A. 2008. Flash storage memory. Comm. ACM 51, 7, 47--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lougher, P. and Lougher, R. 2008. SquashFS. http://squashfs.sourceforge.net.Google ScholarGoogle Scholar
  25. Makatos, T., Klonatos, Y., Marazakis, M., Flouris, M. D., and Bilas, A. 2010a. Using transparent compression to improve SSD-based I/O caches. In Proceedings of the 5th European Conference on Computer Systems (EuroSys’10). ACM, New York, NY, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Makatos, T., Klonatos, Y., Marazakis, M., Flouris, M. D., and Bilas, A. 2010b. ZBD: Using transparent compression at the block level to increase storage space efficiency. In Proceedings of the IEEE International Workshop on Storage Network Architecture and Parallel I/Os. 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Manber, U. 1994. Finding similar files in a large file system. In Proceedings of the USENIX Winter 1994 Technical Conference (WTEC’94). USENIX Association, 2--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Meisner, D., Gold, B. T., and Wenisch, T. F. 2009. POWERNAP: Eliminating server idle power. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). ACM, New York, 205--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Microsoft Corporation. 2008. Understanding NTFS Compression. http://blogs.msdn.com/ntdebugging/archive/2008/05/20/-understanding-ntfs-compression.aspx.Google ScholarGoogle Scholar
  30. Microsoft Corporation. 2009. Best practices for NTFS compression in Windows. support.microsoft.com/default.aspx?scid=kb;en-us;Q251186.Google ScholarGoogle Scholar
  31. Microsoft Corporation. 2010. Explore the features: Windows ReadyBoost. www.microsoft.com/windows/windows-vista/features/readyboost.aspx.Google ScholarGoogle Scholar
  32. Narayanan, D., Thereska, E., Donnelly, A., Elnikety, S., and Rowstron, A. 2009. Migrating server storage to SSDS: Analysis of tradeoffs. In Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys’09). ACM, New York, 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ng, W. K. and Ravishankar, C. V. 1997. Block-oriented compression techniques for large statistical databases. IEEE Trans. Knowl. Data Eng. 9, 2, 314--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. North American Systems International, Inc. FalconStor HotZone - Maximize the performance of your SAN. http://www.nasi.com/hotZone.php.Google ScholarGoogle Scholar
  35. Oberhumer, M. F. X. J. 2008. LZO--A real-time data compression library. http://www.oberhumer.com/opensource/lzo/.Google ScholarGoogle Scholar
  36. Oracle Corporation and Sun Microsystems, Inc. 2009. Oracle Solaris ZFS. http://www.oracle.com/us/products/servers-storage/storage/storage-software/031857.htm.Google ScholarGoogle Scholar
  37. Poess, M. and Potapov, D. 2003. Data compression in oracle. In Proceedings of the 29th VLDB Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Rajimwale, A., Prabhakaran, V., and Davis J. D. 2009. Block management in solid-state devices. In Proceedings of the USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rizzo, L. 1997. A very fast algorithm for RAM compression. SIGOPS Oper. Syst. Rev. 31, 2, 36--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Rosenblum, M. and Ousterhout, J. K. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1, 26--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Russel, P. 2002. The compressed loopback device. http://www.knoppix.net/wiki/Cloop.Google ScholarGoogle Scholar
  42. Savage, S. 2006. CBD compressed block device, new embedded block device. http://lwn.net/Articles/168725.Google ScholarGoogle Scholar
  43. Smith, M. E. G. and Storer, J. A. 1985. Parallel algorithms for data compression. J. ACM 32, 2, 344--373. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. SPEC. 2008a. SPECsfs2008: SPEC’s benchmark designed to evaluate the speed and request-handling capabilities of file servers utilizing the NFSv3 and CIFS protocols. http://www.spec.org/sfs2008/.Google ScholarGoogle Scholar
  45. SPEC. 2008b. SPECsfs2008_cifs published results, as of Nov-10-2009. http://www.spec.org/sfs2008/results/-sfs2008.html.Google ScholarGoogle Scholar
  46. SPEC. 2009. SPECmail2009 published results, as of Nov-06-2009. http://www.spec.org/mail2009/results/-specmail_ent2009.html.Google ScholarGoogle Scholar
  47. Svoboda, M. 2010. FuseCompress, a mountable Linux file system which transparently compress its content. http://miio.net/wordpress/projects/fusecompress/.Google ScholarGoogle Scholar
  48. Thomas, C. and Wong, M. 2007. Database Test 2 (DBT-2), an OLTP transactional performance test. http://osdldbt.sourceforge.net/.Google ScholarGoogle Scholar
  49. TPC. 1997. Overview of the TPC benchmark C: The order-entry benchmark. http://www.tpc.org/tpcc/default.asp.Google ScholarGoogle Scholar
  50. TPC. 2009a. Top ten non-clustered TPC-H published results by performance. http://tpc.org/tpch/results/tpch_perf_results.asp?resulttype=noncluster.Google ScholarGoogle Scholar
  51. TPC. 2009b. TPC-H: An ad-hoc, decision support benchmark. www.tpc.org/tpch.Google ScholarGoogle Scholar
  52. Welch, T. A. 1984. A technique for high-performance data compression. IEEE Computer 17, 6, 8--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Wilson, P. R., Kaplan, S. F., and Smaragdakis, Y. 1999. The case for compressed caching in virtual memory systems. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, 101--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Woodhouse, D. 2001. JFFS: The Journalling Flash File System. http://www.csie.nctu.edu.tw/~ijsung/documents/jffs2.pdf.Google ScholarGoogle Scholar
  55. Yang, L., Dick, R. P., Lekatsas, H., and Chakradhar, S. 2005. Crames: Compressed ram for embedded systems. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’05). ACM, New York, 93--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Zhu, B., Li, K., and Patterson, H. 2008. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). USENIX Association, Berkeley, CA, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Ziv, J. and Lempel, A. 1977. A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23, 337--343. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Transparent Online Storage Compression at the Block-Level

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 8, Issue 2
          May 2012
          89 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/2180905
          Issue’s Table of Contents

          Copyright © 2012 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 May 2012
          • Accepted: 1 October 2011
          • Revised: 1 July 2011
          • Received: 1 March 2011
          Published in tos Volume 8, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!