skip to main content
research-article

Emulating goliath storage systems with David

Published:02 February 2012Publication History
Skip Abstract Section

Abstract

Benchmarking file and storage systems on large file-system images is important, but difficult and often infeasible. Typically, running benchmarks on such large disk setups is a frequent source of frustration for file-system evaluators; the scale alone acts as a strong deterrent against using larger, albeit realistic, benchmarks. To address this problem, we develop David: a system that makes it practical to run large benchmarks using modest amount of storage or memory capacities readily available on most computers. David creates a “compressed” version of the original file-system image by omitting all file data and laying out metadata more efficiently; an online storage model determines the runtime of the benchmark workload on the original uncompressed image. David works under any file system, as demonstrated in this article with ext3 and btrfs. We find that David reduces storage requirements by orders of magnitude; David is able to emulate a 1-TB target workload using only an 80 GB available disk, while still modeling the actual runtime accurately. David can also emulate newer or faster devices, for example, we show how David can effectively emulate a multidisk RAID using a limited amount of memory.

References

  1. Agrawal, N., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2009. Generating realistic impressions for file-system benchmarking. In Proceedings of the 7th Conference on File and Storage Technologies (FAST'09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, N., Arulraj, L., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2011. Emulating Goliath storage systems with David. In Proceedings of the 9th Conference on File and Storage Technologies (FAST'11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Agrawal, N., Bolosky, W. J., Douceur, J. R., and Lorch, J. R. 2007a. A five-year study of file-system metadata. In Proceedings of the 5th USENIX Symposium on File and Storage Technologies (FAST'07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Agrawal, N., Bolosky, W. J., Douceur, J. R., and Lorch, J. R. 2007b. A five-year study of file-system metadata: Microsoft longitudinal dataset. http://iotta.snia.org/traces/list/Static.Google ScholarGoogle Scholar
  5. Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J. D., Manasse, M., and Panigrahy, R. 2008. Design tradeoffs for SSD performance. In Proceedings of the Usenix Annual Technical Conference (USENIX'08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anderson, E. 2001. Simple table-based modeling of storage devices. Tech. rep. HPL-SSP-2001-04, HP Laboratories.Google ScholarGoogle Scholar
  7. Bucy, J. S. and Ganger, G. R. 2003. The DiskSim simulation environment Version 3.0 reference manual. tech. rep. CMU-CS-03-102, Carnegie Mellon University.Google ScholarGoogle Scholar
  8. Chen, P. M. and Patterson, D. A. 1993. A new approach to I/O performance evaluation--Self-scaling I/O benchmarks, predicted I/O performance. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'93). ACM, New York, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ganger, G. R. and Patt, Y. N. 1998. Using system-level models to evaluate I/O subsystem designs. IEEE Trans. Comput. 47, 6, 667--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. GraySort Benchmark. http://sortbenchmark.org/FAQ.htm#gray.Google ScholarGoogle Scholar
  11. Griffin, J. L., Schindler, J., Schlosser, S. W., Bucy, J. S., and Ganger, G. R. 2002. Timing-accurate storage emulation. In Proceedings of the 1st USENIX Symposium on File and Storage Technologies (FAST'02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gupta, D., Yocum, K., McNett, M., Snoeren, A. C., Vahdat, A., and Voelker, G. M. 2006. To infinity and beyond: Time-warped network emulation. In Proceedings of the 3rd Conference on Networked Systems Design and Implementation (NSDI'06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kaashoek, M. F., Engler, D. R., Ganger, G. R., Briceõno, H., Hunt, R., Mazières, D., Pinckney, T., Grimm, R., Jannotti, J., and Mackenzie, K. 1997. Application performance and flexibility on exokernel systems. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP'97). ACM, New York, 52--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Katcher, J. 1997. PostMark: A new file system benchmark. Tech. rep. TR-3022, Network Appliance Inc.Google ScholarGoogle Scholar
  15. Mayfield, J., Finin, T., and Hall, M. 1995. Using automatic memoization as a software engineering tool in real-world AI systems. In Proceedings of the 11th Conference on Artificial Intelligence. IEEE, Los Alamitos, CA, 87--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. McDougall, R. Filebench: Application level file system benchmark. http://www.solarisinternals.com/si/tools/filebench/index.php.Google ScholarGoogle Scholar
  17. Miller, E. L. 1996. Towards scalable benchmarks for mass storage systems. In Proceedings of the 5th NASA Goddard Conference on Mass Storage Systems and Technologies.Google ScholarGoogle Scholar
  18. Riedel, E., Kallahalla, M., and Swaminathan, R. 2002. A framework for evaluating storage system security. In Proceedings of the 1st USENIX Symposium on File and Storage Technologies (FAST'02). 14--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rinard, M., Cadar, C., Dumitran, D., Roy, D. M., Leu, T., and William S. Beebe, J. 2004. Enhancing server availability and security through failure-oblivious computing. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ruemmler, C. and Wilkes, J. 1994. An introduction to disk drive modeling. IEEE Computer 27, 3, 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shriver, E. 1997. Performance modeling for realistic storage devices. Ph.D. dissertation, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-smart disk systems. In Proceedings of the 2nd USENIX Symposium on File and Storage Technologies (FAST'03). 73--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Standard Performance Evaluation Corp. SPECmail2009 Benchmark. http://www.spec.org/mail2009/.Google ScholarGoogle Scholar
  24. Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., and Peck, G. 1996. Scalability in the XFS file system. In Proceedings of the USENIX Annual Technical Conference (USENIX'96). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Traeger, A. and Zadok, E. 2009. How to cheat at benchmarking. In USENIX FAST Birds of a Feather Session.Google ScholarGoogle Scholar
  26. Tweedie, S. C. 1998. Journaling the Linux ext2fs file system. In Proceedings of the 4th Annual Linux Expo.Google ScholarGoogle Scholar
  27. Wikipedia. 2009. Btrfs. en.wikipedia.org/wiki/Btrfs.Google ScholarGoogle Scholar
  28. Wittle, M. and Keith, B. E. 1993. LADDIS: The next generation in NFS file server bench-marking. In Proceedings of the USENIX Summer Conference. 111--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Zadok, E. 2008. File and storage systems benchmarking workshop. University of California, Santa Cruz.Google ScholarGoogle Scholar

Index Terms

  1. Emulating goliath storage systems with David

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Storage
              ACM Transactions on Storage  Volume 7, Issue 4
              January 2012
              65 pages
              ISSN:1553-3077
              EISSN:1553-3093
              DOI:10.1145/2078861
              Issue’s Table of Contents

              Copyright © 2012 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 2 February 2012
              • Accepted: 1 August 2011
              • Received: 1 July 2011
              Published in tos Volume 7, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!