Abstract
Benchmarking file and storage systems on large file-system images is important, but difficult and often infeasible. Typically, running benchmarks on such large disk setups is a frequent source of frustration for file-system evaluators; the scale alone acts as a strong deterrent against using larger, albeit realistic, benchmarks. To address this problem, we develop David: a system that makes it practical to run large benchmarks using modest amount of storage or memory capacities readily available on most computers. David creates a “compressed” version of the original file-system image by omitting all file data and laying out metadata more efficiently; an online storage model determines the runtime of the benchmark workload on the original uncompressed image. David works under any file system, as demonstrated in this article with ext3 and btrfs. We find that David reduces storage requirements by orders of magnitude; David is able to emulate a 1-TB target workload using only an 80 GB available disk, while still modeling the actual runtime accurately. David can also emulate newer or faster devices, for example, we show how David can effectively emulate a multidisk RAID using a limited amount of memory.
- Agrawal, N., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2009. Generating realistic impressions for file-system benchmarking. In Proceedings of the 7th Conference on File and Storage Technologies (FAST'09). Google Scholar
Digital Library
- Agrawal, N., Arulraj, L., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2011. Emulating Goliath storage systems with David. In Proceedings of the 9th Conference on File and Storage Technologies (FAST'11). Google Scholar
Digital Library
- Agrawal, N., Bolosky, W. J., Douceur, J. R., and Lorch, J. R. 2007a. A five-year study of file-system metadata. In Proceedings of the 5th USENIX Symposium on File and Storage Technologies (FAST'07). Google Scholar
Digital Library
- Agrawal, N., Bolosky, W. J., Douceur, J. R., and Lorch, J. R. 2007b. A five-year study of file-system metadata: Microsoft longitudinal dataset. http://iotta.snia.org/traces/list/Static.Google Scholar
- Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J. D., Manasse, M., and Panigrahy, R. 2008. Design tradeoffs for SSD performance. In Proceedings of the Usenix Annual Technical Conference (USENIX'08). Google Scholar
Digital Library
- Anderson, E. 2001. Simple table-based modeling of storage devices. Tech. rep. HPL-SSP-2001-04, HP Laboratories.Google Scholar
- Bucy, J. S. and Ganger, G. R. 2003. The DiskSim simulation environment Version 3.0 reference manual. tech. rep. CMU-CS-03-102, Carnegie Mellon University.Google Scholar
- Chen, P. M. and Patterson, D. A. 1993. A new approach to I/O performance evaluation--Self-scaling I/O benchmarks, predicted I/O performance. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'93). ACM, New York, 1--12. Google Scholar
Digital Library
- Ganger, G. R. and Patt, Y. N. 1998. Using system-level models to evaluate I/O subsystem designs. IEEE Trans. Comput. 47, 6, 667--678. Google Scholar
Digital Library
- GraySort Benchmark. http://sortbenchmark.org/FAQ.htm#gray.Google Scholar
- Griffin, J. L., Schindler, J., Schlosser, S. W., Bucy, J. S., and Ganger, G. R. 2002. Timing-accurate storage emulation. In Proceedings of the 1st USENIX Symposium on File and Storage Technologies (FAST'02). Google Scholar
Digital Library
- Gupta, D., Yocum, K., McNett, M., Snoeren, A. C., Vahdat, A., and Voelker, G. M. 2006. To infinity and beyond: Time-warped network emulation. In Proceedings of the 3rd Conference on Networked Systems Design and Implementation (NSDI'06). Google Scholar
Digital Library
- Kaashoek, M. F., Engler, D. R., Ganger, G. R., Briceõno, H., Hunt, R., Mazières, D., Pinckney, T., Grimm, R., Jannotti, J., and Mackenzie, K. 1997. Application performance and flexibility on exokernel systems. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP'97). ACM, New York, 52--65. Google Scholar
Digital Library
- Katcher, J. 1997. PostMark: A new file system benchmark. Tech. rep. TR-3022, Network Appliance Inc.Google Scholar
- Mayfield, J., Finin, T., and Hall, M. 1995. Using automatic memoization as a software engineering tool in real-world AI systems. In Proceedings of the 11th Conference on Artificial Intelligence. IEEE, Los Alamitos, CA, 87--93. Google Scholar
Digital Library
- McDougall, R. Filebench: Application level file system benchmark. http://www.solarisinternals.com/si/tools/filebench/index.php.Google Scholar
- Miller, E. L. 1996. Towards scalable benchmarks for mass storage systems. In Proceedings of the 5th NASA Goddard Conference on Mass Storage Systems and Technologies.Google Scholar
- Riedel, E., Kallahalla, M., and Swaminathan, R. 2002. A framework for evaluating storage system security. In Proceedings of the 1st USENIX Symposium on File and Storage Technologies (FAST'02). 14--29. Google Scholar
Digital Library
- Rinard, M., Cadar, C., Dumitran, D., Roy, D. M., Leu, T., and William S. Beebe, J. 2004. Enhancing server availability and security through failure-oblivious computing. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI'04). Google Scholar
Digital Library
- Ruemmler, C. and Wilkes, J. 1994. An introduction to disk drive modeling. IEEE Computer 27, 3, 17--28. Google Scholar
Digital Library
- Shriver, E. 1997. Performance modeling for realistic storage devices. Ph.D. dissertation, New York. Google Scholar
Digital Library
- Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-smart disk systems. In Proceedings of the 2nd USENIX Symposium on File and Storage Technologies (FAST'03). 73--88. Google Scholar
Digital Library
- Standard Performance Evaluation Corp. SPECmail2009 Benchmark. http://www.spec.org/mail2009/.Google Scholar
- Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., and Peck, G. 1996. Scalability in the XFS file system. In Proceedings of the USENIX Annual Technical Conference (USENIX'96). Google Scholar
Digital Library
- Traeger, A. and Zadok, E. 2009. How to cheat at benchmarking. In USENIX FAST Birds of a Feather Session.Google Scholar
- Tweedie, S. C. 1998. Journaling the Linux ext2fs file system. In Proceedings of the 4th Annual Linux Expo.Google Scholar
- Wikipedia. 2009. Btrfs. en.wikipedia.org/wiki/Btrfs.Google Scholar
- Wittle, M. and Keith, B. E. 1993. LADDIS: The next generation in NFS file server bench-marking. In Proceedings of the USENIX Summer Conference. 111--128. Google Scholar
Digital Library
- Zadok, E. 2008. File and storage systems benchmarking workshop. University of California, Santa Cruz.Google Scholar
Index Terms
Emulating goliath storage systems with David
Recommendations
Emulating Goliath storage systems with David
FAST'11: Proceedings of the 9th USENIX conference on File and stroage technologiesBenchmarking file and storage systems on large file-system images is important, but difficult and often infeasible. Typically, running benchmarks on such large disk setups is a frequent source of frustration for file-system evaluators; the scale alone ...
A Novel Reordering Write Buffer to Improve Write Performance of Log-Structured File Systems
Abstract--This paper presents a novel reordering write buffer which improves the performance of Log-structured File Systems (LFS). While LFS has a good write performance, high garbage-collection overhead degrades its performance under high disk space ...
Fault-tolerant disk storage and file systems using reflective memory
HICSS '95: Proceedings of the 28th Hawaii International Conference on System SciencesMost replicated storage and file systems either take a specialized hardware approach or a software-oriented approach to fault tolerance. The paper describes a fault-tolerant disk storage and file system that falls in between the hardware and software ...






Comments