Abstract
We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file system from the output of a computer with a synchronous file system. No application modification is required to use an externally synchronous file system. In fact, application developers can program to the simpler synchronous I/O abstraction and still receive excellent performance. We have implemented an externally synchronous file system for Linux, called xsyncfs. Xsyncfs provides the same durability and ordering-guarantees as those provided by a synchronously mounted ext3 file system. Yet even for I/O-intensive benchmarks, xsyncfs performance is within 7% of ext3 mounted asynchronously. Compared to ext3 mounted synchronously, xsyncfs is up to two orders of magnitude faster.
- Best, S. 2000. JFS overview. Tech. Rep., IBM, http://www-128.ibm.com/developerworks/linux/library/l-jfs.html.Google Scholar
- Chen, P. M., Ng, W. T., Chandra, S., Aycock, C., Rajamani, G., and Lowell, D. 1996. The Rio file cache: Surviving operating system crashes. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems. Cambridge, MA, 74--83. Google Scholar
Digital Library
- Elnozahy, E. N., Alvisi, L., Wang, Y.-M., and Johnson, D. B. 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv. 34, 3, 375--408. Google Scholar
Digital Library
- Elnozahy, E. N. and Zwaenepoel, W. 1992. Manetho: transparent rollback-recovery with low overhead, limited rollback, and fast output commit. IEEE Trans. Comput. 41, 5, 526--531. Google Scholar
Digital Library
- Flautner, K. and Mudge, T. 2002. Vertigo: automatic performance-setting for Linux. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation. Boston, MA, 105--116. Google Scholar
Digital Library
- Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th ACM Symposium on Operating Systems Principles. Austin, TX, 155--162. Google Scholar
Digital Library
- Hill, M. D., Larus, J. R., Reinhardt, S. K., and Wood, D. A. 1993. Cooperative shared memory: software and hardware for scalable multiprocessors. ACM Trans. Comput. Syst. 11, 4, 300--318. Google Scholar
Digital Library
- Hitz, D., Lau, J., and Malcolm, M. 1994. File system design for an NFS file server appliance. In Proceedings of the Winter USENIX Technical Conference. Google Scholar
Digital Library
- Katcher, J. 1997. PostMark: A new file system benchmark. Tech. rep. TR3022, Network Appliance.Google Scholar
- Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. ACM Commun. 21, 7, 558--565. Google Scholar
Digital Library
- Liskov, B. and Rodrigues, R. 2004. Transactional file systems can be fast. In Proceedings of the 11th SIGOPS European Workshop. Leuven, Belgium. Google Scholar
Digital Library
- Lowell, D. E., Chandra, S., and Chen, P. M. 2000. Exploring failure transparency and the limits of generic recovery. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation. San Diego, CA. Google Scholar
Digital Library
- Lowell, D. E. and Chen, P. M. 1998. Persistent messages in local transactions. In Proceedings of the 1998 Symposium on Principles of Distributed Computing. 219--226. Google Scholar
Digital Library
- McKusick, M. K. 2006. Disks from the perspective of a file system. ;login: 31, 3, 18--19.Google Scholar
- McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for unix. ACM Trans. Comput. Syst. 2, 3, 181--197. Google Scholar
Digital Library
- MySQL AB. 2006. MySQL Reference Manual. MySQL AB. http://dev.mysql.com/.Google Scholar
- Namesys. 2006. ReiserFS. Namesys, http://www.namesys.com/.Google Scholar
- Nightingale, E. B., Chen, P. M., and Flinn, J. 2006. Speculative execution in a distributed file system. ACM Trans. Comput. Syst. 24, 4, 361--392. Google Scholar
Digital Library
- OSDL 2006. OSDL Database test 2. OSDL, http://www.osdl.org/.Google Scholar
- Paxton, W. H. 1979. A client-based transaction system to maintain data integrity. In Proceedings of the 7th ACM Symposium on Operating Systems Principles. 18--23. Google Scholar
Digital Library
- Prabhakaran, V., Bairavasundaram, L. N., Agrawal, N., Gunawi, H. S., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2005. IRON file systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles. Brighton, UK, 206--220. Google Scholar
Digital Library
- Qin, F., Tucek, J., Sundaresan, J., and Zhou, Y. 2005. Rx: treating bugs as allergies—a safe method to survive software failures. In Proceedings of the 20th ACM Symposium on Operating Systems Principles. Brighton, UK, 235--248. Google Scholar
Digital Library
- Ritchie, D. M. and Thompson, K. 1974. The UNIX time-sharing system. ACM Commun. 17, 7, 365--375. Google Scholar
Digital Library
- Scales, D. J., Gharachorloo, K., and Thekkath, C. A. 1996. Shasta: a low overhead, software-only approach for supporting fine-grain shared memory. In Proceedings of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOSVII). 174--185. Google Scholar
Digital Library
- Schmuck, F. and Wylie, J. 1991. Experience with transactions in QuickSilver. In Proceedings of the 13th ACM Symposium on Operating Systems Principles. 239--253. Google Scholar
Digital Library
- Seltzer, M. I., Ganger, G. R., McKusick, M. K., Smith, K. A., Soules, C. A. N., and Stein, C. A. 2000. Journaling versus soft updates: asynchronous meta-data protection in file systems. In Proceedings of the USENIX Annual Technical Conference. San Diego, CA, 18--23. Google Scholar
Digital Library
- Silberschatz, A. and Galvin, P. B. 1998. Operating System Concepts, 5th ed. Addison Wesley. 27. Google Scholar
Digital Library
- Slashdot. 2005. Your hard drive lies to you. Slashdot. http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252.Google Scholar
- Spector, A. Z., Daniels, D., Duchamp, D., Eppinger, J. L., and Pausch, R. 1985. Distributed transactions for reliable systems. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. Orcas Island, WA, 127--146. Google Scholar
Digital Library
- Standard Performance Evaluation Corporation. 2006. SPECweb99. Standard Performance Evaluation Corporation, http://www.spec.org/web99.Google Scholar
- Strom, R. E. and Yemini, S. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3, 204--226. Google Scholar
Digital Library
- Wang, A.-I. A., Reiher, P., Popek, G. J., and Kuenning, G. H. 2002. Conquest: better performance through a disk/persistent-RAM hybrid file system. In Proceedings of the USENIX Annual Technical Conference. Monterey, CA. Google Scholar
Digital Library
- Weinstein, M. J., Thomas W. Page, J., Livezey, B. K., and Popek, G. J. 1985. Transactions and synchronization in a distributed operating system. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. Oreas Island, WA, 115--126. Google Scholar
Digital Library
- Wu, M. and Zwaenepoel, W. 1994. eNVy: a non-volatile, main memory storage system. In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, CA, 86--97. Google Scholar
Digital Library
Index Terms
Rethink the sync
Recommendations
Rethink the sync
OSDI '06: Proceedings of the 7th symposium on Operating systems design and implementationWe introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a ...
Rethink the sync
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a ...
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...






Comments