Abstract
Versioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine precisely which files and versions to restore. We can create more meaningful versions and enhance the value of those versions by capturing the causal connections among files, facilitating selection and recovery of precisely the right versions after data corrupting events.
We determine when to create new versions of files automatically using the causal relationships among files. The literature on versioning file systems usually examines two extremes of possible version-creation algorithms: open-to-close versioning and versioning on every write. We evaluate causal versions of these two algorithms and introduce two additional causality-based algorithms: Cycle-Avoidance and Graph-Finesse.
We show that capturing and maintaining causal relationships imposes less than 7% overhead on a versioning system, providing benefit at low cost. We then show that Cycle-Avoidance provides more meaningful versions of files created during concurrent program execution, with overhead comparable to open/close versioning. Graph-Finesse provides even greater control, frequently at comparable overhead, but sometimes at unacceptable overhead. Versioning on every write is an interesting extreme case, but is far too costly to be useful in practice.
- Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. 1990. Basic local alignment search tool. Molec. Biol. 215, 403--410.Google Scholar
Cross Ref
- Braun, U., Garfinkel, S., Muniswamy-Reddy, K.-K., Holland, D. A., and Seltzer, M. 2006. Issues in automatic provenance collection. In Proceedings of the International Provenance and Annotation Workshop. Google Scholar
Digital Library
- Cellary, W. and Jomier, G. 1990. Consistency of versions in objects-oriented databases. In Proceedings of the 16th International Conference on Very Large Databases. Google Scholar
Digital Library
- Chapman, A. P., Jagadish, H. V., and Ramanan, P. 2008. Efficient provenance storage. In Proceedings of the ACM SIGMOD International Conference on Management of data (SIGMOD'08). ACM, New York, NY, 993--1006. Google Scholar
Digital Library
- Chutani, S., Anderson, O. T., Kazar, M. L., Leverett, B. W., Mason, W. A., and Sidebotham, R. N. 1992. The Episode file system. In Proceedings of the USENIX Technical Conference. 43--60.Google Scholar
- Cornell, B., Dinda, P., and Bustamante, F. 2004. Wayback: A user-level versioning file system for Linux. In Proceedings of the USENIX Annual Technical Conference, FREENIX Track. Google Scholar
Digital Library
- Goel, A., Po, K., Farhadi, K., Li, Z., and de Lara, E. 2005. The Taser intrusion recovery system. In Proceedings of the 20th ACM SIGOPS Symposium on Operating Systems Principles (SOSP'05). Google Scholar
Digital Library
- Halcrow, M. A. 2005. eCryptfs: An enterprise-class encrypted filesystem for Linux. Proceedings of the Ottawa Linux Symposium.Google Scholar
- Hitz, D., Lau, J., and Malcolm, M. 1994. File system design for an nfs file server appliance. In Proceedings of the USENIX Winter Technical Conference. 235--245. Google Scholar
Digital Library
- King, S. T. and Chen, P. M. 2003. Backtracking Intrusions. In Proceedings of the 19th ACM SIGOPS Symposium on Operating Systems Principles (SOSP'03). Google Scholar
Digital Library
- King, S. T., Mao, Z. M., Lucchetti, D. G., and Chen, P. M. 2005. Enriching intrusion alerts through multi-host causality. In Proceedings of the 12th Annual Network and Distributed System Security Symposium.Google Scholar
- Kistler, J. J. and Satyanarayanan, M. 1991. Disconnected operation in the Coda file system. In Proceedings of the 13th ACM SIGOPS Symposium on Operating Systems Principles (SOSP'91). Google Scholar
Digital Library
- Laadan, O. and Nieh, J. 2007. Transparent checkpoint-restart of multiple processes on commodity operating systems. In Proceedings of the USENIX Annual Technical Conference (ATC'07). USENIX Association, Berkeley, CA, 1--14. Google Scholar
Digital Library
- Muniswamy-Reddy, K., Wright, C. P., Himmer, A., and Zadok, E. 2004. A versatile and user-oriented versioning file system. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST'04). Google Scholar
Digital Library
- Muniswamy-Reddy, K.-K., Braun, U., Holland, D. A., Macko, P., Maclean, D., Margo, D., Seltzer, M., and Smogor, R. 2009. Layering in provenance systems. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Muniswamy-Reddy, K.-K., Holland, D. A., Braun, U., and Seltzer, M. 2006. Provenance-aware storage systems. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Peterson, Z. and Burns, R. 2005. Ext3cow: A time-shifting file system for regulatory compliance. ACM Trans. Stor. 1, 2, 190--212. Google Scholar
Digital Library
- Prabhakaran, V., Bairavasundaram, L., Agrawal, N., Gunawi, H., Arpaci-Dusseau, A., and Arpaci-Dusseau, R. 2005. IRON File Systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). 206--220. Google Scholar
Digital Library
- Quinlan, S. 1991. A cached worm file system. Softw. Pract. Exper. 21, 12, 1289--1299. Google Scholar
Digital Library
- Quinlan, S. and Dorward, S. 2002. Venti: a new approach to archival storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02). 89--101. Google Scholar
Digital Library
- Santry, D. S., Feeley, M. J., Hutchinson, N. C., Veitch, A. C., Carton, R., and Ofir, J. 1999. Deciding when to forget in the elephant file system. In Proceedings of the 17th ACM SIGOPS Symposium on Operating Systems Principles (SOSP'99). Google Scholar
Digital Library
- Shah, S., Soules, C. A. N., Ganger, G. R., and Noble, B. D. 2007. Using provenance to aid in personal file search. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Shaull, R., Shrira, L., and Xu, H. 2008. Skippy: A new snapshot indexing method for time travel in the storage manager. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google Scholar
Digital Library
- Shrira, L. and Xu, H. 2006. Thresher: An efficient storage manager for copy-on-write snapshots. In Proceedings of the USENIX Annual Technical Conference. Google Scholar
Digital Library
- Simmhan, Y. L., Plale, B., and Gannon, D. 2005. A survey of data provenance in e-science. SIGMOD Rec. 34, 3, 31--36. Google Scholar
Digital Library
- Somayaji, A. and Forrest, S. 2000. Automated response using system-call delays. In Proceedings of the USENIX Security Symposium. Google Scholar
Digital Library
- Soules, C. A. N., Goodson, G. R., Strunk, J. D., and Ganger, G. R. 2003. Metadata efficiency in versioning file systems. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST'03). 43--58. Google Scholar
Digital Library
- Su, Y.-Y., Attariyan, M., and Flinn, J. 2007. Autobash: Improving configuration management with operating system causality analysis. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP'07). ACM, New York, NY, 237--250. Google Scholar
Digital Library
- Sundararaman, S., Sivathanu, G., and Zadok, E. 2008. Selective versioning in a secure disk system. In Proceedings of the 17th USENIX Security Symposium. Google Scholar
Digital Library
- Talens, G., Oussalah, C., and Colinas, M. F. 1993. Versions of simple and composite objects. In Proceedings of the 19th International Conference on Very Large Data Bases (VLDB'93). Google Scholar
Digital Library
- Zhu, N. and Chiueh, T.-C. 2003. Design, implementation, and evaluation of repairable file service. In Proceedings of the International Conference on Dependable Systems and Networks.Google Scholar
Index Terms
Causality-based versioning
Recommendations
Causality-based versioning
FAST '09: Proccedings of the 7th conference on File and storage technologiesVersioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine ...
Granular Causality Applications: Using Part-of Relations for Discovering Causality
Causal markers, syntactic structures and connectives have been the sole identifying features for automatically extracting causal relations in natural language discourse. However, various connectives such as "and", prepositions such as "as", and other ...
Time-varying causality between research output and economic growth in US
This main purpose of this paper is to investigate the causal relationship between knowledge (research output) and economic growth in US over 1981–2011. To overcome the issues of ignoring possible instability and hence, falsely assuming a constant ...








Comments