Abstract
The layered design of the Linux operating system hides the liveness of file system data from the underlying block layers. This lack of liveness information prevents the storage system from discarding blocks deleted by the file system, often resulting in poor utilization, security problems, inefficient caching, and migration overheads. In this paper, we define a generic "purge" operation that can be used by a file system to pass liveness information to the block layer with minimal changes in the layer interfaces, allowing the storage system to discard deleted data. We present three approaches for implementing such a purge operation: direct call, zero blocks, and flagged writes, each of which differs in their architectural complexity and potential performance overhead. We evaluate the feasibility of these techniques through a reference implementation of a dynamically resizable copy on write (COW) data store in User Mode Linux (UML). Performance results obtained from this reference implementation show that all these techniques can achieve significant storage savings with a reasonable execution time overhead. At the same time, our results indicate that while the direct call approach has the best performance, the zero block approach provides the best compromise in terms of performance overhead and its semantic and architectural simplicity. Overall, our results demonstrate that passing liveness information across the file system-block layer interface with minimal changes is not only feasible but practical.
- P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proceedings of the nineteenth ACM symposium on Operating systems principles, October 2003. Google Scholar
Digital Library
- T. Bray. bonnie. http://www.textuality.com/bonnie/.Google Scholar
- N. Burnett, J. Bent, A. Arpaci-Dusseau, and R. Arpaci-Dusseau. Exploiting Gray-Box Knowledge of Buffer-Cache Management. In Proceedings of the USENIX Annual Technical Conference, June 2002. Google Scholar
Digital Library
- R. Card, T. T'so, and S. Tweedie. Design and Implementation of the Second Extended Filesystem. In Proceedings of the First Dutch International Symposium on Linux, State University of Groningen, 1995.Google Scholar
- C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live Migration of Virtual Machines. In Proceedings of the 2nd ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 273--286, May 2005. Google Scholar
Digital Library
- W. de Jonge, F. Kaashoek, and W. C. Hsieh. Logical Disk: A simple new approach to improving file system performance. Technical Report MIT/LCS/TR-566, Massachusetts Institute of Technology, 1993. Google Scholar
Digital Library
- J. Dike. A user-mode port of the Linux kernel. In Proceedings of 4th Annual Linux Showcase and Conference, pages 63--72, 2000. Google Scholar
Digital Library
- G. Ganger. Blurring the Line Between Oses and Storage Devices. Technical Report CMU-CS-01-166, Carnegie Mellon University, Dec. 2001.Google Scholar
- T. Garfinkel, B. Pfaff, J. Chow, and M. Rosenblum. Data lifetime is a systems problem. In Proceedings of the 11th workshop on ACM SIGOPS European workshop: beyond the PC, Leuven, Belgium, 2004. Google Scholar
Digital Library
- P. Gutmann. Secure Deletion of Data from Magnetic and Solid-State Memory. In Proceedings of the Sixth USENIX Security Symposium, pages 77--89, July 1996. Google Scholar
Digital Library
- B. Hong, D. Plantenberg, D. D. E. Long, and M. Sivan-Zimet. Duplicate data elimination in a SAN file system. In Proceedings of the 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST 2004), page 301, 2004.Google Scholar
- N. C. Hutchinson, S. Manley, M. Federwisch, G. Harris, D. Hitz, S. Kleiman, and S. Malley. Logical vs. Physical File System Backup. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation, New Orleans, Louisiana, February 1999. Google Scholar
Digital Library
- X. Li, A. Aboulnaga, K. Salem, A. Sachedina, and S. Gao. Second-Tier Cache Management Using Write Hints. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST '05), pages 115--128, December 2005. Google Scholar
Digital Library
- Qumranet. Kvm: Kernel-based virtualization driver. http://www.qumranet.com/wp/kvm_wp.pdf, 2006. Technical Report.Google Scholar
- M. Rosenblum and J. K. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Transactions on Computer Systems, 10(1):26--52, 1992. Google Scholar
Digital Library
- M. Sivathanu, L. N. Bairavasundaram, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Life or death at block-level. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI), pages 379--394, 2004. Google Scholar
Digital Library
- M. Sivathanu, V. Prabhakaran, F. I. Popovici, T. E. Denehy, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Semantically-Smart Disk Systems. In Proceedings of the Second USENIX Conference on File and Storage Technologies (FAST 2003), March 2003. Google Scholar
Digital Library
- J. D. Strunk, G. R. Goodson, M. L. Scheinholtz, C. A. N. Soules, and G. R. Ganger. Self-Securing Storage: Protecting Data in Compromised Systems. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation, pages 165--180, San Diego, CA, October 2000. Google Scholar
Digital Library
- J. Sugerman, G. Venkitachalam, and B.-H. Lim. Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor. In Proceedings of the 2001 USENIX Annual Technical Conference, June 2001. Google Scholar
Digital Library
- A. Tridgell. dbench. http://samba.org/ftp/tridge/dbench/.Google Scholar
- S. Tweedie. Journaling the Linux ext2fs Filesystem. In LinuxExpo '98, 1998.Google Scholar
- G. Yadgar, M. Factor, and A. Schuster. Karma: know-it-all replacement for a multilevel cache. In FAST '07: Proceedings of the 5th USENIX conference on File and Storage Technologies, pages 25--25, Berkeley, CA, USA, 2007. USENIX Association. Google Scholar
Digital Library
- X. Yu, B. Gum, Y. Chen, R. Y. Wang, K. Li, A. Krishnamurthy, and T. E. Anderson. Trading Capacity for Performance in a Disk Array. In Proceedings of the 2000 Symposium on Operating Systems Design and Implementation, pages 243--258, San Diego, 2000. USENIX Association. Google Scholar
Digital Library
Index Terms
Practical techniques for purging deleted data using liveness information
Recommendations
Using the HFS+ journal for deleted file recovery
This paper describes research and analysis that were performed to identify a robust and accurate method for identifying and extracting the residual contents of deleted files stored within an HFS+ file system. A survey performed during 2005 of existing ...
Data Placement Techniques for Serpentine Tapes
HICSS '00: Proceedings of the 33rd Hawaii International Conference on System Sciences-Volume 8 - Volume 8Due to the information explosion we are witnessing, a growing number of applications store, maintain, and retrieve large volumes of data, where the data is required to be available online or near-online. These data repositories are implemented using ...
Improved deleted file recovery technique for Ext2/3 filesystem
Digital devices are increasingly being used in various crimes, and therefore, it becomes important for law enforcement agencies to be able to investigate and analyze digital devices. Accordingly, there is an increasing demand for digital forensic ...






Comments