Abstract
Prefetching is an important technique for improving effective hard disk performance. A prefetcher seeks to accurately predict which data will be requested and load it ahead of the arrival of the corresponding requests. Current disk prefetch policies in major operating systems track access patterns at the level of file abstraction. While this is useful for exploiting application-level access patterns, for two reasons file-level prefetching cannot realize the full performance improvements achievable by prefetching. First, certain prefetch opportunities can only be detected by knowing the data layout on disk, such as the contiguous layout of file metadata or data from multiple files. Second, nonsequential access of disk data (requiring disk head movement) is much slower than sequential access, and the performance penalty for mis-prefetching a randomly located block, relative to that of a sequential block, is correspondingly greater.
To overcome the inherent limitations of prefetching at logical file level, we propose to perform prefetching directly at the level of disk layout, and in a portable way. Our technique, called DiskSeen, is intended to be supplementary to, and to work synergistically with, any present file-level prefetch policies. DiskSeen tracks the locations and access times of disk blocks and, based on analysis of their temporal and spatial relationships, seeks to improve the sequentiality of disk accesses and overall prefetching performance. It also implements a mechanism to minimize mis-prefetching, on a per-application basis, to mitigate the corresponding performance penalty.
Our implementation of the DiskSeen scheme in the Linux 2.6 kernel shows that it can significantly improve the effectiveness of prefetching, reducing execution times by 20%--60% for microbenchmarks and real applications such as grep, CVS, and TPC-H. Even for workloads specifically designed to expose its weaknesses, DiskSeen incurs only minor performance loss.
- Baek, S. H. and Park, K. H. 2008. Prefetching with adaptive cache culling for striped disk arrays. In Proceedings of the USENIX Annual Technical Conference (ATC’08). Google Scholar
Digital Library
- Butt, A. R., Gniady, C., and Hu, Y. C. 2005. The performance impact of kernel prefetching on buffer cache replacement algorithms. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’05). 157--168. Google Scholar
Digital Library
- Cao, P., Felten, E. W., and Li, K. 1994. Application-controlled file caching policies. In Proceedings of the USENIX Summer Technical Conference (USTC’94). Google Scholar
Digital Library
- Cao, P., Felten, E. W., Karlin, A. R., and Li, K. 1996. Implementation and performance of integrated application controlled file caching, prefetching, and disk scheduling. ACM Trans. Comput. Syst. 14, 4, 311--343. Google Scholar
Digital Library
- Chang, F. and Gibson, G. A. 1999. Automatic i/o hint generation through speculative execution. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI’99). Google Scholar
Digital Library
- Chen, X. and Zhang, X. 2003. A popularity-based prediction model for web prefetching. IEEE Comput. 36, 3, 63--70. Google Scholar
Digital Library
- Diaz, P. and Cintra, M. 2009. Stream chaining: Exploiting multiple levels of correlation in data prefetching. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). Google Scholar
Digital Library
- Ding, X., Jiang, S., Chen, F., Davis, K., and Zhang, X. 2007. DiskSeen: Exploiting disk layout and access history to enhance i/o prefetch. In Proceedings of the USENIX Annual Technical Conference (USENIX’07). Google Scholar
Digital Library
- Douceur, J. R. and Bolosky, W. J. 1999. A large-scale study of file-system contents. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’99). 59--70. Google Scholar
Digital Library
- Faser, K. and Chang, F. 2003. Operating system i/o speculation: How two invocations are faster than one. In Proceedings of the USENIX Annual Technical Conference (USENIX’03). 325--338.Google Scholar
- Ganger, G. R. and Kaashoek, M. F. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of USENIX Annual Technical Conference (USENIX’97). Google Scholar
Digital Library
- Gill, B. S. and Bathen, L. A. D. 2007. AMP: Adaptive multi-stream prefetching in a shared cache. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07). Google Scholar
Digital Library
- Griffioen, J. and Appleton, R. 1994. Reducing file system latency using a predictive approach. In Proceedings of the USENIX Summer Technical Conference (USTC’94). Google Scholar
Digital Library
- Huang, H., Hung, W., and Shin, K. G. 2005. FS2: Dynamic data replication in free disk space for improving disk performance and energy consumption. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP’05). 263--276. Google Scholar
Digital Library
- Jiang, S., Ding, X., Chen, F., Tan, E., and Zhang, X. 2005. DULO: An effective buffer cache management scheme to exploit both temporal and spatial locality. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). Google Scholar
Digital Library
- Kroeger, T. M. and Long, D. D. E. 2001. Design and implementation of a predictive file prefetching algorithm. In Proceedings of the USENIX Annual Technical Conference (USENIX’01). 105--118. Google Scholar
Digital Library
- Li, Z., Chen, Z., Srinivasan, S. M., and Zhou, Y. 2004. C-Miner: Mining block correlations in storage systems. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). 173--186. Google Scholar
Digital Library
- Liang, S., Jiang, S., and Zhang, X. 2007. STEP: Sequentiality and thrashing detection based prefetching to improve performance of networked storage servers. In Proceedings of 27th IEEE International Conference on Distributed Computing Systems (ICDCS’07). Google Scholar
Digital Library
- LXR. 2013. Linux cross-reference. http://lxr.linux.no/.Google Scholar
- Mckusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for unix. ACM Trans. Comput. Syst. 2, 3, 181--197. Google Scholar
Digital Library
- Mowry, T. C., Demke, A. K., and Krieger, O. 1996. Automatic compiler-inserted i/o prefetching for out-of-core applications. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation (OSDI’96). Google Scholar
Digital Library
- MPI-IO. 2013. MPI-2: Extensions to the message-passing interface. http://www.mpi-forum.org/docs/mpi-20-html/mpi2-report.html.Google Scholar
- Pai, R., Pulavarty, B., and Cao, M. 2004. Linux 2.6 performance improvement through readahead optimization. In Proceedings of the Linux Symposium.Google Scholar
- Papathanasiou, A. E. and Scott, M. L. 2005. Aggressive prefetching: An idea whose time has come. In Proceedings of the 10th Workshop on Hot Topics in Operating Systems. Google Scholar
Digital Library
- Patterson, R. H., Gibson, G. A., Ginting, E., Stodolsky, D., and Zelenka, J. 1995. Informed prefetching and caching. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP’95). 79--95. Google Scholar
Digital Library
- Schindler, J. and Ganger, G. R. 2000. Automated disk drive characterization. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’00). 112--113. Google Scholar
Digital Library
- Schindler, J., Griffin, J. L., Lumb, C. R., and Ganger, G. R. 2002. Track-aligned extents: Matching access patterns to disk drive characteristics. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Google Scholar
Digital Library
- Schlosser, S. W., Schindler, J., Papadomanolakis, S., Shao, M., Ailamaki, A., Faloutsos, C., and Ganger, G. R. 2005. On multidimensional data and modern disks. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05). Google Scholar
Digital Library
- Schmuck, F. and Haskin, R. 2002. GPFS: A shared-disk file system for large computing clusters. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Google Scholar
Digital Library
- Smith, A. J. 1978. Sequentiality and prefetching in database systems. ACM Trans. Database Syst. 3, 3, 223--247. Google Scholar
Digital Library
- Tomkins, A., Patterson, R. H., and Gibson, G. 1997. Informed multi-process prefetching and caching. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’97). 100--114. Google Scholar
Digital Library
- Vogels, W. 1999. File system usage in windows nt 4.0. In Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP’99). 93--109. Google Scholar
Digital Library
- WebStone. 2013. WebStone --- The benchmark for web servers. http://www.mindcraft.com/benchmarks/webstone/.Google Scholar
- Xu, Y. and Jiang, S. 2011. A scheduling framework that makes any disk schedulers non-work-conserving solely based on request characteristics. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST’11). Google Scholar
Digital Library
- Zhang, X., Davis, K., and Jiang, S. 2010. IOrchestrator: Improving the performance of multi-node i/o systems via inter-server coordination. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’10). 1--11. Google Scholar
Digital Library
Index Terms
A Prefetching Scheme Exploiting both Data Layout and Access History on Disk
Recommendations
A buffer cache management scheme exploiting both temporal and spatial localities
On-disk sequentiality of requested blocks, or their spatial locality, is critical to real disk performance where the throughput of access to sequentially-placed disk blocks can be an order of magnitude higher than that of access to randomly-placed ...
Designing a Modern Memory Hierarchy with Hardware Prefetching
In this paper, we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that, even with an aggressive, next-generation memory system using four Direct Rambus channels and an integrated one-megabyte level-...
Stealth prefetching
Proceedings of the 2006 ASPLOS ConferencePrefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...






Comments