Abstract
Various activities that intend to enhance performance, reliability, and availability of storage systems are scheduled with low priority and served during idle times. Under such conditions, idleness becomes a valuable “resource” that needs to be efficiently managed. A common approach in system design is to be nonwork conserving by “idle waiting”, that is, delay the scheduling of background jobs to avoid slowing down upcoming foreground tasks.
In this article, we complement “idle waiting” with the “estimation” of background work to be served in every idle interval to effectively manage the trade-off between the performance of foreground and background tasks. As a result, the storage system is better utilized without compromising foreground performance. Our analysis shows that if idle times have low variability, then idle waiting is not necessary. Only if idle times are highly variable does idle waiting become necessary to minimize the impact of background activity on foreground performance. We further show that if there is burstiness in idle intervals, then it is possible to predict accurately the length of incoming idle intervals and use this information to serve more background jobs without affecting foreground performance.
- Abd-El-Malek, M., Ganger, G. R., Goodson, G. R., Reiter, M. K., and Wylie, J. J. 2005. Lazy verification in fault-tolerant distributed storage systems. In Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems (SRDS). Google Scholar
Digital Library
- Bachmat, E. and Schindler, J. 2002. Analysis of methods for scheduling low priority disk drive tasks. In Proceedings of the ACM Conference on Measurements and Modeling of Computer Systems (SIGMETRICS). ACM Press. 55--65. Google Scholar
Digital Library
- Bairavasundaram, L. N., Goodson, G. R., Pasupathy, S., and Schindler, J. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the ACM SIGMETRICS Conference. 289--300. Google Scholar
Digital Library
- Colarelli, D. and Grunwald, D. 2002. Massive arrays of idle disks for storage archives. In Proceeding of the SuperComputing Conferences. 1--11. Google Scholar
Digital Library
- Douceur, J. R. and Bolosky, W. J. 1999. Progress-Based regulation of low-importance processes. In Proceedings of 17th ACM Symposium on Operating Systems Principles (SOSP'99). ACM Press. 247--260. Google Scholar
Digital Library
- Douglis, F., Krishnan, P., and Bershad, B. N. 1995. Adaptive disk spin-down policies for mobile computers. In Proceedings of the 2nd USENIX Symposium on Mobile and Location-Independent Computing. 121--137. Google Scholar
Digital Library
- Eggert, L. and Touch, J. D. 2005. Idletime scheduling with preemption intervals. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). ACM Press. 249--262. Google Scholar
Digital Library
- Golding, R., Bosch, P., Staelin, C., Sullivan, T., and Wilkes, J. 1995. Idleness is not sloth. In Proceedings of the Winter'95 USENIX Conference. 201--222. Google Scholar
Digital Library
- Helmbold, D. P., Long, D. D. E., Sconyers, T. L., and Sherrod, B. 2000. Adaptive disk spin-down for mobile computers. Mobile Netw. Appl 5, 4, 285--297. Google Scholar
Digital Library
- Huang, H., Hung, W., and Shin, K. G. 2005. Fs2: Dynamic data replication in free disk space for improving disk performance and energy consumption. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). ACM Press. 263--276. Google Scholar
Digital Library
- Iliadis, I., Haas, R., Hu, X.-Y., and Eleftheriou, E. 2008. Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems. In Proceedings of the ACM SIGMETRICS Conference 241--252. Google Scholar
Digital Library
- Litzkow, M. J., Livny, M., and Mutka, M. W. 1988. Condor - A hunter of idle workstations. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS). 104--111.Google Scholar
- Lo, V. M., Zappala, D., Zhou, D., Liu, Y., and Zhao, S. 2004. Cluster computing on the fly: P2P scheduling of idle cycles in the Internet. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS). 227--236. Google Scholar
Digital Library
- Merchant, A. and Yu, P. S. 1994. An analytic model of reconstruction time in mirrored disks. Perform. Eval. J. 20, 1-3, 115--129. Google Scholar
Digital Library
- Mi, N., Riska, A., Smirni, E., and Riedel, E. 2008. Enhancing data availability in disk drives through background activities. In Proceedings of the Symposium on the Dependability of Systems and Networks (DSN). 492--501.Google Scholar
- Muntz, R. R. and Lui, J. C. S. 1990. Performance analysis of disk arrays under failures. In International Conference on Very Large Databases (VLDB). 162--173. Google Scholar
Digital Library
- Niu, Z., Shu, T., and Takahashi, Y. 2003. A vacation queue with setup and close-down times and batch markovian arrival processes. Perform. Eval. 54, 3, 225--248. Google Scholar
Digital Library
- Osogami, T., Harchol-Balter, M., and Scheller-Wolf, A. 2005. Analysis of cycle stealing with switching times and thresholds. Perform. Eval. J. 61, 4, 347--369. Google Scholar
Digital Library
- Riska, A. and Riedel, E. 2006. Disk drive level workload characterization. In Proceedings of the USENIX Annual Technical Conference. 97--103. Google Scholar
Digital Library
- Riska, A. and Riedel, E. 2008. Idle read after write - IRAW. In Proceedings of the USENIX Annual Technical Conference. 43--56. Google Scholar
Digital Library
- Schwarz, T. J. E., Xin, Q., Miller, E. L., Long, D. D. E., Hospodor, A., and Ng, S. 2004. Disk scrubbing in large archival storage systems. In Proceedings of the International Symposium on Modeling and Simulation of Computer and Communications Systems (MASCOTS). IEEE Press. Google Scholar
Digital Library
- Sivathanu, M., Prabhakaran, V., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2004. Improving storage system availability with D-GRAID. In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST'04). Google Scholar
Digital Library
- Takagi, H. 1991. Queuing Analysis Volume 1: Vacations and Priority Systems. North-Holland, New York.Google Scholar
- Theimer, M. M., Lantz, K. A., and Cheriton, D. R. 1985. Preemptable remote execution facilities for the v-system. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 2--12. Google Scholar
Digital Library
- Thereska, E., Schindler, J., Bucy, J., Salmon, B., Lumb, C. R., and Ganger, G. R. 2004. A framework for building unobtrusive disk maintenance applications. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST). Google Scholar
Digital Library
- Thomasian, A. and Nicola, V. F. 1993. Performance evaluation of a threshold policy for scheduling readers and writers. IEEE Trans. Comput. 42, 1, 83--98. Google Scholar
Digital Library
- Venkataramani, A., Kokku, R., and Dahlin, M. 2002. TCP nice: A mechanism for background transfers. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI). 329--343. Google Scholar
Digital Library
- Xu, E. and Alfa, A. S. 2002. A vacation model for the non-saturated readers and writers system with a threshold policy. Perform. Eval. 50, 4, 233--244. Google Scholar
Digital Library
Index Terms
Efficient management of idleness in storage systems
Recommendations
Restrained utilization of idleness for transparent scheduling of background tasks
SIGMETRICS '09A common practice in system design is to treat features intended to enhance performance and reliability as low priority tasks by scheduling them during idle periods, with the goal to keep these features transparent to the user. In this paper, we present ...
Restrained utilization of idleness for transparent scheduling of background tasks
SIGMETRICS '09: Proceedings of the eleventh international joint conference on Measurement and modeling of computer systemsA common practice in system design is to treat features intended to enhance performance and reliability as low priority tasks by scheduling them during idle periods, with the goal to keep these features transparent to the user. In this paper, we present ...
Evaluating the Performability of Systems with Background Jobs
DSN '06: Proceedings of the International Conference on Dependable Systems and NetworksAs most computer systems are expected to remain operational 24 hours a day, 7 days a week, they must complete maintenance work while in operation. This work is in addition to the regular tasks of the system and its purpose is to improve system ...






Comments