skip to main content
research-article

Predictive data grouping: Defining the bounds of energy and latency reduction through predictive data grouping and replication

Published:28 May 2008Publication History
Skip Abstract Section

Abstract

We demonstrate that predictive grouping is an effective mechanism for reducing disk arm movement, thereby simultaneously reducing energy consumption and data access latency. We further demonstrate that predictive grouping has untapped dramatic potential to further improve access performance and limit energy consumption. Data retrieval latencies are considered a major bottleneck, and with growing volumes of data and increased storage needs it is only growing in significance. Data storage infrastructure is therefore a growing consumer of energy at data-center scales, while the individual disk is already a significant concern for mobile computing (accounting for almost a third of a mobile system's energy demands). While improving responsiveness of storage subsystems and hence reducing latencies in data retrieval is often considered contradictory with efforts to reduce disk energy consumption, we demonstrate that predictive data grouping has the potential to simultaneously work towards both these goals. Predictive data grouping has advantages in its applicability compared to both prior approaches to reducing latencies and to reducing energy usage. For latencies, grouping can be performed opportunistically, thereby avoiding the serious performance penalties that can be incurred with prior applications of access prediction (such as predictive prefetching of data). For energy, we show how predictive grouping can even save energy use for an individual disk that is never idle.

Predictive data grouping with effective replication results in a reduction of the overall mechanical movement required to retrieve data. We have built upon our detailed measurements of disk power consumption, and have estimated both the energy expended by a hard disk for its mechanical components, and that needed to move the disk arm. We have further compared, via simulation, three models of predictive grouping of on-disk data, including an optimal arrangement of data that is guaranteed to minimize disk arm movement. These experiments have allowed us to measure the limits of performance improvement achievable with optimal data grouping and replication strategies on a single device, and have further allowed us to demonstrate the potential of such schemes to reduce energy consumption of mechanical components by up to 70%.

References

  1. Akyürek, S. and Salem, K. 1995. Adaptive block rearrangement. ACM Trans. Comput. Syst. 13, 2, 89--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Amer, A., Long, D. D. E., and Burns, R. C. 2002. Group-Based management of distributed file caches. In Proceedings of the 22nd International Conference on Distributed Computing Systems (ICDCS), Vienna, Austria. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amer, A. M. 2002. Predictive data grouping using successor prediction. Ph.D. thesis, University of California at Santa Cruz. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Craven, M. and Amer, A. 2005. Predictive reduction of power and latency (purple). In Proceedings of the 22nd IEEE/13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Curewitz, K. M., Krishnan, P., and Vitter, J. S. 1993. Practical prefetching via data compression. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), Washington, DC, 257--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Douglis, F., Krishnan, P., and Bershad, B. 1995. Adaptive disk spin-down policies for mobile computers. Comput. Syst. 8, 4 (Fall), 381--413.Google ScholarGoogle Scholar
  7. Douglis, F., Krishnan, P., and Marsh, B. 1994. Thwarting the power-hungry disk. In Proceedings of the Winter USENIX Conference, Boston, MA, 293--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Flinn, J. and Satyanarayanan, M. 1999. Energy-Aware adaptation for mobile applications. In Proceedings of the Symposium on Operating Systems Principles (SOSP), 48--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ganger, G. R. and Kaashoek, M. F. 1997. Embedded inodes and explicit grouping: Exploiting disk bandwidth for small files. In Proceedings of the USENIX Annual Technical Conference, Anaheim, CA, 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Golding, R., Bosch, P., Staelin, C., Sullivan, T., and Wilkes, J. 1995. Idleness is not sloth. In Proceedings of the USENIX Technical Conference, New Orleans, LA, 201--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Golding, R., Bosch, P., and Wilkes, J. 1996. Idleness is not sloth. Tech. Rep. HPL-96-140, Hewlett-Packard Laboratories, Palo Alto, California.Google ScholarGoogle Scholar
  12. Greenawalt, P. M. 1994. Modeling power management for hard disks. In Proceedings of the 2nd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). IEEE Computer Society Press, 62--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Griffioen, J. and Appleton, R. 1994. Reducing file system latency using a predictive approach. In Proceedings of the USENIX Summer Technical Conference, 197--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Helmbold, D. P., Long, D. D., and Sherrod, B. 1996. A dynamic disk spin-down technique for mobile computing. In Proceedings of the 2nd Annual ACM International Conference on Mobile Computing and Networking. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Helmbold, D. P., Long, D. D. E., Sconyers, T. L., and Sherrod, B. 2000. Adaptive disk spin-down for mobile computers. ACM/Baltzer Mobile Netw. Appl. 5, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Herbster, M. and Warmuth, M. K. 1995. Tracking the best expert. In Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA. Morgan Kaufmann, 286--294.Google ScholarGoogle Scholar
  17. Kistler, J. J. and Satyanarayanan, M. 1991. Disconnected operation in the Coda file system. In Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP), Pacific Grove, CA, 213--225. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Krishnan, P., Long, P., and Vitter, J. S. 1995. Adaptive disk spin-down via optimal rent-to-buy in probabilistic environments. In Proceedings of the 12th International Conference on Machine Learning (ML), Tahoe City, CA. Morgan Kaufman, 322--330.Google ScholarGoogle Scholar
  19. Kroeger, T. M. and Long, D. D. E. 1999. The case for efficient file access pattern modeling. In Proceedings of the 7th Workshop on Hot Topics in Operating Systems (HotOS-VII), Rio Rico, AZ. IEEE Computer Society Press, 14--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kroeger, T. M. and Long, D. D. E. 2001. Design and implementation of a predictive file prefetching algorithm. In Proceedings of the USENIX Annual Technical Conference, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Larkby-Lahet, J. A., Santhanakrishnan, G., Amer, A., and Chrysanthis, P. K. 2005. Step: Self-Tuning energy-safe predictors. In Proceedings of the 5th International Conference in Mobile Data Management (MDM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lei, H. and Duchamp, D. 1997. An analytical approach to file prefetching. In Proceedings of the USENIX Annual Technical Conference, Anaheim, CA, 275--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lorch, J. R. and Smith, A. J. 1998. Software strategies for portable computer energy management. IEEE Personal Commun. 5, 3.Google ScholarGoogle ScholarCross RefCross Ref
  24. McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3 (Aug.), 181--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Mummert, L. and Satyanarayanan, M. 1996. Long term distributed file reference tracing: Implementation and experience. Softw. Practice Exper. 26, 6, 705--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Papathanasiou, A. E. and Scott, M. L. 2002. Increasing file system burstiness for energy efficiency. Tech. Rep. TR792, University of Rochester. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ruemmler, C. and Wilkes, J. 1993. Unix disk access patterns. In Proceedings of the USENIX Winter Technical Conference, 405--420.Google ScholarGoogle Scholar
  28. Russell, S. and Norvig, P. 2003. Artificial Intelligence: A Modern Approach, 2nd ed. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sedgewick, R. 1992. Algorithms in C++. Addison-Wesley Longman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shriver, E., Gabber, E., Huang, L., and Stein, C. 2001. Storage management for Web proxies. In Proceedings of the USENIX Annual Technical Conference, Boston, MA, 203--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shriver, E., Small, C., and Smith, K. 1999. Why does file system prefetching work? In Proceedings of the USENIX Annual Technical Conference, Monterey, CA, 71--83. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Smith, K. A. and Seltzer, M. 1996. A comparison of FFS disk allocation policies. In Proceedings of the USENIX Technical Conference, San Diego, CA, 15--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Staelin, C. and Garcia-Molina, H. 1990a. Clustering active disk data to improve disk performance. Tech. Rep. CS-TR-283-90, Department of Computer Science, Princeton University. February. revised June 1990.Google ScholarGoogle Scholar
  34. Staelin, C. and Garcia-Molina, H. 1990b. File system design using large memories. In Proceedings of the 5th Jerusalem Conference on Information Technology (JCIT). IEEE Computer Society Press, 11--21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Staelin, C. and Garcia-Molina, H. 1991. Smart filesystems. In Proceedings of the Winter USENIX Conference, 45--51.Google ScholarGoogle Scholar
  36. Steere, D. C. 1997. Using dynamic sets to reduce the aggregate latency of data access. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Vitter, J. S. and Krishnan, P. 1996. Optimal prefetching via data compression. J. ACM 43, 5 (Sept.), 771--793. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Warmuth, M. and Littlestone, N. 1994. The weighted majority algorithm. Inf. Comput. 108, 2, 212--261. An extended abstract appeared in COLT 89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Weissel, A., Beutel, B., and Bellosa, F. 2002. Cooperative I/O: A novel I/O semantics for energy-aware applications. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI), Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Wilkes, J. 1992. Predictive power conservation. Tech. Rep. HPL-CSP-92-5, Concurrent Systems Project, Hewlett-Packard Laboratories, Palo Alto, California. February.Google ScholarGoogle Scholar
  41. Zedlewski, J., Sobti, S., Garg, N., Zheng, F., Krishnamurthy, A., and Wang, R. 2003. Modeling hard-disk power consumption. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST), San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Predictive data grouping: Defining the bounds of energy and latency reduction through predictive data grouping and replication

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 4, Issue 1
      May 2008
      90 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/1353452
      Issue’s Table of Contents

      Copyright © 2008 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 May 2008
      • Revised: 1 February 2008
      • Accepted: 1 February 2008
      • Received: 1 March 2007
      Published in tos Volume 4, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!