skip to main content
research-article

Workload-based generation of administrator hints for optimizing database storage utilization

Published:25 February 2008Publication History
Skip Abstract Section

Abstract

Database storage management at data centers is a manual, time-consuming, and error-prone task. Such management involves regular movement of database objects across storage nodes in an attempt to balance the I/O bandwidth utilization across disk drives. Achieving such balance is critical for avoiding I/O bottlenecks and thereby maximizing the utilization of the storage system. However, manual management of the aforesaid task, apart from increasing administrative costs, encumbers the greater risks of untimely and erroneous operations. We address the preceding concerns with STORM, an automated approach that combines low-overhead information gathering of database access and storage usage patterns with efficient analysis to generate accurate and timely hints for the administrator regarding data movement operations. STORM's primary objective is minimizing the volume of data movement required (to minimize potential down-time or reduction in performance) during the reconfiguration operation, with the secondary constraints of space and balanced I/O-bandwidth-utilization across the storage devices. We analyze and evaluate STORM theoretically, using a simulation framework, as well as experimentally. We show that the dynamic data layout reconfiguration problem is NP-hard and we present a heuristic that provides an approximate solution in O(Nlog(N/M) + (N/M)2) time, where M is the number of storage devices and N is the total number of database objects residing in the storage devices. A simulation study shows that the heuristic converges to an acceptable solution that is successful in balancing storage utilization with an accuracy that lies within 7% of the ideal solution. Finally, an experimental study demonstrates that the STORM approach can improve the overall performance of the TPC-C benchmark by as much as 22%, by reconfiguring an initial random, but evenly distributed, placement of database objects.

References

  1. Aboulnaga, A. and Chaudhuri, S. 1999. Self-Tuning histograms: Building histograms without looking at data. In Proceedings of the International Conference on Management of Database ACM SIGMOD. Philadelphia, PA, 181--912. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Allen, N. 2001. Don't waste your storage dollars: What you need to know. Res. Note, Gartner Group.Google ScholarGoogle Scholar
  3. An, N., Jin, J., and Sivasubramaniam, A. 2003. Algorithms for index-assisted selectivity estimation. IEEE Trans. Knowl. Data Eng. 15, 2, 305--323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Aoki, P. 1999. Toward an accurate analysis of range queries on spatial data. Proceedings of the 15th International Conference on Data Engineering, 258.Google ScholarGoogle Scholar
  5. BMC Software. 2005. Capacity management and provisioning. www.bmc.com.Google ScholarGoogle Scholar
  6. Cappanera, P. and Trubian, M. 2005. A local search based heuristic for the demand constrained multidimensional knapsack problem. INFORMS J. Comput. 17, 82--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chu, P. and Beasley, J. 1998. A genetic algorithm for the multidimensional knapsack problem. J. Heuristics 4, 63--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Computer Associates. 2005. Storage management. www.ca.com/products.Google ScholarGoogle Scholar
  9. Feng, Y. and Zhang, Y.-Y. 2005. Virtual disk reconfiguration with performance guarantees in shared storage environment. In Proceedings of the 3rd International Conference on Information Technology and Applications, 69--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Furtado, P. 2004. Experimental evidence on partitioning in parallel data warehouses. In Proceedings of the 7th ACM International Workshop on Data Warehousing and OLAP (DOLAP). ACM Press, New York, NY, USA, 23--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ganger, G. R., Worthington, B. L., Hou, R. Y., and Patt, Y. N. 1993. Disk subsystem load balancing: Disk striping vs. conventional data placement. In Proceedings of the International Conference on System Sciences.Google ScholarGoogle Scholar
  12. Hua, K. A. and Lee, C. 1990. An adaptive data placement scheme for parallel database computer systems. In Proceedings of the 16th International Conference on Very Large Data Bases (VLDB). Morgan Kaufmann, San Francisco, CA, 493--506. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. IBM 2006. Storage area network (SAN). http://www-03.ibm.com/servers/storage/san/.Google ScholarGoogle Scholar
  14. ILOG 2006. ILOG CPLEX World's leading mathematical programming optimizers. http://www.ilog.com/products/cplex/.Google ScholarGoogle Scholar
  15. Kan, A. H. G. R., Stougie, L., and Vercellis, C. 1993. A class of generalized greedy algorithms for the multi-knapsack problemm. Discr. Appl. Math. 42, 279--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Karl Nagel Corporation 2006. Sarbanes-Oxley. http://www.sarbanes-oxley.com/.Google ScholarGoogle Scholar
  17. Kephart, J. O. and Chess, D. M. 2003. The vision of autonomic computing. IEEE Comput. 36, 1 (Jan.), 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Khuller, S., Kim, Y., and Wan, Y. 2003. Algorithms for data migration with cloning. In Proceedings of the 22nd ACM Conference on Principles of Database Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kwan, T. T., McCrath, R., and Reed, D. A. 1995. NCSA's World Wide Web server: Design and performance. IEEE Comput. 28, 11, 68--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lamb, E. 2001. Hardware spending spatters. Red Herring, 32--33.Google ScholarGoogle Scholar
  21. Lee, E. K. and Thekkath, C. A. 1996. Petal: Distributed virtual disks. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, 84--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lu, C., Alvarez, G. A., and Wilkes, J. 2002. Aqueduct: Online data migration with performance guarantees. In Proceedings of the USENIX Conference on File and Storage Technologies, 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. McDATA Corp. 2006. Storage network extension and routing. http://www.mcdata.com/products/hardware/srouter/index.html.Google ScholarGoogle Scholar
  24. Mehta, M. and DeWitt, D. J. 1997. Data placement in shared-nothing parallel database systems. The VLDB J. 6, 1, 53--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Menon, J., Pease, D. A., Rees, R., Duyanovich, L., and Hillsberg, B. 2003. IBM storage tank---A heterogeneous scalable SAN file system. IBM Syst. J. 42, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mesnier, M., Ganger, G. R., and Riedel, E. 2003. Object-Based storage. IEEE Commun. Mag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Patterson, D., Gibson, G., and Katz, R. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the ACM SIGMOD Conference on Management of Data, 109--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. PostgreSQL Global Development Group. 2007. Postgresql 8.2. http://www.postgresql.org/.Google ScholarGoogle Scholar
  29. Qiao, L., Iyer, B. R., Agrawal, D., and Abbadi, A. E. 2006. Automated storage management with QoS guarantees. In Proceedings of the International Conference on Data Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sivathanu, M., Bairavasundaram, L., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2005. Database-Aware semantically-smart storage. In Proceedings of the USENIX Conference on File and Storage Technologies. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Stonebraker, M., Aoki, P., Devine, R., Litwin, W., and Olson, M. 1994. Mariposa: A new architecture for distributed data. In Proceedings of the 10th International Conference on Data Engineering, 54--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Transaction Processing Council. 2003. Automatic storage management technical overview: An oracle white paper. Oracle Technology Network (http://www.oracle.com/technology/).Google ScholarGoogle Scholar
  33. Transaction Processing Performance Council (TPC). 2006. TPC benchmark C standard specification revision 5.8.0. Oracle Technology Network (http://www.oracle.com/technology/).Google ScholarGoogle Scholar
  34. Veritas. 2005. Storage and server automation. http://www.symantec.com/Products/enterprise.Google ScholarGoogle Scholar
  35. Wu, C. and Burns, R. 2005. Tunable randomization for load management in shared-disk clusters. ACM Trans. Storage 1, 1 (Feb.), 108--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Zhang, G., Shu, J., Xue, W., and Zheng, W. 2007. SLAS: An efficient approach to scaling round-robin striped volumes. ACM Trans. Storage 3, 1 (Mar.). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Workload-based generation of administrator hints for optimizing database storage utilization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 3, Issue 4
          February 2008
          156 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/1326542
          Issue’s Table of Contents

          Copyright © 2008 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 February 2008
          • Accepted: 1 December 2007
          • Revised: 1 October 2007
          • Received: 1 July 2007
          Published in tos Volume 3, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Pre-selected

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!