skip to main content
research-article

Kinesis: A new approach to replica placement in distributed storage systems

Published:09 February 2009Publication History
Skip Abstract Section

Abstract

Kinesis is a novel data placement model for distributed storage systems. It exemplifies three design principles: structure (division of servers into a few failure-isolated segments), freedom of choice (freedom to allocate the best servers to store and retrieve data based on current resource availability), and scattered distribution (independent, pseudo-random spread of replicas in the system). These design principles enable storage systems to achieve balanced utilization of storage and network resources in the presence of incremental system expansions, failures of single and shared components, and skewed distributions of data size and popularity. In turn, this ability leads to significantly reduced resource provisioning costs, good user-perceived response times, and fast, parallelized recovery from independent and correlated failures.

This article validates Kinesis through theoretical analysis, simulations, and experiments on a prototype implementation. Evaluations driven by real-world traces show that Kinesis can significantly outperform the widely used Chain replica-placement strategy in terms of resource requirements, end-to-end delay, and failure recovery.

References

  1. Azar, Y., Broder, A. Z., Karlin, A. R., and Upfal, E. 1999. Balanced allocations. SIAM J. Comput. 29, 1, 180--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berenbrink, P., Czumaj, A., Steger, A., and Vöcking, B. 2000. Balanced allocations: the heavily loaded case. In Proceedings of the Annual ACM Symposium on Theory of Computing (STOC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Byers, J., Considine, J., and Mitzenmacher, M. 2003. Simple load balancing for distributed hash tables. In Proceedings of the International Workshop on Peer-to-Peer Systems (IPTPS).Google ScholarGoogle Scholar
  4. Czumaj, A., Riley, C., and Scheideler, C. 2003. Perfectly balanced allocation.Google ScholarGoogle Scholar
  5. Dabek, F., Kaashoek, M., Karger, D., Morris, R., and Stoica, I. 2001. Wide-Area cooperative storage with CFS. In Proceedings of the SIGOPS Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the SIGOPS Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Godfrey, B., Lakshminarayanan, K., Surana, S., Karp, R., and Stoica, I. 2004. Load balancing in dynamic structured p2p systems. In Proceedings of the Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM).Google ScholarGoogle Scholar
  8. Hsiao, H. and DeWitt, D. J. 1990. Chained declustering: A new availability strategy for multiprocessor database machines. In Proceedings of the International Conference on Data Engineering (ICDE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ji, M., Felten, E. W., Wang, R., and Singh, J. P. 2000. Archipelago: An island-based file system for highly available and scalable internet services. In Proceedings of the Windows Systems Symposium. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., and Panigrahy, R. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the Annual ACM Symposium on Theory of Computing (STOC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., and Zhao, B. 2000. OceanStore: An architecture for global-scale persistent storage. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lee, E. K. and Thekkath, C. A. 1996. Petal: Distributed virtual disks. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Litwin, W. 1980. Linear hashing: A new tool for file and table addressing. In Proceedings of the Intlernational Conference on Very Large Data Bases (VLDB). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lumb, C. R., Golding, R., and Ganger, G. R. 2004. DSPTF: Decentralized request distribution in brickbased storage systems. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. 2004. Boxwood: Abstractions as the foundation for storage infrastructure. In Proceedings of the ACM/USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pagh, R. and Rodler, F. F. 2004. Cuckoo hashing. J. Algor. 51, 2, 122--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Pai, V. S., Aron, M., Banga, G., Svendsen, M., Druschel, P., Zwaenepoel, W., and Nahum, E. 1998. Locality-Aware request distribution in cluster-based network servers. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Quinlan, S. and Dorward, S. 2002. Venti: A new approach to archival storage. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rowstron, A. and Druschel, P. 2001. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Proceedings of the SIGOPS Symposium on Operating Systems Principles (SOSP). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sanders, P., Egner, S., and Korst, J. H. M. 2003. Fast concurrent access to parallel disks. Algorithmica 35, 1, 21--55.Google ScholarGoogle ScholarCross RefCross Ref
  21. Talwar, K. and Wieder, U. 2007. Ballanced allocations: The weighted case. In Proceedings of the Annual ACM Symposium on Theory of Computing (STOC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. van Renesse, R. and Schneider, F. B. 2004. Chain replication for supporting high throughput and availability. In Proceedings of the ACM/USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Vöcking, B. 1999. How asymmetry helps load balancing. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Weil, S. A., Brandt, S. A., Miller, E. L., Long, D. D. E., and Maltzahn, C. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the ACM/USENIX Symposium on Operating Systems Design and Implementation (OSDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Weil, S. A., Brandt, S. A., Miller, E. L., and Maltzahn, C. 2006. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the International Conference on Super Computing (SC). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wieder, U. 2007. Ballanced allocations with heterogeneous bins. In Proceedings of the Sympostiom on Parallel Algorithms and Architecture (SPAA). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Kinesis: A new approach to replica placement in distributed storage systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 4, Issue 4
          January 2009
          116 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/1480439
          Issue’s Table of Contents

          Copyright © 2009 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 February 2009
          • Revised: 1 May 2008
          • Accepted: 1 May 2008
          • Received: 1 February 2008
          Published in tos Volume 4, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!