skip to main content
research-article

Analysis of a Stochastic Model of Replication in Large Distributed Storage Systems: A Mean-Field Approach

Published:13 June 2017Publication History
Skip Abstract Section

Abstract

Distributed storage systems such as Hadoop File System or Google File System (GFS) ensure data availability and durability using replication. Persistence is achieved by replicating the same data block on several nodes, and ensuring that a minimum number of copies are available on the system at any time. Whenever the contents of a node are lost, for instance due to a hard disk crash, the system regenerates the data blocks stored before the failure by transferring them from the remaining replicas. This paper is focused on the analysis of the efficiency of replication mechanism that determines the location of the copies of a given file at some server. The variability of the loads of the nodes of the network is investigated for several policies. Three replication mechanisms are tested against simulations in the context of a real implementation of a such a system: Random, Least Loaded and Power of Choice.

The simulations show that some of these policies may lead to quite unbalanced situations: if β is the average number of copies per node it turns out that, at equilibrium, the load of the nodes may exhibit a high variability. It is shown in this paper that a simple variant of a power of choice type algorithm has a striking effect on the loads of the nodes: at equilibrium, the distribution of the load of a node has a bounded support, most of nodes have a load less than 2β which is an interesting property for the design of the storage space of these systems. Stochastic models are introduced and investigated to explain this interesting phenomenon.

References

  1. P. Billingsley. 1999. Convergence of probability measures (second ed.). John Wiley & Sons Inc., New York. x+277 pages. A Wiley- Interscience Publication.Google ScholarGoogle Scholar
  2. Dhruba Borthakur. 2008. HDFS architecture guide. HADOOP APACHE PROJECT http://hadoop.apache.org/ (2008).Google ScholarGoogle Scholar
  3. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: A Distributed Storage System for Structured Data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7 (OSDI'06). USENIX Association, Berkeley, CA, USA, 15--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Frank Dabek, Jinyang Li, Emil Sit, James Robertson, Frans F. Kaashoek, and Robert Morris. 2004. Designing a DHT for low latency and high throughput. In the 1st Symposium on Networked Systems Design and Implementation. San Francisco, CA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-value Store. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles (SOSP'07). ACM, New York, NY, USA, 205--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Stewart N. Ethier and Thomas G. Kurtz. 1986. Markov Processes: Characterization and Convergence. John Wiley & Sons Inc., New York. x+534 pages.Google ScholarGoogle Scholar
  7. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In the 9th symposium on Operating systems principles. New York, NY, USA, 29--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Carl Graham. 2000. Chaoticity on path space for a queueing network with selection of the shortest queue among several. Journal of Applied Probability 37, 1 (2000), 198--211.Google ScholarGoogle ScholarCross RefCross Ref
  9. Márk Jelasity, Alberto Montresor, Gian Paolo Jesi, and Spyros Voulgaris. The Peersim Simulator. http://peersim.sourceforge.net/. (????).Google ScholarGoogle Scholar
  10. Yuji Kasahara and Shinzo Watanabe. 1986. Limit theorems for point processes and their functionals. Journal of the Mathematical Society of Japan 38, 3 (1986), 543--574.Google ScholarGoogle ScholarCross RefCross Ref
  11. J. F. C. Kingman. 1993. Poisson processes. Oxford studies in probability.Google ScholarGoogle Scholar
  12. Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sergey Legtchenko, Sébastien Monnet, Pierre Sens, and Gilles Muller. 2012. RelaxDHT: A Churn-resilient Replication Strategy for Peer-to-peer Distributed Hash-tables. ACM Trans. Auton. Adapt. Syst. 7, 2, Article 28 (July 2012), 18 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael Mitzenmacher, Andréa W. Richa, and Ramesh Sitaraman. 2000. The Power of Two Random Choices: A Survey of Techniques and Results. In in Handbook of Randomized Computing. 255--312.Google ScholarGoogle Scholar
  15. Alberto Montresor and Márk Jelasity. 2009. PeerSim: A Scalable P2P Simulator. In Proc. of the 9th Int. Conference on Peer-to-Peer (P2P'09). Seattle, WA, 99--100.Google ScholarGoogle ScholarCross RefCross Ref
  16. Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz André Barroso. 2007. Failure Trends in a Large Disk Drive Population. In 5th USENIX Conference on File and Storage Technologies (FAST'07). 17--29. http://research.google.com/archive/disk_failures.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Philippe Robert and Wen Sun. 2017. An Asymptotic Analysis of Replacement Policies. (2017). Preprint. Proc. ACM Meas. Anal. Comput. Syst., Vol. 1, No. 1, Article 24. Publication date: June 2017.Google ScholarGoogle Scholar
  18. Rodrigo Rodrigues and Charles Blake. 2004. When Multi-hop Peer-to-Peer Lookup Matters. In IPTPS'04: Proceedings of the 3rd International Workshop on Peer-to-Peer Systems. San Diego, CA, USA, 112--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. L. C. G. Rogers and David Williams. 1994. Diffusions, Markov processes, and martingales. Vol. 1: Foundations (second ed.). John Wiley & Sons Ltd., Chichester. xx+386 pages.Google ScholarGoogle Scholar
  20. L. C. G. Rogers and David Williams. 2000. Diffusions, Markov processes, and martingales. Vol. 2. Cambridge University Press, Cambridge. xiv+480 pages. Reprint of the second (1994) edition.Google ScholarGoogle Scholar
  21. Antony I. T. Rowstron and Peter Druschel. 2001. Storage Management and Caching in PAST, A Large-scale, Persistent Peer-to-peer Storage Utility. In the 8th ACM symposium on Operating Systems Principles. 188--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V. Simon, S. Monnet, M. Feuillet, P. Robert, and P. Sens. 2015. Scattering and Placing Data Replicas to Enhance Long-Term Durability. In 2015 IEEE 14th International Symposium on Network Computing and Applications. 226--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. D. Vvedenskaya, R. L. Dobrushin, and F. I. Karpelevich. 1996. A queueing system with a choice of the shorter of two queues-an asymptotic approach. Problemy Peredachi Informatsii 32, 1 (1996), 20--34.Google ScholarGoogle Scholar

Index Terms

  1. Analysis of a Stochastic Model of Replication in Large Distributed Storage Systems: A Mean-Field Approach

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
          Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 1, Issue 1
          June 2017
          712 pages
          EISSN:2476-1249
          DOI:10.1145/3107080
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 June 2017
          Published in pomacs Volume 1, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!