skip to main content
research-article
Free Access

A Lightweight Data Location Service for Nondeterministic Exascale Storage Systems

Published:07 August 2014Publication History
Skip Abstract Section

Abstract

In this article, we present LWDLS, a lightweight data location service designed for Exascale storage systems (storage systems with order of 1018 bytes) and geo-distributed storage systems (large storage systems with physically distributed locations). LWDLS provides a search-based data location solution, and enables free data placement, movement, and replication. In LWDLS, probe and prune protocols are introduced that reduce topology mismatch, and a heuristic flooding search algorithm (HFS) is presented that achieves higher search efficiency than pure flooding search while having comparable search speed and coverage to the pure flooding search. LWDLS is lightweight and scalable in terms of incorporating low overhead, high search efficiency, no global state, and avoiding periodic messages. LWDLS is fully distributed and can be used in nondeterministic storage systems and in deterministic storage systems to deal with cases where search is needed. Extensive simulations modeling large-scale High Performance Computing (HPC) storage environments provide representative performance outcomes. Performance is evaluated by metrics including search scope, search efficiency, and average neighbor distance. Results show that LWDLS is able to locate data efficiently with low cost of state maintenance in arbitrary network environments. Through these simulations, we demonstrate the effectiveness of protocols and search algorithm of LWDLS.

Skip Supplemental Material Section

Supplemental Material

References

  1. John Bent, Garth Gibson, Gary Grider, Ben McClelland, Paul Nowoczynski, James Nunez, Milo Polte, and Meghan Wingate. 2009. PLFS: A checkpoint filesystem for parallel applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 1--12. DOI: http://dx.doi.org/10.1145/1654059.1654081 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Kevin Brandstatter, Dongfang Zhao, Ke Wang, Anupam Rajendran, Zhao Zhang, Ioan Raicu, Tonglin Li, and Xiaobing Zhou. 2013. ZHT: A light-weight reliable persistent dynamic scalable zero-hop distributed hash table. In Proceedings of the IEEE International Parallel & Distributed Processing Symposium (IPDPS'13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. John Buford. 2013. Microsoft PowerPoint - JBuford-IETF-P2PSIP-Overlay-Systems-v3.ppt-IETF64_P2PSIP_AdHoc_P2P_Overview_Buford.pdf. (2013). http://www.softarmor.com/sipping/meets/ietf64/slides/IETF64_P2PSIP_AdHoc_P2P_Overview_Buford.pdf.Google ScholarGoogle Scholar
  4. Philip H. Carns, Walter B. Ligon, III, Robert B. Ross, and Rajeev Thakur. 2000. PVFS: A parallel file system for linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott Shenker. 2003. Making gnutella-like P2P systems scalable. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM'03). ACM, New York, 407--418. DOI: http://dx.doi.org/10.1145/863955.864000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sérgio Crisóstomo, Udo Schilcher, Christian Bettstetter, and João Barros. 2012. Probabilistic flooding in stochastic networks: Analysis of global information outreach. Comput. Netw. 56, 1, 142--156. DOI: http://dx.doi.org/10.1016/j.comnet.2011.08.014 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Matthew L. Curry, Ruth Klundt, and H. Lee Ward. 2012. Using the Sirocco file system for high-bandwidth checkpoints. Sandia National Laboratories, Technical Report SAND2012-1087. http://prod.sandia.gov/techlib/access-control.cgi/2012/121087.pdf.Google ScholarGoogle Scholar
  8. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's highly available key-value store. SIGOPS Oper. Syst. Rev. 41, 205--220. DOI: http://dx.doi.org/10.1145/1323293.1294281 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Wolfgang E. Denzel, Jian Li, Peter Walker, and Yuho Jin. 2008. A framework for end-to-end simulation of high-performance computing systems. In Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops (Simutools'08). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Article 21, http://dl.acm.org/citation.cfm?id=1416222.1416248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jack Dongarra. 2010. Impact of architecture and technology for extreme scale on software and algorithm design. In Proceedings of the Department of Energy Workshop on Cross-Cutting Technologies for Computing at the Exascale.Google ScholarGoogle Scholar
  11. Rossano Gaeta and Matteo Sereno. 2011. Generalized probabilistic flooding in unstructured peer-to-peer networks. IEEE Trans. Parallel Distrib. Syst. 22, 12, 2055--2062. DOI: http://dx.doi.org/10.1109/TPDS.2011.82 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, 96--108. http://www.cs.rochester.edu/sosp2003/papers/p125-ghemawat.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Christos Gkantsidis, Milena Mihail, and Amin Saberi. 2005. Hybrid search schemes for unstructured peer-to-peer networks. In Proceedings of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCom'05). Vol. 3, 1526--1537. DOI: http://dx.doi.org/10.1109/INFCOM.2005.1498436Google ScholarGoogle ScholarCross RefCross Ref
  14. Anjali Gupta, Barbara Liskov, and Rodrigo Rodrigues. 2003. One hop lookups for peer-to-peer overlays. In Proceedings of the 9th Conference on Hot Topics in Operating Systems (HOTOS'03). Vol. 9, USENIX Association, Berkeley, CA, 2--2. http://dl.acm.org/citation.cfm?id=1251054.1251056. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Song Jiang, Lei Guo, Xiaodong Zhang, and Haodong Wang. 2008. LightFlood: Minimizing redundant messages and maximizing scope of peer-to-peer search. IEEE Trans. Parallel Distrib. Syst. 19, 5, 601--614. DOI: http://dx.doi.org/10.1109/TPDS.2007.70772 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ketama 2013. Ketama. http://www.audioscrobbler.net/development/ketama/.Google ScholarGoogle Scholar
  17. Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2, 35--40. DOI: http://dx.doi.org/10.1145/1773912.1773922 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tsungnan Lin, Pochiang Lin, Hsinping Wang, and Chiahung Chen. 2009. Dynamic search algorithm in unstructured peer-to-peer networks. IEEE Trans. Parall. Distrib. Syst. 20, 5, 654--666. DOI: http://dx.doi.org/10.1109/TPDS.2008.134 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yunhao Liu. 2008. A two-hop solution to solving topology mismatch. IEEE Trans. Parall. Distrib. Syst. 19, 11, 1591--1600. DOI: http://dx.doi.org/10.1109/TPDS.2008.24 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yunhao Liu, Li Xiao, Xiaomei Liu, L.M. Ni, and Xiaodong Zhang. 2005. Location awareness in unstructured peer-to-peer systems. IEEE Trans. Parall. Distrib. Syst. 16, 2, 163--174. DOI: http://dx.doi.org/10.1109/TPDS.2005.21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Boon Thau Loo, Ryan Huebsch, Ion Stoica, and Joseph M. Hellerstein. 2004. The case for a hybrid p2p search infrastructure. In Proceedings of the 3rd International Conference on Peer-to-Peer Systems (IPTPS'04). Springer-Verlag, Berlin, Heidelberg, 141--150. DOI: http://dx.doi.org/10.1007/978-3-540-30183-7_14 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Qin Lv, Pei Cao, Edith Cohen, Kai Li, and Scott Shenker. 2002. Search and replication in unstructured peer-to-peer networks. In Proceedings of the 16th International Conference on Supercomputing (ICS'02). ACM, New York, 84--95. DOI: http://dx.doi.org/10.1145/514191.514206 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Petar Maymounkov and David Mazières. 2002. Kademlia: A peer-to-peer information system based on the XOR metric. In Revised Papers from the 1st International Workshop on Peer-to-Peer Systems (IPTPS'01). Springer-Verlag, 53--65. http://dl.acm.org/citation.cfm?id=646334.687801. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Memcached 2013. Memcached. http://www.memcached.org/.Google ScholarGoogle Scholar
  25. Mark Newman, Steven Strogatz, and Duncan J. Watts. 2001. Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 2, 026118. DOI: http://dx.doi.org/10.1103/PhysRevE.64.026118Google ScholarGoogle ScholarCross RefCross Ref
  26. Paul Nowoczynski, Nathan Stone, Jared Yanovich, and Jason Sommerfield. 2008. Zest Checkpoint storage system for large supercomputers. In Petascale Data Storage Workshop (PDSW'08). 1--5. DOI: http://dx.doi.org/10.1109/PDSW.2008.4811883Google ScholarGoogle ScholarCross RefCross Ref
  27. Konstantinos Oikonomou, Dimitrios Kogias, and Ioannis Stavrakakis. 2010. Probabilistic flooding for efficient information dissemination in random graph topologies. Comput. Netw. 54, 10, 1615--1629. DOI: http://dx.doi.org/10.1016/j.comnet.2010.01.007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Karl Pearson. 1905. The problem of the random walk. Nature 72, 1865, 294--294. DOI: http://dx.doi.org/10.1038/072294b0Google ScholarGoogle Scholar
  29. Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker. 2001. A scalable content-addressable network. SIGCOMM Comput. Commun. Rev. 31, 4, 161--172. DOI: http://dx.doi.org/10.1145/964723.383072 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Matei Ripeanu, Adriana Iamnitchi, and Ian Foster. 2002. Mapping the gnutella network. IEEE Internet Comput. 6, 1, 50--57. DOI: http://dx.doi.org/10.1109/4236.978369 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ohad Rodeh and Avi Teperman. 2003. zFS - A scalable distributed file system using object disks. In Proceedings of the 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003 (MSST'03). 207--218. DOI: http://dx.doi.org/10.1109/MASS.2003.1194858 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Antony Rowstron and Peter Druschel. 2001. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, 329--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Schmuck and R. Haskin. 2002. GPFS: A shared-disk file system for large computing clusters. In Proceedings of the 1st Conference on File and Storage Technologies (FAST'02), Monterey, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Philip Schwan. 2003. Lustre: Building a file system for 1,000-node clusters. In Proceedings of the Linux Symposium. 9.Google ScholarGoogle Scholar
  35. Haiying Shen, Cheng-Zhong Xu, and Guihai Chen. 2006. Cycloid: A constant-degree and lookup-efficient P2P overlay network. Perform. Eval. 63, 3, 195--216. DOI: http://dx.doi.org/10.1016/j.peva.2005.01.004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Alexandre O. Stauffer and Valmir C. Barbosa. 2004. Probabilistic heuristics for disseminating information in networks. CoRR cs.NI/0409001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. 2001. Chord: A scalable peer-to-peer lookup service for internet applications. SIGCOMM Comput. Commun. Rev. 31, 4, 149--160. DOI: http://dx.doi.org/10.1145/964723.383071 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Hong Tang and Tao Yang. 2003. An efficient data location protocol for self-organizing storage clusters. In Proceedings of the International Conference for High Performance Computing and Communications. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Bruce Tolley. 2011. Solarflare Fujitsu low latency test report - Solarflare_low-latency_TestReport.pdf. http://www.fujitsu.com/downloads/COMP/ffna/ethernet/Solarflare_Low-Latency_TestReport.pdf.Google ScholarGoogle Scholar
  40. András Varga and Rudolf Hornig. 2008. An overview of the OMNeT++ simulation environment. In Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops (Simutools'08). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). Article 60, http://dl.acm.org/citation.cfm?id=1416222.1416290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Sage A. Weil, Scott A. Brandt, Ethan L. Miller, and Carlos Maltzahn. 2006a. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC'06). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006b. CEPH: A scalable, high-performance distributed file system. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 307--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tao Yang, Hong Tang, Aziz Gulbeden, Jingyu Zhou, and Lingkun Chu. 2004. Sorrento: A self-organizing storage cluster for parallel data-intensive applications. In Proceedings of the High Performance Computing, Networking and Storage Conference (SC'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Min Yang and Yuanyuan Yang. 2010. An efficient hybrid peer-to-peer system for distributed data sharing. IEEE Trans. 59, 9, 1158--1171. DOI: http://dx.doi.org/10.1109/TC.2009.175 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ben Y. Zhao, John D. Kubiatowicz, and Anthony D. Joseph. 2001. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. rep. UCB/CSD-01-1141. EECS Department, University of California, Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2001/5213.html. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Lightweight Data Location Service for Nondeterministic Exascale Storage Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 10, Issue 3
      July 2014
      113 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/2661087
      • Editor:
      • Darrell Long
      Issue’s Table of Contents

      Copyright © 2014 Public Domain

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2014
      • Accepted: 1 February 2014
      • Revised: 1 September 2013
      • Received: 1 June 2013
      Published in tos Volume 10, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!