Abstract
We describe PNUTS, a massively parallel and geographically distributed database system for Yahoo!'s web applications. PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees. It is a hosted, centrally managed, and geographically distributed service, and utilizes automated load-balancing and failover to reduce operational complexity. The first version of the system is currently serving in production. We describe the motivation for PNUTS and the design and implementation of its table storage and replication layers, and then present experimental results.
- Eventually consistent. http://www.allthingsdistributed.com/2007/12/-eventually_consistent.html.Google Scholar
- Trading consistency for scalability in distributed architectures. http://www.infoq.com/news/2008/03/ebaybase, 2008.Google Scholar
- M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. In SOSP, 2007. Google Scholar
Digital Library
- P. Bernstein, N. Dani, B. Khessib, R. Manne, and D. Shutt. Data management issues in supporting large-scale web services. IEEE Data Engineering Bulletin, December 2006.Google Scholar
- P. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. Google Scholar
Digital Library
- P. A. Bernstein and N. Goodman. Timestamp-based algorithms for concurrency control in distributed database systems. In Proc. VLDB, 1980. Google Scholar
Digital Library
- L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and zipf-like distributions: Evidence and implications. In Proc. INFOCOM, 1999.Google Scholar
Cross Ref
- F. Chang et al. Bigtable: A distributed storage system for structured data. In OSDI, 2006. Google Scholar
Digital Library
- F. Dabek, M. F. Kaashoek, D. R. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Proc. SOSP, 2001. Google Scholar
Digital Library
- K. Daudjee and K. Salem. Lazy database replication with snapshot isolation. In Proc. VLDB, 2006. Google Scholar
Digital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI, 2004. Google Scholar
Digital Library
- G. DeCandia et al. Dynamo: Amazon's highly available key-value store. In SOSP, 2007. Google Scholar
Digital Library
- D. J. DeWitt and J. Gray. Parallel database systems: The future of high performance database processing. CACM, 36(6), June 1992. Google Scholar
Digital Library
- I. Stoica et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In Proc. ACM STOC, 1997. Google Scholar
Digital Library
- S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In Proc. SOSP, 2003. Google Scholar
Digital Library
- J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993. Google Scholar
Digital Library
- P. Helland. Life beyond distributed transactions: an apostate's opinion. In Proc. Conference on Innovative Data Systems Research (CIDR), 2007.Google Scholar
- Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott Shenker, and Ion Stoica. Querying the internet with pier. In Proc. VLDB, 2003. Google Scholar
Digital Library
- D. Kossmann. The state of the art in distributed query processing. ACM Computing Surveys, 32(4):422--469, 2000. Google Scholar
Digital Library
- J. MacCormick, N. Murphy, M. Najork, C. A. Thekkath, and L. Zhou. Boxwood: Abstractions as the foundation for storage infrastructure. In OSDI, 2004. Google Scholar
Digital Library
- C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. In Proc. SIGMOD, 2008. Google Scholar
Digital Library
- E. Pacitti, P. Minet, and E. Simon. Fast algorithms for maintaining replica consistency in lazy master replicated databases. In VLDB, 1999. Google Scholar
Digital Library
- K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. Demers. Flexible update propagation for weakly consistent replication. In Proc. SOSP, 1997. Google Scholar
Digital Library
- A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. In Middleware, 2001. Google Scholar
Digital Library
- A. Silberstein, B. F. Cooper, U. Srivastava, E. Vee, R. Yerneni, and R. Ramakrishnan. Efficient bulk insertion into a distributed ordered table. In Proc. SIGMOD, 2008. Google Scholar
Digital Library
- I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proc. SIGCOMM, 2001. Google Scholar
Digital Library
- S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In Proc. OSDI, 2006. Google Scholar
Digital Library
- S. A. Weil, S. A. Brandt, E. L. Miller, and C. Maltzahn. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proc. Supercomputing (SC), 2006. Google Scholar
Digital Library
Index Terms
PNUTS: Yahoo!'s hosted data serving platform





Comments