Abstract
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.
References
- Abadi, D. J., Madden, S. R., and Ferreira, M. C. 2006. Integrating compression and execution in column-oriented database systems. Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York. Google Scholar
Digital Library
- Ailamaki, A., DeWitt, D. J., Hill, M. D., and Skounakis, M. 2001. Weaving relations for cache performance. The VLDB J. 169--180. Google Scholar
Digital Library
- Banga, G., Druschel, P., and Mogul, J. C. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. 45--58. Google Scholar
Digital Library
- Baru, C. K., Fecteau, G., Goyal, A., Hsiao, H., Jhingran, A., Padmanabhan, S., Copeland, G. P., and Wilson, W. G. 1995. DB2 parallel edition. IBM Syst. J. 34, 2, 292--322. Google Scholar
Digital Library
- Bavier, A., Bowman, M., Chun, B., Culler, D., Karlin, S., Peterson, L., Roscoe, T., Spalink, T., and Wawrzoniak, M. 2004. Operating system support for planetary-scale network services. In Proceedings of the 1st Symposium on Networked Systems Design and Implementation. 253--266. Google Scholar
Digital Library
- Bentley, J. L. and McIlroy, M. D. 1999. Data compression using long common strings. In Data Compression Conference. 287--295. Google Scholar
Digital Library
- Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7, 422--426. Google Scholar
Digital Library
- Burrows, M. 2006. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 335--350. Google Scholar
Digital Library
- Chandra, T., Griesemer, R., and Redstone, J. 2007. Paxos made live --- An engineering perspective. In Proceedings of PODC. Google Scholar
Digital Library
- Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R. E. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation. 205--218. Google Scholar
Digital Library
- Comer, D. 1979. Ubiquitous B-tree. Computing Surveys 11, 2 (June), 121--137. Google Scholar
Digital Library
- Copeland, G. P., Alexander, W., Boughter, E. E., and Keller, T. W. 1988. Data placement in Bubba. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 99--108. Google Scholar
Digital Library
- Dean, J. and Ghemawat, S. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 137--150. Google Scholar
Digital Library
- DeWitt, D., Katz, R., Olken, F., Shapiro, L., Stonebraker, M., and Wood, D. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 1--8. Google Scholar
Digital Library
- DeWitt, D. J. and Gray, J. 1992. Parallel database systems: The future of high performance database systems. Commun. ACM 35, 6 (June), 85--98. Google Scholar
Digital Library
- French, C. D. 1995. One size fits all database architectures do not work for DSS. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 449--450. Google Scholar
Digital Library
- Gawlick, D. and Kinkade, D. 1985. Varieties of concurrency control in IMS/VS fast path. Datab. Eng. Bull. 8, 2, 3--10.Google Scholar
- Ghemawat, S., Gobioff, H., and Leung, S.-T. 2003. The Google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles. ACM, New York, 29--43. Google Scholar
Digital Library
- Gray, J. 1978. Notes on database operating systems. In Operating Systems --- An Advanced Course. Lecture Notes in Computer Science, vol. 60. Springer-Verlag, ACM, New York. Google Scholar
Digital Library
- Greer, R. 1999. Daytona and the fourth-generation language Cymbal. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 525--526. Google Scholar
Digital Library
- Hagmann, R. 1987. Reimplementing the Cedar file system using logging and group commit. In Proceedings of the 11th Symposium on Operating Systems Principles. 155--162. Google Scholar
Digital Library
- Hartman, J. H. and Ousterhout, J. K. 1993. The Zebra striped network file system. In Proceedings of the 14th Symposium on Operating Systems Principles. ACM, New York, 29--43. Google Scholar
Digital Library
- kx.com. kx.com/products/database.php. Product page.Google Scholar
- Lamport, L. 1998. The part-time parliament. ACM Trans. Comput. Syst. 16, 2, 133--169. Google Scholar
Digital Library
- MacCormick, J., Murphy, N., Najork, M., Thekkath, C. A., and Zhou, L. 2004. Boxwood: Abstractions as the foundation for storage infrastructure. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation. 105--120. Google Scholar
Digital Library
- McCarthy, J. 1960. Recursive functions of symbolic expressions and their computation by machine. Commun. ACM 3, 4 (Apr.), 184--195. Google Scholar
Digital Library
- O'Neil, P., Cheng, E., Gawlick, D., and O'Neil, E. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4, 351--385. Google Scholar
Digital Library
- oracle.com. www.oracle.com/technology/products/database/clustering/index.html. Product page.Google Scholar
- Pike, R., Dorward, S., Griesemer, R., and Quinlan, S. 2005. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal 13, 4, 227--298. Google Scholar
Digital Library
- Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. 2001. A scalable content-addressable network. In Proceedings of SIGCOMM. ACM, New York, 161--172. Google Scholar
Digital Library
- Rowstron, A. and Druschel, P. 2001. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proceedings of Middleware 2001. 329--350. Google Scholar
Digital Library
- sensage.com. sensage.com/products-sensage.htm. Product page.Google Scholar
- Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of SIGCOMM. ACM, New York, 149--160. Google Scholar
Digital Library
- Stonebraker, M. 1986. The case for shared nothing. Datab. Eng. Bull. 9, 1 (Mar.), 4--9.Google Scholar
- Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O'Neil, E., O'Neil, P., Rasin, A., Tran, N., and Zdonik, S. 2005. C-Store: A column-oriented DBMS. In Proceedings of the 10th International Conference on Very Large Data Bases. ACM, New York, 553--564. Google Scholar
Digital Library
- Stonebraker, M., Aoki, P. M., Devine, R., Litwin, W., and Olson, M. A. 1994. Mariposa: A new architecture for distributed data. In Proceedings of the 10th International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA, 54--65. Google Scholar
Digital Library
- sybase.com. www.sybase.com/products/databaseservers/sybaseiq. Product page.Google Scholar
- Zhao, B. Y., Kubiatowicz, J., and Joseph, A. D. 2001. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. Tech. Rep. UCB/CSD-01-1141, CS Division, University of California, Berkeley. Apr. Google Scholar
Digital Library
- Zukowski, M., Boncz, P. A., Nes, N., and Heman, S. 2005. MonetDB/X100 --- A DBMS in the CPU cache. IEEE Data Eng. Bull. 28, 2, 17--22.Google Scholar
Index Terms
Bigtable: A Distributed Storage System for Structured Data





Comments