ABSTRACT
Piccolo is a new data-centric programming model for writing parallel in-memory applications in data centers. Unlike existing data-flow models, Piccolo allows computation running on different machines to share distributed, mutable state via a key-value table interface. Piccolo enables efficient application implementations. In particular, applications can specify locality policies to exploit the locality of shared state access and Piccolo's run-time automatically resolves write-write conflicts using user-defined accumulation functions.
Using Piccolo, we have implemented applications for several problem domains, including the PageRank algorithm, k-means clustering and a distributed crawler. Experiments using 100 Amazon EC2 instances and a 12 machine cluster show Piccolo to be faster than existing data flow models for many problems, while providing similar fault-tolerance guarantees and a convenient programming interface.
References
- Apache hadoop. http://hadoop.apache.org.Google Scholar
- Example matrix multiplication implementation using mpi. http: //www.cs.umanitoba.ca/~comp4510/examples.html.Google Scholar
- ALMÁSI, G., HEIDELBERGER, P., ARCHER, C. J., MARTORELL, X., ERWAY, C. C., MOREIRA, J. E., STEINMACHERBUROW, B., AND ZHENG, Y. Optimization of MPI collective communication on BlueGene/L systems. In Proceedings of the 19th annual international conference on Supercomputing (New York, NY, USA, 2005), ICS '05, ACM, pp. 253-262. Google Scholar
Digital Library
- ANDERSEN, D. G., FRANKLIN, J., KAMINSKY, M., PHANISHAYEE, A., TAN, L., AND VASUDEVAN, V. FAWN: a fast array of wimpy nodes. In SOSP (2009), J. N. Matthews and T. E. Anderson, Eds., ACM, pp. 1-14. Google Scholar
Digital Library
- BASUMALLIK, A., MIN, S.-J., AND EIGENMANN, R. Programming distributed memory sytems using OpenMP. Parallel and Distributed Processing Symposium, International 0 (2007), 207.Google Scholar
- BEAZLEY, D. M. Automated scientific software scripting with SWIG. Future Gener. Comput. Syst. 19 (July 2003), 599-609. Google Scholar
Digital Library
- BERSHAD, B. N., ZEKAUSKAS, M. J., AND SAWDON, W. The Midway Distributed Shared Memory System. In Proceedings of the 38th IEEE Computer Society International Conference (1993).Google Scholar
Cross Ref
- BLUMOFE, R. D., JOERG, C. F., KUSZMAUL, B. C., LEISERSON, C. E., RANDALL, K. H., AND ZHOU, Y. Cilk: an efficient multithreaded runtime system. In PPOPP '95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming (New York, NY, USA, 1995), ACM, pp. 207-216. Google Scholar
Digital Library
- BOLDI, P., AND VIGNA, S. The WebGraph framework I: Compression techniques. In Proc. of the Thirteenth International World Wide Web Conference (WWW 2004) (Manhattan, USA, 2004), ACM Press, pp. 595-601. Google Scholar
Digital Library
- BOSILCA, G., BOUTEILLER, A., CAPPELLO, F., DJILALI, S., FEDAK, G., GERMAIN, C., HERAULT, T., LEMARINIER, P., LODYGENSKY, O., MAGNIETTE, F., NERI, V., AND SELIKHOV, A. Mpich-v: toward a scalable fault tolerant mpi for volatile nodes. In Proceedings of the 2002 ACM/IEEE conference on Supercomputing (Los Alamitos, CA, USA, 2002), Supercomputing '02, IEEE Computer Society Press, pp. 1-18. Google Scholar
Digital Library
- BRIN, S., AND PAGE, L. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 1-7 (1998), 107-117. Proceedings of the Seventh International World Wide Web Conference. Google Scholar
Digital Library
- BUCK, I., FOLEY, T., HORN, D., SUGERMAN, J., FATAHALIAN, K., HOUSTON, M., AND HANRAHAN, P. Brook for GPUs: stream computing on graphics hardware. In ACM SIGGRAPH 2004 Papers (2004), ACM, p. 786. Google Scholar
Digital Library
- CARRIERO, N., AND GELERNTER, D. Linda in context. Commun. ACM 32, 4 (1989), 444-458. Google Scholar
Digital Library
- CHAMBERS, C., RANIWALA, A., PERRY, F., ADAMS, S., HENRY, R. R., BRADSHAW, R., AND WEIZENBAUM, N. Flumejava: Easy, efficient data-parallel pipelines. In PLDI - ACM SIGPLAN 2010 (2010). Google Scholar
Digital Library
- CHANDY, K. M., AND LAMPORT, L. Distributed snapshots: determining global states of distributed systems. ACM Transactions on Computer Systems (TOCS) 3 (1985), 63-75. Google Scholar
Digital Library
- CONDIE, T., CONWAY, N., ALVARO, P., AND HELLERSTEIN, J. MapReduce online. In NSDI (2010). Google Scholar
Digital Library
- CONSORTIUM, U. UPC language specifications, v1.2. Tech. rep., Lawrence Berkeley National Lab, 2005.Google Scholar
- DAGUM, L., AND MENON, R. Open MP: An Industry-Standard API for Shared-Memory Programming. IEEE Computational Science and Engineering 5, 1 (1998), 46-55. Google Scholar
Digital Library
- DEAN, J., AND GHEMAWAT, S. Mapreduce: Simplified data processing on large clusters. In Symposium on Operating System Design and Implementation (OSDI) (2004). Google Scholar
Digital Library
- DECANDIA, G., HASTORUN, D., JAMPANI, M., KAKULAPATI, G., LAKSHMAN, A., PILCHIN, A., SIVASUBRAMANIAN, S., VOSSHALL, P., AND VOGELS, W. Dynamo: Amazon's highly available key-value store. In ACM Symposium on Operating Systems Principles (Oct. 2007), pp. 205-220. Google Scholar
Digital Library
- FORUM, M. MPI 2.0 standard, 1997.Google Scholar
- FREEMAN, E., ARNOLD, K., AND HUPFER, S. JavaSpaces Principles, Patterns, and Practice. Addison-Wesley Longman Ltd., Essex, UK, UK, 1999. Google Scholar
Digital Library
- GRIBBLE, S. D., BREWER, E. A., HELLERSTEIN, J. M., AND CULLER, D. Scalable, distributed data structures for internet service construction. In OSDI'00: Proceedings of the 4th conference on Symposium on Operating System Design & Implementation (Berkeley, CA, USA, 2000), USENIX Association, pp. 22-22. Google Scholar
Digital Library
- GROUP, K. O. W. The OpenCL specification. Tech. rep., 2009.Google Scholar
- HILL, J., MCCOLL, W., STEFANESCU, D., GOUDREAU, M., LANG, K., RAO, S., SUEL, T., TSANTILAS, T., AND BISSELING, H. Bsplib: The bsp programming library. Parallel Computing 24 (1998). Google Scholar
Digital Library
- HOEFLINGER, J. P. Extending OpenMP to clusters. Tech. rep., Intel, 2009.Google Scholar
- ISARD, M., BUDIU, M., YU, Y., BIRRELL, A., AND FETTERLY, D. Dryad: Distributed data-parallel programs from sequential building blocks. In European Conference on Computer Systems (EuroSys) (2007). Google Scholar
Digital Library
- ISARD, M., PRABHAKARAN, V., CURREY, J., WIEDER, U., TALWAR, K., AND GOLDBERG, A. Quincy: Fair scheduling for distributed computing clusters. In SOSP (2010). Google Scholar
Digital Library
- JOHNSON, K. L., KAASHOEK, M. F., AND WALLACH, D. A. CRL: High-performance all-software distributed shared memory. In SOSP (1995). Google Scholar
Digital Library
- KELEHER, P., COX, A. L., AND ZWAENEPOEL, W. Lazy release consistency for software distributed shared memory. In In Proceedings of the 19th Annual International Symposium on Computer Architecture (1992). Google Scholar
Digital Library
- LAMPORT, L. How to make a multiprocessor that correctly executes multiprocess programs. IEEE Transactions on Computers 28, 9 (1979). Google Scholar
Digital Library
- LI, K., AND HUDAK, P. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems (TOCS) 7 (1989), 321-359. Google Scholar
Digital Library
- MALEWICZ, G., AUSTERN, M. H., BIK, A. J., DEHNERT, J. C., HORN, I., LEISER, N., AND CZAJKOWSKI, G. Pregel: a system for large-scale graph processing. In SIGMOD '10: Proceedings of the 2010 international conference on Management of data (New York, NY, USA, 2010), ACM, pp. 135-146. Google Scholar
Digital Library
- NAGARAJAN, A. B., MUELLER, F., ENGELMANN, C., AND SCOTT, S. L. Proactive fault tolerance for hpc with xen virtualization. In Proceedings of the 21st annual international conference on Supercomputing (New York, NY, USA, 2007), ICS '07, ACM, pp. 23-32. Google Scholar
Digital Library
- NUMRICH, R. W., AND REID, J. Co-array Fortran for parallel programming. SIGPLAN Fortran Forum 17 (August 1998), 1-31. Google Scholar
Digital Library
- NVIDIA. CUDA programming guide (ver 3.0).Google Scholar
- OLSON, C., REED, B., SRIVASTAVA, U., KUMAR, R., AND TOMKINS, A. Pig Latin: A not-so-foreign language for data processing. In ACM SIGMOD (2008). Google Scholar
Digital Library
- OUSTERHOUT, J., AGRAWAL, P., ERICKSON, D., KOZYRAKIS, C., LEVERICH, J., MAZIERES, D., MITRA, S., NARAYANAN, A., PARULKAR, G., ROSENBLUM, M., RUMBERL, S., STRATMANN, E., AND STUTSMAN, R. The case for RAMclouds: Scalable high-performance storage entirely in DRAM. In Operating system review (Dec. 2009). Google Scholar
Digital Library
- PHILLIPS, L., AND FITZPATRICK, B. Livejournal's backend and memcached: Past, present, and future. In LISA (2004), USENIX.Google Scholar
- PIKE, R., DORWARD, S., GRIESEMER, R., AND QUINLAN, S. Interpreting the data: Parallel analysis with Sawzall. In Scientific Programming (2005). Google Scholar
Digital Library
- REINDERS, J. Intel threading building blocks: outfitting C++ for multi-core processor parallelism. O'Reilly Media, Inc., 2007. Google Scholar
Digital Library
- ROWSTRON, A., AND DRUSCHEL, P. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In 18th IFIP/ACM International Conference on Distributed Systems Platforms (Nov. 2001). Google Scholar
Digital Library
- SINGH, J. P., WEBER, W.-D., AND GUPTA, A. SPLASH: Stanford parallel applications for shared-memory. Tech. rep., Stanford University, 1991. Google Scholar
Digital Library
- STEPHENS, R. A survey of stream processing, 1995.Google Scholar
- STOICA, I., MORRIS, R., LIBEN-NOWELL, D., KARGER, D. R., KAASHOEK, M. F., DABEK, F., AND BALAKRISHNAN, H. Chord: A scalable peer-to-peer lookup protocol for Internet applications. IEEE/ACM Transactions on Networking (2002), 149-160. Google Scholar
Digital Library
- SUNDERAM, V. PVM: A framework for parallel distributed computing. Concurrency: Practice and Experience (1990), 315-339. Google Scholar
Digital Library
- THIES, W., KARCZMAREK, M., AND AMARASINGHE, S. StreamIt: A language for streaming applications. In Compiler Construction (2002), Springer, pp. 49-84. Google Scholar
Digital Library
- THUSOO, A., SARMA, J. S., JAIN, N., SHAO, Z., CHAKKA, P., ANTHONY, S., LIU, H., WYCKOFF, P., AND MURTHY, R. Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2 (August 2009), 1626-1629. Google Scholar
Digital Library
- TSANG LEE, H., LEONARD, D., WANG, X., AND LOGUINOV, D. Irlbot: Scaling to 6 billion pages and beyond. In WWW Conference (2008). Google Scholar
Digital Library
- VALIANT, L. A bridging model for parallel computation. Communications of the ACM 33 (1990). Google Scholar
Digital Library
- YELICK, K., SEMENZATO, L., PIKE, G., MIYAMOTO, C., LIBLIT, B., KRISHNAMURTHY, A., GRAHAM, P. H. S., GAY, D., COLELLA, P., AND AIKEN, A. Titanium: A high-performance Java dialect. Concurrency: Practice and Experience 10, 11 (1998).Google Scholar
Cross Ref
- YU, Y., GUNDA, P. K., AND ISARD, M. Distributed aggregation for data-parallel computing: Interfaces and implementations. In ACM Symposium on Operating Systems Principles (SOSP) (2009). Google Scholar
Digital Library
- YU, Y., ISARD, M., FETTERLY, D., BUDIU, M., ERLINGSSON, U., GUNDA, P. K., AND CURREY, J. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In Symposium on Operating System Design and Implementation (OSDI) (2008). Google Scholar
Digital Library
- ZAHARIA, M., CHOWDHURY, N. M. M., FRANKLIN, M., SHENKER, S., AND STOICA, I. Spark: Cluster Computing with Working Sets. Tech. Rep. UCB/EECS-2010-53, EECS Department, University of California, Berkeley, May 2010.Google Scholar
Index Terms
(auto-classified)Piccolo: building fast, distributed programs with partitioned tables




Comments