ABSTRACT
There has been much research devoted to improving the performance of data analytics frameworks, but comparatively little effort has been spent systematically identifying the performance bottlenecks of these systems. In this paper, we develop blocked time analysis, a methodology for quantifying performance bottlenecks in distributed computation frameworks, and use it to analyze the Spark framework's performance on two SQL benchmarks and a production workload. Contrary to our expectations, we find that (i) CPU (and not I/O) is often the bottleneck, (ii) improving network performance can improve job completion time by a median of at most 2%, and (iii) the causes of most stragglers can be identified.
- Apache Parquet. http://parquet.incubator.apache.org/.Google Scholar
- Common Crawl. http://commoncrawl.org/.Google Scholar
- Databricks. http://databricks.com/.Google Scholar
- Spark SQL. https://spark.apache.org/sql/.Google Scholar
- M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds, and A. Muthitacharoen. Performance Debugging for Distributed Systems of Black Boxes. In Proc. SOSP, 2003. Google Scholar
Digital Library
- M. Al-fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic Flow Scheduling for Data Center Networks. In Proc. NSDI, 2010. Google Scholar
Digital Library
- G. Ananthanarayanan, S. Agarwal, S. Kandula, A. Greenberg, I. Stoica, D. Harlan, and E. Harris. Scarlett: Coping with Skewed Content Popularity in MapReduce Clusters. In Proc. EuroSys, 2011. Google Scholar
Digital Library
- G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica. Effective Straggler Mitigation: Attack of the Clones. In Proc. NSDI, 2013. Google Scholar
Digital Library
- G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: Coordinated Memory Caching for Parallel Jobs. In Proc. NSDI, 2012. Google Scholar
Digital Library
- G. Ananthanarayanan, M. C.-C. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. GRASS: Trimming Stragglers in Approximation Analytics. In Proc. NSDI, 2014. Google Scholar
Digital Library
- G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the Outliers in Map-Reduce Clusters using Mantri. In Proc. OSDI, 2010. Google Scholar
Digital Library
- G. Ananthanarayayan. Personal Communication, February 2015.Google Scholar
- Apache Software Foundation. Apache Hadoop. http://hadoop.apache.org/.Google Scholar
- H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Towards Predictable Datacenter Networks. In Proc. SIGCOMM, 2011. Google Scholar
Digital Library
- P. Barham, A. Donnelly, R. Isaacs, and R. Mortier. Using magpie for request extraction and workload modelling. In Proc. SOSP, 2004. Google Scholar
Digital Library
- D. Borthakur. Facebook has the world's largest Hadoop cluster! http://hadoopblog.blogspot.com/2010/05/facebook-has-worlds-largest-hadoop.html, May 2010.Google Scholar
- M. Chowdhury, S. Kandula, and I. Stoica. Leveraging Endpoint Flexibility in Data-intensive Clusters. In Proc. SIGCOMM, 2013. Google Scholar
Digital Library
- M. Chowdhury and I. Stoica. Coflow: A Networking Abstraction for Cluster Applications. In Proc. HotNets, 2012. Google Scholar
Digital Library
- M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing Data Transfers in Computer Clusters with Orchestra. In Proc. SIGCOMM, 2011. Google Scholar
Digital Library
- M. Chowdhury, Y. Zhong, and I. Stoica. Efficient Coflow Scheduling with Varys. In Proc. SIGCOMM, 2014. Google Scholar
Digital Library
- P. Costa, A. Donnelly, A. Rowstron, and G. O'Shea. Camdoop: Exploiting In-network Aggregation for Big Data Applications. In Proc. NSDI, 2012. Google Scholar
Digital Library
- A. Crotty, A. Galakatos, K. Dursun, T. Kraska, U. Çetintemel, and S. B. Zdonik. Tupleware: Redefining modern analytics. CoRR, 2014.Google Scholar
- J. Dean. Personal Communication, February 2015.Google Scholar
- J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. CACM, 51(1):107-113, Jan. 2008. Google Scholar
Digital Library
- J. Erickson, M. Kornacker, and D. Kumar. New SQL Choices in the Apache Hadoop Ecosystem: Why Impala Continues to Lead. http://goo.gl/evDBfy, 2014.Google Scholar
- B. Gufler, N. Augsten, A. Reiser, and A. Kemper. Load Balancing in MapReduce Based on Scalable Cardinality Estimates. In Proc. ICDE, pages 522-533, 2012. Google Scholar
Digital Library
- Z. Guo, X. Fan, R. Chen, J. Zhang, H. Zhou, S. McDirmid, C. Liu, W. Lin, J. Zhou, and L. Zhou. Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE. In Proc. OSDI, 2012. Google Scholar
Digital Library
- V. Jeyakumar, M. Alizadeh, D. Mazieres, B. Prabhakar, C. Kim, and A. Greenberg. EyeQ: Practical Network Performance Isolation at the Edge. In Proc. NSDI, 2013. Google Scholar
Digital Library
- Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. SkewTune: Mitigating Skew in MapReduce Applications. In Proc. SIGMOD, pages 25-36, 2012. Google Scholar
Digital Library
- H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica. Reliable, Memory Speed Storage for Cluster Computing Frameworks. In Proc. SoCC, 2014. Google Scholar
Digital Library
- K. O'Dell. How-to: Select the Right Hardware for Your New Hadoop Cluster. http://goo.gl/INds4t, August 2013.Google Scholar
- Oracle. The Java HotSpot Performance Engine Architecture. http://www.oracle.com/technetwork/java/whitepaper-135217.html.Google Scholar
- K. Ousterhout. Display filesystem read statistics with each task. https://issues.apache.org/jira/browse/SPARK-1683.Google Scholar
- K. Ousterhout. Shuffle read bytes are reported incorrectly for stages with multiple shuffle dependencies. https://issues.apache.org/jira/browse/SPARK-2571.Google Scholar
- K. Ousterhout. Shuffle write time does not include time to open shuffle files. https://issues. apache.org/jira/browse/SPARK-3570.Google Scholar
- K. Ousterhout. Shuffle write time is incorrect for sort-based shuffle. https://issues.apache. org/jira/browse/SPARK-5762.Google Scholar
- K. Ousterhout. Spark big data benchmark and TPC-DS workload traces. http://eecs.berkeley.edu/~keo/traces.Google Scholar
- K. Ousterhout. Time to cleanup spilled shuffle files not included in shuffle write time. https://issues.apache.org/jira/browse/SPARK-5845.Google Scholar
- K. Ousterhout, A. Panda, J. Rosen, S. Venkataraman, R. Xin, S. Ratnasamy, S. Shenker, and I. Stoica. The Case for Tiny Tasks in Compute Clusters. In Proc. HotOS, 2013. Google Scholar
Digital Library
- A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A Comparison of Approaches to Large-scale Data Analysis. In Proc. SIGMOD, 2009. Google Scholar
Digital Library
- L. Popa, G. Kumar, M. Chowdhury, A. Krishnamurthy, S. Ratnasamy, and I. Stoica. FairCloud: Sharing The Network in Cloud Computing. In Proc. SIGCOMM, 2012. Google Scholar
Digital Library
- P. Prakash, A. Dixit, Y. C. Hu, and R. Kompella. The TCP Outcast Problem: Exposing Unfairness in Data Center Networks. In Proc. NSDI, 2012. Google Scholar
Digital Library
- A. Rasmussen, V. T. Lam, M. Conley, G. Porter, R. Kapoor, and A. Vahdat. Themis: An I/O-efficient MapReduce. In Proc. SoCC, 2012. Google Scholar
Digital Library
- K. Sakellis. Track local bytes read for shuffles - update UI. https://issues.apache.org/jira/browse/SPARK-5645.Google Scholar
- Transaction Processing Performance Council (TPC). TPC Benchmark DS Standard Specification. http://www.tpc.org/tpcds/spec/tpcds_1.1.0.pdf, 2012.Google Scholar
- UC Berkeley AmpLab. Big Data Benchmark. https://amplab.cs.berkeley.edu/benchmark/, February 2014.Google Scholar
- A. Wang and C. McCabe. In-memory Caching in HDFS: Lower Latency, Same Great Taste. In Presented at Hadoop Summit, 2014.Google Scholar
- D. Xie, N. Ding, Y. C. Hu, and R. Kompella. The Only Constant is Change: Incorporating Time-Varying Network Reservations in Data Centers. In Proc. SIGCOMM, 2012. Google Scholar
Digital Library
- N. J. Yadwadkar, G. Ananthanarayanan, and R. Katz. Wrangler: Predictable and Faster Jobs Using Fewer Resources. In Proc. SoCC, 2014. Google Scholar
Digital Library
- L. Yi, K. Wei, S. Huang, and J. Dai. Hadoop Benchmark Suite (HiBench). https://github.com/intel-hadoop/HiBench, 2012.Google Scholar
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In Proc. NSDI, 2012. Google Scholar
Digital Library
- M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce Performance in Heterogeneous Environments. In Proc. OSDI, 2008. Google Scholar
Digital Library
- J. Zhang, H. Zhou, R. Chen, X. Fan, Z. Guo, H. Lin, J. Y. Li, W. Lin, J. Zhou, and L. Zhou. Optimizing Data Shuffling in Data-Parallel Computation by Understanding User-Defined Functions. In Proc. NSDI, 2012. Google Scholar
Digital Library
Index Terms
Making sense of performance in data analytics frameworks
Recommendations
Making sense of performance in in-memory computing frameworks for scientific data analysis: A case study of the spark system
AbstractOver the last five years, Apache Spark has become a major software platform for in-memory data analysis. Acknowledging its widespread use, we present a comprehensive study of system characteristics of Spark targeting scientific data ...
Highlights- We develop a benchmark, ArrayBench, for benchmarking scientific data analytics that process gene expression matrices using Spark and SciDB.




Comments