Abstract
This paper addresses the problem of designing scaling strategies for elastic data stream processing. Elasticity allows applications to rapidly change their configuration on-the-fly (e.g., the amount of used resources) in response to dynamic workload fluctuations. In this work we face this problem by adopting the Model Predictive Control technique, a control-theoretic method aimed at finding the optimal application configuration along a limited prediction horizon in the future by solving an online optimization problem. Our control strategies are designed to address latency constraints, using Queueing Theory models, and energy consumption by changing the number of used cores and the CPU frequency through the Dynamic Voltage and Frequency Scaling (DVFS) support available in the modern multicore CPUs. The proactive capabilities, in addition to the latency- and energy-awareness, represent the novel features of our approach. To validate our methodology, we develop a thorough set of experiments on a high-frequency trading application. The results demonstrate the high-degree of flexibility and configurability of our approach, and show the effectiveness of our elastic scaling strategies compared with existing state-of-the-art techniques used in similar scenarios.
Supplemental Material
Available for Download
- Fastflow (ff). http://calvados.di.unipi.it/fastflow/.Google Scholar
- Ibm infosphere streams. http://www-03.ibm.com/software/products/en/infosphere-streams.Google Scholar
- Apache spark streaming. https://spark.apache.org/streaming.Google Scholar
- Apache storm. https://storm.apache.org.Google Scholar
- Enhanced intel speedstep technology for the intel pentium m processor, 2004. URL ftp://download.intel.com/design/network/papers/30117401.pdf.Google Scholar
- Joachim wuttke: lmfit a c library for levenberg-marquardt least-squares minimization and curve fitting, 2015. URL http://apps.jcns.fz-juelich.de/lmfit.Google Scholar
- T. Akidau, A. Balikov, K. Bekiroğlu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. Mill-wheel: Fault-tolerant stream processing at internet scale. Proc. VLDB Endow., 6(11):1033--1044, Aug. 2013. ISSN 2150-8097. doi: 10. 14778/2536222.2536229. Google Scholar
Digital Library
- M. Aldinucci, M. Danelutto, P. Kilpatrick, M. Meneghin, and M. Torquati. An efficient unbounded lock-free queue for multi-core systems. In Proceedings of the 18th International Conference on Parallel Processing, Euro-Par'12, pages 662--673, Berlin, Heidelberg, 2012. Springer-Verlag. ISBN 978-3-642-32819-0. Google Scholar
Digital Library
- H. Andrade, B. Gedik, and D. Turaga. Fundamentals of Stream Processing. Cambridge University Press, 2014. ISBN 9781139058940. Google Scholar
Digital Library
- B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '02, pages 1--16, New York, NY, USA, 2002. ACM. ISBN 1-58113-507-6. Google Scholar
Digital Library
- E. F. Camacho and C. Bordons, editors. Model predictive control. Springer-Verlag, Berlin Heidelberg, 2007.Google Scholar
- R. Castro Fernandez, M. Migliavacca, E. Kalyvianaki, and P. Pietzuch. Integrating scale out and fault tolerance in stream processing using operator state management. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 725--736, New York, NY, USA, 2013. ACM. doi: 10.1145/2463676.2465282. Google Scholar
Digital Library
- R. Fried and A. George. Exponential and holt-winters smoothing. In M. Lovric, editor, International Encyclopedia of Statistical Science, pages 488--490. Springer Berlin Heidelberg, 2014.Google Scholar
- B. Gedik, S. Schneider, M. Hirzel, and K.-L. Wu. Elastic scaling for data stream processing. Parallel and Distributed Systems, IEEE Transactions on, 25(6):1447--1463, June 2014. ISSN 1045-9219. Google Scholar
Digital Library
- V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, and P. Valduriez. Streamcloud: An elastic and scalable data streaming system. IEEE Trans. Parallel Distrib. Syst., 23(12):2351--2365, Dec. 2012. ISSN 1045-9219. Google Scholar
Digital Library
- M. Hähnel, B. Döbel, M. Völp, and H. Härtig. Measuring energy consumption for short code paths using rapl. SIGMETRICS Perform. Eval. Rev., 40(3):13--17, Jan. 2012. ISSN 0163-5999. doi: 10.1145/2425248.2425252. Google Scholar
Digital Library
- T. Heinze, Z. Jerzak, G. Hackenbroich, and C. Fetzer. Latency-aware elastic scaling for distributed data stream processing systems. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 13--22, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2737-4. Google Scholar
Digital Library
- J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, 2004. Google Scholar
Digital Library
- N. R. Herbst, N. Huber, S. Kounev, and E. Amrehn. Self-adaptive workload classification and forecasting for proactive resource provisioning. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering, ICPE '13, pages 187--198, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1636-1. doi: 10.1145/2479871.2479899. Google Scholar
Digital Library
- W. Hummer, B. Satzger, and S. Dustdar. Elastic stream processing in the cloud. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(5):333--345, 2013. ISSN 1942-4795.Google Scholar
Digital Library
- J. F. C. Kingman. On queues in heavy traffic. Journal of the Royal Statistical Society. Series B (Methodological), 24(2):pp. 383--392, 1962.Google Scholar
Cross Ref
- A. Kumbhare, Y. Simmhan, and V. Prasanna. Plasticc: Predictive look-ahead scheduling for continuous dataflows on clouds. In Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on, pages 344--353, May 2014. doi: 10.1109/CCGrid.2014.60.Google Scholar
Digital Library
- B. Lohrmann, P. Janacik, and O. Kao. Elastic stream processing with latency guarantees. In The 35th International Conference on Distributed Computing Systems (ICDCS 2015), page to appear, 2015.Google Scholar
Cross Ref
- G. Mencagli, M. Vanneschi, and E. Vespa. Control-theoretic adaptation strategies for autonomic reconfigurable parallel applications on cloud environments. In High Performance Computing and Simulation (HPCS), 2013 International Conference on, pages 11--18, July 2013. doi: 10.1109/HPCSim.2013.6641387.Google Scholar
Cross Ref
- G. Mencagli, M. Vanneschi, and E. Vespa. A cooperative predictive control approach to improve the reconfiguration stability of adaptive distributed parallel applications. ACM Trans. Auton. Adapt. Syst., 9 (1):2:1--2:27, Mar. 2014. ISSN 1556-4665. doi: 10.1145/2567929. URL http://doi.acm.org/10.1145/2567929. Google Scholar
Digital Library
- A. Miyoshi, C. Lefurgy, E. Van Hensbergen, R. Rajamony, and R. Rajkumar. Critical power slope: Understanding the runtime effects of frequency scaling. In Proceedings of the 16th International Conference on Supercomputing, ICS '02, pages 35--44, New York, NY, USA, 2002. ACM. ISBN 1-58113-483-5. Google Scholar
Digital Library
- R. A. Shafik, A. Das, S. Yang, G. Merrett, and B. M. Al-Hashimi. Adaptive energy minimization of openmp parallel applications on many-core systems. In Proceedings of the 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, PARMA-DITAM '15, pages 19--24, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3343-6. doi: 10.1145/2701310. 2701311. Google Scholar
Digital Library
- M. Shah, J. Hellerstein, S. Chandrasekaran, and M. Franklin. Flux: an adaptive partitioning operator for continuous query systems. In Data Engineering, 2003. Proceedings. 19th International Conference on, pages 25--36, March 2003.Google Scholar
Cross Ref
- D. Sun, G. Zhang, S. Yang, W. Zheng, S. U. Khan, and K. Li. Restream: Real-time and energy-efficient resource scheduling in big data stream computing environments. Information Sciences, 319:92--112, 2015. ISSN 0020-0255. doi: http://dx.doi.org/10.1016/j.ins.2015.03.027.Google Scholar
Digital Library
- V. V. Vazirani. Approximation Algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001. ISBN 3-540-65367-8. Google Scholar
Digital Library
- U. Verner, A. Schuster, and M. Silberstein. Processing data streams with hard real-time constraints on heterogeneous systems. In Proceedings of the International Conference on Supercomputing, ICS '11, pages 120--129, New York, NY, USA, 2011. ACM. Google Scholar
Digital Library
Recommendations
Keep calm and react with foresight: strategies for low-latency and energy-efficient elastic data stream processing
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingThis paper addresses the problem of designing scaling strategies for elastic data stream processing. Elasticity allows applications to rapidly change their configuration on-the-fly (e.g., the amount of used resources) in response to dynamic workload ...
Proactive elasticity and energy awareness in data stream processing
We design a predictive methodology for elastic data stream processing.We exploit Model Predictive Control to design the predictive controller.We regulate the number of used cores and the CPU frequency.The approach targets multicore-based shared-memory ...
Elastic Stream Computing with Clouds
CLOUD '11: Proceedings of the 2011 IEEE 4th International Conference on Cloud ComputingStream computing, also known as data stream processing, has emerged as a new processing paradigm that processes incoming data streams from tremendous numbers of sensors in a real-time fashion. Data stream applications must have low latency even when the ...






Comments