Abstract
The rising complexity of distributed server applications in Internet data centers has made the tasks of modeling and analyzing their behavior increasingly difficult. This article presents Modellus, a novel system for automated modeling of complex web-based data center applications using methods from queuing theory, data mining, and machine learning. Modellus uses queuing theory and statistical methods to automatically derive models to predict the resource usage of an application and the workload it triggers; these models can be composed to capture multiple dependencies between interacting applications.
Model accuracy is maintained by fast, distributed testing, automated relearning of models when they change, and methods to bound prediction errors in composite models. We have implemented a prototype of Modellus, deployed it on a data center testbed, and evaluated its efficacy for modeling and analysis of several distributed multitier web applications. Our results show that this feature-based modeling technique is able to make predictions across several data center tiers, and maintain predictive accuracy (typically 95% or better) in the face of significant shifts in workload composition; we also demonstrate practical applications of the Modellus system to prediction and provisioning of real-world data center applications.
- Aguilera, M. K., Mogul, J. C., Wiener, J. L., Reynolds, P., and Muthitacharoen, A. 2003. Performance debugging for distributed systems of black boxes. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03). ACM, New York. Google Scholar
Digital Library
- Arlitt, M. and Jin, T. 1999. Workload characterization. Tech. rep. HPL-1999-35R1. 1998 World Cup Web Site, HP Labs.Google Scholar
- Ascher, D., Dubois, P., Hinsen, K., Hugunin, J., Oliphant, T., et al. 2001. Numerical Python. http: //www.numpy.org.Google Scholar
- Barham, P., Donnelly, A., Isaacs, R., and Mortier, R. 2004. Using Magpie for request extraction and workload modeling. In Proceedings of the OSDI. 259--272. Google Scholar
Digital Library
- Baskett, F., Chandy, K. M., Muntz, R. R., and Palacios, F. G. 1975. Open, closed, and mixed networks of queues with different classes of customers. J. ACM 22, 248--260. Google Scholar
Digital Library
- Bennani, M. N. and Menasce, D. A. 2005. Resource allocation for autonomic data centers using analytic performance models. In Proceedings of the 2nd International Conference on Automatic Computing (ICAC '05). IEEE, Los Alamitos, CA, 229--240. Google Scholar
Digital Library
- Cain, H. W., Rajwar, R., Marden, M., and Lipasti, M. H. 2001. An architectural evaluation of Java TPC-W. In Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA '01). Google Scholar
Digital Library
- Cecchet, E., Chanda, A., Elnikety, S., Marguerite, J., and Zwaenepoel, W. 2003. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware (Middleware '03).Google Scholar
- Chase, J., Anderson, D., Thakar, P., Vahdat, A., and Doyle, R. 2001. Managing energy and server resources in hosting centers. Oper. Syst. Rev. 35, 5, 103--116. Google Scholar
Digital Library
- Chen, J., Soundararajan, G., and Amza, C. 2006. Autonomic provisioning of backend databases in dynamic content web servers. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC '06), IEEE, Los Alamitos, CA, 231--242. Google Scholar
Digital Library
- Cohen, I., Chase, J. S., and Goldszmidt, M., et al. 2004. Correlating instrumentation data to system states: A building block for automated diagnosis and control. In Proceedings of the OSDI. 231--244. Google Scholar
Digital Library
- Denning, P. J. and Buzen, J. P. 1978. The operational analysis of queuing network models. ACM Comput. Surv. 10, 225--261. Google Scholar
Digital Library
- Doyle, R. P., Chase, J. S., and Asad, O. M., et al. 2003. Model-based resource provisioning in a web service utility. In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS'03). USENIX Association, Berkeley, CA. Google Scholar
Digital Library
- Draper, N. R. and Smith, H. 1998. Applied Regression Analysis. Wiley, New York.Google Scholar
- Efroymson, M.A. 1960. Mathematical Methods for Digital Computers. Wiley, New York.Google Scholar
- Foster, D. P. and George, E. I. 1994. The risk inflation criterion for multiple regression. Ann. Stat. 22, 4, 1947--1975.Google Scholar
Cross Ref
- Ghanbari, S., Soundararajan, G., and Amza, C. 2010. A query language and runtime tool for evaluating behavior of multi-tier servers. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '10). 131--142. Google Scholar
Digital Library
- Jackson, J. R. 1957. Networks of waiting lines. Oper. Res. 5, 518--521.Google Scholar
Digital Library
- Jiang, G., Chen, H., and Yoshihira, K. 2006. Discovering likely invariants of distributed transaction systems for autonomic system management. In Cluster Computing, Springer, Berlin, 199--208. Google Scholar
Digital Library
- jwebunit 2007. JWebUnit. http://jwebunit.sourceforge.net.Google Scholar
- Kamra, A., Misra, V., and Nahum, E. 2004. Yaksha: A controller for managing the performance of 3-tiered websites. In Proceedings of the 12th IEEE International Workshop on Quality of Service (IWQOS'04).Google Scholar
- Lazowska, E. D., Zahorjan, J., Graham, G. S., and Sevcik, K. C. 1984. Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, Upper Saddle River, NJ. Google Scholar
Digital Library
- Liu, Z., Wynter, L., Xia, C. H., and Zhang, F. 2006. Parameter inference of queueing models for it systems using end-to-end measurements. Perform. Eval. 63, 1, 36--60. Google Scholar
Digital Library
- Menasce, D., Almeida, V., Riedi, R., Ribeiro, F., Fonseca, R., and Meira, W., Jr. 2000. In search of invariants for e-business workloads. In Proceedings of the 2nd ACM Conference on Electronic Commerce (EC '00). ACM, New York, 56--65. Google Scholar
Digital Library
- Menasce, D. A. and Almeida, V. 2000. Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning 1st Ed., Prentice Hall, Upper Saddle River, NJ. Google Scholar
Digital Library
- Menasce, D. A., Almeida, V., and Dowdy, L. W. 2004. Performance by Design. Prentice Hall, Upper Saddle River, NJ.Google Scholar
- Mosberger, D. and Jin, T. 1998. httperf -- A tool for measuring web server performance. In Proceedings of the SIGMETRICS Workshop on Internet Server Performance. ACM, New York. Google Scholar
Digital Library
- openforbiz. 2007. The Apache “open for business” project. http://ofbiz.apache.org.Google Scholar
- Parekh, S., Gandhi, N., Hellerstein, J., Tilbury, D., Jayram, T. S., and Bigus, J. 2002. Using control theory to achieve service level objectives in performance management. Real-Time Syst. 23, 1, 127--141. Google Scholar
Digital Library
- Sharma, A. B., Bhagwan, R., Choudhury, M., Golubchik, L., Govindan, R., and Voelker, G. M. 2008. Automatic request categorization in internet services. SIGMETRICS Perform. Eval. Rev. 36, 2,16--25. Google Scholar
Digital Library
- Shivam, P., Babu, S., and Chase, J. 2006. Learning application models for utility resource planning. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC '06). IEEE, Los Alamitos, CA, 255--264. Google Scholar
Digital Library
- Smith, W. TPC-W: Benchmarking an ecommerce solution. http://www.tpc.org/information/other/techarticles .asp.Google Scholar
- Stewart, C. and Shen, K. 2005. Performance modeling and system management for multi-component online services. In Proceedings of the 2nd conference on Symposium on Networked Systems Design and Implementation (NSDI '05). Vol. 2, USENIX Associations, Berkeley, CA. Google Scholar
Digital Library
- Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and Tantawi, A. 2005. An analytical model for multi-tier internet services and its applications. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '05). ACM, New York. Google Scholar
Digital Library
- Zhang, Q., Cherkasova, L., Mathews, G., Greene, W., and Smirni, E. 2007a. R-capriccio: A capacity planning and anomaly detection tool for enterprise services with live workloads. In Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware (Middleware '07). Springer, Berlin, 244--265. Google Scholar
Digital Library
- Zhang, Q., Cherkasova, L., and Smirni, E. 2007b. A regression-based analytic model for dynamic resource provisioning of multi-tier applications. ACM Trans. Comput. Syst. 27, 3(Nov).Google Scholar
Index Terms
Modellus: Automated modeling of complex internet data center applications
Recommendations
A Lightweight Model for Estimating Energy Cost of Live Migration of Virtual Machines
CLOUD '13: Proceedings of the 2013 IEEE Sixth International Conference on Cloud ComputingLive migration, the process of moving a virtual machine (VM) interruption-free between physical hosts is a core concept in modern data centers. Power management strategies use live migration to consolidate services in a cluster environment and switch ...
An Inter-VM Communication Model Supporting Live Migration
CUBE '13: Proceedings of the 2013 International Conference on Cloud & Ubiquitous Computing & Emerging TechnologiesVirtualization technology in cloud environment is characterized by the property of sharing system resources among multiple operating systems. It also maintains isolation between the virtual machines. In virtualized environment, network intensive ...
A Virtual CPU Scheduling Model for I/O Performance in Paravirtualized Environments
RACS '17: Proceedings of the International Conference on Research in Adaptive and Convergent SystemsParavirtualization manages virtual machines and virtual resources efficiently by the communication between the virtualization layer and modified guest OSes. In a paravirtual environment, the I/O response of a virtual machine is hard to approach that of ...






Comments