skip to main content
research-article

Modellus: Automated modeling of complex internet data center applications

Published:04 June 2012Publication History
Skip Abstract Section

Abstract

The rising complexity of distributed server applications in Internet data centers has made the tasks of modeling and analyzing their behavior increasingly difficult. This article presents Modellus, a novel system for automated modeling of complex web-based data center applications using methods from queuing theory, data mining, and machine learning. Modellus uses queuing theory and statistical methods to automatically derive models to predict the resource usage of an application and the workload it triggers; these models can be composed to capture multiple dependencies between interacting applications.

Model accuracy is maintained by fast, distributed testing, automated relearning of models when they change, and methods to bound prediction errors in composite models. We have implemented a prototype of Modellus, deployed it on a data center testbed, and evaluated its efficacy for modeling and analysis of several distributed multitier web applications. Our results show that this feature-based modeling technique is able to make predictions across several data center tiers, and maintain predictive accuracy (typically 95% or better) in the face of significant shifts in workload composition; we also demonstrate practical applications of the Modellus system to prediction and provisioning of real-world data center applications.

References

  1. Aguilera, M. K., Mogul, J. C., Wiener, J. L., Reynolds, P., and Muthitacharoen, A. 2003. Performance debugging for distributed systems of black boxes. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP '03). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Arlitt, M. and Jin, T. 1999. Workload characterization. Tech. rep. HPL-1999-35R1. 1998 World Cup Web Site, HP Labs.Google ScholarGoogle Scholar
  3. Ascher, D., Dubois, P., Hinsen, K., Hugunin, J., Oliphant, T., et al. 2001. Numerical Python. http: //www.numpy.org.Google ScholarGoogle Scholar
  4. Barham, P., Donnelly, A., Isaacs, R., and Mortier, R. 2004. Using Magpie for request extraction and workload modeling. In Proceedings of the OSDI. 259--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Baskett, F., Chandy, K. M., Muntz, R. R., and Palacios, F. G. 1975. Open, closed, and mixed networks of queues with different classes of customers. J. ACM 22, 248--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bennani, M. N. and Menasce, D. A. 2005. Resource allocation for autonomic data centers using analytic performance models. In Proceedings of the 2nd International Conference on Automatic Computing (ICAC '05). IEEE, Los Alamitos, CA, 229--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cain, H. W., Rajwar, R., Marden, M., and Lipasti, M. H. 2001. An architectural evaluation of Java TPC-W. In Proceedings of the 7th International Symposium on High Performance Computer Architecture (HPCA '01). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cecchet, E., Chanda, A., Elnikety, S., Marguerite, J., and Zwaenepoel, W. 2003. In Proceedings of the ACM/IFIP/USENIX International Conference on Middleware (Middleware '03).Google ScholarGoogle Scholar
  9. Chase, J., Anderson, D., Thakar, P., Vahdat, A., and Doyle, R. 2001. Managing energy and server resources in hosting centers. Oper. Syst. Rev. 35, 5, 103--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chen, J., Soundararajan, G., and Amza, C. 2006. Autonomic provisioning of backend databases in dynamic content web servers. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC '06), IEEE, Los Alamitos, CA, 231--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cohen, I., Chase, J. S., and Goldszmidt, M., et al. 2004. Correlating instrumentation data to system states: A building block for automated diagnosis and control. In Proceedings of the OSDI. 231--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Denning, P. J. and Buzen, J. P. 1978. The operational analysis of queuing network models. ACM Comput. Surv. 10, 225--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Doyle, R. P., Chase, J. S., and Asad, O. M., et al. 2003. Model-based resource provisioning in a web service utility. In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems (USITS'03). USENIX Association, Berkeley, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Draper, N. R. and Smith, H. 1998. Applied Regression Analysis. Wiley, New York.Google ScholarGoogle Scholar
  15. Efroymson, M.A. 1960. Mathematical Methods for Digital Computers. Wiley, New York.Google ScholarGoogle Scholar
  16. Foster, D. P. and George, E. I. 1994. The risk inflation criterion for multiple regression. Ann. Stat. 22, 4, 1947--1975.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ghanbari, S., Soundararajan, G., and Amza, C. 2010. A query language and runtime tool for evaluating behavior of multi-tier servers. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '10). 131--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jackson, J. R. 1957. Networks of waiting lines. Oper. Res. 5, 518--521.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jiang, G., Chen, H., and Yoshihira, K. 2006. Discovering likely invariants of distributed transaction systems for autonomic system management. In Cluster Computing, Springer, Berlin, 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. jwebunit 2007. JWebUnit. http://jwebunit.sourceforge.net.Google ScholarGoogle Scholar
  21. Kamra, A., Misra, V., and Nahum, E. 2004. Yaksha: A controller for managing the performance of 3-tiered websites. In Proceedings of the 12th IEEE International Workshop on Quality of Service (IWQOS'04).Google ScholarGoogle Scholar
  22. Lazowska, E. D., Zahorjan, J., Graham, G. S., and Sevcik, K. C. 1984. Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Liu, Z., Wynter, L., Xia, C. H., and Zhang, F. 2006. Parameter inference of queueing models for it systems using end-to-end measurements. Perform. Eval. 63, 1, 36--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Menasce, D., Almeida, V., Riedi, R., Ribeiro, F., Fonseca, R., and Meira, W., Jr. 2000. In search of invariants for e-business workloads. In Proceedings of the 2nd ACM Conference on Electronic Commerce (EC '00). ACM, New York, 56--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Menasce, D. A. and Almeida, V. 2000. Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning 1st Ed., Prentice Hall, Upper Saddle River, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Menasce, D. A., Almeida, V., and Dowdy, L. W. 2004. Performance by Design. Prentice Hall, Upper Saddle River, NJ.Google ScholarGoogle Scholar
  27. Mosberger, D. and Jin, T. 1998. httperf -- A tool for measuring web server performance. In Proceedings of the SIGMETRICS Workshop on Internet Server Performance. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. openforbiz. 2007. The Apache “open for business” project. http://ofbiz.apache.org.Google ScholarGoogle Scholar
  29. Parekh, S., Gandhi, N., Hellerstein, J., Tilbury, D., Jayram, T. S., and Bigus, J. 2002. Using control theory to achieve service level objectives in performance management. Real-Time Syst. 23, 1, 127--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sharma, A. B., Bhagwan, R., Choudhury, M., Golubchik, L., Govindan, R., and Voelker, G. M. 2008. Automatic request categorization in internet services. SIGMETRICS Perform. Eval. Rev. 36, 2,16--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Shivam, P., Babu, S., and Chase, J. 2006. Learning application models for utility resource planning. In Proceedings of the IEEE International Conference on Autonomic Computing (ICAC '06). IEEE, Los Alamitos, CA, 255--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Smith, W. TPC-W: Benchmarking an ecommerce solution. http://www.tpc.org/information/other/techarticles .asp.Google ScholarGoogle Scholar
  33. Stewart, C. and Shen, K. 2005. Performance modeling and system management for multi-component online services. In Proceedings of the 2nd conference on Symposium on Networked Systems Design and Implementation (NSDI '05). Vol. 2, USENIX Associations, Berkeley, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and Tantawi, A. 2005. An analytical model for multi-tier internet services and its applications. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '05). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zhang, Q., Cherkasova, L., Mathews, G., Greene, W., and Smirni, E. 2007a. R-capriccio: A capacity planning and anomaly detection tool for enterprise services with live workloads. In Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware (Middleware '07). Springer, Berlin, 244--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Zhang, Q., Cherkasova, L., and Smirni, E. 2007b. A regression-based analytic model for dynamic resource provisioning of multi-tier applications. ACM Trans. Comput. Syst. 27, 3(Nov).Google ScholarGoogle Scholar

Index Terms

  1. Modellus: Automated modeling of complex internet data center applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 6, Issue 2
      May 2012
      137 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/2180861
      Issue’s Table of Contents

      Copyright © 2012 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 June 2012
      • Accepted: 1 October 2011
      • Revised: 1 March 2011
      • Received: 1 July 2010
      Published in tweb Volume 6, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!