skip to main content
research-article

Automated anomaly detection and performance modeling of enterprise applications

Published:27 November 2009Publication History
Skip Abstract Section

Abstract

Automated tools for understanding application behavior and its changes during the application lifecycle are essential for many performance analysis and debugging tasks. Application performance issues have an immediate impact on customer experience and satisfaction. A sudden slowdown of enterprise-wide application can effect a large population of customers, lead to delayed projects, and ultimately can result in company financial loss. Significantly shortened time between new software releases further exacerbates the problem of thoroughly evaluating the performance of an updated application. Our thesis is that online performance modeling should be a part of routine application monitoring. Early, informative warnings on significant changes in application performance should help service providers to timely identify and prevent performance problems and their negative impact on the service. We propose a novel framework for automated anomaly detection and application change analysis. It is based on integration of two complementary techniques: (i) a regression-based transaction model that reflects a resource consumption model of the application, and (ii) an application performance signature that provides a compact model of runtime behavior of the application. The proposed integrated framework provides a simple and powerful solution for anomaly detection and analysis of essential performance changes in application behavior. An additional benefit of the proposed approach is its simplicity: It is not intrusive and is based on monitoring data that is typically available in enterprise production environments. The introduced solution further enables the automation of capacity planning and resource provisioning tasks of multitier applications in rapidly evolving IT environments.

References

  1. Aguilera, M., Mogul, J., Wiener, J., Reynolds, P., and Muthitacharoen, A. 2003. Performance debugging for distributed systems of black boxes. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP'03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Arlitt, M. and Farkas, K. 2005. The case for data assurance. HP laboratories rep. No. HPL-2005-38. http://www.hpl.hp.com/techreports/2005/HPL-2005-38.html.Google ScholarGoogle Scholar
  3. Barham, P., Donnelly, A., Isaacs, R., and Mortier, R. 2004. Using Magpie for request extraction and workload modelling. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. BMC. ProactiveNet. http://www.bmc.com/.Google ScholarGoogle Scholar
  5. CA Willy Introscope. http://www.ca.com/us/application-management-solution.aspx.Google ScholarGoogle Scholar
  6. Chen, M., Accardi, A., Kiciman, E., Lloyd, J., Patterson, D., Fox, A., and Brewer, E. 2004. Path-based failure and evolution management. In Proceedings of the 1st International Conference on Networked Systems Design and Implementation (NSDI'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cherkasova, L., Fu, Y., Tang, W., and Vahdat, A. 2003. Measuring and characterizing end-to-end Internet service performance. ACM/IEEE Trans. Internet Technol. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cherkasova, L. and Karlsson, M. 2001. Dynamics and evolution of Web sites: Analysis, metrics and design issues. In Proceedings of the 6th International Symposium on Computers and Communications (ISCC'01). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chou, T. 2004. The End of Software: Transforming Your Business for the On Demand Future. Sams.Google ScholarGoogle Scholar
  10. Cohen, I., Zhang, S., Goldszmidt, M., Symons, J., Kelly, T., and Fox, A. 2005. Capturing, indexing, clustering, and retrieving system history. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Douglis, F., Feldmann, A., and Krishnamurthy, B. 1997. Rate of change and other metrics: A live study of the World Wide Web. In Proceedings of USENIX Symposium on Internet Technologies and Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Draper, N. R. and Smith, H. 1998. Applied Regression Analysis. John Wiley&Sons.Google ScholarGoogle Scholar
  13. IBM Corporation. Tivoli Web management solutions. http://www.tivoli.com/products/demos/twsm.html.Google ScholarGoogle Scholar
  14. Menasce, D., Almeida, V., and Dowdy, L. 1994. Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mercury Diagnostics. http://www.mercury.com/us/products/diagnostics/.Google ScholarGoogle Scholar
  16. Mercury Real User Monitor. http://www.mercury.com/us/products/business-availability-center/end-use%r-management/real-user-monitor/.Google ScholarGoogle Scholar
  17. Mi, N., Cherkasova, L., Ozonat, K., Symons, J., and Smirni, E. 2008. Analysis of application performance and its change via representative application signatures. In Proceedings of the Network Operations and Management Symposium (NOMS'08).Google ScholarGoogle Scholar
  18. NetQoS Inc. http://www.netqos.com.Google ScholarGoogle Scholar
  19. Netuitive Inc. http://www.netuitive.com/.Google ScholarGoogle Scholar
  20. Nimsoft Co. http://www.nimsoft.com/solutions/.Google ScholarGoogle Scholar
  21. Quest Software Inc. Performasure. http://http://www.quest.com/performasure.Google ScholarGoogle Scholar
  22. Rajamony, R. and Elnozahy, M. 2001. Measuring client-perceived response times on the WWW. In Proceedings of the 3rd USENIX Symposium on Internet Technologies and Systems (USITS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Stewart, C., Kelly, T., and Zhang, A. 2007. Exploiting nonstationarity for performance prediction. In Proceedings of the Conference EuroSys Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Symantec I3. Application performance management. http://www.symantec.com/business/products/.Google ScholarGoogle Scholar
  25. TPC-W Benchmark. http://www.tpc.org.Google ScholarGoogle Scholar
  26. Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and Tantawi, A. 2005. An analytical model for multi-tier Internet services and its applications. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Zhang, Q., Cherkasova, L., Mathews, G., Greene, W., and Smirni, E. 2007a. R-Capriccio: A capacity planning and anomaly detection tool for enterprise services with live workloads. In Proceedings of the 8th ACM/IFIP/USENIX International Middleware Conference (Middleware'07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhang, Q., Cherkasova, L., and Smirni, E. 2007b. A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In Proceedings of the 4th IEEE International Conference on Autonomic Computing (ICAC'07). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automated anomaly detection and performance modeling of enterprise applications

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!