Abstract
Automated tools for understanding application behavior and its changes during the application lifecycle are essential for many performance analysis and debugging tasks. Application performance issues have an immediate impact on customer experience and satisfaction. A sudden slowdown of enterprise-wide application can effect a large population of customers, lead to delayed projects, and ultimately can result in company financial loss. Significantly shortened time between new software releases further exacerbates the problem of thoroughly evaluating the performance of an updated application. Our thesis is that online performance modeling should be a part of routine application monitoring. Early, informative warnings on significant changes in application performance should help service providers to timely identify and prevent performance problems and their negative impact on the service. We propose a novel framework for automated anomaly detection and application change analysis. It is based on integration of two complementary techniques: (i) a regression-based transaction model that reflects a resource consumption model of the application, and (ii) an application performance signature that provides a compact model of runtime behavior of the application. The proposed integrated framework provides a simple and powerful solution for anomaly detection and analysis of essential performance changes in application behavior. An additional benefit of the proposed approach is its simplicity: It is not intrusive and is based on monitoring data that is typically available in enterprise production environments. The introduced solution further enables the automation of capacity planning and resource provisioning tasks of multitier applications in rapidly evolving IT environments.
- Aguilera, M., Mogul, J., Wiener, J., Reynolds, P., and Muthitacharoen, A. 2003. Performance debugging for distributed systems of black boxes. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP'03). Google Scholar
Digital Library
- Arlitt, M. and Farkas, K. 2005. The case for data assurance. HP laboratories rep. No. HPL-2005-38. http://www.hpl.hp.com/techreports/2005/HPL-2005-38.html.Google Scholar
- Barham, P., Donnelly, A., Isaacs, R., and Mortier, R. 2004. Using Magpie for request extraction and workload modelling. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI'04). Google Scholar
Digital Library
- BMC. ProactiveNet. http://www.bmc.com/.Google Scholar
- CA Willy Introscope. http://www.ca.com/us/application-management-solution.aspx.Google Scholar
- Chen, M., Accardi, A., Kiciman, E., Lloyd, J., Patterson, D., Fox, A., and Brewer, E. 2004. Path-based failure and evolution management. In Proceedings of the 1st International Conference on Networked Systems Design and Implementation (NSDI'04). Google Scholar
Digital Library
- Cherkasova, L., Fu, Y., Tang, W., and Vahdat, A. 2003. Measuring and characterizing end-to-end Internet service performance. ACM/IEEE Trans. Internet Technol. Google Scholar
Digital Library
- Cherkasova, L. and Karlsson, M. 2001. Dynamics and evolution of Web sites: Analysis, metrics and design issues. In Proceedings of the 6th International Symposium on Computers and Communications (ISCC'01). Google Scholar
Digital Library
- Chou, T. 2004. The End of Software: Transforming Your Business for the On Demand Future. Sams.Google Scholar
- Cohen, I., Zhang, S., Goldszmidt, M., Symons, J., Kelly, T., and Fox, A. 2005. Capturing, indexing, clustering, and retrieving system history. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). Google Scholar
Digital Library
- Douglis, F., Feldmann, A., and Krishnamurthy, B. 1997. Rate of change and other metrics: A live study of the World Wide Web. In Proceedings of USENIX Symposium on Internet Technologies and Systems. Google Scholar
Digital Library
- Draper, N. R. and Smith, H. 1998. Applied Regression Analysis. John Wiley&Sons.Google Scholar
- IBM Corporation. Tivoli Web management solutions. http://www.tivoli.com/products/demos/twsm.html.Google Scholar
- Menasce, D., Almeida, V., and Dowdy, L. 1994. Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems. Prentice Hall. Google Scholar
Digital Library
- Mercury Diagnostics. http://www.mercury.com/us/products/diagnostics/.Google Scholar
- Mercury Real User Monitor. http://www.mercury.com/us/products/business-availability-center/end-use%r-management/real-user-monitor/.Google Scholar
- Mi, N., Cherkasova, L., Ozonat, K., Symons, J., and Smirni, E. 2008. Analysis of application performance and its change via representative application signatures. In Proceedings of the Network Operations and Management Symposium (NOMS'08).Google Scholar
- NetQoS Inc. http://www.netqos.com.Google Scholar
- Netuitive Inc. http://www.netuitive.com/.Google Scholar
- Nimsoft Co. http://www.nimsoft.com/solutions/.Google Scholar
- Quest Software Inc. Performasure. http://http://www.quest.com/performasure.Google Scholar
- Rajamony, R. and Elnozahy, M. 2001. Measuring client-perceived response times on the WWW. In Proceedings of the 3rd USENIX Symposium on Internet Technologies and Systems (USITS). Google Scholar
Digital Library
- Stewart, C., Kelly, T., and Zhang, A. 2007. Exploiting nonstationarity for performance prediction. In Proceedings of the Conference EuroSys Conference. Google Scholar
Digital Library
- Symantec I3. Application performance management. http://www.symantec.com/business/products/.Google Scholar
- TPC-W Benchmark. http://www.tpc.org.Google Scholar
- Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and Tantawi, A. 2005. An analytical model for multi-tier Internet services and its applications. In Proceedings of the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'05). Google Scholar
Digital Library
- Zhang, Q., Cherkasova, L., Mathews, G., Greene, W., and Smirni, E. 2007a. R-Capriccio: A capacity planning and anomaly detection tool for enterprise services with live workloads. In Proceedings of the 8th ACM/IFIP/USENIX International Middleware Conference (Middleware'07). Google Scholar
Digital Library
- Zhang, Q., Cherkasova, L., and Smirni, E. 2007b. A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In Proceedings of the 4th IEEE International Conference on Autonomic Computing (ICAC'07). Google Scholar
Digital Library
Index Terms
Automated anomaly detection and performance modeling of enterprise applications
Recommendations
Application Layer Anomaly Detection Based on HSMM
IFITA '10: Proceedings of the 2010 International Forum on Information Technology and Applications - Volume 02Today more and more network-based attacks occur at application layer. Observed from the network layer and transport layer, these attacks may not contain significant malicious activities, and generate abnormal network traffic. However, traditional ...
Maximized subspace model for hyperspectral anomaly detection
An important application in remote sensing using hyperspectral imaging system is the detection of anomalies in a large background in real-time. A basic anomaly detector for hyperspectral imagery that performs reasonaly well is the RX detector. In ...
Using hardware counter-based performance model to diagnose scaling issues of HPC applications
Performance diagnosing for HPC applications can be extremely difficult due to their complicated performance behaviors. One hand, developers used to identify the potential performance bottlenecks by conducting detailed instrumentation, which may ...






Comments