skip to main content
article

Improved error reporting for software that uses black-box components

Published:10 June 2007Publication History
Skip Abstract Section

Abstract

An error occurs when software cannot complete a requested action as a result of some problem with its input, configuration, or environment. A high-quality error report allows a user to understand and correct the problem. Unfortunately, the quality of error reports has been decreasing as software becomes more complex and layered. End-users take the cryptic error messages given to them by programsand struggle to fix their problems using search engines and support websites. Developers cannot improve their error messages when they receive an ambiguous or otherwise insufficient error indicator from a black-box software component.

We introduce Clarify, a system that improves error reporting by classifying application behavior. Clarify uses minimally invasive monitoring to generate a behavior profile, which is a summary of the program's execution history. A machine learning classifier uses the behavior profile to classify the application's behavior, thereby enabling a more precise error report than the output of the application itself.

We evaluate a prototype Clarify system on ambiguous error messages generated by large, modern applications like gcc, La-TeX, and the Linux kernel. For a performance cost of less than 1% on user applications and 4.7% on the Linux kernel, the proto type correctly disambiguates at least 85% of application behaviors that result in ambiguous error reports. This accuracy does not degrade significantly with more behaviors: a Clarify classifier for 81 La-TeX error messages is at most 2.5% less accurate than a classifier for 27 LaTeX error messages. Finally, we show that without any human effort to build a classifier, Clarify can provide nearest-neighbor software support, where users who experience a problem are told about 5 other users who might have had the same problem. On average 2.3 of the 5 users that Clarify identifies have experienced the same problem.

References

  1. M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds, and A. Muthitacharoen. Performance debugging for distributed systems of black boxes. In SOSP, Bolton Landing, NY, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performacne counters with flow and context sensitive profiling. In PLDI '97, pages 4--16, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrew Ayers, Christopher Metcalf, Junghwan Rhee, Richard Schooler, Anant Agarwal, and Emmett Witchel. Traceback: First fault diagnosis by reconstruction of distributed control flow. In PLDI, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In PLDI, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Ball and J. R. Larus. Efficient path profiling. In MICRO, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Barrett, E. Haber, E. Kandogan, P. P. Maglio, M. Prabaker, and L. A. Takayama. Field studies of computer system administrators: Analysis of system management tools and practices. In ACM CSCW (Computer-supported Cooperative Work), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Berkman. Bug-buddy -- GNOME bug-reporting utility, 2004. http://directory.fsf.org/All_Packages_in_Directory/bugbuddy.html.Google ScholarGoogle Scholar
  8. J. F. Bowring, J. M. Rehg, and M. J. Harrold. Active learning for automatic classification of software behavior. In ISSTA, Jul 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Justin Brickell, Donald E. Porter, Vitaly Shmatikov, and Emmett Witchel. Secure remote software diagnostics, Under review.Google ScholarGoogle Scholar
  10. M. Brodie, Sheng Ma, G. Lohman, L. Mignet, N. Modani, M. Wilding, J. Champlin, and P. Sohn. Quickly finding known software problems via automated symptom matching. In ICAC'05, pages 101--110, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Derek Bruening, Timothy Garnett, and Saman Amarasinghe. An infrastructure for adaptive dynamic optimization. In CGO-03, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Brun and M. D. Ernst. Finding latent code errors via machine learning over program executions. In ICSE, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bryan Cantrill and Mike Shapiro and Adam Leventhal. Dtrace, 2006. http://www.genunix.org/wiki/index.php/DTrace_FAQ.Google ScholarGoogle Scholar
  14. Trishul M. Chilimbi and Vinod Ganapathy. Heapmd: Identifying heap-based bugs using anomaly detection. In ASPLOS '06, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Latex Error Classes. http://www.cs.utexas.edu/users/habals/clarify/latex_errors.html, 2006.Google ScholarGoogle Scholar
  16. Microsoft corporation. Privacy statement for the microsoft error reporting service, 2006.Google ScholarGoogle Scholar
  17. Microsoft corporation. Reporting and solving computer problems, 2006.Google ScholarGoogle Scholar
  18. Microsoft Corporation. What information is sent to Microsoft when I report a problem?, 2006.Google ScholarGoogle Scholar
  19. Jason V. Davis, Jungwoo Ha, Christopher J. Rossbach, Hany E. Ramadan, and Emmett Witchel. Cost-sensitive decision tree learning for forensic classification. In ECML, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In ICSE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. J. Harrold, G. Rothermel, K. Sayre, R. Wu, and L. Yi. An empirical investigation of the relationship between fault-revealing test behavior and differences in program spectra. In Journal of Software Testing, Verification and Reliability, vol 10, no 3, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Humphreys and V. Turner. On-demand enterprises and utility computing: A current market assessment and outlook. Technical report, IDC, Jul 2004.Google ScholarGoogle Scholar
  23. M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments on the effectiveness of dataflow-- and controlflow-based test adequacy criteria. In ICSE, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jim Keniston and Prasanna S Panchamukhi. Kernel Probes (Kprobes),2006. Documentation/kprobes.txt.Google ScholarGoogle Scholar
  25. N. Lao, J. Wen, W. Ma, and Y. Wang. Combining high level symptom descriptions and low level state information for configuration fault diagnosis. In LISA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Liu, X. Yang, H.Yu, J. Han, and P. S. Yu. Mining behavior graphs for "backtrace" of noncrashing bugs. In Proc. of 2005 SIAM Int. Conf. on Data Mining (SDM05), 2005.Google ScholarGoogle ScholarCross RefCross Ref
  29. Microsoft Corporation. Dr. Watson Overview, 2002. http://www.microsoft.com/TechNet/prodtechnol/winxppro/proddocs/drwatson_overview.asp.Google ScholarGoogle Scholar
  30. Microsoft Corporation. Online Crash Analysis, 2004. http://oca.microsoft.com/.Google ScholarGoogle Scholar
  31. A. Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, J. Sun, and B. Wang. Automated support for classifying software failure reports. In ICSE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. T. Reps, T. Ball, M. Das, and J. Larus. The use of program profiling for software maintenance with applications to the year 2000 problem. In M. Jazayeri and H. Schauer, editors, ESEC/FSE 97, pages 432--449. Springer-Verlag, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Rubber. http://www.pps.jussieu.fr/_beffara/soft/rubber, 2007.Google ScholarGoogle Scholar
  35. Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. Automatically characterizing large scale program behavior. In ASPLOS, Oct 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Somayaji and S. Forrest. Automated response using system-call delays. In Proceedings of 9th Usenix Security Symposium, August 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ariel Tamches and Barton P. Miller. Fine-grained dynamic instrumentation of commodity operating system kernels. In OSDI, pages 117--130, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H. J. Wang, J. C. Platt, Y. Chen, R. Zhang, and Y. Wang. Automatic misconfiguration troubleshooting with PeerPressure. In OSDI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. I. Witten and E. Frank. Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Yuan, N. Lao, J. Wen, J. Li, Z. Zhang, Y. Wang, and W. Ma. Automated known problem diagnosis with event traces. MSR-TR-2005--81, 2005.Google ScholarGoogle Scholar
  41. C. Yuan, N. Lao, J. Wen, J. Li, Z. Zhang, Y. Wang, and W. Ma. Automated known problem diagnosis with event traces. In EuroSys, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Xiaotong Zhuang, Mauricio J. Serrano, Harold W. Cain, and Jong-Deok Choi. Accurate, efficient, and adaptive calling context profiling. In PLDI, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improved error reporting for software that uses black-box components

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 42, Issue 6
        Proceedings of the 2007 PLDI conference
        June 2007
        491 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1273442
        Issue’s Table of Contents
        • cover image ACM Conferences
          PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
          June 2007
          508 pages
          ISBN:9781595936332
          DOI:10.1145/1250734

        Copyright © 2007 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 June 2007

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!