skip to main content
research-article

A User-Friendly Log Viewer for Storage Systems

Published:12 May 2016Publication History
Skip Abstract Section

Abstract

System log files contains messages emitted from several modules within a system and carries valuable information about the system state such as device status and error conditions and also about the various tasks within the system such as program names, execution path, including function names and parameters, and the task completion status. For customers with remote support, the system collects and transmits these logs to a central enterprise repository, where these are monitored for alerts, problem forecasting, and troubleshooting.

Very large log files limit the interpretability for the support engineers. For an expert, a large volume of log messages may not pose any problem; however, an inexperienced person may get flummoxed due to the presence of a large number of log messages. Often it is desired to present the log messages in a comprehensive manner where a person can view the important messages first and then go into details if required.

In this article, we present a user-friendly log viewer where we first hide the unimportant or inconsequential messages from the log file. A user can then click a particular hidden view and get the details of the hided messages. Messages with low utility are considered inconsequential as their removal does not impact the end user for the aforesaid purpose such as problem forecasting or troubleshooting. We relate the utility of a message to the probability of its appearance in the due context. We present machine-learning-based techniques that computes the usefulness of individual messages in a log file. We demonstrate identification and discarding of inconsequential messages to shrink the log size to acceptable limits. We have tested this over real-world logs and observed that eliminating such low value data can reduce the log files significantly (30% to 55%), with minimal error rates (7% to 20%). When limited user feedback is available, we show modifications to the technique to learn the user intent and accordingly further reduce the error.

References

  1. R. Agrawal, T. Imieliński, and A. Swami. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93). ACM New York, NY, 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Alspaugh, Beidi Chen, Jessica Lin, Archana Ganapathi, Marti Hearst, and Randy Katz. 2014. Analyzing log analysis: An empirical study of user log mining. In 28th Large Installation System Administration Conference (LISA14). USENIX Association, Seattle, WA, 62--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Christopher J. C. Burges. 1998. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2, 2 (1998), 121--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recog. Lett. 27, 8 (2006), 861--874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anil K. Jain, Jianchang Mao, and K. Moidin Mohiuddin. 1996. Artificial neural networks: A tutorial. Computer 29, 3 (1996), 31--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Weihang Jiang, Chongfeng Hu, Shankar Pasupathy, Arkady Kanevsky, Zhenmin Li, and Yuanyuan Zhou. 2009. Understanding customer problem troubleshooting from storage system logs. In FAST, Vol. 9. 43--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Koshy. 2007. PMC based Performance Measurement in FreeBSD. Retrieved from http://people.freebsd.org/∼jkoshy/projects/perf-measurement.Google ScholarGoogle Scholar
  8. Time Kramer. 2003. Effective Log Reduction and Analysis Using Linux and Open Source Tools. Retrieved from http://www.giac.org/paper/gsec/3144/effective-log-reduction-analysis-linux-open-source-tools/105234.Google ScholarGoogle Scholar
  9. Yinglung Liang, Yanyong Zhang, Hui Xiong, and Ramendra Sahoo. 2007. An adaptive semantic filter for blue gene/L failure log analysis. In Proceedings of the 3rd International Workshop on System Management Techniques, Processes, and Services (SMTPS).Google ScholarGoogle ScholarCross RefCross Ref
  10. Andrew L. Maas and Andrew Y. Ng. 2010. A probabilistic model for semantic word vectors. In Proceedings of the Workshop on Deep Learning and Unsupervised Feature Learning, NIPS, Vol. 10.Google ScholarGoogle Scholar
  11. Network Appliance. 2007. Proactive health management with autosupport. http://www.netapp.com/us/media/wp-7027.pdf.Google ScholarGoogle Scholar
  12. W. Peng, T. Li, and S. Ma. 2005. Mining logs files for data-driven system management. ACM SIGKDD Explor. Newslett. 7, 1 (2005), 44--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Christian S. Perone. 2009. Pyevolve: A Python open-source framework for genetic algorithms. SIGEVOlution 4, 1 (Nov. 2009), 12--20. DOI:http://dx.doi.org/10.1145/1656395.1656397 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. R. Sarukkai. 2000. Link prediction and path analysis using Markov chains. In Proceedings of the 9th International World Wide Web Conference on Computer Networks : The International Journal of Computer and Telecommunications Netowrking. North-Holland Publishing Co., Amsterdam, The Netherlands, 377--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. A. Shahrestani, M. Feily, R. Ahmad, and S. Ramadass. 2010. Discovery of invariant BOT behaviour through visual network monitoring system. In Proceedings of the 2010 Fourth International Conference on Emerging Security Information, Systems and Technologies. 182--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Darrell Whitley. 1994. A genetic algorithm tutorial. Stat. Comput. 4, 2 (1994), 65--85.Google ScholarGoogle ScholarCross RefCross Ref
  17. Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael Jordan. 2010. Experience mining google.s production console logs. In Proceedings of the SLAML (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Z. Ziming, L. Zhiling, B. H. Park, and A. Geist. 2009. System log pre-processing to improve failure prediction. In Proceedings of the IEEE/IFIP International Conference on Dependable Systems & Networks (DSN’’09). 572--577.Google ScholarGoogle Scholar

Index Terms

  1. A User-Friendly Log Viewer for Storage Systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Storage
            ACM Transactions on Storage  Volume 12, Issue 3
            June 2016
            237 pages
            ISSN:1553-3077
            EISSN:1553-3093
            DOI:10.1145/2932205
            Issue’s Table of Contents

            Copyright © 2016 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 May 2016
            • Accepted: 1 November 2015
            • Revised: 1 July 2015
            • Received: 1 January 2015
            Published in tos Volume 12, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!