skip to main content
research-article
Open Access

Long-Span Program Behavior Modeling and Attack Detection

Published:20 September 2017Publication History
Skip Abstract Section

Abstract

Intertwined developments between program attacks and defenses witness the evolution of program anomaly detection methods. Emerging categories of program attacks, e.g., non-control data attacks and data-oriented programming, are able to comply with normal trace patterns at local views. This article points out the deficiency of existing program anomaly detection models against new attacks and presents long-span behavior anomaly detection (LAD), a model based on mildly context-sensitive grammar verification. The key feature of LAD is its reasoning of correlations among arbitrary events that occurred in long program traces. It extends existing correlation analysis between events at a stack snapshot, e.g., paired call and ret, to correlation analysis among events that historically occurred during the execution. The proposed method leverages specialized machine learning techniques to probe normal program behavior boundaries in vast high-dimensional detection space. Its two-stage modeling/detection design analyzes event correlation at both binary and quantitative levels. Our prototype successfully detects all reproduced real-world attacks against sshd, libpcre, and sendmail. The detection procedure incurs 0.1 ms to 1.3 ms overhead to profile and analyze a single behavior instance that consists of tens of thousands of function call or system call events.

References

  1. Ulrich Bayer, Paolo Milani Comparetti, Clemens Hlauschek, Christopher Krügel, and Engin Kirda. 2009. Scalable, behavior-based malware clustering. In Proceedings of the Network and Distributed System Security Symposium. The Internet Society, Reston, VA, 8--11.Google ScholarGoogle Scholar
  2. Johan Behrenfeldt. 2009. A Linguist’s Survey of Pumping Lemmata. Master’s thesis. University of Gothenburg, Sweden.Google ScholarGoogle Scholar
  3. S. Bhatkar, A. Chaturvedi, and R. Sekar. 2006. Dataflow anomaly detection. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 15--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David Brumley, Dawn Xiaodong Song, Tzicker Chiueh, Rob Johnson, and Huijia Lin. 2007. RICH: Automatically protecting against integer-based vulnerabilities. In Proceedings of the Network and Distributed System Security Symposium. The Internet Society, Reston, VA, 1--28.Google ScholarGoogle Scholar
  5. Davide Canali, Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu, and Engin Kirda. 2012. A quantitative study of accuracy in system call-based malware detection. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, New York, NY, 122--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Shuo Chen, Jun Xu, Emre C. Sezer, Prachi Gauriar, and Ravishankar K. Iyer. 2005. Non-control-data attacks are realistic threats. In Proceedings of the USENIX Security Symposium, Vol. 14. USENIX Association, Berkeley, CA, 12--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Marco Cova, Davide Balzarotti, Viktoria Felmetsger, and Giovanni Vigna. 2007. Swaddler: An approach for the anomaly-based detection of state violations in web applications. In Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses. Springer, Berlin, Germany, 63--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dorothy E. Denning. 1987. An intrusion-detection model. IEEE Transactions on Software Engineering 13, 2 (1987), 222--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Henry Hanping Feng, Jonathon T. Giffin, Yong Huang, Somesh Jha, Wenke Lee, and Barton P. Miller. 2004. Formalizing sensitivity in static analysis for intrusion detection. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 194--208.Google ScholarGoogle Scholar
  10. Henry Hanping Feng, Oleg M. Kolesnikov, Prahlad Fogla, Wenke Lee, and Weibo Gong. 2003. Anomaly detection using call stack information. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 62--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stephanie Forrest, Steven Hofmeyr, and Anil Somayaji. 2008. The evolution of system-call monitoring. In Proceedings of the Annual Computer Security Applications Conference. IEEE Computer Society, Washington, DC, 418--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Stephanie Forrest, Steven A. Hofmeyr, Anil Somayaji, and Thomas A. Longstaff. 1996. A sense of self for Unix processes. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 120--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Matt Fredrikson, Somesh Jha, Mihai Christodorescu, Reiner Sailer, and Xifeng Yan. 2010. Synthesizing near-optimal malware specifications from suspicious behaviors. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 45--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alessandro Frossi, Federico Maggi, Gian Luigi Rizzo, and Stefano Zanero. 2009. Selecting and improving system call models for anomaly detection. In Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, Hamburg, Germany, 206--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Debin Gao, Michael K. Reiter, and Dawn Song. 2006. Behavioral distance measurement using hidden Markov models. In Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses. Springer, Hamburg, Germany, 19--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xinyang Ge, Weidong Cui, and Trent Jaeger. 2017. GRIFFIN: Guarding control flows using intel processor trace. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 585--598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Anup K. Ghosh, James Wanken, and Frank Charron. 1998. Detecting anomalous and unknown intrusions against programs. In Proceedings of the 14th Annual Computer Security Applications Conference. IEEE Computer Society, Washington, DC, 259--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jonathon T. Giffin, David Dagon, Somesh Jha, Wenke Lee, and Barton P. Miller. 2006. Environment-sensitive intrusion detection. In Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses. Springer, Hamburg, Germany, 185--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jonathon T. Giffin, Somesh Jha, and Barton P. Miller. 2004. Efficient context-sensitive intrusion detection. In Proceedings of the Network and Distributed System Security Symposium. The Internet Society, Reston, VA, 0.Google ScholarGoogle Scholar
  20. Rajeev Gopalakrishna, Eugene H. Spafford, and Jan Vitek. 2005. Efficient intrusion detection using automaton inlining. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 18--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Guofei Gu, Phillip A. Porras, Vinod Yegneswaran, Martin W. Fong, and Wenke Lee. 2007. BotHunter: Detecting malware infection through IDS-driven dialog correlation. In Proceedings of the USENIX Security Symposium, Vol. 7. USENIX Association, Berkeley, CA, 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhongshu Gu, Kexin Pei, Qifan Wang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2014. LEAPS: Detecting camouflaged attacks with statistical learning guided by program analysis. In Proceedings of the Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE Computer Society, Washington, DC, 491--502. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Heartbleed 2014. The Heartbleed Bug. Retrieved from http://heartbleed.com/.Google ScholarGoogle Scholar
  24. Hong Hu, Zheng Leong Chua, Sendroiu Adrian, Prateek Saxena, and Zhenkai Liang. 2015. Automatic generation of data-oriented exploits. In Proceedings of the 24th USENIX Conference on Security Symposium. USENIX Association, Berkeley, CA, 177--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hong Hu, Shweta Shinde, Sendroiu Adrian, Zheng Leong Chua, Prateek Saxena, and Zhenkai Liang. 2016. Data-oriented programming: On the expressiveness of non-control data attacks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 969--986.Google ScholarGoogle ScholarCross RefCross Ref
  26. Ling Huang, Anthony D. Joseph, Blaine Nelson, Benjamin I. P. Rubinstein, and J. D. Tygar. 2011. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. ACM, New York, NY, 43--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Hubballi, S. Biswas, and S. Nandi. 2011. Sequencegram: N-gram modeling of system calls for program based anomaly detection. In Proceedings of the International Conference on Communication Systems and Networks. IEEE, Washington, DC, 1--10.Google ScholarGoogle Scholar
  28. Hajime Inoue and Anil Somayaji. 2007. Lookahead pairs and full sequences: A tale of two anomaly detection methods. In Proceedings of the Annual Symposium on Information Assurance. ASIA, Albany, NY, 9--19.Google ScholarGoogle Scholar
  29. Md Rafiqul Islam, Md Saiful Islam, and Morshed U. Chowdhury. 2011. Detecting unknown anomalous program behavior using API system calls. In Informatics Engineering and Information Science. Springer, Hamburg, Germany, 383--394.Google ScholarGoogle Scholar
  30. Jafar Haadi Jafarian, Ali Abbasi, and Siavash Safaei Sheikhabadi. 2011. A gray-box DPDA-based intrusion detection technique using system-call monitoring. In Proceedings of the Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference. ACM, New York, NY, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Rohit Jalan and Arun Kejariwal. 2012. Trin-Trin: Who’s calling? A pin-based dynamic call graph extraction framework. International Journal of Parallel Programming 40, 4 (2012), 410--442.Google ScholarGoogle ScholarCross RefCross Ref
  32. Sandeep Karanth, Srivatsan Laxman, Prasad Naldurg, Ramarathnam Venkatesan, J. Lambert, and Jinwook Shin. 2010. Pattern Mining for Future Attacks. Technical Report MSR-TR-2010-100. Microsoft Research.Google ScholarGoogle Scholar
  33. Baris Kasikci, Benjamin Schubert, Cristiano Pereira, Gilles Pokam, and George Candea. 2015. Failure sketching: A technique for automated root cause diagnosis of in-production failures. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, NY, 344--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Andrew P. Kosoresow and Steven A. Hofmeyr. 1997. Intrusion detection via system call traces. IEEE Software 14, 5 (1997), 35--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jeffrey P. Lanza. 2001. SSH CRC32 attack detection code contains remote integer overflow. (2001). Vulnerability Notes Database.Google ScholarGoogle Scholar
  36. Wenke Lee and Salvatore J. Stolfo. 1998. Data mining approaches for intrusion detection. In Proceedings of the USENIX Security Symposium. USENIX Association, Berkeley, CA, 6--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhen Liu, Susan M. Bridges, and Rayford B. Vaughn. 2005. Combining static analysis and dynamic learning to build accurate intrusion detection models. In Proceedings of IEEE International Workshop on Information Assurance. IEEE Computer Society, Washington, DC, 164--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Federico Maggi, Matteo Matteucci, and Stefano Zanero. 2010. Detecting intrusions through system call sequence and argument analysis. IEEE Transactions on Dependable and Secure Computing 7, 4 (2010), 381--395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Sukarno Mertoguno. 2014. Human decision making model for autonomic cyber systems. International Journal on Artificial Intelligence Tools 23, 06 (2014), 1460023.Google ScholarGoogle ScholarCross RefCross Ref
  40. David Moore, Colleen Shannon, Douglas J. Brown, Geoffrey M. Voelker, and Stefan Savage. 2006. Inferring internet denial-of-service activity. ACM Transactions on Computer Systems 24, 2 (2006), 115--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. James Newsome, Brad Karp, and Dawn Song. 2006. Paragraph: Thwarting signature learning by training maliciously. In Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses (RAID). Springer, Berlin,, Germany, 81--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Paradyn 2016. The Paradyn Project. Retrieved from http://www.paradyn.org/.Google ScholarGoogle Scholar
  43. K. Peason. 1901. On lines and planes of closest fit to systems of point in space. Philos. Mag. 2 (1901), 559--572.Google ScholarGoogle ScholarCross RefCross Ref
  44. Roberto Perdisci, Davide Ariu, Prahlad Fogla, Giorgio Giacinto, and Wenke Lee. 2009. McPAD: A multiple classifier system for accurate payload-based anomaly detection. Computer Networks 53, 6 (2009), 864--881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. R. Perdisci, Guofei Gu, and Wenke Lee. 2006. Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems. In Proceedings of the International Conference on Data Mining. IEEE Computer Society, Washington, DC, 488--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Roberto Perdisci, Wenke Lee, and Nick Feamster. 2010. Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation. USENIX Association, Berkeley, CA, 26--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Geoffrey K. Pullum. 1983. Context-freeness and the computer processing of human languages. In Proceedings of the Annual Meeting on Association for Computational Linguistics. ACL, Stroudsburg, PA, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Bernhard Schölkopf, Robert C. Williamson, Alex J. Smola, John Shawe-Taylor, and John C. Platt. 1999. Support vector method for novelty detection. In Proceedings of the Annual Conference on Neural Information Processing Systems, Vol. 12. The MIT Press, Cambridge, MA, 582--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. R. Sekar, Mugdha Bendre, Dinakar Dhurjati, and Pradeep Bollineni. 2001. A fast automaton-based method for detecting anomalous program behaviors. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 144--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Hovav Shacham, Matthew Page, Ben Pfaff, Eu-Jin Goh, Nagendra Modadugu, and Dan Boneh. 2004. On the effectiveness of address-space randomization. In Proceedings of the ACM Conference on Computer and Communications Security. ACM, New York, NY, 298--307. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Monirul Sharif, Kapil Singh, Jonathon Giffin, and Wenke Lee. 2007. Understanding precision in host based intrusion detection. In Proceedings of the International Symposium on Research in Attacks, Intrusions and Defenses. Springer, Hamburg, Germany, 21--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xiaokui Shu and Danfeng Yao. 2016. Program anomaly detection: Methodology and practices. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, New York, NY, 1853--1854. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Xiaokui Shu, Danfeng Yao, and Naren Ramakrishnan. 2015. Unearthing stealthy program attacks buried in extremely long execution paths. In Proceedings of the 2015 ACM Conference on Computer and Communications Security (CCS). ACM, New York, NY, 401--413. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Xiaokui Shu, Danfeng Yao, and Barbara G. Ryder. 2015. A formal framework for program anomaly detection. In Proceedings of the 18th International Symposium on Research in Attacks, Intrusions and Defenses (RAID). Springer, Hamburg, Germany, 270--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Alexander Sotirov. 2007. Heap Feng Shui in JavaScript. (2007). Black Hat Europe.Google ScholarGoogle Scholar
  56. S. C. Sundaramurthy, J. McHugh, X. S. Ou, S.R. Rajagopalan, and M. Wesch. 2014. An anthropological approach to studying CSIRTs. IEEE Security 8 Privacy 12, 5 (September 2014), 52--60.Google ScholarGoogle Scholar
  57. Systemtap. 2006. SystemTap Overhead Test, https://sourceware.org/ml/systemtap/2006-q3/msg00146.html. (2006).Google ScholarGoogle Scholar
  58. Fredrik Valeur, Giovanni Vigna, Christopher Kruegel, and Richard A. Kemmerer. 2004. A comprehensive approach to intrusion detection alert correlation. IEEE Transactions on Dependable and Secure Computing 1, 3 (July 2004), 146--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. David Wagner and R. Dean. 2001. Intrusion detection via static analysis. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 156--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. David Wagner and Paolo Soto. 2002. Mimicry attacks on host-based intrusion detection systems. In Proceedings of the ACM Conference on Computer and Communications Security. ACM, New York, NY, 255--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Christina Warrender, Stephanie Forrest, and Barak Pearlmutter. 1999. Detecting intrusions using system calls: Alternative data models. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE Computer Society, Washington, DC, 133--145.Google ScholarGoogle ScholarCross RefCross Ref
  62. K. Xu, K. Tian, D. Yao, and B. G. Ryder. 2016. A sharper sense of self: Probabilistic Reasoning of program behaviors for anomaly detection with context sensitivity. In Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, Washington, DC, 467--478.Google ScholarGoogle Scholar
  63. K. Xu, D. D. Yao, B. G. Ryder, and K. Tian. 2015. Probabilistic program modeling for high-precision anomaly classification. In Proceedings of the IEEE 28th Computer Security Foundations Symposium. IEEE, Washington, DC, 497--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Nong Ye and X. Li. 2000. A Markov chain model of temporal behavior for anomaly detection. In Proceedings of the 2000 IEEE Systems, Man, and Cybernetics Information Assurance and Security Workshop, Vol. 166. IEEE, Washington, DC, 169.Google ScholarGoogle Scholar
  65. Stefano Zanero. 2004. Behavioral intrusion detection. In Computer and Information Sciences. Springer, Hamburg, Germany, 657--666.Google ScholarGoogle Scholar
  66. Hao Zhang, Danfeng Yao, and Naren Ramakrishnan. 2014. Detection of stealthy malware activities with traffic causality and scalable triggering relation discovery. In Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security. ACM, New York, NY, 39--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Hao Zhang, Danfeng Yao, Naren Ramakrishnan, and Zhibin Zhang. 2016. Causality reasoning about network events for detecting stealthy malware activities. Computers 8 Security 58 (2016), 180--198. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Long-Span Program Behavior Modeling and Attack Detection

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!