skip to main content
research-article
Free Access

A semantics-based approach to malware detection

Published:04 September 2008Publication History
Skip Abstract Section

Abstract

Malware detection is a crucial aspect of software security. Current malware detectors work by checking for signatures, which attempt to capture the syntactic characteristics of the machine-level byte sequence of the malware. This reliance on a syntactic approach makes current detectors vulnerable to code obfuscations, increasingly used by malware writers, that alter the syntactic properties of the malware byte sequence without significantly affecting their execution behavior.

This paper takes the position that the key to malware identification lies in their semantics. It proposes a semantics-based framework for reasoning about malware detectors and proving properties such as soundness and completeness of these detectors. Our approach uses a trace semantics to characterize the behavior of malware as well as that of the program being checked for infection, and uses abstract interpretation to “hide” irrelevant aspects of these behaviors. As a concrete application of our approach, we show that (1) standard signature matching detection schemes are generally sound but not complete, (2) the semantics-aware malware detector proposed by Christodorescu et al. is complete with respect to a number of common obfuscations used by malware writers and (3) the malware detection scheme proposed by Kinder et al. and based on standard model-checking techniques is sound in general and complete on some, but not all, obfuscations handled by the semantics-aware malware detector.

References

  1. Adleman, L. M. 1988. An abstract theory of computer viruses. In Proceedings of Advances in Cryptology (CRYPTO'88). Lecture Notes in Computer Science, vol. 403. Springer, Berlin, Germany. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S., Sahai, A., Vadhan, S., and Yang, K. 2001. On the (im)possibility of obfuscating programs. In Proceedings of the Advances in Cryptology (CRYPTO'01). Lecture Notes in Computer Science, vol. 2139. Springer, 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bergeron, J., Debbabi, M., Desharnais, J., Erhioui, M. M., Lavoie, Y., and Tawbi, N. 2001. Static detection of malicious code in executable programs. Symposium on Requirements Engineering for Information Security. http://www.sreis.org/old/2001/index.html.Google ScholarGoogle Scholar
  4. Briesemeister, L., Porras, P. A., and Tiwari, A. 2005. Model checking of worm quarantine and counter-quarantine under a group defense. Tech. rep. SRI-CSL-05-03, Computer Science Laboratory. SRI International.Google ScholarGoogle Scholar
  5. Chess, D. and White, S. 2000. An undetectable computer virus. In Proceedings of the Virus Bulletin Conference (VB2000). Virus Bulletin, Orlando, FL.Google ScholarGoogle Scholar
  6. Chow, S., Gu, Y., Johnson, H., and Zakharov, V. 2001. An approach to the obfuscation of control-flow of sequential computer programs. In Proceedings of the 4th International Information Security Conference (ISC'01), G. Davida and Y. Frankel, Eds. Lecture Notes in Computer Science, vol. 2200. Springer, 144--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Christodorescu, M. and Jha, S. 2003. Static analysis of executables to detect malicious patterns. In Proceedings of the 12th USENIX Security Symposium (Security'03). USENIX Association, Berkeley, CA, 169--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christodorescu, M., Jha, S., and Kruegel, C. 2007. Mining specifications of malicious behavior. In Proceedings of the 6th Joint Meeting European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering (ESEC/FSE'07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Christodorescu, M., Jha, S., Seshia, S. A., Song, D., and Bryant, R. E. 2005. Semantics-aware malware detection. In Proceedings of the 2005 IEEE Symposium on Security and Privacy (S&P'05). IEEE Computer Society, Los Alamitos, CA, 32--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christodorescu, M., Kinder, J., Jha, S., Katzenbeisser, S., and Veith, H. 2005. Malware normalization. Tech. rep. 1539, University of Wisconsin, Madison. WI.Google ScholarGoogle Scholar
  11. Clarke Jr. E. M., Grumberg, O., and Peled, D. A. 2001. Model Checking. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  12. Cohen, F. 1985. Computer viruses. Ph.D. thesis, University of Southern California.Google ScholarGoogle Scholar
  13. Cohen, F. 1989. Computational aspects of computer viruses. Comput. Secur. 8, 4, 325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cohen, F. B. 1987. Computer viruses: Theory and experiments. Comput. Secur. 6, 22--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Collberg, C., Thomborson, C., and Low, D. 1997. A taxonomy of obfuscating transformations. Tech. rep. 148, Department of Computer Sciences, University of Auckland.Google ScholarGoogle Scholar
  16. Collberg, C., Thomborson, C., and Low, D. 1998. Manufacturing cheap, resilient, and stealthy opaque constructs. In Proceedings of the 25th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'98). ACM Press, 184--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cousot, P. and Cousot, R. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction of approximation of fixed points. In Proceedings of the 4th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'77). ACM Press, 238--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cousot, P. and Cousot, R. 1979. Systematic design of program analysis frameworks. In Proceedings of the 6th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'79). ACM Press, 269--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cousot, P. and Cousot, R. 1992. Abstract interpretation frameworks. J. Logic Comput. 2, 4 (Aug.), 511--547.Google ScholarGoogle ScholarCross RefCross Ref
  20. Cousot, P. and Cousot, R. 2002. Systematic design of program transformation frameworks by abstract interpretation. In Proceedings of the 29th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages (POPL'02). ACM Press, 178--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dalla Preda, M., Christodorescu, M., Jha, S., and Debray, S. 2007. A semantics-based approach to malware detection. In Proceedings of the 32nd ACM Symp. on Principles of Programming Languages (POPL'07). ACM Press, 377--388. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dalla Preda, M. and Giacobazzi, R. 2005. Control code obfuscation by abstract interpretation. In Proceedings of the 3rd IEEE International Conference on Software Engineering and Formal Methods (SEFM'05). IEEE Computer Society, Los Alamitos, CA, USA, 301--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dalla Preda, M. and Giacobazzi, R. 2005. Semantics-based code obfuscation by abstract interpretation. In Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05). Lecture Notes in Computer Science, vol. 3580. Springer, 1325--1336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Detristan, T., Ulenspiegel, T., Malcom, Y., and von Underduk, M. S. 2003. Polymorphic shellcode engine using spectrum analysis. Phrack 11, 61 http://www.phrack.org.Google ScholarGoogle Scholar
  25. Goldwasser, S. and Kalai, Y. T. 2005. On the impossibility of obfuscation with auxiliary input. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05). IEEE Computer Society, 553--562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Gupta, A. and Sekar, R. 2003. An approach for detecting self-propagating email using anomaly detection. In Proceedings of the 6th International Symposium on Recent Advances in Intrusion Detection (RAID'03), G. Vigna, E. Jonsson, and C. Kruegel, Eds. Lecture Notes in Computer Science, vol. 2820. Springer, 55--72.Google ScholarGoogle Scholar
  27. Intel Corporation. 2001. IA-32 Intel Architecture Software Developer's Manual. Intel Corporation.Google ScholarGoogle Scholar
  28. Jordan, M. 2002. Dealing with metamorphism. Virus Bull. 10, 4--6.Google ScholarGoogle Scholar
  29. Kinder, J., Katzenbeisser, S., Schallhart, C., and Veith, H. 2005. Detecting malicious code by model checking. In Proceedings of the 2nd International Conference on Intrusion and Malware Detection and Vulnerability Assessment (DIMVA'05), K. Julisch and C. Krügel, Eds. Lecture Notes in Computer Science, vol. 3548. Springer, 174--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kolter, J. Z. and Maloof, M. A. 2004. Learning to detect malicious executables in the wild. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04). ACM Press, 470--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lakhotia, A. and Mohammed, M. 2004. Imposing Order on Program Statements to Assist Anti-Virus Scanners. In Proceedings of the 11th Working Conference on Reverse Engineering (WCRE'04). IEEE Computer Society, 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lakhotia, A. and Singh, P. K. 2000. Challenges in getting “formal” with viruses. In Virus Bull.Google ScholarGoogle Scholar
  33. Lee, W., Nimbalkar, R. A., Yee, K. K., Patil, S. B., Desai, P. H., Tran, T. T., and Stolfo, S. J. 2000. A data mining and CIDF based approach for detecting novel and distributed intrusions. In Proceedings of the 3rd International Workshop on Recent Advances in Intrusion Detection (RAID 2000). Lecture Notes in Computer Sciences, vol. 1907. Springer, 49--65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lee, W. and Stolfo, S. 1998. Data mining approaches for intrusion detection. In Proceedings of the 7th USENIX Security Symposium. USENIX Association, 79--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lee, W., Stolfo, S., and Mok, K. W. 1999. A data mining framework for building intrusion detection models. In Proceedings of the IEEE Symposium on Security and Privacy (S & P'99). IEEE Computer Society, Los Alamitos, CA, USA, 120--132.Google ScholarGoogle Scholar
  36. Li, W.-J., Wang, K., Stolfo, S. J., and Herzog, B. 2005. Fileprints: Identifying file types by n-gram analysis. In Proceedings of the 6th Annual IEEE Systems, Man, and Cybernetics (SMC) Workshop on Information Assurance (IAW'05). IEEE Computer Society, 64--71.Google ScholarGoogle Scholar
  37. Linn, C. and Debray, S. 2003. Obfuscation of executable code to improve resistance to static disassembly. In Proceedings of the 10th ACM Conference on Computer and Communications Security (CCS'03). ACM Press, 290--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Lo, R. W., Levitt, K. N., and Olsson, R. A. 1995. Mcf: A malicious code filter. Comput. Secur. 14, 541--566.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. McHugh, J. 2001. Intrusion and intrusion detection. Int. J. Inform. Secu. 1, 1, 14--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Morley, P. 2001. Processing virus collections. In Proceedings of the Virus Bulletin Conference (VB2'001). Virus Bulletin, 129--134.Google ScholarGoogle Scholar
  41. Nachenberg, C. 1997. Computer virus-antivirus coevolution. Comm. ACM 40, 1, 46--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Rajaat. 1999. Polymorphism. 29A Mag. 1, 3, 1--2.Google ScholarGoogle Scholar
  43. Singh, P. and Lakhotia, A. 2003. Static verification of worm and virus behaviour in binary executables using model checking. In Proceedings of the 4th IEEE Information Assurance Workshop. IEEE Computer Society, Los Alamitos, CA, USA.Google ScholarGoogle Scholar
  44. Symantec Corporation. 2006. Symantec Internet Security Threat Report: Trends for January 06--June 06. Vol. X. Symantec Corporation, Cupertino, CA.Google ScholarGoogle Scholar
  45. Ször, P. 2005. The Art of Computer Virus Research and Defense. Addison-Wesley Professional, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ször, P. and Ferrie, P. 2001. Hunting for metamorphic. In Proceedings of the Virus Bulletin Conference (VB2001). Virus Bulletin, 123--144.Google ScholarGoogle Scholar
  47. Walenstein, A., Mathur, R., Chouchane, M. R., and, Lakhotia, A 2006. Normalizing Metamorphic Malware Using Term Rewriting. In Proceedings of the 6th International Workshop on Source Code Analysis and Manipulation (SCAM'06). 75--84, IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Wee, H. 2005. On obfuscating point functions. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC'05). ACM Press, 523--532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. zombie. 2001a. Automated reverse engineering: Mistfall engine. Published online at http://www.madchat.org//vxdevl/papers/vxers/Z0mbie/autorev.txt (last accessed on Sep. 29, 2006).Google ScholarGoogle Scholar
  50. zombie. 2001b. Real Permutating{sic} Engine. Published online at http://vx.netlux.org/vx.php?id=er05.Google ScholarGoogle Scholar

Index Terms

  1. A semantics-based approach to malware detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Programming Languages and Systems
        ACM Transactions on Programming Languages and Systems  Volume 30, Issue 5
        August 2008
        193 pages
        ISSN:0164-0925
        EISSN:1558-4593
        DOI:10.1145/1387673
        Issue’s Table of Contents

        Copyright © 2008 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 September 2008
        • Accepted: 1 October 2007
        • Received: 1 July 2007
        Published in toplas Volume 30, Issue 5

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!