skip to main content
research-article

Exploiting Content Spatial Distribution to Improve Detection of Intrusions

Published:20 January 2018Publication History
Skip Abstract Section

Abstract

We present PCkAD, a novel semisupervised anomaly-based IDS (Intrusion Detection System) technique, detecting application-level content-based attacks. Its peculiarity is to learn legitimate payloads by splitting packets into chunks and determining the within-packet distribution of n-grams. This strategy is resistant to evasion techniques as blending. We prove that finding the right legitimate content is NP-hard in the presence of chunks. Moreover, it improves the false-positive rate for a given detection rate with respect to the case where the spatial information is not considered. Comparison with well-known IDSs using n-grams highlights that PCkAD achieves state-of-the-art performances.

References

  1. Elizabeth Shawt Adams. 1992. A Study of Trigrams and Their Feasibility as Index Terms in a Full Text Information Retrieval System. Ph.D. Dissertation. Washington, DC. UMI Order No. GAX92-12700.Google ScholarGoogle Scholar
  2. Brandie Anderson, Sue Barsamian, Dustin Childs, Jason Ding, Joy Marie Forsythe, Brian Gorenc, Angela Gunn, Alexander Hoole, Howard Miller, Sasi Siddharth Muthurajan, Yekaterina Tsipenyuk O’Neil, John Park, Oleg Petrovsky, Barak Raz, Nidhi Shah, Vanja Svajcer, Ken Tietjen, and Jewel Timpe. 2016. Cyber Risk Report 2016. Technical Report. Hewlett Packard Enterprise.Google ScholarGoogle Scholar
  3. Fabrizio Angiulli, Luciano Argento, and Angelo Furfaro. 2015. Exploiting N-gram location for intrusion detection. In IEEE International Conference on Tools with Artificial Intelligence (ICTAI’15). 1093--1098. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fabrizio Angiulli, Luciano Argento, and Angelo Furfaro. 2017. PCkAD source code. Retrieved from https://github.com/F3nDis/PCkAD.Google ScholarGoogle Scholar
  5. Stefan Axelsson. 2000. Intrusion Detection Systems: A Survey and Taxonomy. Technical Report.Google ScholarGoogle Scholar
  6. Salem Benferhat, Tayeb Kenaza, and Aicha Mokhta2ri. 2008. A naive Bayes approach for detecting coordinated attacks. In IEEE International Conference on Computer Software and Applications (COMPSAC’08). 704--709. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Battista Biggio, Igino Corona, Zhi-Min He, Patrick P.K. Chan, Giorgio Giacinto, Daniel S. Yeung, and Fabio Roli. 2015. One-and-a-half-class multiple classifier systems for secure learning against evasion attacks at test time. In International Workshop on Multiple Classifier Systems. Springer, 168--180.Google ScholarGoogle ScholarCross RefCross Ref
  8. Battista Biggio, Giorgio Fumera, and Fabio Roli. 2014. Security evaluation of pattern classifiers under attack. IEEE Transactions on Knowledge and Data Engineering 26, 4 (2014), 984--996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Leyla Bilge, Davide Balzarotti, William Robertson, Engin Kirda, and Christopher Kruegel. 2012. Disclosure: Detecting botnet command and control servers through large-scale netflow analysis. In Proceedings of the 28th Annual Computer Security Applications Conference. ACM, 129--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. 2011. EXPOSURE: Finding malicious domains using passive DNS analysis. In 8th Annual Network and Distributed System Security Symposium.Google ScholarGoogle Scholar
  11. Leyla Bilge, Sevil Sen, Davide Balzarotti, Engin Kirda, and Christopher Kruegel. 2014. EXPOSURE: A passive DNS analysis service to detect and report malicious domains. ACM Transactions on Information and System Security (TISSEC) 16, 4 (2014), 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Misty Blowers and Jonathan Williams. 2014. Machine learning applied to cyber operations. In Network Science and Cybersecurity. Springer, 155--175.Google ScholarGoogle Scholar
  13. Nathaniel Boggs, Senyao Du, and Salvatore J Stolfo. 2014a. Measuring drive-by download defense in depth. In Research in Attacks, Intrusions and Defenses. Springer, 172--191.Google ScholarGoogle Scholar
  14. Nathaniel Boggs, Hang Zhao, Senyao Du, and Salvatore J. Stolfo. 2014b. Synthetic data generation and defense in depth measurement of web applications. In Research in Attacks, Intrusions and Defenses. Springer, 234--254.Google ScholarGoogle Scholar
  15. Damiano Bolzoni, Sandro Etalle, and Pieter Hartel. 2006. POSEIDON: A 2-tier anomaly-based network intrusion detection system. In Proceedings of the IEEE International Workshop on Information Assurance (IWIA’06). 144--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Peter F. Brown, Peter V. Desouza, Robert L. Mercer, Vincent J. Della Pietra, and Jenifer C. Lai. 1992. Class-based n-gram models of natural language. Computational Linguistics 18, 4 (1992), 467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Michael Brückner, Christian Kanzow, and Tobias Scheffer. 2012. Static prediction games for adversarial learning problems. Journal of Machine Learning Research 13, (2012), 2617--2654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Gabriela F. Cretu, Angelos Stavrou, Michael E. Locasto, Salvatore J. Stolfo, and Angelos D. Keromytis. 2008. Casting out demons: Sanitizing training data for anomaly sensors. In IEEE Symposium on Security and Privacy. 81--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai, and Deepak Verma. 2004. Adversarial classification. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 99--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jonathan J. Davis and Andrew J. Clark. 2011. Data preprocessing for anomaly based network intrusion detection: A review. Computers 8 Security 30, 6 (2011), 353--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Theo Detristan, Tyll Ulenspiegel, Yann Malcom, and Mynheer Underduk. 2003. Polymorphic shellcode engine using spectrum analysis. Volume 11, issue 61. http://phrack.org/issues/61/9.html.Google ScholarGoogle Scholar
  22. Prahlad Fogla, Monirul Sharif, Roberto Perdisci, Oleg Kolesnikov, and Wenke Lee. 2006. Polymorphic blending attacks. In Proceedings of the 15th USENIX Security Symposium. Vancouver, B.C., Canada, 241--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. John Gallant, David Maier, and James Astorer. 1980. On finding minimal length superstrings. Journal of Computer and System Sciences 20, 1 (1980), 50--58.Google ScholarGoogle ScholarCross RefCross Ref
  24. Pedro Garcia-Teodoro, J. Diaz-Verdejo, Gabriel Maciá-Fernández, and Enrique Vázquez. 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Computers 8 Security 28, 1 (2009), 18--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Amir Globerson and Sam Roweis. 2006. Nightmare at test time: Robust learning by feature deletion. In Proceedings of the 23rd International Conference on Machine Learning. ACM, 353--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. IETF. 1999. Hypertext Transfer Protocol -- HTTP/1.1. Retrieved from https://tools.ietf.org/html/rfc2616.Google ScholarGoogle Scholar
  27. Kenneth L. Ingham and Hajime Inoue. 2007. Comparing anomaly detection techniques for http. In International Symposium on Recent Advances in Intrusion Detection (RAID’07). 42--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. John Felix Charles Joseph, Amitabha Das, Bu-Sung Lee, and Boon-Chong Seet. 2010. CARRADS: Cross layer based adaptive real-time routing attack detection system for MANETS. Computer Networks 54, 7 (2010), 1126--1141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Latifur Khan, Mamoun Awad, and Bhavani Thuraisingham. 2007. A new intrusion detection system using support vector machines and hierarchical clustering. VLDB Journal 16, 4 (2007), 507--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Amit Klein. 2005. Exploiting the XmlHttpRequest object in IE. Retrieved from http://www.securityfocus.com/archive/1/411585.Google ScholarGoogle Scholar
  31. Levent Koc, Thomas A. Mazzuchi, and Shahram Sarkani. 2012. A network intrusion detection system based on a Hidden naïve Bayes multiclass classifier. Expert Systems with Applications 39, 18 (2012), 13492--13500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yinhui Li, Jingbo Xia, Silan Zhang, Jiakai Yan, Xiaochuan Ai, and Kuobin Dai. 2012. An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Systems with Applications 39, 1 (2012), 424--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Richard Lippmann, Joshua W. Haines, David J. Fried, Jonathan Korba, and Kumar Das. 2000. The 1999 DARPA off-line intrusion detection evaluation. Computer Networks 34, 4 (2000), 579--595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Matthew V. Mahoney and Philip K. Chan. 2001. PHAD: Packet Header Anomaly Detection for Identifying Hostile Network Traffic. Technical Report CS-2001-04. Florida Institute of Technology. Retrieved from https://cs.fit.edu/media/TechnicalReports/cs-2001-04.pdf.Google ScholarGoogle Scholar
  35. MITRE Corporation. 2012. Common Vulnerabilities and Exposures. CVE 2012-0911. Retrieved from http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-0911.Google ScholarGoogle Scholar
  36. MITRE Corporation. 2014. Common Vulnerabilities and Exposures. CVE 2014-6271. Retrieved from http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-6271.Google ScholarGoogle Scholar
  37. OWASP. 2016. Open Web Application Security Project. Retrieved from https://www.owasp.org.Google ScholarGoogle Scholar
  38. Roberto Perdisci, Davide Ariu, Prahlad Fogla, Giorgio Giacinto, and Wenke Lee. 2009. McPAD: A multiple classifier system for accurate payload-based anomaly detection. Computer Networks 53, 6 (2009), 864--881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. The Snort Project. 2016. Snort® Users Manual. Software. Cisco.Google ScholarGoogle Scholar
  40. Yingbo Song, Angelos D. Keromytis, and Salvatore Stolfo. 2009. Spectrogram: A mixture-of-Markov-chains model for anomaly detection in web traffic. In Proceedings of the Network and Distributed System Security Symposium 2009. Internet Society, 121--135.Google ScholarGoogle Scholar
  41. Stuart Staniford, James A. Hoagland, and Joseph M. McAlerney. 2002. Practical automated detection of stealthy portscans. Journal of Computer Security 10, 1--2 (2002), 105--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Dafydd Stuttard and Marcus Pinto. 2011. The Web Application Hacker’s Handbook: Finding and Exploiting Security Flaws. John Wiley 8 Sons.Google ScholarGoogle Scholar
  43. Choon H. Teo, Amir Globerson, Sam T. Roweis, and Alex J. Smola. 2007. Convex learning with invariances. In Advances in Neural Information Processing Systems. 1489--1496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Alvarez Torrano-Gimenez and Perez-Villegas. 2010. HTTP dataset CSIC. Retrieved from http://www.isi.csic.es/dataset/.Google ScholarGoogle Scholar
  45. Juan Wang, Qiren Yang, and Dasen Ren. 2009. An intrusion detection algorithm based on decision tree technology. In IEEE Asia-Pacific Conf. on Information Processing (APCIP’09), Vol. 2. 333--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ke Wang, Gabriela F. Cretu, and Salvatore J. Stolfo. 2005. Anomalous payload-based worm detection and signature generation. In International Symposium on Recent Advances in Intrusion Detection (RAID’05). 227--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ke Wang, Janak J. Parekh, and Salvatore J. Stolfo. 2006. Anagram: A content anomaly detector resistant to mimicry attack. In Recent Advances in Intrusion Detection. Springer, 226--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Cheng Xiang, Png Chin Yong, and Lim Swee Meng. 2008. Design of multiple-level hybrid classifier for intrusion detection system using Bayesian clustering and decision trees. Pattern Recognition Letters 29, 7 (2008), 918--924. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jun Xu, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. 2003. Transparent runtime randomization for security. In Proceedings of the 22nd International Symposium on Reliable Distributed Systems. 260--269.Google ScholarGoogle Scholar
  50. Thiago Zaninotti. 2006. Unfiltered Header Injection in Apache 1.3.34/2.0.57/2.2.1. Retrieved from http://www.securityfocus.com/archive/1/433280.Google ScholarGoogle Scholar
  51. Jiong Zhang, Mohammad Zulkernine, and Anwar Haque. 2008. Random-forests-based network intrusion detection systems. IEEE Transactions on Systems, Man, and Cybernetics, Part C 38, 5 (2008), 649--659. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Yan Zhou, Murat Kantarcioglu, Bhavani Thuraisingham, and Bowei Xi. 2012. Adversarial support vector machine learning. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1059--1067. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Exploiting Content Spatial Distribution to Improve Detection of Intrusions

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Internet Technology
          ACM Transactions on Internet Technology  Volume 18, Issue 2
          Special Issue on Internetware and Devops and Regular Papers
          May 2018
          294 pages
          ISSN:1533-5399
          EISSN:1557-6051
          DOI:10.1145/3182619
          • Editor:
          • Munindar P. Singh
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 January 2018
          • Accepted: 1 September 2017
          • Revised: 1 July 2017
          • Received: 1 October 2016
          Published in toit Volume 18, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!