Abstract
In this article, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite graph representing who is querying what. After labeling nodes in this query behavior graph that are known to be either benign or malware-related, we propose a novel approach to accurately detect previously unknown malware-control domains.
We implemented a proof-of-concept version of Segugio and deployed it in large ISP networks that serve millions of users. Our experimental results show that Segugio can track the occurrence of new malware-control domains with up to 94% true positives (TPs) at less than 0.1% false positives (FPs). In addition, we provide the following results: (1) we show that Segugio can also detect control domains related to new, previously unseen malware families, with 85% TPs at 0.1% FPs; (2) Segugio’s detection models learned on traffic from a given ISP network can be deployed into a different ISP network and still achieve very high detection accuracy; (3) new malware-control domains can be detected days or even weeks before they appear in a large commercial domain-name blacklist; (4) Segugio can be used to detect previously unknown malware-infected machines in ISP networks; and (5) we show that Segugio clearly outperforms domain-reputation systems based on Belief Propagation.
- Manos Antonakakis, Roberto Perdisci, David Dagon, Wenke Lee, and Nick Feamster. 2010. Building a dynamic reputation system for DNS. In Proceedings of the 19th USENIX Conference on Security (USENIX Security’10). Google Scholar
Digital Library
- Manos Antonakakis, Roberto Perdisci, Wenke Lee, Nikolaos Vasiloglou, II, and David Dagon. 2011. Detecting malware domains at the upper DNS hierarchy. In Proceedings of the 20th USENIX Conference on Security (SEC’11). Google Scholar
Digital Library
- Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. 2012. From throw-away traffic to bots: Detecting the rise of DGA-based malware. In Proceedings of the 21st USENIX Conference on Security Symposium (Security’12). USENIX Association, Berkeley, CA, 24--24. http://dl.acm.org/citation.cfm?id=2362793.2362817 Google Scholar
Digital Library
- Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. 2011. EXPOSURE: Finding malicious domains using passive DNS analysis. In NDSS. The Internet Society.Google Scholar
- Leo Breiman. 2001. Random forests. Machine Learning 45, 1, 5--32. Google Scholar
Digital Library
- Juan Caballero, Chris Grier, Christian Kreibich, and Vern Paxson. 2011. Measuring pay-per-install: The commoditization of malware distribution. In Proceedings of the 20th USENIX Conference on Security (SEC’11). USENIX Association, Berkeley, CA, USA, 13--13. Google Scholar
Digital Library
- D. H. Chau, C. Nachenberg, J. Willhelm, A. Wright, and C. Faloutsos. 2011. Polonium: Tera-scale graph mining and inference for malware detection. Proceedings of SIAM International Conference on Data Mining (SDM’11) 131--142.Google Scholar
- Baris Coskun, Sven Dietrich, and Nasir Memon. 2010. Friends of an enemy: Identifying local members of peer-to-peer botnets using mutual contacts. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 131--140. Google Scholar
Digital Library
- Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. 2008. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys 44, 2, Article 6. Google Scholar
Digital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research 9, 1871--1874. Google Scholar
Digital Library
- Mark Felegyhazi, Christian Kreibich, and Vern Paxson. 2010. On the potential of proactive domain blacklisting. In Proceedings of the 3rd USENIX Workshop on Large-scale Exploits and Emergent Threats (LEET’10). Google Scholar
Digital Library
- Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee. 2008a. BotMiner: Clustering analysis of network traffic for protocol- and structure-independent botnet detection. In Proceedings of the 17th Conference on Security Symposium (SS’08). USENIX Association, Berkeley, CA, 139--154. Google Scholar
Digital Library
- Guofei Gu, Phillip Porras, Vinod Yegneswaran, Martin Fong, and Wenke Lee. 2007. BotHunter: Detecting malware infection through IDS-driven dialog correlation. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (SS’07). USENIX Association, Berkeley, CA, Article 12. Google Scholar
Digital Library
- Guofei Gu, Junjie Zhang, and Wenke Lee. 2008b. BotSniffer: Detecting botnet command and control channels in network traffic. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS’08).Google Scholar
- Gregoire Jacob, Ralf Hund, Christopher Kruegel, and Thorsten Holz. 2011. JACKSTRAWS: Picking command and control connections from bot traffic. In Proceedings of the 20th USENIX Conference on Security. Berkeley, CA. Google Scholar
Digital Library
- Thomas Karagiannis, Konstantina Papagiannaki, and Michalis Faloutsos. 2005. BLINC: Multilevel traffic classification in the dark. In Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM’05). ACM, New York, NY, 12. Google Scholar
Digital Library
- Daphne Koller and Nir Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Marc Kührer, Christian Rossow, and Thorsten Holz. 2014. Paint it black: Evaluating the effectiveness of malware blacklists. In Research in Attacks, Intrusions and Defenses. Springer, 1--21.Google Scholar
- Ludmila I. Kuncheva. 2004. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, Hoboken, NJ. Google Scholar
Digital Library
- Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2010. GraphLab: A new parallel framework for machine learning. In Conference on Uncertainty in Artificial Intelligence (UAI). Catalina Island, CA.Google Scholar
- Pratyusa K. Manadhata, Sandeep Yadav, Prasad Rao, and William Horne. 2014. Detecting malicious domains via graph inference. In Computer Security - ESORICS’14, Miroslaw Kutylowski and Jaideep Vaidya (Eds.). Lecture Notes in Computer Science, Vol. 8712. Springer, Berlin, 1--18.Google Scholar
- Terry Nelms, Roberto Perdisci, and Mustaque Ahamad. 2013. ExecScent: Mining for new C&C domains in live networks with adaptive control protocol templates. In Proceedings of the 22nd USENIX Conference on Security. USENIX Association, 589--604. Google Scholar
Digital Library
- Roberto Perdisci, Wenke Lee, and Nick Feamster. 2010. Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI’10). Google Scholar
Digital Library
- M. Zubair Rafique and Juan Caballero. 2013. FIRMA: Malware clustering and network signature generation with mixed network behaviors. In Proceedings of the 16th International Symposium on Research in Attacks, Intrusions and Defenses. St. Lucia. Google Scholar
Digital Library
- Babak Rahbarinia, Roberto Perdisci, and Manos Antonakakis. 2015. Segugio: Efficient behavior-based tracking of malware-control domains in large ISP networks. In Proceedings of the 2015 IEEE/IFIP International Conference on Dependable Systems &Networks (DSN’’15). Google Scholar
Digital Library
- Christian Rossow, Christian Dietrich, and Herbert Bos. 2013. Large-scale analysis of malware downloaders. In Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 42--61. Google Scholar
Digital Library
- Kazumichi Sato, Keisuke Ishibashi, Tsuyoshi Toyono, and Nobuhisa Miyake. 2010. Extending black domain name list by using co-occurrence relation between DNS queries. In LEET. Google Scholar
Digital Library
- Le Song, Arthur Gretton, Danny Bickson, Yucheng Low, and Carlos Guestrin. 2011. Kernel belief propagation. In Artificial Intelligence and Statistics (AISTATS).Google Scholar
- Symantec. 2013a. India Sees 280 Percent Increase in Bot Infections. Retrieved July 18, 2016 from http://www.symantec.com/en/in/about/news/release/article.jsp?pr id=20130428_01.Google Scholar
- Symantec. 2013b. Internet Security Threat Report, Volume 18. http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v18_2012_21291018.en-us.pdf.Google Scholar
- Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel, and Engin Kirda. 2009. Automatically generating models for botnet detection. In Proceedings of the 14th European Conference on Research in Computer Security (ESORICS’09). Google Scholar
Digital Library
- Kuai Xu, Feng Wang, and Lin Gu. 2011. Network-aware behavior clustering of Internet end hosts. In Proceedings of IEEE INFOCOM.Google Scholar
Cross Ref
- Ting-Fang Yen and Michael K. Reiter. 2010. Are your hosts trading or plotting? Telling P2P file-sharing and bots apart. In Proceedings of the IEEE 30th International Conference on Distributed Computing Systems (ICDCS’10). Google Scholar
Digital Library
- Junjie Zhang, Roberto Perdisci, Wenke Lee, Unum Sarfraz, and Xiapu Luo. 2011. Detecting stealthy P2P botnets using statistical traffic fingerprints. In Proceedings of the IEEE/IFIP 41st International Conference on Dependable Systems &Networks (DSN’’11). Google Scholar
Digital Library
Index Terms
Efficient and Accurate Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks
Recommendations
Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks
DSN '15: Proceedings of the 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and NetworksIn this paper, we propose Segugio, a novel defense system that allows for efficiently tracking the occurrence of new malware-control domain names in very large ISP networks. Segugio passively monitors the DNS traffic to build a machine-domain bipartite ...
Design and implementation of a malware detection system based on network behavior
With the increasing of new malicious software attacks, the host-based malware detection methods cannot always detect the latest unknown malware. Intrusion detection system does not focus on malware detection, whereas the behavior-based detection methods ...
Behavior-Based Malware Analysis and Detection
IWCDM '11: Proceedings of the 2011 First International Workshop on Complexity and Data MiningMalware, such as Trojan Horse, Worms and Spy ware severely threatens Internet. We observed that although malware and its variants may vary a lot from content signatures, they share some behavior features at a higher level which are more precise in ...






Comments