ABSTRACT
Twitter is a popular public source for threat hunting. Many security vendors and security professionals use Twitter in practice for collecting Indicators of Compromise (IOCs). However, little is known about IOCs on Twitter. Their important characteristics such as earliness, uniqueness, and accuracy have never been investigated. Moreover, how to extract IOCs from Twitter with high accuracy is not obvious. In this paper, we present Twiti, a system that automatically extracts various forms of malware IOCs from Twitter. Based on the collected IOCs, we conduct the first empirical assessment and thorough analysis of malware IOCs on Twitter. Twiti extracts IOCs from tweets identified as having malware IOC information by leveraging natural language processing and machine learning techniques. With extensive evaluation, we demonstrate that not only can Twiti extract malware IOCs accurately, but also the extracted IOCs are unique and early. By analyzing IOCs in Twiti from various aspects, we find that Twitter captures ongoing malware threats such as Emotet variants and malware distribution sites better than other public threat intelligence (TI) feeds. We also find that only a tiny fraction of IOCs on Twitter come from commercial vendor accounts and individual Twitter users are the main contributors of the early detected or exclusive IOCs, which indicates that Twitter can provide many valuable IOCs uncovered in commercial domain
- 2019 SONICWALL CYBERTHREAT REPORT. www.sonicwall.com/lp/2019-cyber-threat-report-lp.Google Scholar
- Abuse.ch Feodo Tracker. https://feodotracker.abuse.ch/.Google Scholar
- Actionable Threat Intelligence. https://www.checkpoint.com/downloads/partners/checkpoint-intsights-solution-brief.pdf.Google Scholar
- Alexa Top 1 Million. http://s3.amazonaws.com/alexa-static/top-1m.csv.zip.Google Scholar
- AlienVault IP reputation. http://reputation.alienvault.com/reputation.data.Google Scholar
- Any.Run. https://app.any.run/.Google Scholar
- AV-TEST Security Report 2018/2019. https://www.av-test.org/fileadmin/pdf/security_report/AV-TEST_Security_Report_2018-2019.pdf.Google Scholar
- AWS, Google Cloud Popular Home for Botnet Controllers. https://www.darkreading.com/cloud/aws-google-cloud-popular-home-for-botnet-controllers/d/d-id/1330798.Google Scholar
- Cisco Umbrella 1M. http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip.Google Scholar
- Hackers use Microsoft Azure to host malware and run C2 servers. https://www.scmagazineuk.com/hackers-use-microsoft-azure-host-malware-run-c2-servers/article/1586279.Google Scholar
- Hunting Threats on Twitter. https://www.trendmicro.com/vinfo/us/security/news/cybercrime-and-digital-threats/hunting-threats-on-twitter.Google Scholar
- Hybrid Analysis. https://www.hybrid-analysis.com/.Google Scholar
- InQuest Labs IOC Database. https://labs.inquest.net/iocdb.Google Scholar
- Internet Security Threat Report 2019. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf.Google Scholar
- ioc-fanger 3.1.0. https://pypi.org/project/ioc-fanger/.Google Scholar
- iocextract 1.13.1. https://pypi.org/project/iocextract/.Google Scholar
- Majestic Million. http://downloads.majestic.com/majestic_million.csv.Google Scholar
- MalwareBazaar. https://bazaar.abuse.ch/.Google Scholar
- The OpenIOC Framework. http://www.openioc.org.Google Scholar
- OTX AlienVault. https://otx.alienvault.com/.Google Scholar
- Sources of Threat Data. https://www.recordedfuture.com/threat-data-sources/.Google Scholar
- ThreatIngestor: Extract and aggregate IOCs. https://github.com/InQuest/ThreatIngestor.Google Scholar
- Twitter IOC Hunter. http://tweettioc.com/.Google Scholar
- Twitter Search API. https://developer.twitter.com/en/docs/tweets/search/overview.Google Scholar
- Twitter Timeline API. https://developer.twitter.com/en/docs/tweets/timelines/overview.Google Scholar
- URLhaus. https://urlhaus.abuse.ch/.Google Scholar
- urlscan.io. https://urlscan.io/.Google Scholar
- Using Twitter as a source of Indicators of Compromise. https://medium.com/@cybersiftIO/using-twitter-as-a-source-of-indicators-of-compromise-bc6877fba629.Google Scholar
- The Value of Threat Intelligence: Annual Study of North American & United Kingdom Companies. https://www.anomali.com/resources/whitepapers/2019-ponemon-report-the-value-of-threat-intelligence-from-anomali.Google Scholar
- VirusTotal Contributors. https://support.virustotal.com/hc/articles/115002146809-Contributors.Google Scholar
- [n.d.]. VirusTotal Reports. https://support.virustotal.com/hc/en-us/articles/115002719069-Reports.Google Scholar
- 2019. Garmin reportedly paid multimillion-dollar ransom after suffering cyberattack. https://www.theverge.com/2020/8/4/21353842/garmin-ransomware-attack-wearables-wastedlocker-evil-corp.Google Scholar
- 2019. Security researchers take down 100,000 malware sites over the last ten months. https://www.zdnet.com/article/security-researchers-take-down-100000-malware-sites-over-the-last-ten-months/.Google Scholar
- Mitsuaki Akiyama, Takeshi Yagi, Takeshi Yada, Tatsuya Mori, and Youki Kadobayashi. 2017. Analyzing the ecosystem of malicious URL redirection through longitudinal observation from honeypots. Computers & Security 69(2017), 155–173.Google Scholar
Cross Ref
- Eihal Alowaisheq. 2019. Cracking wall of confinement: Understanding and analyzing malicious domain takedowns. In The Network and Distributed System Security Symposium (NDSS).Google Scholar
Cross Ref
- X. Bouwman, H. Griffioen, J. Egbers, C. Doerr, B. Klievink, and M. van Eeten. 2020. A different cup of TI? The added value of commercial threat intelligence. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association. https://www.usenix.org/conference/usenixsecurity20/presentation/bouwmanGoogle Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018).Google Scholar
- Nuno Dionísio, Fernando Alves, Pedro M Ferreira, and Alysson Bessani. 2019. Cyberthreat detection from twitter using deep neural networks. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE.Google Scholar
Cross Ref
- Joobin Gharibshah, Tai Ching Li, Andre Castro, Konstantinos Pelechrinis, Evangelos E Papalexakis, and Michalis Faloutsos. 2017. Mining actionable information from security forums: the case of malicious IP addresses. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Springer, 193–211.Google Scholar
- Cheng Huang, Shuang Hao, Luca Invernizzi, Jiayong Liu, Yong Fang, Christopher Kruegel, and Giovanni Vigna. 2017. Gossip: Automatically identifying malicious domains from mailing list discussions. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 494–505.Google Scholar
Digital Library
- Constantinos Kolias, Georgios Kambourakis, Angelos Stavrou, and Jeffrey Voas. 2017. DDoS in the IoT: Mirai and other botnets. Computer 50, 7 (2017), 80–84.Google Scholar
Digital Library
- Vector Guo Li, Matthew Dunn, Paul Pearce, Damon McCoy, Geoffrey M Voelker, Stefan Savage, and Kirill Levchenko. 2019. Reading the Tea Leaves: A Comparative Analysis of Threat Intelligence. In 28th USENIX Security Symposium.Google Scholar
- Xiaojing Liao, Kan Yuan, XiaoFeng Wang, Zhou Li, Luyi Xing, and Raheem Beyah. 2016. Acing the IOC game: Toward automatic discovery and analysis of open-source cyber threat intelligence. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 755–766.Google Scholar
Digital Library
- Edward Loper and Steven Bird. 2002. NLTK: The Natural Language Toolkit. In Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Volume 1 (Philadelphia, Pennsylvania) (ETMTNLP ’02). Association for Computational Linguistics, Stroudsburg, PA, USA, 63–70. https://doi.org/10.3115/1118108.1118117Google Scholar
Digital Library
- Christopher Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations(Baltimore, Maryland). Association for Computational Linguistics, 55–60. https://doi.org/10.3115/v1/P14-5010Google Scholar
Cross Ref
- Niels Provos, Dean McNamee, Panayiotis Mavrommatis, Ke Wang, and Nagendra Modadugu. 2007. The Ghost In The Browser: Analysis of Web-based Malware. In First Workshop on Hot Topics in Understanding Botnets (HotBots ’07).Google Scholar
- Sivaramakrishnan Ramanathan, Jelena Mirkovic, and Minlan Yu. 2020. BLAG: Improving the Accuracy of Blacklists. In Proceedings of the 27th Annual Network and Distributed Systems Security (NDSS) Symposium.Google Scholar
Cross Ref
- Alan Ritter, Sam Clark, Oren Etzioni, 2011. Named entity recognition in tweets: an experimental study. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1524–1534.Google Scholar
Digital Library
- Hyejin Shin, WooChul Shim, Jiin Moon, Jaewoo Seo, Sol Lee, and Yong H Hwang. 2020. Cybersecurity event detection with new and re-emerging words. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security (AsiaCCS). ACM.Google Scholar
Digital Library
- Sushant Sinha, Michael Bailey, and Farnam Jahanian. 2008. Shades of Grey: On the effectiveness of reputation-based “blacklists”. In 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE). IEEE, 57–64.Google Scholar
Cross Ref
- Bin Yu, Daniel L Gray, Jie Pan, Martine De Cock, and Anderson CA Nascimento. 2017. Inline DGA detection with deep networks. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 683–692.Google Scholar
Cross Ref
- Shengping Zhou, Zi Long, Lianzhi Tan, and Hao Guo. 2018. Automatic identification of indicators of compromise using neural-based sequence labelling. arXiv preprint arXiv:1810.10156(2018).Google Scholar
Index Terms
#Twiti: Social Listening for Threat Intelligence
Recommendations
What is Twitter, a social network or a news media?
WWW '10: Proceedings of the 19th international conference on World wide webTwitter, a microblogging service less than three years old, commands more than 41 million users as of July 2009 and is growing fast. Twitter users tweet about any topic within the 140-character limit and follow others to receive their tweets. The goal ...
Dissecting a Social Botnet: Growth, Content and Influence in Twitter
CSCW '15: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social ComputingSocial botnets have become an important phenomenon on social media. There are many ways in which social bots can disrupt or influence online discourse, such as, spam hashtags, scam twitter users, and astroturfing. In this paper we considered one ...
Disinformation Warfare: Understanding State-Sponsored Trolls on Twitter and Their Influence on the Web
WWW '19: Companion Proceedings of The 2019 World Wide Web ConferenceOver the past couple of years, anecdotal evidence has emerged linking coordinated campaigns by state-sponsored actors with efforts to manipulate public opinion on the Web, often around major political events, through dedicated accounts, or “trolls.” ...





Comments