Abstract
We propose to identify compromised mobile devices from a network administrator’s point of view. Intuitively, inadvertent users (and thus their devices) who download apps through untrustworthy markets are often lured to install malicious apps through in-app advertisements or phishing. We thus hypothesize that devices sharing similar apps would have a similar likelihood of being compromised, resulting in an association between a compromised device and its apps. We propose to leverage such associations to identify unknown compromised devices using the guilt-by-association principle. Admittedly, such associations could be relatively weak as it is hard, if not impossible, for an app to automatically download and install other apps without explicit user initiation. We describe how we can magnify such associations by carefully choosing parameters when applying graph-based inferences. We empirically evaluate the effectiveness of our approach on real datasets provided by a major mobile service provider. Specifically, we show that our approach achieves nearly 98% AUC (area under the ROC curve) and further detects as many as 6 ~ 7 times of new compromised devices not covered by the ground truth by expanding the limited knowledge on known devices. We show that the newly detected devices indeed present undesirable behavior in terms of leaking private information and accessing risky IPs and domains. We further conduct in-depth analysis of the effectiveness of graph inferences to understand the unique structure of the associations between mobile devices and their apps, and its impact on graph inferences, based on which we propose how to choose key parameters.
- [1] 2019. Koodous: Online malware analysis platform. https://koodous.com/.Google Scholar
- [2] . 2016. Can Android applications be identified using only TCP/IP headers of their launch time traffic? In ACM Conference on Security & Privacy in Wireless and Mobile Networks. 61–66.Google Scholar
Digital Library
- [3] . 2016. AndroZoo: Collecting millions of Android apps for the research community. In MSR’16 (Austin, Texas). ACM, New York, NY, USA, 468–471.Google Scholar
Digital Library
- [4] . 2019. Cracking the wall of confinement: Understanding and analyzing malicious domain take-downs. In NDSS.Google Scholar
- [5] . 2014. DREBIN: Effective and explainable detection of Android malware in your pocket. In NDSS, Vol. 14. 23–26.Google Scholar
- [6] . 2011. EXPOSURE: Finding malicious domains using passive DNS analysis. In NDSS. 1–17.Google Scholar
- [7] . 2018. Protecting users with TLS by default in Android P. https://android-developers.googleblog.com/2018/04/protecting-users-with-tls-by-default-in.html.Google Scholar
- [8] . 2019. An Update on Android TLS Adoption. https://security.googleblog.com/2019/12/an-update-on-android-tls-adoption.html.Google Scholar
- [9] . 2017. Australia wants to force ISPs to protect customers from malware. https://www.theinquirer.net/inquirer/news/3009045/australian-wants-to-force-isps-to-protect-customers-from-malware.Google Scholar
- [10] . 2015. AppCracker: Widespread vulnerabilities in user and session authentication in mobile apps. MoST (2015).Google Scholar
- [11] . 2016. Measurement and analysis of private key sharing in the https ecosystem. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 628–640.Google Scholar
Digital Library
- [12] . 2011. Polonium: Tera-scale graph mining and inference for malware detection. In SDM. SIAM, 131–142.Google Scholar
- [13] . 2015. Finding unknown malice in 10 seconds: Mass vetting for new threats at the Google-Play scale. In Usenix Security 15. 659–674.Google Scholar
- [14] . 2015. DroidJust: Automated functionality-aware privacy leakage analysis for Android applications. In Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks. ACM, 5.Google Scholar
Digital Library
- [15] . 2018. Machine learning based mobile malware detection using highly imbalanced network traffic. Information Sciences 433 (2018), 346–364.Google Scholar
Cross Ref
- [16] . 2018. The dark side (-channel) of mobile devices: A survey on network traffic analysis. IEEE Communications Surveys & Tutorials 20, 4 (2018), 2658–2713.Google Scholar
Digital Library
- [17] . 2013. NetworkProfiler: Towards automatic fingerprinting of Android apps. Proceedings - IEEE INFOCOM, 809–817.Google Scholar
- [18] . 2016. OAuth 2.0 for native apps. Internet Engineering Task Force, Internet-Draft draft-ietf-oauthnative-apps-05 (2016).Google Scholar
- [19] . 2019. Mixed content weakens HTTPS. https://developers.google.com/web/fundamentals/security/prevent-mixed-content/what-is-mixed-content.Google Scholar
- [20] . 2014. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. ACM Transactions on Computer Systems (TOCS) 32, 2 (2014), 5.Google Scholar
Digital Library
- [21] . 2017. Cyber Security Minister says firms need to tell customers more about threats. https://www.afr.com/technology/cyber-security-minister-says-firms-need-to-tell-customers-more-about-threats-20170422-gvqbl7.Google Scholar
- [22] 2019. DNS Database. https://www.dnsdb.info/.Google Scholar
- [23] . 2011. A survey of mobile malware in the wild. In Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices. 3–14.Google Scholar
Digital Library
- [24] . 2016. Node2Vec: Scalable feature learning for networks. In KDD’16 (San Francisco, California, USA). New York, NY, USA, 855–864.Google Scholar
Digital Library
- [25] . 2008. Measuring and detecting fast-flux service networks. In NDSS.Google Scholar
- [26] . 2019. Characterizing location-based mobile tracking in mobile ad networks. arXiv preprint arXiv:1903.09916 (2019).Google Scholar
- [27] . 2016. An analysis of the privacy and security risks of Android VPN permission-enabled apps. In IMC. ACM, 349–364.Google Scholar
- [28] . 1912. The distribution of the flora in the alpine zone. 1. New Phytologist 11, 2 (1912), 37–50.Google Scholar
Cross Ref
- [29] . 2013. Malware detection for mobile devices using software-defined networking. In 2013 Second GENI Research and Educational Experiment Workshop. IEEE, 81–88.Google Scholar
Digital Library
- [30] . 2018. A domain is only as good as its buddies: Detecting stealthy malicious domains via graph inference. In CODASPY. ACM, 330–341.Google Scholar
- [31] . 2021. How did that get in my phone? Unwanted app distribution on Android devices. In 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 53–69.Google Scholar
Cross Ref
- [32] . 2015. The dropper effect: Insights into malware distribution with downloader graph analytics. In CCS. ACM, 1118–1129.Google Scholar
- [33] . 2013. The core of the matter: Analyzing malicious traffic in cellular carriers. In NDSS.Google Scholar
- [34] . 2016. CREDROID: Android malware detection by network traffic analysis. In Proceedings of the 1st ACM Workshop on Privacy-Aware Mobile Computing. ACM, 28–36.Google Scholar
Digital Library
- [35] . 2014. Detecting malicious domains via graph inference. In European Symposium on Research in Computer Security. Springer, 1–18.Google Scholar
- [36] . 2015. Personalized security indicators to detect application phishing attacks in mobile platforms. arXiv preprint arXiv:1502.06824 (2015).Google Scholar
- [37] . 2019. McAfee mobile threat report 2019. (2019).Google Scholar
- [38] . 2016. Sherlock vs Moriarty: A smartphone dataset for cybersecurity research. In Proc. of the 2016 ACM Workshop on Artificial Intelligence and Security. 1–12.Google Scholar
- [39] . 2015. AppPrint: Automatic fingerprinting of mobile applications in network traffic. In International Conference on Passive and Active Network Measurement. Springer, 57–69.Google Scholar
Cross Ref
- [40] . 2019. MalRank: A measure of maliciousness in SIEM-based knowledge graphs. In ACSAC. 417–429.Google Scholar
- [41] . 2016. Evaluation of machine learning classifiers for mobile malware detection. Soft Computing 20, 1 (2016), 343–357.Google Scholar
Digital Library
- [42] . 2015. Detection of early-stage enterprise infection by mining large-scale log data. In DSN. IEEE, 45–56.Google Scholar
- [43] . 2017. The long-standing privacy debate: Mobile websites vs mobile apps. In WWW. 153–162.Google Scholar
- [44] . 2010. Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In NSDI, Vol. 10. 14.Google Scholar
- [45] . 2020. Towards HTTPS everywhere on Android: We are not there yet. In USENIX Security’20. USENIX Association, 343–360.Google Scholar
- [46] . 2016. Real-time detection of malware downloads via large-scale URL file machine graph mining. In ASIACCS. ACM, 783–794.Google Scholar
- [47] . 2015. SAMPLES: Self adaptive mining of persistent lexical snippets for classifying mobile application traffic.Google Scholar
- [48] . 2016. Approximate matching of persistent lexicon using search-engines for classifying mobile app traffic. In IEEE INFOCOM. IEEE, 1–9.Google Scholar
- [49] . 2018. Bug fixes, improvements, ...and privacy leaks. (2018).Google Scholar
- [50] . 2016. ReCon: Revealing and controlling PII leaks in mobile network traffic. In 14th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 361–374.Google Scholar
Digital Library
- [51] . 2020. The many kinds of creepware used for interpersonal attacks. In IEEE S&P.Google Scholar
- [52] . 2014. Mobile malware detection through analysis of deviations in application network behavior. Computers & Security 43 (2014), 1–18.Google Scholar
Cross Ref
- [53] . 2018. Predicting impending exposure to malicious content from user behavior. In CCS. ACM, 1487–1501.Google Scholar
- [54] . 2017. Marmite: Spreading malicious file reputation through download graphs. In ACSAC. ACM, 91–102.Google Scholar
- [55] . 2014. Guilt by association: Large scale malware detection by mining file-relation graphs. In KDD. ACM, 1524–1533.Google Scholar
- [56] . 2014. Dynamic analysis of traffic time series at different temporal scales: A complex networks approach. Physica A: Statistical Mechanics and Its Applications 405 (2014), 303–315.Google Scholar
Cross Ref
- [57] . 2016. AppScanner: Automatic fingerprinting of smartphone apps from encrypted network traffic. In Euro S&P. IEEE, 439–454.Google Scholar
- [58] . 2017. Robust smartphone app identification via encrypted network traffic analysis. IEEE Transactions on Information Forensics and Security (2017).Google Scholar
- [59] . 2013. Understanding mobile app usage patterns using in-app advertisements. In International Conference on Passive and Active Network Measurement. Springer, 63–72.Google Scholar
Digital Library
- [60] . 2009. On cellular botnets: Measuring the impact of malicious devices on a cellular network core. In CCS. ACM, 223–234.Google Scholar
- [61] . [n. d.]. FLOWPRINT: Semi-supervised mobile-app fingerprinting on encrypted network traffic. ([n. d.]).Google Scholar
- [62] . 2017. Leaky birds: Exploiting mobile application traffic for surveillance. 367–384.Google Scholar
- [63] . 2019. Mobile Security Index. (2019).Google Scholar
- [64] . 2019. VirusTotal. http://www.virustotal.com.Google Scholar
- [65] . 2017. Exploring the ecosystem of malicious domain registrations in the .eu TLD. In Research in Attacks, Intrusions, and Defenses. Springer International Publishing, 472–493.Google Scholar
- [66] . 2018. Beyond Google Play: A large-scale comparative study of Chinese Android app markets. In IMC 2018 (Boston, MA, USA). ACM, 293–307.Google Scholar
- [67] . 2016. TrafficAV: An effective and explainable detection of mobile malware behavior using network traffic. In IwQoS. IEEE, 1–6.Google Scholar
- [68] . 2017. Deep ground truth analysis of current Android malware. In International Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment. 252–276.Google Scholar
- [69] . 2013. Mosaic: Quantifying privacy leakage in mobile networks. In ACM SIGCOMM Computer Communication Review, Vol. 43. 279–290.Google Scholar
- [70] . 2012. DroidScope: Seamlessly reconstructing the OS and Dalvik semantic views for dynamic Android malware analysis. In USENIX Security Symposium. 569–584.Google Scholar
Digital Library
- [71] . 2014. DroidMiner: Automated mining and characterization of fine-grained malicious behaviors in Android applications. In European Symposium on Research in Computer Security. Springer, 163–182.Google Scholar
Digital Library
- [72] . 2017. Supervised belief propagation: Scalable supervised inference on attributed networks. In ICDM. IEEE, 595–604.Google Scholar
- [73] . 2014. Automated generation of models for fast and precise detection of HTTP-based malware. In PST. IEEE, 249–256.Google Scholar
- [74] . 2012. RobotDroid: A lightweight malware detection framework on smartphones. Journal of Networks 7, 4 (2012), 715.Google Scholar
Cross Ref
- [75] . 2002. Learning from labeled and unlabeled data with label propagation.Google Scholar
- [76] . 2012. A social network based patching scheme for worm containment in cellular networks. In Handbook of Optimization in Complex Networks. Springer, 505–533.Google Scholar
Cross Ref
Index Terms
DeviceWatch: A Data-Driven Network Analysis Approach to Identifying Compromised Mobile Devices with Graph-Inference
Recommendations
Android on Mobile Devices: An Energy Perspective
CIT '10: Proceedings of the 2010 10th IEEE International Conference on Computer and Information TechnologyMobile devices and embedded devices need more processing power but energy consumption should be less to save battery power. Open Handset Alliance (OHA) hosting members like Google, Motorola, HTC etc released an open source platform Android for mobile ...
Enhancing User Privacy on Android Mobile Devices via Permissions Removal
HICSS '14: Proceedings of the 2014 47th Hawaii International Conference on System SciencesAndroid mobile devices are becoming a popular alternative to computers. The rise in the number of tasks performed on smartphones means sensitive information is stored on the devices. Consequently, Android devices are a potential vector for criminal ...
Following Passive DNS Traces to Detect Stealthy Malicious Domains Via Graph Inference
Malicious domains, including phishing websites, spam servers, and command and control servers, are the reason for many of the cyber attacks nowadays. Thus, detecting them in a timely manner is important to not only identify cyber attacks but also take ...






Comments