Abstract
Traffic classification is essential in network management for operations ranging from capacity planning, performance monitoring, volumetry, and resource provisioning, to anomaly detection and security. Recently, it has become increasingly challenging with the widespread adoption of encryption in the Internet, e.g., as a de-facto in HTTP/2 and QUIC protocols. In the current state of encrypted traffic classification using Deep Learning (DL), we identify fundamental issues in the way it is typically approached. For instance, although complex DL models with millions of parameters are being used, these models implement a relatively simple logic based on certain header fields of the TLS handshake, limiting model robustness to future versions of encrypted protocols. Furthermore, encrypted traffic is often treated as any other raw input for DL, while crucial domain-specific considerations exist that are commonly ignored. In this paper, we design a novel feature engineering approach that generalizes well for encrypted web protocols, and develop a neural network architecture based on Stacked Long Short-Term Memory (LSTM) layers and Convolutional Neural Networks (CNN) that works very well with our feature design. We evaluate our approach on a real-world traffic dataset from a major ISP and Mobile Network Operator. We achieve an accuracy of 95% in service classification with less raw traffic and smaller number of parameters, out-performing a state-of-the-art method by nearly 50% fewer false classifications. We show that our DL model generalizes for different classification objectives and encrypted web protocols. We also evaluate our approach on a public QUIC dataset with finer and application-level granularity in labeling, achieving an overall accuracy of 99%.
- Université Toulouse 1. 2020. Blacklists UT1. http://dsi.ut-capitole.fr/blacklists/index_en.php . [Online; Accessed 01-October-2020].Google Scholar
- Giuseppe Aceto, Domenico Ciuonzo, Antonio Montieri, and Antonio Pescapé. 2018. Mobile encrypted traffic classification using deep learning. In IEEE Network Traffic Measurement and Analysis Conference (TMA). 1--8.Google Scholar
Cross Ref
- Giuseppe Aceto, Domenico Ciuonzo, Antonio Montieri, and Antonio Pescapé. 2019. Mobile encrypted traffic classification using deep learning: Experimental evaluation, lessons learned, and challenges. IEEE Transactions on Network and Service Management , Vol. 16, 2 (2019), 445--458.Google Scholar
Cross Ref
- Riyad Alshammari and A Nur Zincir-Heywood. 2009. Machine learning based encrypted traffic classification: Identifying ssh and skype. In IEEE symposium on computational intelligence for security and defense applications. 1--8.Google Scholar
Cross Ref
- Blake Anderson and David McGrew. 2017. Machine learning for encrypted malware traffic classification: accounting for noisy labels and non-stationarity. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1723--1732.Google Scholar
Digital Library
- Blake Anderson and David McGrew. 2020. Accurate TLS Fingerprinting using Destination Context and Knowledge Bases. arXiv preprint arXiv:2009.01939 (2020).Google Scholar
- Blake Anderson, Subharthi Paul, and David McGrew. 2018. Deciphering malware's use of TLS (without decryption). Springer Journal of Computer Virology and Hacking Techniques , Vol. 14, 3 (2018), 195--211.Google Scholar
Cross Ref
- Mike Belshe and Roberto Peon. 2012. SPDY Protocol. Technical Report. Network Working Group. 1--51 pages. https://tools.ietf.org/pdf/draft-mbelshe-httpbis-spdy-00.pdfGoogle Scholar
- Mike Belshe, Roberto Peon, and Martin Thomson. 2015. Hypertext Transfer Protocol Version 2 (HTTP/2). IETF RFC 7540. 1--96 pages.Google Scholar
- Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine learning , Vol. 79, 1--2 (2010), 151--175.Google Scholar
- Dario Bonfiglio, Marco Mellia, Michela Meo, Dario Rossi, and Paolo Tofanelli. 2007. Revealing skype traffic: when randomness plays with you. In ACM SIGCOMM Computer Communication Review, Vol. 37. 37--48.Google Scholar
Digital Library
- Raouf Boutaba, Mohammad A Salahuddin, Noura Limam, Sara Ayoubi, Nashid Shahriar, Felipe Estrada-Solano, and Oscar M Caicedo. 2018. A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. Springer Journal of Internet Services and Applications , Vol. 9, 1 (2018), 16.Google Scholar
Cross Ref
- Pierre-Olivier Brissaud, Jérôme Franccc is, Isabelle Chrisment, Thibault Cholez, and Olivier Bettan. 2019. Transparent and Service-Agnostic Monitoring of Encrypted Web Traffic. IEEE Transactions on Network and Service Management , Vol. 16, 3 (2019), 842--856.Google Scholar
Cross Ref
- Francesco Bronzino, Paul Schmitt, Sara Ayoubi, Guilherme Martins, Renata Teixeira, and Nick Feamster. 2019. Inferring streaming video quality from encrypted traffic: Practical models and deployment experience. ACM on Measurement and Analysis of Computing Systems (SIGMETRICS) , Vol. 3, 3 (2019), 1--25.Google Scholar
Digital Library
- Zhiyong Bu, Bin Zhou, Pengyu Cheng, Kecheng Zhang, and Zhen-Hua Ling. 2020. Encrypted Network Traffic Classification Using Deep and Parallel Network-in-Network Models. IEEE Access , Vol. 8 (2020), 132950--132959.Google Scholar
Cross Ref
- Zhitang Chen, Ke He, Jian Li, and Yanhui Geng. 2017. Seq2img: A sequence-to-image based approach towards ip traffic classification using convolutional neural networks. In IEEE International Conference on Big Data (Big Data). 1271--1276.Google Scholar
Cross Ref
- Ramin Hasibi, Matin Shokri, and Mehdi Dehghan. 2019. Augmentation scheme for dealing with imbalanced network traffic classification using deep learning. arXiv preprint arXiv:1901.00204 (2019).Google Scholar
- Jonas Höchst, Lars Baumg"artner, Matthias Hollick, and Bernd Freisleben. 2017. Unsupervised traffic flow classification using a neural autoencoder. In IEEE Conference on Local Computer Networks (LCN). 523--526.Google Scholar
Cross Ref
- Janardhan Iyengar and Ian Swett. 2015. QUIC: A UDP-Based Secure and Reliable Transport for HTTP/2. Technical Report. Network Working Group. 1--30 pages.Google Scholar
- Jana Iyengar and Martin Thomson. 2018. QUIC: A UDP-based multiplexed and secure transport. Internet Engineering Task Force, Internet-Draft (2018).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Arash Habibi Lashkari, Gerard Draper-Gil, Mohammad Saiful Islam Mamun, and Ali A Ghorbani. 2017. Characterization of Tor Traffic using Time based Features. In International Conference on Information Systems Security and Privacy (ICISSP) . 253--262.Google Scholar
Cross Ref
- Chang Liu, Longtao He, Gang Xiong, Zigang Cao, and Zhen Li. 2019. Fs-net: A flow sequence network for encrypted traffic classification. In IEEE Conference on Computer Communications (INFOCOM). 1171--1179.Google Scholar
Digital Library
- Xun Liu, Junling You, Yulei Wu, Tong Li, Liangxiong Li, Zheyuan Zhang, and Jingguo Ge. 2020. Attention-based bidirectional gru networks for efficient https traffic classification. Elsevier Information Sciences , Vol. 541 (2020), 297--315.Google Scholar
Cross Ref
- Manuel Lopez-Martin, Belen Carro, Antonio Sanchez-Esguevillas, and Jaime Lloret. 2017. Network traffic classifier with convolutional and recurrent neural networks for Internet of Things. IEEE Access , Vol. 5 (2017), 18042--18050.Google Scholar
Cross Ref
- Mohammad Lotfollahi, Mahdi Jafari Siavoshani, Ramin Shirali Hossein Zade, and Mohammdsadegh Saberian. 2020. Deep packet: A novel approach for encrypted traffic classification using deep learning. Springer Soft Computing , Vol. 24, 3 (2020), 1999--2012.Google Scholar
Digital Library
- Jonathan Muehlstein, Yehonatan Zion, Maor Bahumi, Itay Kirshenboim, Ran Dubin, Amit Dvir, and Ofir Pele. 2017. Analyzing HTTPS encrypted traffic to identify user's operating system, browser and application. In 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC). IEEE, 1--6.Google Scholar
Digital Library
- Shahbaz Rezaei, Bryce Kroencke, and Xin Liu. 2019. Large-scale mobile app identification using deep learning. IEEE Access , Vol. 8 (2019), 348--362.Google Scholar
Cross Ref
- Shahbaz Rezaei and Xin Liu. 2018. How to achieve high classification accuracy with just a few labels: semi-supervised approach using sampled packets. arXiv preprint arXiv:1812.09761 (2018).Google Scholar
- Vera Rimmer, Davy Preuveneers, Marc Juarez, Tom Van Goethem, and Wouter Joosen. 2017. Automated website fingerprinting through deep learning. arXiv preprint arXiv:1708.06376 (2017).Google Scholar
- Roei Schuster, Vitaly Shmatikov, and Eran Tromer. 2017. Beauty and the burst: Remote identification of encrypted video streams. In USENIX Security Symposium (USENIX Security 17). 1357--1374.Google Scholar
- Yan Shi, Dezhi Feng, and Subir Biswas. 2019. A Natural Language-Inspired Multi-label Video Streaming Traffic Classification Method Based on Deep Neural Networks. arXiv preprint arXiv:1906.02679 (2019).Google Scholar
- Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A Ghorbani. 2012. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. computers & security , Vol. 31, 3 (2012), 357--374.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google Scholar
- Petr Velan, Milan vC ermák, Pavel vC eleda, and Martin Dravs ar. 2015. A survey of methods for encrypted traffic classification and analysis. International Journal of Network Management , Vol. 25, 5 (2015), 355--374.Google Scholar
Digital Library
- Ly Vu, Cong Thanh Bui, and Quang Uy Nguyen. 2017. A deep learning based method for handling imbalanced problem in network traffic classification. In International Symposium on Information and Communication Technology. 333--339.Google Scholar
Digital Library
- Pan Wang, Shuhang Li, Feng Ye, Zixuan Wang, and Moxuan Zhang. 2020. PacketCGAN: Exploratory study of class imbalance for encrypted traffic classification using CGAN. In IEEE International Conference on Communications (ICC). 1--7.Google Scholar
Cross Ref
- Wei Wang, Yiqiang Sheng, Jinlin Wang, Xuewen Zeng, Xiaozhou Ye, Yongzhong Huang, and Ming Zhu. 2018. HAST-IDS: Learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection. IEEE Access , Vol. 6 (2018), 1792--1806.Google Scholar
Cross Ref
- Wei Wang, Ming Zhu, Jinlin Wang, Xuewen Zeng, and Zhongzhen Yang. 2017. End-to-end encrypted traffic classification with one-dimensional convolution neural networks. In IEEE International Conference on Intelligence and Security Informatics (ISI). 43--48.Google Scholar
Digital Library
- Nigel Williams, Sebastian Zander, and Grenville Armitage. 2006. A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification. ACM SIGCOMM Computer Communication Review , Vol. 36, 5 (2006), 5--16.Google Scholar
Digital Library
- Haipeng Yao, Pengcheng Gao, Jingjing Wang, Peiying Zhang, Chunxiao Jiang, and Zhu Han. 2019 a. Capsule network assisted IoT traffic classification mechanism for smart cities. IEEE Internet of Things Journal , Vol. 6, 5 (2019), 7515--7525.Google Scholar
Cross Ref
- Haipeng Yao, Chong Liu, Peiying Zhang, Sheng Wu, Chunxiao Jiang, and Shui Yu. 2019 b. Identification of Encrypted Traffic Through Attention Mechanism Based Long Short Term Memory. IEEE Transactions on Big Data (2019).Google Scholar
- Zhuang Zou, Jingguo Ge, Hongbo Zheng, Yulei Wu, Chunjing Han, and Zhongjiang Yao. 2018. Encrypted traffic classification with a convolutional long short-term memory neural network. In IEEE International Conference on High Performance Computing and Communications; IEEE International Conference on Smart City; IEEE International Conference on Data Science and Systems (HPCC/SmartCity/DSS). 329--334.Google Scholar
Index Terms
A Look Behind the Curtain: Traffic Classification in an Increasingly Encrypted Web
Recommendations
A Look Behind the Curtain: Traffic Classification in an Increasingly Encrypted Web
SIGMETRICS '21: Abstract Proceedings of the 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer SystemsTraffic classification is essential in network management for operations ranging from capacity planning, performance monitoring, volumetry, and resource provisioning, to anomaly detection and security. Recently, it has become increasingly challenging ...
A Look Behind the Curtain: Traffic Classification in an Increasingly Encrypted Web
SIGMETRICS '21Traffic classification is essential in network management for operations ranging from capacity planning, performance monitoring, volumetry, and resource provisioning, to anomaly detection and security. Recently, it has become increasingly challenging ...
Deep learning for encrypted traffic classification in the face of data drift: An empirical study
AbstractDeep learning models have shown to achieve high performance in encrypted traffic classification. However, when it comes to production use, multiple factors challenge the performance of these models. The emergence of new protocols, ...






Comments