Abstract
Privacy on the Internet has become a priority, and several efforts have been devoted to limit the leakage of personal information. Domain names, both in the TLS Client Hello and DNS traffic, are among the last pieces of information still visible to an observer in the network. The Encrypted Client Hello extension for TLS, DNS over HTTPS or over QUIC protocols aim to further increase network confidentiality by encrypting the domain names of the visited servers.
In this article, we check whether an attacker able to passively observe the traffic of users could still recover the domain name of websites they visit even if names are encrypted. By relying on large-scale network traces, we show that simplistic features and off-the-shelf machine learning models are sufficient to achieve surprisingly high precision and recall when recovering encrypted domain names. We consider three attack scenarios, i.e., recovering the per-flow name, rebuilding the set of visited websites by a user, and checking which users visit a given target website. We next evaluate the efficacy of padding-based mitigation, finding that all three attacks are still effective, despite resources wasted with padding. We conclude that current proposals for domain encryption may produce a false sense of privacy, and more robust techniques should be envisioned to offer protection to end users.
- [1] . 2014. Pervasive Monitoring Is an Attack.
Technical Report 7528. RFC Editor.Google ScholarDigital Library
- [2] . 2014. The cost of the “S” in HTTPS. In
Proceedings of the International Conference on Emerging Networking Experiments and Technologies (CoNEXT’14 ). 133–140.Google Scholar - [3] . 2019. TLS beyond the browser: Combining end host and network data to understand application behavior. In
Proceedings of the Internet Measurement Conference (IMC’19 ). 379–392.Google Scholar - [4] . 2021. Hypertext Transfer Protocol Version 3 (HTTP/3).
Internet-Draft draft-ietf-quic-http-34. Internet Engineering Task Force. Retrieved from https://datatracker.ietf.org/doc/html/draft-ietf-quic-http-34.Google Scholar - [5] . 2019. An empirical study of the cost of DNS-over-HTTPS. In
Proceedings of the Internet Measurement Conference (IMC’19 ). 15–21.Google Scholar - [6] . 2012. DNS to the rescue: Discerning content and services in a tangled web. In
Proceedings of the Internet Measurement Conference (IMC ’12). 413–426.Google Scholar - [7] . 2017. Users’ fingerprinting techniques from TCP traffic. In
Proceedings of the ACM CoNEXT Workshop on Big DAta, Machine Learning and Artificial Intelligence for Data Communication Networks (Big-DAMA’17 ). 49–54.Google Scholar - [8] . 2018. DNS Queries over HTTPS (DoH).
Technical Report 8484. RFC Editor.Google ScholarDigital Library
- [9] . 2022. DNS over Dedicated QUIC Connections. RFC 9250. (
May 2022).DOI: Google ScholarDigital Library
- [10] . 2022. DNS privacy with speed? Evaluating DNS over QUIC and its impact on web performance. In Proceedings of the 22nd ACM Internet Measurement Conference. 44–50.Google Scholar
Digital Library
- [11] . 2021. TLS Encrypted Client Hello.
Internet-Draft draft-ietf-tls-esni-13. Internet Engineering Task Force. Retrieved from https://datatracker.ietf.org/doc/html/draft-ietf-tls-esni-13.Work in Progress. Google Scholar - [12] . 2012. Issues and future directions in traffic classification. IEEE Netw. 26, 1 (2012), 35–40. Google Scholar
Digital Library
- [13] . 2018. Padding Policies for Extension Mechanisms for DNS (EDNS(0)).
Technical Report 8467.DOI: Google ScholarDigital Library
- [14] . 2020. Does domain name encryption increase users’ privacy? ACM SIGCOMM Comput. Commun. Rev. 50, 3 (2020), 16–22.Google Scholar
Digital Library
- [15] . 2016. Specification for DNS over Transport Layer Security (TLS).
Technical Report 7858. RFC Editor.Google ScholarDigital Library
- [16] . 2018. Usage Profiles for DNS over TLS and DNS over DTLS. RFC 8310. (
March 2018).DOI: Google ScholarDigital Library
- [17] . 2021. Service Binding and Parameter Specification via the DNS (DNS SVCB and HTTPS RRs).
Internet-Draft draft-ietf-dnsop-svcb-https-07. Internet Engineering Task Force. Retrieved from https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-svcb-https-07.Google Scholar - [18] . 2003. Fingerprinting websites using traffic analysis. In
Proceedings of the Annual Privacy Enhancing Technologies Symposium (PETS’03 ). 171–178.Google Scholar - [19] . 2014. Website fingerprinting using traffic analysis of dynamic webpages. In
Proceedings of the IEEE Global Communications Conference (GLOBECOM ’14). 557–563.Google Scholar - [20] . 2015. A novel website fingerprinting attack against multi-tab browsing behavior. In
Proceedings of the IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD’15 ). 234–239.Google Scholar - [21] . 2015. Torben: A practical side-channel attack for deanonymizing Tor communication. In
Proceedings of the ACM ASIA Conference on Computer and Communications Security (ASIACCS ’15). 597–602.Google Scholar - [22] . 2016. User profiling in the time of HTTPS. In
Proceedings of the Internet Measurement Conference (IMC’16 ). 373–379.Google Scholar - [23] . 2016. A web traffic analysis attack using only timing information. IEEE Trans. Inf. Forens. Secur. 11, 8 (2016), 1747–1759.Google Scholar
Digital Library
- [24] . 2018. Tools for active and passive network side-channel detection for web applications. In
Proceedings of the IEEE Workshop on Offensive Technologies (WOOT ’18).Google Scholar - [25] . 2014. I know why you went to the clinic: Risks and realization of HTTPS Traffic analysis
Proceedings of the Annual Privacy Enhancing Technologies Symposium (PETS’14 ). 143–163.Google Scholar - [26] . 2018. Automated website fingerprinting through deep learning.
Proceedings of the NDSS ).Google Scholar - [27] . 2019. Var-CNN: A data-efficient website fingerprinting attack based on deep learning.
Proceedings of the Annual Privacy Enhancing Technologies Symposium (PETS’19 ). 292–310.Google Scholar - [28] . 2018. Deep fingerprinting: Undermining website fingerprinting defenses with deep learning.
Proceedings of the ACM Conference on Computer and Communications Security (CCS ’18). 1928–1943.Google Scholar - [29] . 2014. Effective attacks and provable defenses for website fingerprinting.
Proceedings of the USENIX Security Symposium (USENIX Security ’14). 143–157.Google Scholar - [30] . 2011. Flexible traffic and host profiling via DNS rendezvous. In
Proceedings of the 1st Securing and Trusting Internet Names Workshop (SATIN’11) . 1–8.Google Scholar - [31] . 2016. Statistical estimation of the names of HTTPS servers with domain name graphs. Comput. Commun. 94 (2016), 104–113.Google Scholar
Digital Library
- [32] . 2019. An end-to-end, large-scale measurement of DNS-over-encryption: How far have we come? In
Proceedings of the Internet Measurement Conference (IMC’19 ). 22–35.Google Scholar - [33] . 2020. Comparing the effects of DNS, dot, and doh on web performance. In Proceedings of the Web Conference 2020. 562–572.Google Scholar
Digital Library
- [34] . 2021. Measuring dns over tls from the edge: adoption, reliability, and response times. In Proceeding of the International Conference on Passive and Active Network Measurement, Springer, 192–209.Google Scholar
Digital Library
- [35] . 2019. An investigation on information leakage of DNS over TLS. In
Proceedings of the International Conference on Emerging Networking Experiments and Technologies (CoNEXT ’19).Google Scholar - [36] . 2020. Encrypted DNS–> privacy? A traffic analysis perspective. In
Proceedings of the Network and Distributed System Security Symposium (NDSS’20 ).Google Scholar - [37] . 2020. Doh insight: Detecting DNS over https by machine learning. In Proceedings of the 15th International Conference on Availability, Reliability and Security. 1–8.Google Scholar
Digital Library
- [38] . 2020. Privacy illusion: Beware of unpadded DoH. In Proceedings of the 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON’20). IEEE, 0621–0628.Google Scholar
Cross Ref
- [39] . 2020. Real-time encrypted traffic classification via lightweight neural networks. In Proceedings of the IEEE Global Communications Conference (GLOBECOM’20). IEEE, 1–6.Google Scholar
Digital Library
- [40] . 2020. Padding ain’t enough: Assessing the privacy guarantees of encrypted DNS. In Proceedings of the 10th USENIX Workshop on Free and Open Communications on the Internet (FOCI’20).Google Scholar
- [41] Alec Muffett. Dohot: Making practical use of dns over https over Tor. Retrieved February 15, 2021 from https://github.com/alecmuffett/dohot.Google Scholar
- [42] . 2021. Oblivious DNS over HTTPS (ODoH): A practical privacy enhancement to DNS. Proceedings on Privacy Enhancing Technologies 4 (2021), 575–592.Google Scholar
Cross Ref
- [43] . 2020. Assessing the privacy benefits of domain name encryption. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 290–304.Google Scholar
Digital Library
- [44] . 2012. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 332–346.Google Scholar
Digital Library
- [45] . 2006. Timing analysis in low-latency mix networks: Attacks and defenses. In European Symposium on Research in Computer Security. Springer, 18–33.Google Scholar
- [46] . 2016. Towards web service classification using addresses and DNS. In
Proceedings of the International Wireless Communications and Mobile Computing Conference (IWCMC’16) . 38–43.Google Scholar - [47] . 2017. Traffic analysis with off-the-shelf hardware: Challenges and lessons learned. IEEE Commun. Mag. 55, 3 (2017), 163–169.Google Scholar
Digital Library
- [48] . 2020. On landing and internal web pages: The strange case of Jekyll and Hyde in web performance measurement. In Proceedings of the ACM Internet Measurement Conference. 680–695.Google Scholar
Digital Library
- [49] . 2014. Flow monitoring explained: From packet capture to data analysis with netflow and ipfix. IEEE Commun. Surv. Tutor. 16, 4 (2014), 2037–2064.Google Scholar
Cross Ref
- [50] . 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830.Google Scholar
Digital Library
- [51] . 2009. An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol. Methods 14, 4 (2009), 323.Google Scholar
Cross Ref
- [52] . 2022. The internet with privacy policies: Measuring the web upon consent. ACM Trans. Web 16, 3 (2022), 1–24.Google Scholar
Digital Library
Index Terms
Attacking DoH and ECH: Does Server Name Encryption Protect Users’ Privacy?
Recommendations
Does domain name encryption increase users' privacy?
Knowing domain names associated with traffic allows eavesdroppers to profile users without accessing packet payloads. Encrypting domain names transiting the network is, therefore, a key step to increase network confidentiality. Latest efforts include ...
Combining fragmentation and encryption to protect privacy in data storage
The impact of privacy requirements in the development of modern applications is increasing very quickly. Many commercial and legal regulations are driving the need to develop reliable solutions for protecting sensitive information whenever it is stored, ...
Integrating OpenID with proxy re-encryption to enhance privacy in cloud-based identity services
CLOUDCOM '12: Proceedings of the 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom)The inclusion of identity management in the cloud computing landscape represents a new business opportunity for providing what has been called Identity Management as a Service (IDaaS). Nevertheless, IDaaS introduces the same kind of problems regarding ...






Comments