skip to main content
research-article
Open Access

IP Geolocation through Reverse DNS

Published:15 October 2021Publication History
Skip Abstract Section

Abstract

IP Geolocation databases are widely used in online services to map end-user IP addresses to their geographical location. However, they use proprietary geolocation methods, and in some cases they have poor accuracy. We propose a systematic approach to use reverse DNS hostnames for geolocating IP addresses, with a focus on end-user IP addresses as opposed to router IPs. Our method is designed to be combined with other geolocation data sources. We cast the task as a machine learning problem where, for a given hostname, we first generate a list of potential location candidates, and then we classify each hostname and candidate pair using a binary classifier to determine which location candidates are plausible. Finally, we rank the remaining candidates by confidence (class probability) and break ties by population count. We evaluate our approach against three state-of-the-art academic baselines and two state-of-the-art commercial IP geolocation databases. We show that our work significantly outperforms the academic baselines and is complementary and competitive with commercial databases. To aid reproducibility, we open source our entire approach and make it available to the academic community.

References

  1. 2017. IPv4 Special-Purpose Address Registry. Technical Report. Internet Assigned Numbers Authority.Google ScholarGoogle Scholar
  2. John Akhilomen. 2013. Data mining application for cyber credit-card fraud detection system. In Proceedings of the Industrial Conference on Data Mining. Springer, 218–228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: Improving geographical prediction with social and spatial proximity. In Proceedings of the Annual Conference on the World Wide Web (WWW'10). ACM, 61–70. DOI:https://doi.org/10.1145/1772690.1772698 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Paul N. Bennett, Filip Radlinski, Ryen W. White, and Emine Yilmaz. 2011. Inferring and using location metadata to personalize web search. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'11). ACM, 135–144. DOI:https://doi.org/10.1145/2009916.2009938 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tej Paul Bhatla, Vikram Prabhu, and Amit Dua. 2003. Understanding credit card frauds. Cards Bus. Rev. 1, 6 (2003).Google ScholarGoogle Scholar
  6. R. Braden. 1989. Requirements for Internet Hosts—Application and Support. RFC 1123. RFC Editor. Retrieved from https://tools.ietf.org/html/rfc1123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Asmir Butkovic, Fahrudin Orucevic, and Anel Tanovic. 2013. Using whois based geolocation and google maps api for support cybercrime investigations. In Proceedings of the WSEAS International Conference on Circuits, Systems, Communications, Computers and Applications (CSCCA'13). 194–201.Google ScholarGoogle Scholar
  8. CAIDA. 2018. The CAIDA Internet Topology Data Kit—2018-03. Retrieved August 19, 2020 from https://www.caida.org/data/internet-topology-data-kit.Google ScholarGoogle Scholar
  9. Pew Research Center. 2013. Location-Based Services. Retrieved February 6, 2019 from http://www.pewinternet.org/2013/09/12/location-based-services/.Google ScholarGoogle Scholar
  10. Joseph Chabarek and Paul Barford. 2013. What's in a name?: Decoding router interface names. In Proceedings of the 5th ACM Workshop on HotPlanet. ACM, 3–8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Balakrishnan Chandrasekaran, Mingru Bai, Michael Schoenfield, Arthur Berger, Nicole Caruso, George Economou, Stephen Gilliss, Bruce Maggs, Kyle Moses, David Duff, et al. 2015. Alidade: Ip geolocation without active probing. Technical Report, Department of Computer Science, Duke University.Google ScholarGoogle Scholar
  12. Gloria Ciavarrini, Maria S. Greco, and Alessio Vecchio. 2018. Geolocation of Internet hosts: Accuracy limits through Cramér–Rao lower bound. Comput. Netw. 135 (2018), 70–80.Google ScholarGoogle ScholarCross RefCross Ref
  13. Kc Claffy. 2016. The 7th workshop on active internet measurements (AIMS7) report. ACM SIGCOMM Comput. Commun. Rev. 46, 1 (2016), 50–57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ovidiu Dan, Vaibhav Parikh, and Brian D. Davison. 2016. Improving IP geolocation using query logs. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. ACM, 347–356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ovidiu Dan, Vaibhav Parikh, and Brian D. Davison. 2018. Distributed reverse DNS geolocation. In Proceedings of the IEEE International Conference on Big Data (Big Data'18). IEEE, 1581–1586.Google ScholarGoogle Scholar
  16. Digital Element. 2018. Finding Yourself: The Challenges of Accurate IP Geolocation. Retrieve February 6, 2019 from https://dyn.com/blog/finding-yourself-the-challenges-of-accurate-ip-geolocation/.Google ScholarGoogle Scholar
  17. Ben Du, Massimo Candela, Bradley Huffaker, Alex C. Snoeren, and K. C. Claffy. 2020. RIPE IPmap active geolocation: Mechanism and performance evaluation. ACM SIGCOMM Comput. Commun. Rev. 50, 2 (2020), 3–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zakir Durumeric, Eric Wustrow, and J. Alex Halderman. 2013. ZMap: Fast internet-wide scanning and its security applications. In Proceedings of the 22nd USENIX Security Symposium (USENIX Security'13). 605–620. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Eidnes, G. de Groot, and P. Vixie. 1998. Classless IN-ADDR.ARPA Delegation. RFC 2317. RFC Editor. Retrieved from https://tools.ietf.org/html/rfc2317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. T. Endo and D. Sadok. 2010. Whois based geolocation: A strategy to geolocate internet hosts. In Proceedings of the International Conference on Advanced Information Networking and Applications (AINA'10). 408–413. DOI:https://doi.org/10.1109/AINA.2010.39 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Center for Applied Internet Data Analysis. [n.d.]. DDec–DNS Decoded–CAIDA's public DNS Decoding database. Retrieved July 31, 2018 from http://ddec.caida.org/help.pl.Google ScholarGoogle Scholar
  22. United Nations Economic Commission for Europe. [n.d.]. UN/LOCODE: United Nations Code for Trade and Transport Locations. Retrieved June 27, 2018 from https://www.unece.org/cefact/locode/welcome.html.Google ScholarGoogle Scholar
  23. Mozilla Foundation. [n.d.]. Public Suffix List. Retrieved June 28, 2018 from https://publicsuffix.org/list/.Google ScholarGoogle Scholar
  24. Michael J. Freedman, Mythili Vutukuru, Nick Feamster, and Hari Balakrishnan. 2005. Geographic locality of IP prefixes. In Proceedings of the 5th ACM SIGCOMM Conference on Internet Measurement. USENIX Association, 153–158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fysical. 2019. The Next Data Frontier Isn't Digital. It's Fysical. Retrieved February 6, 2019 fromhttps://fysical.org/.Google ScholarGoogle Scholar
  26. Manaf Gharaibeh, Anant Shah, Bradley Huffaker, Han Zhang, Roya Ensafi, and Christos Papadopoulos. 2017. A look at router geolocation in public and commercial databases. In Proceedings of the 2017 Internet Measurement Conference. ACM, 463–469. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Bamba Gueye, Artur Ziviani, Mark Crovella, and Serge Fdida. 2006. Constraint-based geolocation of internet hosts. IEEE/ACM Trans. Netw. 14, 6 (Dec. 2006), 1219–1232. DOI:https://doi.org/10.1109/TNET.2006.886332 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chuanxiong Guo, Yunxin Liu, Wenchao Shen, H. J. Wang, Qing Yu, and Yongguang Zhang. 2009. Mining the web and the internet for accurate IP address geolocations. In Proceedings of the IEEE International Conference on Computer Communication (INFOCOM'09). 2841–2845. DOI:https://doi.org/10.1109/INFCOM.2009.5062243Google ScholarGoogle ScholarCross RefCross Ref
  29. Aniko Hannak, Piotr Sapiezynski, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring personalization of web search. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 527–538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. Harrenstien, M. Stahl, and E. Feinler. 1985. DoD Internet Host Table Specification. RFC 952. RFC Editor. Retrieved from https://tools.ietf.org/html/rfc952. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jochen Hipp, Ulrich Güntzer, and Gholamreza Nakhaeizadeh. 2000. Algorithms for association rule mining—a general survey and comparison. ACM SIGKDD Explor. Newslett. 2, 1 (2000), 58–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Cheng Huang, D. A. Maltz, Jin Li, and Albert Greenberg. 2011. Public DNS system and global traffic management. In Proceedings of the IEEE International Conference on Computer Communication (INFOCOM'11). 2615–2623. DOI:https://doi.org/10.1109/INFCOM.2011.5935088Google ScholarGoogle ScholarCross RefCross Ref
  33. Bradley Huffaker, Marina Fomenkov, and K. C. Claffy. 2014. DRoP:DNS-based router positioning. ACM SIGCOMM Comput. Commun. Rev. 44, 3 (Jul. 2014), 6–13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Stephen Mark Huffman and Michael Henry Reifer. 2005. Method for Geolocating Logical Network Addresses. US Patent 6,947,978.Google ScholarGoogle Scholar
  35. Ethan Katz-Bassett, John P. John, Arvind Krishnamurthy, David Wetherall, Thomas Anderson, and Yatin Chawathe. 2006. Towards IP geolocation using delay and topology measurements. In Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement. ACM, 71–84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kiip. 2019. Moments Based In-App Mobile Advertising. Retrieved February 6, 2019 from http://www.kiip.me/.Google ScholarGoogle Scholar
  37. Chloe Kliman-Silver, Aniko Hannak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, location, location: The impact of geolocation on web search personalization. In Proceedings of the 2015 Internet Measurement Conference. ACM, 121–127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Bernhard Kölmel and Spiros Alexakis. 2002. Location based advertising. In Proceedings of the 1st International Conference on Mobile Business.Google ScholarGoogle Scholar
  39. Lori MacVittie. 2012.Geolocation and Application Delivery. Retrieved August 2, 2018 from https://www.f5.com/pdf/white-papers/geolocation-wp.pdf.Google ScholarGoogle Scholar
  40. Douglas Maughan et al. 2009. A roadmap for cybersecurity research. U.S. Department of Homeland Security.Google ScholarGoogle Scholar
  41. Reveal Mobile. 2019. Win More Business with Location-Based Marketing & Analytics. Retrieved February 6, 2019 from https://revealmobile.com/.Google ScholarGoogle Scholar
  42. P. Mockapetris. 1987. Domain Names—Concepts and Facilities. RFC 1034. RFC Editor. Retrieved from https://tools.ietf.org/html/rfc1034. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. James A. Muir and Paul C. Van Oorschot. 2009. Internet geolocation: Evasion and counterevasion. ACM Comput. Surv. 42, 1 (2009), 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Abdullah Yasin Nur and Mehmet Engin Tozal. 2018. Geography and routing in the internet. ACM Trans. Spatial Algor. Syst. 4, 4, Article 11 (2018), 16 pages. DOI:https://doi.org/10.1145/3239162 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. A. Costello P. Faltstrom, and P. Hoffman. 2003. Internationalizing Domain Names in Applications (IDNA). RFC 3490. RFC Editor. Retrieved from https://tools.ietf.org/html/rfc3490. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Venkata N. Padmanabhan and Lakshminarayanan Subramanian. 2001. An investigation of geographic mapping techniques for internet hosts. In Proceedings of the Annual Conference of the Special Interest Group on Data Communication (SIGCOMM'01). ACM, 173–185. DOI:https://doi.org/10.1145/383059.383073Google ScholarGoogle Scholar
  47. Ingmar Poese, Steve Uhlig, Mohamed Ali Kaafar, Benoit Donnet, and Bamba Gueye. 2011. IP geolocation databases: Unreliable?ACM SIGCOMM Comput. Commun. Rev. 41, 2 (2011), 53–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Lee Rainie and Maeve Duggan. 2016. Privacy and information sharing. Pew Res. Center 16 (2016).Google ScholarGoogle Scholar
  49. Rapid7Labs. [n.d.]. Reverse DNS (RDNS)–2013-2017. Retrieved June 23, 2018 from https://opendata.rapid7.com/sonar.rdns/.Google ScholarGoogle Scholar
  50. Rapid7Labs. [n.d.]. Reverse DNS (RDNS) v2–2017 Onward. Technical Report. Retrieved Jun 23, 2018 from https://opendata.rapid7.com/sonar.rdns_v2/.Google ScholarGoogle Scholar
  51. Joel Reardon. 2018. Apps Sending Location, Secretly. Retrieved February 6, 2019 from https://blog.appcensus.mobi/2018/05/14/apps-sending-location-secretly/.Google ScholarGoogle Scholar
  52. SafeGraph. 2019. The Source of Truth for Physical Places. Retrieved February, 6, 2019 from https://www.safegraph.com/.Google ScholarGoogle Scholar
  53. Quirin Scheitle, Oliver Gasser, Patrick Sattler, and Georg Carle. 2017. HLOC: Hints-based geolocation leveraging multiple measurement frameworks. arXiv:1706.09331. Retrieved from https://arxiv.org/abs/1706.09331.Google ScholarGoogle Scholar
  54. Yuval Shavitt and Noa Zilberman. 2011. A geolocation databases study. IEEE J. Select. Areas Commun. 29, 10 (2011), 2044–2056.Google ScholarGoogle ScholarCross RefCross Ref
  55. Craig A. Shue, Nathanael Paul, and Curtis R. Taylor. 2013. From an IP address to a street address: Using wireless signals to locate a target. In Proceedings of the Workshop on Offensive Technologies (WOOT'13). USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Neil Spring, Ratul Mahajan, and David Wetherall. 2002. Measuring ISP topologies with Rocketfuel. ACM SIGCOMM Comput. Commun. Rev. 32, 4 (2002), 133–145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Dan Jerker B. Svantesson. 2007. E-commerce tax: How the taxman brought geography to the ‘Borderless’ internet. Rev. Law J. 17, 1 (2007), 11.Google ScholarGoogle Scholar
  58. Geo Targetly. 2019. Automatically Switching Website Language Based on Visitor Country. Retrieved January 26, 2019 from https://geotargetly.com/automatically-switch-website-language-based-on-country.Google ScholarGoogle Scholar
  59. The New York Times. 2018. How the times analyzed location tracking companies. The New York Times, December 10 (2018). Retrieved February 6, 2019 from https://www.nytimes.com/2018/12/10/technology/location-tracking-apps-privacy.html.Google ScholarGoogle Scholar
  60. Paul Timmins. [n.d.]. TelcoData Telecommunications Database. Retrieved June 27, 2018 from https://www.telcodata.us/.Google ScholarGoogle Scholar
  61. Marketa Trimble. 2011. The future of cybertravel: Legal implications of the evasion of geolocation. Fordham Intell. Prop. Media Ent. Law J. 22 (2011), 567.Google ScholarGoogle Scholar
  62. Jennifer Valentino-DeVries, Natasha Singer, Michael H. Keller, and A. Krolik. 2018. Your apps know where you were last night, and They're not keeping it secret. The New York Times,December 10 (2018). Retrieved February 6, 2019 from https://www.nytimes.com/interactive/2018/12/10/business/location-data-privacy-apps.html.Google ScholarGoogle Scholar
  63. Yong Wang, Daniel Burgener, Marcel Flores, Aleksandar Kuzmanovic, and Cheng Huang. 2011. Towards street-level client-independent IP geolocation. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI'11). USENIX, Berkeley, CA, 365–379. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Lin Wei, Guoming Ren, Lei Shi, Yongcai Tao, and Yangjie Cao. 2013. How does the recursive undns algorithm affect the accuracy of an IP geolocation system? In Proceedings of the 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD'13). IEEE, 1060–1064.Google ScholarGoogle Scholar
  65. Marc Wick. [n.d.]. GeoNames. Retrieved June 27, 2018 from http://download.geonames.org/export/dump/.Google ScholarGoogle Scholar
  66. Bernard Wong, Ivan Stoyanov, and Emin Gün Sirer. 2007. Octant: A comprehensive framework for the geolocalization of internet hosts. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI'07). USENIX Association, Berkeley, CA, 23–23. http://dl.acm.org/citation.cfm?id=1973430.1973453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Inja Youn, Brian L. Mark, and Dana Richards. 2009. Statistical geolocation of internet hosts. In Proceedings of the ICCCN 2009. 1–6. DOI:https://doi.org/10.1109/ICCCN.2009.5235373 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. IP Geolocation through Reverse DNS

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!