skip to main content
research-article

Host-Based P2P Flow Identification and Use in Real-Time

Published:01 May 2011Publication History
Skip Abstract Section

Abstract

Data identification and classification is a key task for any Internet Service Provider (ISP) or network administrator. As port fluctuation and encryption become more common in P2P applications wishing to avoid identification, new strategies must be developed to detect and classify their flows. This article introduces a method of separating P2P and standard web traffic that can be applied as part of an offline data analysis process, based on the activity of the hosts on the network. Heuristics are analyzed and a classification system proposed that focuses on classifying those “long” flows that transfer most of the bytes across a network. The accuracy of the system is then tested using real network traffic from a core Internet router showing misclassification rates as low as 0.54% of flows in some cases. We expand on this proposed strategy to investigate its relevance to real-time, early classification problems. New proposals are made and the results of real-time experiments are compared to those obtained in the offline analysis. It is shown that classification accuracies in the real-time strategy are similar to those achieved in offline analysis with a large portion of the total web and P2P flows correctly identified.

References

  1. Basher, N., Mahanti, A., Mahanti, A., Williamson, C., and Arlitt, M. 2008. A comparative analysis of web and peer-to-peer traffic. In Proceeding of the 17th International Conference on World Wide Web. 287--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bernaille, L., Teixeira, R., and Salamatian, K. 2006a. Early application identification. In Proceedings of the 2006 ACM CoNEXT. International Conference on Emerging Networking Experiments and Technologies. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bernaille, L., Teixeira, R., Akodjenou, I., Soule, A., and Salamatian, K. 2006b. Traffic classification on the fly. ACM SIGCOMM Computer Communi. Review, 36, 2, 23--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bernaille, L. and Teixeira, R. 2007. Early recognition of encrypted applications. Lecture Notes in Computer Science, vol. 4427, 165--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BitTorrent. A technical description of the BitTorrent protocol. http://www.cse.chalmers.se/~tsigas/Courses/DCDSeminar/Files/BitTorrent.pdf.Google ScholarGoogle Scholar
  6. Collins, M. and Reiter, M. 2006. Finding peer-to-peer file-sharing using coarse network behaviors. Lecture Notes in Computer Science, Comput. Secur., ESORICS, vol. 4189, 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Constantinou, F. and Mavrommatis, P. 2006. Identifying known and unknown peer-to-peer traffic. In Proceedings of the 5th IEEE International Symposium on Network Computing and Applications. 93--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Erman, J., Mahanti, A., and Arlitt, M. 2007a. Byte me: A case for byte accuracy in traffic classification. In Proceedings of MineNet’07. 25--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Erman, J., Mahanti, A., Arlitt, M., Cohen, I., and Williamson, C. 2007b. Offline/realtime traffic classification using semi-supervised learning. Perfo. Eval., 64, 9-12, 1194--1213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gnutella -- The Gnutella protocol specification v0.4. http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf.Google ScholarGoogle Scholar
  11. Guha, S., Daswani, N., and Jain, R. 2006. An experimental study of the Skype peer-to-peer VoIP system. In Proceedings of the 5th International Workshop on Peer-to-Peer Systems (IPTPS).Google ScholarGoogle Scholar
  12. Hu, Y., Chiu, D., and Lui, J. 2008. Application identification based on network behavioural profiles. In Proceedings of the 16th International Workshop on Quality of Service (IWQoS). 219--228.Google ScholarGoogle Scholar
  13. Huang, G., Jai, G., and Chao, H. 2008. Early identifying application traffic with application characteristics. In Proceedings of the IEEE Conference on Communications (ICC’08). 5788--5792.Google ScholarGoogle Scholar
  14. Iana. Port numbers reference, http://www.iana.org/assignments/port-numbers.Google ScholarGoogle Scholar
  15. John, W. and Tafvelin, S. 2008. Heuristics to classify internet backbone traffic based on connection patterns. In Proceedings of the International Conference on Information Networking (ICOIN). 1--5.Google ScholarGoogle Scholar
  16. Junior, G P S., Maia, J. E. B., Holanda, R., and De Sousa, J. N. 2007. P2P traffic identification using cluster analysis. In Proceedings of the Global Information Infrastructure Symposium (GIIS’07). 128--133.Google ScholarGoogle ScholarCross RefCross Ref
  17. Karagiannis, T., Broido, A., Faloutsos, M., and Claffy, Kc. 2004. Transport layer identification of P2P traffic. In Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement (IMC’04). 121--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Karagiannis, T., Papagiannaki, K., and Faloutsos, M. 2005. BLINC: Multilevel Traffic Classification in the dark. In Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications. 22--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kazaa, http://www.kazaa.com.Google ScholarGoogle Scholar
  20. Kulbak, Y., and Bickson, D. eMule -- The eMule protocol specification. http://www.cs.huji.ac.il/labs/danss/p2p/resources/emule.pdf.Google ScholarGoogle Scholar
  21. L7-Filter. Application layer packet classifier for Linux. http://l7-filter.sourceforge.net/.Google ScholarGoogle Scholar
  22. Ocampe, R., Galis, A., Todd, C., and De Meer, H. 2006. Towards context-based flow classification. In Proceedings of the International Conference on Autonomic and Autonomous Systems. 44--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ohzahata, S., Hagiwara, Y., Terada, M., and Kawashima, K. 2005. A traffic identification method and evaluations for a pure P2P application. In Proceedings of the Conference on Passive and Active Network Management (PAM’05). Lecture Notes in Computer Science, vol. 3431, Springer, Berlin, 55--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Perényi, M., Dang, T., Gefferth, A., and Molnár, S. 2006. Identification and analysis of peer-to-peer traffic, J. Comm. 1, 7, 36--46.Google ScholarGoogle ScholarCross RefCross Ref
  25. Sandvine. 2009. Report on P2P. http://www.dslreports.com/shownews/Sandvine-P2P-Now-Just-20-Of-Internet-Use-105194.Google ScholarGoogle Scholar
  26. Sen, S., Spatscheck, O. and Wang, D. 2004. Accurate, scalable in-network identification of P2P traffic using application signatures. In Proceedings of the 13th International Conference on World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Zhang, M., John, W., Claffy, K. C., and Brownlee, N. 2009. State of the art in traffic classification: A research review. In Proceedings of the Conference on Passive and Active Network Management (PAM’09).Google ScholarGoogle Scholar
  28. Zhou, L., Li, Z., and Liu, B. 2006. P2P traffic identification by TCP flow analysis. In Proceedings of the International Workshop on Networking, Architecture, and Storages (IWNAS’06). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Host-Based P2P Flow Identification and Use in Real-Time

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 5, Issue 2
      May 2011
      190 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/1961659
      Issue’s Table of Contents

      Copyright © 2011 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 May 2011
      • Revised: 1 July 2010
      • Accepted: 1 July 2010
      • Received: 1 May 2009
      Published in tweb Volume 5, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!