skip to main content
research-article
Public Access

Persistent Spread Measurement for Big Network Data Based on Register Intersection

Authors Info & Claims
Published:13 June 2017Publication History
Skip Abstract Section

Abstract

Persistent spread measurement is to count the number of distinct elements that persist in each network flow for predefined time periods. It has many practical applications, including detecting long-term stealthy network activities in the background of normal-user activities, such as stealthy DDoS attack, stealthy network scan, or faked network trend, which cannot be detected by traditional flow cardinality measurement. With big network data, one challenge is to measure the persistent spreads of a massive number of flows without incurring too much memory overhead as such measurement may be performed at the line speed by network processors with fast but small on-chip memory. We propose a highly compact Virtual Intersection HyperLogLog (VI-HLL) architecture for this purpose. It achieves far better memory efficiency than the best prior work of V-Bitmap, and in the meantime drastically extends the measurement range. Theoretical analysis and extensive experiments demonstrate that VI-HLL provides good measurement accuracy even in very tight memory space of less than 1 bit per flow.

References

  1. C. Smith, "By the Numbers: 100 Amazing Google Statistics and Facts," February 2016. {Online}. Available: http://expandedramblings.com/index.php/by-the-numbers-a-gigantic-list-of-google-stats-and-facts/10/Google ScholarGoogle Scholar
  2. "Twitter Usage Statistics." {Online}. Available: http://www.internetlivestats.com/twitter-statistics/Google ScholarGoogle Scholar
  3. C. Estan and G. Varghese, "New Directions in Traffic Measurement and Accounting," Proc. of ACM SIGCOMM, August 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Q. Zhao, A. Kumar, J. Wang, and J. Xu, "Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices," Proc. of ACM SIGMETRICS, vol. 33, no. 1, pp. 350--361, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Chen, J. Cao, and T. Bu, "Bitmap Algorithms for Counting Active Flows on High-Speed Links," Proc. of VLDB, pp. 171--182, 2007.Google ScholarGoogle Scholar
  6. M. Yoon, T. Li, S. Chen, and J.-K. Peir, "Fit a Spread Estimator in Small Memory," Proc. of IEEE INFOCOM, April 2009.Google ScholarGoogle Scholar
  7. P. Lieven and B. Scheuermann, "High-Speed Per-Flow Traffic Measurement with Probabilistic Multiplicity Counting," Proc. of ACM SIGMETRICS, pp. 1--9, 2010.Google ScholarGoogle Scholar
  8. X. Shi, D.-M. Chiu, and J. C. Lui, "An online framework for catching top spreaders and scanners," Computer Networks, vol. 54, no. 9, pp. 1375--1388, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Chen and S. Chen, "Counter Tree: A Scalable Counter Architecture for Per-Flow Traffic Measurement," Proc. of IEEE ICNP, November 2015.Google ScholarGoogle Scholar
  10. Y. Zhou, S. Chen, Z. Mo, and Q. Xiao, "Point-to-Point Traffic Volume Measurement through Variable-Length Bit Array Masking in Vehicular Cyber-Physical Systems," Proc. of IEEE ICDCS, pp. 51--60, 2015.Google ScholarGoogle Scholar
  11. M. Yu, L. Jose, and R. Miao, "Software defined traffic measurement with opensketch," Proc. of NSDI, pp. 29--42, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Zhou, S. Chen, Y. Zhou, M. Chen, and Q. Xiao, "Privacy-preserving multi-point traffic volume measurement through vehicle-to-infrastructure communications," IEEE Transactions on Vehicular Technology, vol. 64, no. 12, pp. 5619--5630, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Moshref, M. Yu, R. Govindan, and A. Vahdat, "Scream: Sketch resource allocation for software-defined measurement," Proc. of ACM CoNEXT, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Zhou, Z. Mo, Q. Xiao, S. Chen, and Y. Yin, "Privacy-Preserving Transportation Traffic Measurement in Intelligent Cyber-physical Road Systems," IEEE Transactions on Vehicular Technology, vol. 65, no. 5, pp. 3749--3759, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  15. C. Estan, G. Varghese, and M. Fisk, "Bitmap Algorithms for Counting Active Flows on High-Speed Links," IEEE/ACM Transactions on Networking, vol. 14, no. 5, pp. 925--937, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Flajolet and G. N. Martin, "Probabilistic counting algorithms for database applications," Journal of Computer and System Sciences, vol. 31, pp. 182--209, September 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Durand and P. Flajolet, "Loglog counting of large cardinalities," European Symposia on Algorithms, pp. 605--617, 2003.Google ScholarGoogle Scholar
  18. P. Flajolet, E. Fusy, O. Gandouet, and F. Meunier, "Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm," Proc. of AOFA, pp. 127--146, 2007.Google ScholarGoogle Scholar
  19. S. Heule, M. Nunkesser, and A. Hall, "HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm," Proc. of EDBT, pp. 683--692, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Q. Xiao, S. Chen, M. Chen, and Y. Ying, "Hyper-Compact Virtual Estimators for Big Network Data Based on Register Sharing," Proc. of ACM SIGMETRICS, pp. 417--428, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Gao, Y. Zhao, R. Schweller, S. Venkataraman, Y. Chen, D. Song, and M.-Y. Kao, "Detecting Stealthy Spreaders Using Online Outdegree Histograms," Proc. of IEEE IWQoS, pp. 145--153, 2007.Google ScholarGoogle Scholar
  22. Q. Xiao, Y. Qiao, Z. Mo, and S. Chen, "Estimating the Persistent Spreads in High-Speed Networks," Proc. of IEEE ICNP, pp. 131--142, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. "CAIDA," 2015. {Online}. Available: http://www.caida.org/home/Google ScholarGoogle Scholar
  24. "Google trends." {Online}. Available: https://www.google.com/trends/Google ScholarGoogle Scholar
  25. E. L. Lehmann, G. Casella, and G. Casella, Theory of Point Estimation. Wadsworth & Brooks/Cole Advanced Books & Software, 1991.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Persistent Spread Measurement for Big Network Data Based on Register Intersection

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!