Abstract
Modern network telemetry systems collect and analyze massive amounts of raw data in a space efficient manner. These require advanced capabilities such as drill down queries that allow iterative refinement of the search space. We present a first integral solution that (i) enables multiple measurement tasks inside the same data structure, (ii) supports specifying the time frame of interest as part of its queries, and (iii) is sketch-based and thus space efficient. Namely, our approach allows the user to define both the measurement task (e.g., heavy hitters, entropy estimation, count distinct, etc.) and the time frame of relevance (e.g., 5PM-6PM) at query time. Our approach provides accuracy guarantees and is the only space-efficient solution that offers such capabilities. Finally, we demonstrate how our system can be used for accurately pinpointing the start of a realistic DDoS attack.
- The CAIDA Anonymized Internet Trace, equinix-chicago 2016-06--21, Dir. A.Google Scholar
- The CAIDA Anonymized Internet Trace equinix-nyc 2018-03--15, Dir. A.Google Scholar
- The CAIDA Anonymized Internet Trace, equinix-sanjose 2014-03--20, Dir. B.Google Scholar
- Charu C Aggarwal. Data Streams: Models and Algorithms, volume 31. Springer Science & Business Media, 2007.Google Scholar
- Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. J. Comp. and sys. sciences, 1999.Google Scholar
- Eran Assaf, Ran Ben-Basat, Gil Einziger, and Roy Friedman. Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free. In IEEE INFOCOM, 2018.Google Scholar
Cross Ref
- Ziv Bar-Yossef, Thathachar S Jayram, Ravi Kumar, and D Sivakumar. An Information Statistics Approach to Data Stream and Communication Complexity. Journal of Computer and System Sciences, 2004.Google Scholar
Digital Library
- Ran Ben Basat, Gil Einziger, Isaac Keslassy, Ariel Orda, Shay Vargaftik, and Erez Waisbard. Memento: Making Sliding Windows Efficient for Heavy Hitters. In ACM CoNEXT, 2018.Google Scholar
- Ran Ben Basat, Roy Friedman, and Rana Shahout. Heavy Hitters over Interval Queries. In PVLDB, 2019. Also available on arXiv:1804.10740.Google Scholar
- R. Ben-Basat, G. Einziger, R. Friedman, and Y. Kassner. Heavy Hitters in Streams and Sliding Windows. In IEEE INFOCOM, 2016.Google Scholar
Cross Ref
- Ran Ben-Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. Randomized Admission Policy for Efficient Top-k and Frequency Estimation. In IEEE INFOCOM, 2017.Google Scholar
Cross Ref
- Ran Ben-Basat, Gil Einziger, Roy Friedman, Marcelo Caggiani Luizelli, and Erez Waisbard. Constant Time Updates in Hierarchical Heavy Hitters. ACM SIGCOMM, 2017.Google Scholar
Digital Library
- Theophilus Benson, Aditya Akella, and David A. Maltz. Network Traffic Characteristics of Data Centers in the Wild. In ACM IMC, 2010.Google Scholar
- Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang. MicroTE: Fine Grained Traffic Engineering for Data Centers. In ACM CoNEXT, 2011.Google Scholar
Digital Library
- Arnab Bhattacharyya, Palash Dey, and David P Woodruff. An Optimal Algorithm for $L_1$-Heavy Hitters in Insertion Streams and Related Problems. In ACM PODS, 2016.Google Scholar
Digital Library
- V. Braverman. Sliding window algorithms. Encyc. of Algorithms, 2004.Google Scholar
- Vladimir Braverman, Stephen R Chestnut, Nikita Ivkin, Jelani Nelson, Zhengyu Wang, and David P Woodruff. BPTree: an $L_2 $ Heavy Hitters Algorithm using Constant Memory. arXiv:1603.00759, 2016.Google Scholar
- Vladimir Braverman, Stephen R Chestnut, Nikita Ivkin, and David P Woodruff. Beating CountSketch for Heavy Hitters in Insertion Streams. In ACM STOC, 2016.Google Scholar
Digital Library
- Vladimir Braverman, Stephen R Chestnut, David P Woodruff, and Lin F Yang. Streaming Space Complexity of Nearly All Functions of One Variable on Frequency Vectors. In ACM PODS, 2016.Google Scholar
Digital Library
- Vladimir Braverman, Ran Gelles, and Rafail Ostrovsky. How to Catch $L_2$-heavy-hitters on Sliding Windows. Theoretical Computer Science, 2014.Google Scholar
- Vladimir Braverman, Elena Grigorescu, Harry Lang, David P Woodruff, and Samson Zhou. Nearly Optimal Distinct Elements and Heavy Hitters on Sliding Windows. arXiv preprint arXiv:1805.00212, 2018.Google Scholar
- Vladimir Braverman and Rafail Ostrovsky. Smooth Histograms for Sliding Windows. In IEEE FOCS, 2007.Google Scholar
Digital Library
- Vladimir Braverman and Rafail Ostrovsky. Generalizing the Layering Method of Indyk and Woodruff: Recursive Sketches for Frequency-Based Vectors on Streams. In APPROX/RANDOM, 2013.Google Scholar
Cross Ref
- Amit Chakrabarti, Subhash Khot, and Xiaodong Sun. Near-optimal Lower Bounds on the Multi-party Communication Complexity of Set Disjointness. In IEEE CCC, 2003.Google Scholar
Cross Ref
- Moses Charikar, Kevin Chen, and Martin Farach-Colton. Finding Frequent Items in Data Streams. In ICALP, 2002.Google Scholar
Digital Library
- Xiaoqi Chen, Shir Landau Feibish, Yaron Koral, Jennifer Rexford, and Ori Rottenstreich. Catching the Microburst Culprits with Snappy. In Proceedings of the Afternoon Workshop on Self-Driving Networks, SelfDN 2018, 2018.Google Scholar
- Edith Cohen. All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis. IEEE Trans. Knowl. Data Eng., 2015.Google Scholar
- Graham Cormode and Marios Hadjieleftheriou. Methods for Finding Frequent Items in Data Streams. J. VLDB, 2010.Google Scholar
- Graham Cormode and Shan Muthukrishnan. An Improved Data Stream Summary: The Count-min Sketch and Its Applications. J. Algorithms, 2005.Google Scholar
- Mayur Datar, Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Maintaining Stream Statistics over Sliding Windows. SIAM J. Comp., 2002.Google Scholar
- Jisa David and Ciza Thomas. DDoS Attack Detection using Fast Entropy Approach on Flow-based Network Traffic. Procedia Computer Science, 2015.Google Scholar
Cross Ref
- G. Einziger, B. Fellman, and Y. Kassner. Independent Counter Estimation Buckets . In IEEE INFOCOM, 2015.Google Scholar
Cross Ref
- Cristian Estan, Stefan Savage, and George Varghese. Automatically Inferring Patterns of Resource Consumption in Network Traffic. In ACM SIGCOMM, 2003.Google Scholar
Digital Library
- FD.io. Vector packet processing, 2018.Google Scholar
- Shir Landau Feibish, Yehuda Afek, Anat Bremler-Barr, Edith Cohen, and Michal Shagam. Mitigating DNS Random Subdomain DDoS Attacks by Distinct Heavy Hitters Sketches. In ACM/IEEE HotWeb 2017.Google Scholar
- Laura Feinstein, Dan Schnackenberg, Ravindra Balupari, and Darrell Kindred. Statistical Approaches to DDoS Attack Detection and Response. In Proceedings DARPA information survivability conference and exposition, 2003.Google Scholar
Cross Ref
- Éric Fusy and Frédéric Giroire. Estimating the Number of Active Flows in a Data Stream over a Sliding Window. In ANALCO, 2007.Google Scholar
Cross Ref
- Moshe Gabel, Daniel Keren, and Assaf Schuster. Anarchists, Unite: Practical Entropy Approximation for Distributed Streams. In ACM KDD, 2017.Google Scholar
- Pedro Garcia-Teodoro, Jesus E. Diaz-Verdejo, Gabriel Macia-Fernandez, and E. Vazquez. Anomaly-Based Network Intrusion Detection: Techniques, Systems and Challenges. Computers and Security, 2009.Google Scholar
- Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. SoftNIC: A Software NIC to Augment Hardware. Technical report, 2015.Google Scholar
- Hazar Harmouch and Felix Naumann. Cardinality Estimation: An Experimental Survey. J. VLDB, 2017.Google Scholar
- Stefan Heule, Marc Nunkesser, and Alexander Hall. HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm. In ACM EDBT, 2013.Google Scholar
Digital Library
- Piotr Indyk and David Woodruff. Optimal Approximations of the Frequency Moments of Data Streams. In ACM STOC, 2005.Google Scholar
Digital Library
- Nikita Ivkin, Edo Liberty, Kevin Lang, Zohar Karnin, and Vladimir Braverman. Streaming quantiles algorithms with small space and update time. arXiv preprint arXiv:1907.00236, 2019.Google Scholar
- Nikita Ivkin, Zhuolong Yu, Vladimir Braverman, and Xin Jin. Qpipe: Quantiles sketch fully in the data plane. In ACM CoNEXT, 2019.Google Scholar
Digital Library
- Atul Kant Kaushik, Emmanuel S. Pilli, and R. C. Joshi. "Network Forensic Analysis by Correlation of Attacks with Network Attributes". In Information and Communication Technologies, 2010.Google Scholar
Cross Ref
- Ilan Kremer, Noam Nisan, and Dana Ron. On Randomized One-round Communication Complexity. Computational Complexity, 1999.Google Scholar
- Krishan Kumar, RC Joshi, and Kuldip Singh. A Distributed Approach using Entropy to Detect DDoS Attacks in ISP Domain. In IEEE ICDCS, 2007.Google Scholar
Cross Ref
- Xuemin Lin, Hongjun Lu, Jian Xu, and Jeffrey Xu Yu. Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream. ICDE, 2004.Google Scholar
Digital Library
- Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir Braverman, Roy Friedman, and Vyas Sekar. NitroSketch: Robust and General Sketch-based Monitoring in Software Switches. In ACM SIGCOMM, 2019.Google Scholar
Digital Library
- Zaoxing Liu, Antonis Manousis, Gregory Vorsanger, Vyas Sekar, and Vladimir Braverman. One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. In ACM SIGCOMM, 2016.Google Scholar
Digital Library
- A. Metwally, D. Agrawal, and A. El Abbadi. Efficient Computation of Frequent and Top-k Elements in Data Streams. In ICDT, 2005.Google Scholar
- Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs. In ACM SIGCOMM, 2017.Google Scholar
Digital Library
- Jayadev Misra and David Gries. Finding Repeated Elements. Science of computer programming, 1982.Google Scholar
- Masoud Moshref, Minlan Yu, Ramesh Govindan, and Amin Vahdat. DREAM: Dynamic Resource Allocation for Software-defined Measurement. In ACM SIGCOMM, 2014.Google Scholar
Digital Library
- Michael Müter and Naim Asaj. Entropy-based Anomaly Detection for In-vehicle Networks. In 2011 IEEE Intelligent Vehicles Symposium (IV), 2011.Google Scholar
- Shanmugavelayutham Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in TCS, 2005.Google Scholar
- AS Navaz, V Sangeetha, and C Prabhadevi. Entropy based anomaly detection system to prevent DDoS attacks in cloud. arXiv preprint arXiv:1308.6745, 2013.Google Scholar
- George Nychis, Vyas Sekar, David G. Andersen, Hyong Kim, and Hui Zhang. An Empirical Evaluation of Entropy-based Traffic Anomaly Detection. In ACM IMC, 2008.Google Scholar
Digital Library
- Odysseas Papapetrou, Minos Garofalakis, and Antonios Deligiannakis. Sketching Distributed Sliding-window Data Streams. The VLDB Journal, 2015.Google Scholar
- Ben Pfaff, Justin Pettit, Teemu Koponen, Ethan Jackson, Andy Zhou, Jarno Rajahalme, Jesse Gross, Alex Wang, Joe Stringer, Pravin Shelar, Keith Amidon, and Martin Casado. The Design and Implementation of Open vSwitch. In USENIX NSDI, 2015.Google Scholar
Digital Library
- Vyas Sekar, Nick G Duffield, Oliver Spatscheck, Jacobus E van der Merwe, and Hui Zhang. LADS: Large-scale Automated DDoS Detection System. In USENIX ATC, 2006.Google Scholar
- Haya Shulman and Michael Waidner. Towards Forensic Analysis of Attacks with DNSSEC. In IEEE SPW, 2014.Google Scholar
Digital Library
- Zhewei Wei, Ge Luo, Ke Yi, Xiaoyong Du, and Ji-Rong Wen. Persistent data sketching. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 795--810. ACM, 2015.Google Scholar
Digital Library
- Li Yang, Wu Hao, Pan Tian, Dai Huichen, Lu Jianyuan, and Liu Bin. CASE: Cache-assisted Stretchable Estimator for High Speed Per-flow Measurement. In IEEE INFOCOM, 2016.Google Scholar
- Sen Yang, Bill Lin, and Jun Xu. Safe Randomized Load-Balanced Switching By Diffusing Extra Loads. ACM Meas. Anal. Comput. Syst., 2007.Google Scholar
- Ke Yi and Qin Zhang. Optimal Tracking of Distributed Heavy Hitters and Quantiles. Algorithmica, 2013.Google Scholar
Digital Library
Index Terms
I Know What You Did Last Summer: Network Monitoring using Interval Queries
Recommendations
Sketching distributed sliding-window data streams
While traditional data management systems focus on evaluating single, ad hoc queries over static data sets in a centralized setting, several emerging applications require (possibly, continuous) answers to queries on dynamic data that is widely ...
Supporting top-k aggregate queries over unequal synopsis on internet traffic streams
APWeb'08: Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and developmentQueries that return a list of frequently occurring items are important in the analysis of real-time Internet packet streams. While several results exist for computing Top-k queries using limited memory in the infinite stream model (e.g., limited-memory ...
Summarization Method for Multiple Sliding Window Aggregate Queries
STFSSD '09: Proceedings of the 2009 Software Technologies for Future Dependable Distributed SystemsThe data stream management systems use continuous queries. The continuous query has a independent queue per each query. This situation causes to use much duplicate memories. To solve this problem, there are two resource sharing methods which are the ...






Comments