ABSTRACT
Capturing characteristics of large data streams has received considerable attention. The constraints in space and time restrict the data stream processing to only one pass (or a small number of passes). Processing data streams over sliding windows make the problem more difficult and challenging. In this paper, we address the problem of maintaining ∈-approximate variance of data streams over sliding windows. To our knowledge, the best existing algorithm requires O(1/∈2 log N) space, though the lower bound for this problem is Ω(1/∈ log N). We propose the first ∈-approximation algorithm to this problem that is optimal in both space and worst case time. Our algorithm requires O(1/∈ log N) space. Furthermore, its running time is O(1) in worst case.
- A. Arasu and G. S. Manku. Approximate counts and quantiles over sliding windows. In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2004), Paris, France, June 2004. Google Scholar
Digital Library
- B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2002), pages 1--16, Madison, USA, June 2002. Google Scholar
Digital Library
- B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2003), pages 234--243, San Diego, USA, June 2003. Google Scholar
Digital Library
- M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pages 635--644, San Francisco, USA, Jan. 2002. Google Scholar
Digital Library
- J. Feigenbaum, S. Kannan, and J. Zhang. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25--41, 2004.Google Scholar
Digital Library
- P. B. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA 2002), Winnipeg, Manitoba, Canada, Aug. 2002. Google Scholar
Digital Library
- L. Golab, D. DeHaan, E. D. Demaine, A. López-Ortiz, and J. I. Munro. Identifying frequent items in sliding windows over on-line packet streams. In Proceedings of the Internet Measurement Conference (IMC 2003), Miami, USA, Oct. 2003. Google Scholar
Digital Library
- X. Lin, H. Lu, J. Xu, and J. X. Yu. Continuously maintaining quantile summaries of the most recent N elements over a data stream. In Proceedings of the 20th International Conference on Data Engineering (ICDE 2004), Boston, USA, Mar. 2004. Google Scholar
Digital Library
- S. Muthukrishnan. Data streams: Algorithms and applications. Technical report, Rutgers University, Piscataway, USA, 2003.Google Scholar
- S. Tirthapura, B. Xu, and C. Busch. Sketching asynchronous streams over sliding windows. In Proceedings of the 25th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2006), Denver, USA, July 2006. Google Scholar
Digital Library
- Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, Aug. 2002. Google Scholar
Digital Library
Index Terms
Variance estimation over sliding windows
Recommendations
Optimal sampling from sliding windows
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsA sliding windows model is an important case of the streaming model, where only the most "recent" elements remain active and the rest are discarded in a stream. The sliding windows model is important for many applications (see, e.g., Babcock, Babu, ...
Optimal sampling from sliding windows
A sliding windows model is an important case of the streaming model, where only the most ''recent'' elements remain active and the rest are discarded. The sliding windows model is important for many applications (see, e.g., Babcock, Babu, Datar, Motwani ...
Distributed streams algorithms for sliding windows
SPAA '02: Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architecturesThis paper presents algorithms for estimating aggregate functions over a "sliding window" of the N most recent data items in one or more streams. Our results include
- For a single stream, we present the first ε-approximation scheme for the number of 1's ...






Comments