skip to main content
10.1145/1265530.1265562acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Variance estimation over sliding windows

Published:11 June 2007Publication History

ABSTRACT

Capturing characteristics of large data streams has received considerable attention. The constraints in space and time restrict the data stream processing to only one pass (or a small number of passes). Processing data streams over sliding windows make the problem more difficult and challenging. In this paper, we address the problem of maintaining ∈-approximate variance of data streams over sliding windows. To our knowledge, the best existing algorithm requires O(1/∈2 log N) space, though the lower bound for this problem is Ω(1/∈ log N). We propose the first ∈-approximation algorithm to this problem that is optimal in both space and worst case time. Our algorithm requires O(1/∈ log N) space. Furthermore, its running time is O(1) in worst case.

References

  1. A. Arasu and G. S. Manku. Approximate counts and quantiles over sliding windows. In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2004), Paris, France, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2002), pages 1--16, Madison, USA, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2003), pages 234--243, San Diego, USA, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pages 635--644, San Francisco, USA, Jan. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Feigenbaum, S. Kannan, and J. Zhang. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25--41, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. B. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA 2002), Winnipeg, Manitoba, Canada, Aug. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Golab, D. DeHaan, E. D. Demaine, A. López-Ortiz, and J. I. Munro. Identifying frequent items in sliding windows over on-line packet streams. In Proceedings of the Internet Measurement Conference (IMC 2003), Miami, USA, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. X. Lin, H. Lu, J. Xu, and J. X. Yu. Continuously maintaining quantile summaries of the most recent N elements over a data stream. In Proceedings of the 20th International Conference on Data Engineering (ICDE 2004), Boston, USA, Mar. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Muthukrishnan. Data streams: Algorithms and applications. Technical report, Rutgers University, Piscataway, USA, 2003.Google ScholarGoogle Scholar
  10. S. Tirthapura, B. Xu, and C. Busch. Sketching asynchronous streams over sliding windows. In Proceedings of the 25th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2006), Denver, USA, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, Aug. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Variance estimation over sliding windows

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
        June 2007
        328 pages
        ISBN:9781595936851
        DOI:10.1145/1265530

        Copyright © 2007 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 June 2007

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate476of1,835submissions,26%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!