skip to main content
article
Free access

The P2 algorithm for dynamic calculation of quantiles and histograms without storing observations

Published: 01 October 1985 Publication History
  • Get Citation Alerts
  • Abstract

    A heuristic algorithm is proposed for dynamic calculation of the median and other quantiles. The estimates are produced dynamically as the observations are generated. The observations are not stored; therefore, the algorithm has a very small and fixed storage requirement regardless of the number of observations. This makes it ideal for implementing in a quantile chip that can be used in industrial controllers and recorders. The algorithm is further extended to histogram plotting. The accuracy of the algorithm is analyzed.

    References

    [1]
    Buchholz, W. File organization and addressing. IBM Syst. 1. 2, (196% 66-111.
    [2]
    Knott, G.D. Hashing functions. Cornput. 1. 18. 3 (1975). 265-276.
    [3]
    Knuth. D.E. The Art of Computer Programming. Vol. 3, Sorting and Searching. Addison-Wesley, Reading, Mass., 1973.
    [4]
    Pearson, ES. and Hartley, H.O. Biometrikn, Tables for Stotisticinns. Cambridge University Press, Cambridge, 1972.
    [5]
    Peterson, W.W. Addressing for random access storage. IBM I. Res. Develop. 1. 2 (Apr. 1957). 130-146.
    [6]
    Van der Pool, J.A. Optimum storage allocation for initial loading of a file. IBM J. Res. Develop. 16 (Nov. 1972), 579-586.
    [7]
    Vitter, J.S. Implementations for coalesced hashing. Commun. ACM 25, 12 (Dec. 1962). 911-926.

    Cited By

    View all
    • (2024)Estimating cutoff values for diagnostic tests to achieve target specificity using extreme value theoryBMC Medical Research Methodology10.1186/s12874-023-02139-524:1Online publication date: 8-Feb-2024
    • (2024)Inferring Video Streaming Quality of Experience at Scale using Incremental Statistics from CDN LogsProceedings of the 3rd Mile-High Video Conference10.1145/3638036.3640803(34-40)Online publication date: 11-Feb-2024
    • (2024)Methods of Descriptive Statistics in Telemetry Tasks2024 Systems of Signals Generating and Processing in the Field of on Board Communications10.1109/IEEECONF60226.2024.10496798(1-5)Online publication date: 12-Mar-2024
    • Show More Cited By

    Recommendations

    Reviews

    Leonard Francis Zettel

    .abstract A heuristic algorithm is proposed for dynamic calculation of the median and other quantiles. . . . [T]he algorithm has a very small and fixed storage requirement regardless of the number of observations. . . . The algorithm is further extended to histogram plotting. . . . — From the Authors' Abstract The algorithm looks interesting and is presented in sufficient detail so that one could implement it. It requires the storage of ten numbers for the calculation of a quantile; thus, it would probably save storage on large samples of observations, since conventional algorithms require the equivalent to access the data sorted from lowest to highest. The analysis that accompanies the algorithm is best described as a lick, a promise, and a bit of arm waving. There is no formal analysis of execution behavior. Nothing is proven about the statistical behavior of the estimators that the algorithm produces. Results of Monte Carlo runs of 50 samples are given for various cases. Similar work presented in the Journal of the American Statistical Association would typically have sample sizes of 10,000.

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image Communications of the ACM
    Communications of the ACM  Volume 28, Issue 10
    Oct. 1985
    64 pages
    ISSN:0001-0782
    EISSN:1557-7317
    DOI:10.1145/4372
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 October 1985
    Published in CACM Volume 28, Issue 10

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)215
    • Downloads (Last 6 weeks)21

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Estimating cutoff values for diagnostic tests to achieve target specificity using extreme value theoryBMC Medical Research Methodology10.1186/s12874-023-02139-524:1Online publication date: 8-Feb-2024
    • (2024)Inferring Video Streaming Quality of Experience at Scale using Incremental Statistics from CDN LogsProceedings of the 3rd Mile-High Video Conference10.1145/3638036.3640803(34-40)Online publication date: 11-Feb-2024
    • (2024)Methods of Descriptive Statistics in Telemetry Tasks2024 Systems of Signals Generating and Processing in the Field of on Board Communications10.1109/IEEECONF60226.2024.10496798(1-5)Online publication date: 12-Mar-2024
    • (2024)hermiter: R package for sequential nonparametric estimationComputational Statistics10.1007/s00180-023-01382-039:3(1127-1163)Online publication date: 1-May-2024
    • (2023)EasyQuantile: Efficient Quantile Tracking in the Data PlaneProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600084(123-129)Online publication date: 29-Jun-2023
    • (2023)Rebalancing gradient to improve self-supervised co-training of depth, odometry and optical flow predictions2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV56688.2023.00132(1267-1276)Online publication date: Jan-2023
    • (2023)Online Feature Screening for Data Streams With Concept DriftIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.323275235:11(11693-11707)Online publication date: 1-Nov-2023
    • (2023)A Method for Threshold Setting and False Alarm Probability Evaluation for Radar DetectorsSignal Processing10.1016/j.sigpro.2023.108930207:COnline publication date: 1-Jun-2023
    • (2022)Image-based Visualization of Large Volumetric Data Using MomentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.316534628:6(2314-2325)Online publication date: 1-Jun-2022
    • (2022)r-adaptive algorithms for supersonic flows with high-order Flux Reconstruction methodsComputer Physics Communications10.1016/j.cpc.2022.108373276(108373)Online publication date: Jul-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media