skip to main content
10.1145/3447548.3467278acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression

Published: 14 August 2021 Publication History

Abstract

Gradient Boosting Machines (GBM) are hugely popular for solving tabular data problems. However, practitioners are not only interested in point predictions, but also in probabilistic predictions in order to quantify the uncertainty of the predictions. Creating such probabilistic predictions is difficult with existing GBM-based solutions: they either require training multiple models or they become too computationally expensive to be useful for large-scale settings. We propose Probabilistic Gradient Boosting Machines (PGBM), a method to create probabilistic predictions with a single ensemble of decision trees in a computationally efficient manner. PGBM approximates the leaf weights in a decision tree as a random variable, and approximates the mean and variance of each sample in a dataset via stochastic tree ensemble update equations. These learned moments allow us to subsequently sample from a specified distribution after training. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods: (i) PGBM enables probabilistic estimates without compromising on point performance in a single model, (ii) PGBM learns probabilistic estimates via a single model only (and without requiring multi-parameter boosting), and thereby offers a speedup of up to several orders of magnitude over existing state-of-the-art methods on large datasets, and (iii) PGBM achieves accurate probabilistic estimates in tasks with complex differentiable loss functions, such as hierarchical time series problems, where we observed up to 10% improvement in point forecasting performance and up to 300% improvement in probabilistic forecasting performance.

Supplementary Material

MP4 File (KDD21-rst1763.mp4)
Creating probabilistic predictions is difficult with existing Gradient Boosting Machines (GBM) solutions: they either require training multiple models or they become too computationally expensive to be useful for large-scale settings. We propose Probabilistic Gradient Boosting Machines (PGBM), a method to create probabilistic predictions with a single ensemble of decision trees in a computationally efficient manner. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods: (i) PGBM enables probabilistic estimates without compromising on point performance, (ii) PGBM learns probabilistic estimates via a single model only, thereby offering a speedup of up to several orders of magnitude over existing state-of-the-art methods on large datasets, and (iii) PGBM achieves accurate probabilistic estimates in tasks with complex differentiable loss functions, such as hierarchical time series problems, where we observed up to 300% improvement in probabilistic forecasting performance.

References

[1]
Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Rangapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS : Probabilistic and Neural Time Series Modeling in Python. Journal of Machine Learning Research, Vol. 21, 116 (2020), 1--6.
[2]
Joos-Hendrik Böse, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Dustin Lange, David Salinas, Sebastian Schelter, Matthias Seeger, and Yuyang Wang. 2017. Probabilistic Demand Forecasting at Scale. Proc. VLDB Endow., Vol. 10, 12 (Aug. 2017), 1694--1705. https://doi.org/10.14778/3137765.3137775
[3]
G. E. P. Box and David A. Pierce. 1970. Distribution of Residual Autocorrelations in Autoregressive -Integrated Moving Average Time Series Models. J. Amer. Statist. Assoc., Vol. 65, 332 (1970), 1509--1526. https://doi.org/10.2307/2284333
[4]
Tianqi Chen and Carlos Guestrin. 2016. XGBoost : A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, San Francisco California USA, 785--794. https://doi.org/10.1145/2939672.2939785
[5]
Hugh A. Chipman, Edward I. George, and Robert E. McCulloch. 2010. BART : Bayesian Additive Regression Trees. Ann. Appl. Stat., Vol. 4, 1 (March 2010), 266--298. https://doi.org/10.1214/09-AOAS285
[6]
Tony Duan, Anand Avati, Daisy Ding, Khanh K. Thai, Sanjay Basu, Andrew Ng, and Alejandro Schuler. 2020. NGBoost : Natural Gradient Boosting for Probabilistic Prediction. ICML (2020).
[7]
Jerome H. Friedman. 2001. Greedy Function Approximation : A Gradient Boosting Machine. The Annals of Statistics, Vol. 29, 5 (2001), 1189--1232.
[8]
Henry Gouk, Bernhard Pfahringer, and Eibe Frank. 2019. Stochastic Gradient Trees. In Asian Conference on Machine Learning . PMLR, 1094--1109.
[9]
Rob J Hyndman. 2018. Forecasting: Principles and Practice .
[10]
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM : A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 3146--3154.
[11]
Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan. 2019. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 5244--5254.
[12]
Bryan Lim, Sercan O. Arik, Nicolas Loeff, and Tomas Pfister. 2019. Temporal Fusion Transformers for Interpretable Multi -Horizon Time Series Forecasting. arXiv:1912.09363 [cs, stat] (Dec. 2019). arxiv: 1912.09363 [cs, stat]
[13]
Spyros Makridakis, Evangelos Spiliotis, and Vassilis Assimakopoulos. 2020 a. The M5 Accuracy Competition: Results, Findings and Conclusions .
[14]
Spyros Makridakis, Evangelos Spiliotis, Vassilis Assimakopoulos, Zhi Chen, Anil Gaba, Ilia Tsetlin, and Robert Winkler. 2020 b. The M5 Uncertainty Competition: Results, Findings and Conclusions .
[15]
Nicolai Meinshausen. 2006. Quantile Regression Forests. Journal of Machine Learning Research, Vol. 7, 35 (2006), 983--999.
[16]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch : An Imperative Style, High -Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024--8035.
[17]
R. A. Rigby and D. M. Stasinopoulos. 2005. Generalized Additive Models for Location, Scale and Shape. Journal of the Royal Statistical Society: Series C (Applied Statistics), Vol. 54, 3 (2005), 507--554. https://doi.org/10.1111/j.1467--9876.2005.00510.x
[18]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2019. DeepAR : Probabilistic Forecasting with Autoregressive Recurrent Networks. International Journal of Forecasting (Oct. 2019). https://doi.org/10.1016/j.ijforecast.2019.07.001
[19]
Rajat Sen, Hsiang-Fu Yu, and Inderjit S Dhillon. 2019. Think Globally, Act Locally : A Deep Neural Network Approach to High -Dimensional Time Series Forecasting. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 4838--4847.
[20]
Fabio Sigrist. 2020. Gradient and Newton Boosting for Classification and Regression. Expert Systems with Applications (Oct. 2020). https://doi.org/10.1016/j.eswa.2020.114080
[21]
Souhaib Ben Taieb, Raphael Huser, Rob J Hyndman, and Marc G Genton. 2015. Probabilistic Time Series Forecasting with Boosted Additive Models: An Application to Smart Meter Data. (2015), 30.
[22]
Sean J. Taylor and Benjamin Letham. 2018. Forecasting at Scale. The American Statistician, Vol. 72, 1 (Jan. 2018), 37--45. https://doi.org/10.1080/00031305.2017.1380080
[23]
Michaël Zamo and Philippe Naveau. 2018. Estimation of the Continuous Ranked Probability Score with Limited Information and Applications to Ensemble Weather Forecasts. Math Geosci, Vol. 50, 2 (Feb. 2018), 209--234. https://doi.org/10.1007/s11004-017--9709--7

Cited By

View all
  • (2024)NodeFlow: Towards End-to-End Flexible Probabilistic Regression on Tabular DataEntropy10.3390/e2607059326:7(593)Online publication date: 11-Jul-2024
  • (2024)A review of predictive uncertainty estimation with machine learningArtificial Intelligence Review10.1007/s10462-023-10698-857:4Online publication date: 18-Mar-2024
  • (2023)Probabilistic Machine Learning Methods for Fractional Brownian Motion Time Series ForecastingFractal and Fractional10.3390/fractalfract70705177:7(517)Online publication date: 29-Jun-2023
  • Show More Cited By

Index Terms

  1. Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
    August 2021
    4259 pages
    ISBN:9781450383325
    DOI:10.1145/3447548
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    KDD '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 23 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)NodeFlow: Towards End-to-End Flexible Probabilistic Regression on Tabular DataEntropy10.3390/e2607059326:7(593)Online publication date: 11-Jul-2024
    • (2024)A review of predictive uncertainty estimation with machine learningArtificial Intelligence Review10.1007/s10462-023-10698-857:4Online publication date: 18-Mar-2024
    • (2023)Probabilistic Machine Learning Methods for Fractional Brownian Motion Time Series ForecastingFractal and Fractional10.3390/fractalfract70705177:7(517)Online publication date: 29-Jun-2023
    • (2023)Ensemble Machine Learning Model Improves Prediction Accuracy for Academic Performance: A Comparative Study of Default ML VS Boosting AlgorithmProceedings of the 5th International Conference on Information Management & Machine Intelligence10.1145/3647444.3652473(1-7)Online publication date: 23-Nov-2023
    • (2023)An adaptive multi-class imbalanced classification framework based on ensemble methods and deep networkNeural Computing and Applications10.1007/s00521-023-08290-w35:15(11141-11159)Online publication date: 20-Feb-2023
    • (2022)Instance-based uncertainty estimation for gradient-boosted regression treesProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601080(11145-11159)Online publication date: 28-Nov-2022
    • (2022)Walking motion real-time detection method based on walking stick, IoT, COPOD and improved LightGBMApplied Intelligence10.1007/s10489-022-03264-252:14(16398-16416)Online publication date: 1-Nov-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media