Abstract
We consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings reports). We utilize a classical linear systems model to capture both the evolution of the hidden or latent state (e.g. daily revenue), as well as the proxy signal (e.g. credit cards transactions). The linear system model is particularly well suited here as data is extremely sparse (4 quarterly reports per year). In classical system identification, where the central theme is to learn parameters for such linear systems, unbiased and consistent estimation of parameters is not feasible: the likelihood is non-convex; and worse, the global optimum for maximum likelihood estimation is often non-unique. As the main contribution of this work, we provide a simple, consistent estimator of all parameters for the linear system model of interest; in addition the estimation is unbiased for some of the parameters. In effect, the additional sparse observations of aggregate hidden state (e.g. quarterly reports) enable system identification in our setup that is not feasible in general. For estimating and forecasting hidden state (actual earnings) using the noisy observations (daily credit card transactions), we utilize the learned linear model along with a natural adaptation of classical Kalman filtering (or Belief Propagation). This leads to optimal inference with respect to mean-squared error. Analytically, we argue that even though the underlying linear system may be "unstable," "uncontrollable," or "undetectable" in the classical setting, our setup and inference algorithm allow for estimation of hidden state with bounded error. Further, the estimation error of the algorithm monotonically decreases as the frequency of the sparse observations increases. This, seemingly intuitive insight contradicts the word on the Street. Finally, we utilize our framework to estimate quarterly earnings of 34 public companies using credit card transaction data. Our data-driven method convincingly outperforms the Wall Street consensus (analyst) estimates even though our method uses only credit card data as input, while the Wall Street consensus is based on various data sources including experts' input.
- Eagle Alpha. 2018. Eagle Alpha Alternative Data Use Cases. https://eaglealpha.com/eagle-alphas-alternative-data-use-cases. Accessed: 2018-05--10.Google Scholar
- AlternativeData.org. 2018a. Alternative Data by the Numbers. https://alternativedata.org/resources/alternative-data-by-the-numbers. Accessed: 2018-05--17.Google Scholar
- AlternativeData.org. 2018b. Alternative Data Database. https://alternativedata.org/data-providers/. Accessed: 2018--11--10.Google Scholar
- Harry Asada. 2018. MIT 2.160. Lecture Notes.Google Scholar
- Dimitri P Bertsekas. 1995. Dynamic programming and optimal control . Vol. 1. Athena scientific, Belmont, MA.Google Scholar
- Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of Computational Science , Vol. 2, 1 (2011), 1 -- 8.Google Scholar
- Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2018. Recurrent Neural Networks for Multivariate Time Series with Missing Values . Nature, Scientific Reports , Vol. 8, 6085 (April 2018).Google Scholar
Cross Ref
- Ryan Dezember. 2018. Your Smartphone's Location Data Is Worth Big Money to Wall Street. https://www.wsj.com/articles/your-smartphones-location-data-is-worth-big-money-to-wall-street-1541131260. Accessed: 2018--11-04.Google Scholar
- Amir Efrati. 2018. U.S. Slowdown at Uber and Lyft . https://www.theinformation.com/articles/u-s-slowdown-at-uber-and-lyft . Accessed: 2018--10--25.Google Scholar
- Virtu Financial. 2015. Virtu Financial Announces First Quarter 2015 Results. http://ir.virtu.com/financials-and-filings/quarterly-results/default.aspx Accessed: 2019-07--10.Google Scholar
- Graham C. Goodwin and Kwai Sang Sin. 1984. Adaptive Filtering Prediction and Control .Dover Publications, Mineola, NY.Google Scholar
- N. J. Gordon, D. J. Salmond, and A. F. M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings F - Radar and Signal Processing , Vol. 140, 2 (April 1993), 107--113.Google Scholar
Cross Ref
- Nachi Gupta. 2006. Kalman Filtering in the Presence of State Space Equality Constraints. In 2007 Chinese Control Conference. IEEE, 107--113.Google Scholar
- James Douglas Hamilton. 1994. Time series analysis .Princeton Univ. Press, Princeton, NJ.Google Scholar
- P. D. Hanlon and P. S. Maybeck. 2000. Characterization of Kalman filter residuals in the presence of mismodeling. IEEE Trans. Aerospace Electron. Systems , Vol. 36, 1 (Jan 2000), 114--131.Google Scholar
Cross Ref
- Moritz Hardt, Tengyu Ma, and Benjamin Recht. 2016. Gradient Descent Learns Linear Dynamical Systems. nd R. Van Der Merwe. 2000. The unscented Kalman filter for nonlinear estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium. IEEE, 153--158.Google Scholar
- Joseph White. 2018. GM to drop monthly U.S. vehicle sale reports. https://www.reuters.com/article/us-usa-autos-gm/gm-to-drop-monthly-u-s-vehicle-sale-reports-idUSKCN1HA0C9. Accessed: 2018-05-07.Google Scholar
- Robin Wigglesworth. 2018a. Asset management's fight for alternative data analysts heats up. https://www.ft.com/content/2f454550-02c8--11e8--9650--9c0ad2d7c5b5. Accessed: 2018-05-07.Google Scholar
- Robin Wigglesworth. 2018b. Asset managers double spending on new data in hunt for edge. https://www.ft.com/content/3c321c14--52d4--11e8-b24e-cad6aa67e23e. Accessed: 2018-05--10.Google Scholar
Index Terms
Forecasting with Alternative Data
Recommendations
Forecasting with Alternative Data
SIGMETRICS '20: Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer SystemsWe consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings ...
Forecasting with Alternative Data
We consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings ...
Forecasting Models: An Application to Home Insurance
Computational Science and Its Applications – ICCSA 2022 WorkshopsAbstractForecasting in time series is one of the main purposes for applying time series models. The choice of the forecasting model depends on data structure and the objectives of the study. This study presents a comparison of Box Jenkins SARIMA and Holt-...






Comments