skip to main content
research-article
Open Access

Forecasting with Alternative Data

Authors Info & Claims
Published:17 December 2019Publication History
Skip Abstract Section

Abstract

We consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings reports). We utilize a classical linear systems model to capture both the evolution of the hidden or latent state (e.g. daily revenue), as well as the proxy signal (e.g. credit cards transactions). The linear system model is particularly well suited here as data is extremely sparse (4 quarterly reports per year). In classical system identification, where the central theme is to learn parameters for such linear systems, unbiased and consistent estimation of parameters is not feasible: the likelihood is non-convex; and worse, the global optimum for maximum likelihood estimation is often non-unique. As the main contribution of this work, we provide a simple, consistent estimator of all parameters for the linear system model of interest; in addition the estimation is unbiased for some of the parameters. In effect, the additional sparse observations of aggregate hidden state (e.g. quarterly reports) enable system identification in our setup that is not feasible in general. For estimating and forecasting hidden state (actual earnings) using the noisy observations (daily credit card transactions), we utilize the learned linear model along with a natural adaptation of classical Kalman filtering (or Belief Propagation). This leads to optimal inference with respect to mean-squared error. Analytically, we argue that even though the underlying linear system may be "unstable," "uncontrollable," or "undetectable" in the classical setting, our setup and inference algorithm allow for estimation of hidden state with bounded error. Further, the estimation error of the algorithm monotonically decreases as the frequency of the sparse observations increases. This, seemingly intuitive insight contradicts the word on the Street. Finally, we utilize our framework to estimate quarterly earnings of 34 public companies using credit card transaction data. Our data-driven method convincingly outperforms the Wall Street consensus (analyst) estimates even though our method uses only credit card data as input, while the Wall Street consensus is based on various data sources including experts' input.

References

  1. Eagle Alpha. 2018. Eagle Alpha Alternative Data Use Cases. https://eaglealpha.com/eagle-alphas-alternative-data-use-cases. Accessed: 2018-05--10.Google ScholarGoogle Scholar
  2. AlternativeData.org. 2018a. Alternative Data by the Numbers. https://alternativedata.org/resources/alternative-data-by-the-numbers. Accessed: 2018-05--17.Google ScholarGoogle Scholar
  3. AlternativeData.org. 2018b. Alternative Data Database. https://alternativedata.org/data-providers/. Accessed: 2018--11--10.Google ScholarGoogle Scholar
  4. Harry Asada. 2018. MIT 2.160. Lecture Notes.Google ScholarGoogle Scholar
  5. Dimitri P Bertsekas. 1995. Dynamic programming and optimal control . Vol. 1. Athena scientific, Belmont, MA.Google ScholarGoogle Scholar
  6. Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of Computational Science , Vol. 2, 1 (2011), 1 -- 8.Google ScholarGoogle Scholar
  7. Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2018. Recurrent Neural Networks for Multivariate Time Series with Missing Values . Nature, Scientific Reports , Vol. 8, 6085 (April 2018).Google ScholarGoogle ScholarCross RefCross Ref
  8. Ryan Dezember. 2018. Your Smartphone's Location Data Is Worth Big Money to Wall Street. https://www.wsj.com/articles/your-smartphones-location-data-is-worth-big-money-to-wall-street-1541131260. Accessed: 2018--11-04.Google ScholarGoogle Scholar
  9. Amir Efrati. 2018. U.S. Slowdown at Uber and Lyft . https://www.theinformation.com/articles/u-s-slowdown-at-uber-and-lyft . Accessed: 2018--10--25.Google ScholarGoogle Scholar
  10. Virtu Financial. 2015. Virtu Financial Announces First Quarter 2015 Results. http://ir.virtu.com/financials-and-filings/quarterly-results/default.aspx Accessed: 2019-07--10.Google ScholarGoogle Scholar
  11. Graham C. Goodwin and Kwai Sang Sin. 1984. Adaptive Filtering Prediction and Control .Dover Publications, Mineola, NY.Google ScholarGoogle Scholar
  12. N. J. Gordon, D. J. Salmond, and A. F. M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings F - Radar and Signal Processing , Vol. 140, 2 (April 1993), 107--113.Google ScholarGoogle ScholarCross RefCross Ref
  13. Nachi Gupta. 2006. Kalman Filtering in the Presence of State Space Equality Constraints. In 2007 Chinese Control Conference. IEEE, 107--113.Google ScholarGoogle Scholar
  14. James Douglas Hamilton. 1994. Time series analysis .Princeton Univ. Press, Princeton, NJ.Google ScholarGoogle Scholar
  15. P. D. Hanlon and P. S. Maybeck. 2000. Characterization of Kalman filter residuals in the presence of mismodeling. IEEE Trans. Aerospace Electron. Systems , Vol. 36, 1 (Jan 2000), 114--131.Google ScholarGoogle ScholarCross RefCross Ref
  16. Moritz Hardt, Tengyu Ma, and Benjamin Recht. 2016. Gradient Descent Learns Linear Dynamical Systems. nd R. Van Der Merwe. 2000. The unscented Kalman filter for nonlinear estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium. IEEE, 153--158.Google ScholarGoogle Scholar
  17. Joseph White. 2018. GM to drop monthly U.S. vehicle sale reports. https://www.reuters.com/article/us-usa-autos-gm/gm-to-drop-monthly-u-s-vehicle-sale-reports-idUSKCN1HA0C9. Accessed: 2018-05-07.Google ScholarGoogle Scholar
  18. Robin Wigglesworth. 2018a. Asset management's fight for alternative data analysts heats up. https://www.ft.com/content/2f454550-02c8--11e8--9650--9c0ad2d7c5b5. Accessed: 2018-05-07.Google ScholarGoogle Scholar
  19. Robin Wigglesworth. 2018b. Asset managers double spending on new data in hunt for edge. https://www.ft.com/content/3c321c14--52d4--11e8-b24e-cad6aa67e23e. Accessed: 2018-05--10.Google ScholarGoogle Scholar

Index Terms

  1. Forecasting with Alternative Data

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Article Metrics

            • Downloads (Last 12 months)587
            • Downloads (Last 6 weeks)105

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!