Abstract
We consider the question of identifying which set of products are purchased and at what prices in a given transaction by observing only the total amount spent in the transaction, and nothing more. The ability to solve such an inverse problem can lead to refined information about consumer spending by simply observing anonymized credit card transactions data. Indeed, when considered in isolation, it is impossible to identify the products purchased and their prices from a given transaction just based on the transaction total. However, given a large number of transactions, there may be a hope. As the main contribution of this work, we provide a robust estimation algorithm for decomposing transaction totals into the underlying, individual product(s) purchased by utilizing a large corpus of transactions. Our method recovers a (product prices) vector $p \in \mathbbR ^N_>0 $ of unknown dimension (number of products) N as well as matrix $A \in \mathbbZ ^M \times N _\geq0 $ simply from M observations (transaction totals) $y \in \mathbbR ^M_>0 $ such that $y = A p + η$ with η \in \mathbbR ^M$ representing noise (taxes, discounts, etc.). We formally establish that our algorithm identifies $N, A$ precisely and p approximately, as long as each product is purchased individually at least once, i.e. $M \geq N$ and A has rank N. Computationally, the algorithm runs in polynomial time (with respect to problem parameters), and thus we provide a computationally efficient and statistically robust method for solving such inverse problems. We apply the algorithm to a large corpus of anonymized consumer credit card transactions in the period 2016-2019, with data obtained from a commercial data vendor. The transactions are associated with spending at Apple, Chipotle, Netflix, and Spotify. From just transactions data, our algorithm identifies (i) key price points (without access to the listed prices), (ii) products purchased within a transaction, (iii) product launches, and (iv) evidence of a new 'secret' product from Netflix - rumored to be in limited release.
- AlternativeData.org. Alternativedata.org database point of sale data. https://alternativedata.org/data-providers/category,point-of-sale. Accessed: 2019-05--19.Google Scholar
- Mark Bergen and Jennifer Surane. Google and mastercard cut a secret ad deal to track retail sales. https://www.bloomberg.com/news/articles/2018-08--30/google-and-mastercard-cut-a-secret-ad-deal-to-track-retail-sales, August 2018. Accessed: 2019-05--19.Google Scholar
- Radu Berinde, Anna C Gilbert, Piotr Indyk, Howard Karloff, and Martin J Strauss. Combining geometry and combinatorics: A unified approach to sparse signal recovery. In 2008 46th Annual Allerton Conference on Communication, Control, and Computing, pages 798--805. IEEE, 2008.Google Scholar
Cross Ref
- Florentin Butaru, QingQing Chen, Brian Clark, Sanmay Das, Andrew W Lo, and Akhtar Siddique. Risk and risk management in the credit card industry. Working Paper 21305, National Bureau of Economic Research, June 2015.Google Scholar
- Emmanuel Candes and Terence Tao. Near optimal signal recovery from random projections: Universal encoding strategies. arXiv preprint math/0410542, 2004.Google Scholar
- Emmanuel J Candes. The restricted isometry property and its implications for compressed sensing. Comptes rendus mathematique, 346(9--10):589--592, 2008.Google Scholar
- Chipotle. Chipotle online ordering. https://order.chipotle.com/Meal/Index/1597'showloc=1, 2019. Accessed: 2019-05-01.Google Scholar
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009. Google Scholar
Digital Library
- Ryan Dezember. Your smartphone's location data is worth big money to wall street. https://www.wsj.com/articles/your-smartphones-location-data-is-worth-big-money-to-wall-street-1541131260, November 2018. Accessed: 2018--11-04.Google Scholar
- David L Donoho. Compressed sensing. IEEE Transactions on information theory, 52(4):1289--1306, 2006. Google Scholar
Digital Library
- DVD.COM. Dvd.com choose a plan. https://dvd.netflix.com/Plans?dsrc=DVDWEB_NMHOME_NMHEADER_PLANS. Accessed: 2019-05--27.Google Scholar
- Amir Efrati. U.S. slowdown at Uber and Lyft. https://www.theinformation.com/articles/u-s-slowdown-at-uber-and-lyft, September 2018. Accessed: 2018--10--25.Google Scholar
- Michael Fleder and Devavrat Shah. Forecasting with alternative data. In Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '20, page 23--24, New York, NY, USA, 2020. Association for Computing Machinery. Google Scholar
Digital Library
- Bradley Hope. Provider of personal finance tools tracks bank cards sells data to investors. https://www.wsj.com/articles/provider-of-personal-finance-tools-tracks-bank-cards-sells-data-to-investors-1438914620, April 2015. Accessed: 2018-05--10.Google Scholar
- IO&C. The big trends in data reshaping financial industry. https://ioandc.com/the-big-trends-in-data-reshaping-financial-industry, April 2019. Accessed: 2019-04-07.Google Scholar
- Jon Kleinberg and Eva Tardos. Algorithm design. Pearson Education India, 2006. Google Scholar
Digital Library
- S.P Kothari. Capital markets research in accounting. Journal of Accounting and Economics, 31(1):105 -- 231, 2001.Google Scholar
Cross Ref
- Tze Leung Lai, Ching Zong Wei, et al. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1):154--166, 1982.Google Scholar
Cross Ref
- Netflix. Netflix pick your price. https://www.netflix.com. Accessed: 2019-05--27.Google Scholar
- Sarah Perez. Spotify and Hulu launch a discounted entertainment bundle for $$12.99$. https://techcrunch.com/2018/04/11/spotify-and-hulu-launch-a-discounted-entertainment-bundle-for-12--99-per-month, April 2018. Accessed: 2019-06--11.Google Scholar
- Ashley Rodriguez. A history of netflix us price hikes, charted. https://qz.com/1524449/netflix-just-raised-prices-in-the-us-a-history-of-hikes-charted. Accessed: 2019-05--27.Google Scholar
- Second Measure. Data points. https://secondmeasure.com/datapoints. Accessed: 2019-05--19.Google Scholar
- Todd Spangler. Netflix testing out pricier new "Ultra" plan at $16.99 per month. https://variety.com/2018/digital/news/netflix-ultra-plan-hdr-ultrahd-test-1202865305, July 2018. Accessed: 2019-05--27.Google Scholar
- Robin Wigglesworth. Asset management's fight for alternative data analysts heats up. https://www.ft.com/content/2f454550-02c8--11e8--9650--9c0ad2d7c5b5, January 2018. Accessed: 2018-05-07.Google Scholar
Index Terms
I Know What You Bought At Chipotle for $9.81 by Solving A Linear Inverse Problem
Recommendations
Forecasting with Alternative Data
SIGMETRICSWe consider the problem of forecasting fine-grained company financials, such as daily revenue, from two input types: noisy proxy signals a la alternative data (e.g. credit card transactions) and sparse ground-truth observations (e.g. quarterly earnings ...
I Know What You Bought At Chipotle for $9.81 by Solving A Linear Inverse Problem
SIGMETRICS '21: Abstract Proceedings of the 2021 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer SystemsWe consider the question of identifying which set of products are purchased and at what prices in a given transaction by observing only the total amount spent in the transaction, and nothing more. The ability to solve such an inverse problem can lead to ...
I Know What You Bought At Chipotle for $9.81 by Solving A Linear Inverse Problem
SIGMETRICS '21We consider the question of identifying which set of products are purchased and at what prices in a given transaction by observing only the total amount spent in the transaction, and nothing more. The ability to solve such an inverse problem can lead to ...






Comments