skip to main content
research-article
Open Access

Real-Time Principal Component Analysis

Authors Info & Claims
Published:12 June 2020Publication History
Skip Abstract Section

Abstract

We propose a variant of Principal Component Analysis (PCA) that is suited for real-time applications. In the real-time version of the PCA problem, we maintain a window over the most recent data and project every incoming row of data into a lower-dimensional subspace, which we generate as the output of the model. The goal is to reduce the reconstruction error of the output from the input and to retain major components pertaining to previous distributions of the data. We use the reconstruction error as the termination criteria to update the eigenspace as new data arrives. We then propose two variants of this algorithm that are progressively more time efficient. To verify whether our proposed model can capture the essence of the changing distribution of large datasets in real time, we have implemented the algorithms and compared performance against carefully designed simulations that change distributions of data sources over time in a controllable manner. Furthermore, we have demonstrated that proposed algorithms can capture the changing distributions of real-life datasets by running simulations on datasets from a variety of real-time applications, e.g., localization, activity recognition, customer expenditure, and so forth. Results show that straightforward modifications to convert PCA to use a sliding window of datasets do not work because of the difficulties associated with determination of optimal window size. Instead, we propose algorithmic enhancements that rely on spectral analysis to improve dimensionality reduction. Results show that our methods can successfully capture the changing distribution of data in a real-time scenario, thus enabling real-time PCA.

References

  1. Nuno Abreu, Gonçalo Costa, and Fernandes Marques. 2011. Analise do Perfil do Cliente Recheio e Desenvolvimento de um Sistema Promocional. Ph.D. Dissertation.Google ScholarGoogle Scholar
  2. Stefan Aeberhard, Danny Coomans, and Olivier de Vel. 1992. The classification performance of RDA. Dept. of Computer Science and Dept. of Mathematics and Statistics, James Cook University of North Queensland, Tech. Rep. (1992), 92--01.Google ScholarGoogle Scholar
  3. S. Aeberhard, D. Coomans, and O. De Vel. 1992. Comparison of classifiers in high dimensional settings. Dept. Math. Statist., James Cook Univ., North Queensland, Australia, Tech. Rep. (1992), 92–02.Google ScholarGoogle Scholar
  4. Charu C. Aggarwal. 2003. A framework for diagnosing changes in evolving data streams. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data. ACM, 575--586.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kerem Altun and Billur Barshan. 2010. Human activity recognition using inertial/magnetic sensor units. In International Workshop on Human Behavior Understanding. Springer, 38--51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Kerem Altun, Billur Barshan, and Orkun Tunçel. 2010. Comparative study on classifying human activities with miniature inertial and magnetic sensors. Pattern Recognition 43, 10 (2010), 3605--3620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cédric Archambeau and Francis R. Bach. 2009. Sparse probabilistic projections. In Advances in Neural Information Processing Systems. 73--80.Google ScholarGoogle Scholar
  8. Matej Artac, Matjaz Jogan, and Ales Leonardis. 2002. Incremental PCA for on-line visual learning and recognition. In Proceedings of the 16th International Conference on Pattern Recognition, 2002, Vol. 3. IEEE, 781--784.Google ScholarGoogle ScholarCross RefCross Ref
  9. Kirk Baker. 2005. Singular value decomposition tutorial. The Ohio State University 24 (2005).Google ScholarGoogle Scholar
  10. Billur Barshan and Murat Cihan Yüksek. 2014. Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units. Computer Journal 57, 11 (2014), 1649--1667.Google ScholarGoogle ScholarCross RefCross Ref
  11. Jean-Patrick Baudry, Margarida Cardoso, Gilles Celeux, Maria José Amorim, and Ana Sousa Ferreira. 2012. Enhancing the selection of a model-based clustering with external qualitative variables. arXiv preprint arXiv:1211.0437 (2012).Google ScholarGoogle Scholar
  12. Mikhail Belkin and Partha Niyogi. 2003. Using manifold structure for partially labeled classification. In Advances in Neural Information Processing Systems. 953--960.Google ScholarGoogle Scholar
  13. Rajen B. Bhatt and M. Gopal. 2008. FRCT: Fuzzy-rough classification trees. Pattern Analysis and Applications 11, 1 (2008), 73--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Anil Bhattacharyya. 1943. On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society 35 (1943), 99--109.Google ScholarGoogle Scholar
  15. Albert Bifet and Ricard Gavalda. 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the 2007 SIAM International Conference on Data Mining. SIAM, 443--448.Google ScholarGoogle ScholarCross RefCross Ref
  16. Avrim L. Blum and Pat Langley. 1997. Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 1–2 (1997), 245--271.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Christos Boutsidis, Dan Garber, Zohar Karnin, and Edo Liberty. 2015. Online principal components analysis. Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms, 887--901.Google ScholarGoogle Scholar
  18. Pierluigi Casale, Oriol Pujol, and Petia Radeva. 2012. BeaStreamer-v0. 1: a new platform for Multi-Sensors Data Acquisition in Wearable Computing Applications.Google ScholarGoogle Scholar
  19. Pierluigi Casale, Oriol Pujol, and Petia Radeva. 2011. Human activity recognition from accelerometer data using a wearable device. In Iberian Conference on Pattern Recognition and Image Analysis. Springer, 289--296.Google ScholarGoogle ScholarCross RefCross Ref
  20. Pierluigi Casale, Oriol Pujol, and Petia Radeva. 2012. Personalization and user verification in wearable systems using biometric walking patterns. Personal and Ubiquitous Computing 16, 5 (2012), 563--580.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. ACM, 205--214.Google ScholarGoogle Scholar
  22. Patricia Cohen, Stephen G. West, and Leona S. Aiken. 2014. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Psychology Press.Google ScholarGoogle Scholar
  23. Belur V. Dasarathy. 1980. Nosing around the neighborhood: A new system structure and classification rule for recognition in partially exposed environments. IEEE Transactions on Pattern Analysis and Machine Intelligence 1 (1980), 67--71.Google ScholarGoogle Scholar
  24. Claudio De Stefano, Francesco Fontanella, Marilena Maniaci, and Alessandra Scotto di Freca. 2011. A method for scribe distinction in medieval manuscripts using page layout features. In International Conference on Image Analysis and Processing. Springer, 393--402.Google ScholarGoogle Scholar
  25. Claudio De Stefano, Marilena Maniaci, Francesco Fontanella, and A. Scotto di Freca. 2018. Reliable writer identification in medieval manuscripts through page layout features: The “Avila” Bible case. Engineering Applications of Artificial Intelligence 72 (2018), 99--110.Google ScholarGoogle Scholar
  26. Jamie DeCoster. 1998. Overview of factor analysis.Google ScholarGoogle Scholar
  27. Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.Google ScholarGoogle Scholar
  28. Thomas G. Dietterich and Ghulum Bakiri. 1991. Error-correcting output codes: A general method for improving multiclass inductive learning programs. In Association for the Advancement of Artificial Intelligence. Citeseer, 572--577.Google ScholarGoogle Scholar
  29. Thomas G. Dietterich and Ghulum Bakiri. 1994. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research 2 (1994), 263--286.Google ScholarGoogle ScholarCross RefCross Ref
  30. Chris Ding and Xiaofeng He. 2004. K-means clustering via principal component analysis. In Proceedings of the 21st International Conference on Machine Learning. ACM, 29.Google ScholarGoogle Scholar
  31. Richard O. Duda, Peter E. Hart, and David G. Stork. 2012. Pattern Classification. John Wiley 8 Sons.Google ScholarGoogle Scholar
  32. George H. Dunteman. 1989. Principal component analysis. Quantitative applications in the social sciences series (vol. 69). Sage.Google ScholarGoogle Scholar
  33. Carl Eckart and Gale Young. 1936. The approximation of one matrix by another of lower rank. Psychometrika 1, 3 (1936), 211--218.Google ScholarGoogle ScholarCross RefCross Ref
  34. Tarek Elgamal, Maysam Yabandeh, Ashraf Aboulnaga, Waleed Mustafa, and Mohamed Hefeeda. 2015. sPCA: Scalable principal component analysis for big data on distributed platforms. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 79--91.Google ScholarGoogle Scholar
  35. Brian S. Everitt and Graham Dunn. 2001. Applied Multivariate Data Analysis. Vol. 2. Wiley Online Library.Google ScholarGoogle Scholar
  36. Weiguo Fan, Michael D. Gordon, and Praveen Pathak. 2005. Effective profiling of consumer information retrieval needs: A unified framework and empirical comparison. Decision Support Systems 40, 2 (2005), 213--233.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Mark Fanty and Ronald Cole. 1991. Spoken letter recognition. In Advances in Neural Information Processing Systems. 220--226.Google ScholarGoogle Scholar
  38. Ronald A. Fisher. 1936. The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 2 (1936), 179--188.Google ScholarGoogle ScholarCross RefCross Ref
  39. Richard O. Duda, Peter E. Hart, and David G. Stork. 1973. Pattern Classification and Scene Analysis, Vol. 3. Wiley New York.Google ScholarGoogle Scholar
  40. Rainer Hoch. 1994. Using IR techniques for text classification in document analysis. In Special Interest Group on Information Retrieval (SIGIR’94). Springer, 31--40.Google ScholarGoogle Scholar
  41. Michael Holmes, Alexander Gray, and Charles Isbell. 2007. Fast SVD for large-scale matrices. In Workshop on Efficient Machine Learning at NIPS, Vol. 58. 249--252.Google ScholarGoogle Scholar
  42. Ian Jolliffe. 2011. Principal Component Analysis. Springer.Google ScholarGoogle Scholar
  43. Ian T. Jolliffe. 1990. Principal component analysis: A beginner’s guide—I. Introduction and application. Weather 45, 10 (1990), 375--382.Google ScholarGoogle ScholarCross RefCross Ref
  44. Thomas Kailath. 1967. The divergence and Bhattacharyya distance measures in signal selection. IEEE Transactions on Communication Technology 15, 1 (1967), 52--60.Google ScholarGoogle ScholarCross RefCross Ref
  45. Zohar Karnin and Edo Liberty. 2015. Online PCA with spectral bounds. In Conference on Learning Theory. 1129--1140.Google ScholarGoogle Scholar
  46. Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artificial Intelligence 97, 1–2 (1997), 273--324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Daphne Koller and Mehran Sahami. 1996. Toward Optimal Feature Selection. Technical Report. Stanford InfoLab.Google ScholarGoogle Scholar
  48. Solomon Kullback and Richard A. Leibler. 1951. On information and sufficiency. Annals of Mathematical Statistics 22, 1 (1951), 79--86.Google ScholarGoogle ScholarCross RefCross Ref
  49. Ludmila I. Kuncheva and William J. Faithfull. 2014. PCA feature extraction for change detection in multidimensional unlabeled data. IEEE Transactions on Neural Networks and Learning Systems 25, 1 (2014), 69--80.Google ScholarGoogle ScholarCross RefCross Ref
  50. Wenke Lee, Salvatore J. Stolfo, and Kui W. Mok. 1999. A data mining framework for building intrusion detection models. In Proceedings of the 1999 IEEE Symposium on Security and Privacy, 1999. IEEE, 120--132.Google ScholarGoogle Scholar
  51. David D. Lewis. 1992. Feature selection and feature extraction for text categorization. In Proceedings of the Workshop on Speech and Natural Language. Association for Computational Linguistics, 212--217.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Haifeng Li, Tao Jiang, and Keshu Zhang. 2004. Efficient and robust feature extraction by maximum margin criterion. In Advances in Neural Information Processing Systems. 97--104.Google ScholarGoogle Scholar
  53. Quanzhi Li, Armineh Nourbakhsh, Sameena Shah, and Xiaomo Liu. 2017. Real-time novel event detection from social media. In IEEE 33rd International Conference on Data Engineering (ICDE’17). IEEE, 1129--1139.Google ScholarGoogle Scholar
  54. Yongmin Li, L-Q Xu, Jason Morphett, and Richard Jacobs. 2003. An integrated algorithm of incremental and robust PCA. In Proceedings of the 2003 International Conference on Image Processing (ICIP ’03), Vol. 1. IEEE, I–245.Google ScholarGoogle Scholar
  55. Daw-Tung Lin. 2006. Facial expression classification using PCA and hierarchical radial basis function network. Journal of Information Science and Engineering 22, 5 (2006), 1033--1046.Google ScholarGoogle Scholar
  56. Raul H. C. Lopes. 2011. Kolmogorov-Smirnov test. In International Encyclopedia of Statistical Science. Springer, 718--720.Google ScholarGoogle Scholar
  57. Moutinho Luiz and Huarng Kun-huang. 2015. Quantitative Modelling in Marketing and Management. World Scientific.Google ScholarGoogle Scholar
  58. Prasanta Chandra Mahalanobis. 1936. On the generalized distance in statistics. National Institute of Science of India.Google ScholarGoogle Scholar
  59. Aleix M. Martínez and Avinash C. Kak. 2001. PCA versus lDA. IEEE Transactions on Pattern Analysis 8 Machine Intelligence 2 (2001), 228--233.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Michael Mathioudakis and Nick Koudas. 2010. Twittermonitor: Trend detection over the Twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. ACM, 1155--1158.Google ScholarGoogle Scholar
  61. Kevin Meagher, David Loiselle, and Rodger Koopman. 2012. Real time microgrid power analytics portal for mission critical power systems. US Patent 8,321,194.Google ScholarGoogle Scholar
  62. Stuart E. Middleton, Lee Middleton, and Stefano Modafferi. 2014. Real-time crisis mapping of natural disasters using social media. IEEE Intelligent Systems 29, 2 (2014), 9--17.Google ScholarGoogle ScholarCross RefCross Ref
  63. M. Nikulin. 2001. Hellinger distance. In Hazewinkel, M. (Ed.), Encyclopedia of Mathematics. Springer, Berlin. doi 10 (2001), 1361684--1361686.Google ScholarGoogle Scholar
  64. Erkki Oja and Juha Karhunen. 1985. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. Journal of Mathematical Analysis and Applications 106, 1 (1985), 69--84.Google ScholarGoogle ScholarCross RefCross Ref
  65. Cam Nugent (originator). [n.d.]. S8P 500 stock data. Retrieved from https://www.kaggle.com/camnugent/sandp500.Google ScholarGoogle Scholar
  66. G. Chaudhuri (originator). [n.d.]. Bhattacharyya distance. Retrieved from https://www.encyclopediaofmath.org/index.php/Bhattacharyya_distance.Google ScholarGoogle Scholar
  67. Nhathai Phan, Soon Ae Chun, Manasi Bhole, and James Geller. 2017. Enabling real-time drug abuse detection in tweets. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE’17). IEEE, 1510--1514.Google ScholarGoogle Scholar
  68. Daniel Preotiuc-Pietro, Sina Samangooei, Trevor Cohn, Nicholas Gibbins, and Mahesan Niranjan. 2012. Trendminer: An architecture for real time analysis of social media text. In Sixth International Association for the Advancement of Artificial Intelligence Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  69. Abdulhakim A. Qahtan, Basma Alharbi, Suojin Wang, and Xiangliang Zhang. 2015. A PCA-based change detection framework for multidimensional data streams: Change detection in multidimensional data streams. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 935--944.Google ScholarGoogle Scholar
  70. Jayant G. Rohra, Boominathan Perumal, Swathi Jamjala Narayanan, Priya Thakur, and Rajen B. Bhatt. 2017. User localization in an indoor environment using fuzzy hybrid of particle swarm optimization 8 gravitational search algorithm with neural networks. In Proceedings of 6th International Conference on Soft Computing for Problem Solving. Springer, 286--295.Google ScholarGoogle Scholar
  71. Sam T. Roweis and Lawrence K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323--2326.Google ScholarGoogle ScholarCross RefCross Ref
  72. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, 851--860.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Terence D. Sanger. 1989. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks 2, 6 (1989), 459--473.Google ScholarGoogle ScholarCross RefCross Ref
  74. Tamas Sarlos. 2006. Improved approximation algorithms for large matrices via random projections. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06). IEEE, 143--152.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Lindsay I. Smith. 2002. A Tutorial on Principal Components Analysis. Technical Report.Google ScholarGoogle Scholar
  76. Jun-ichi Takeuchi and Kenji Yamanishi. 2006. A unifying framework for detecting outliers and change points from time series. IEEE Transactions on Knowledge and Data Engineering 18, 4 (2006), 482--492.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Md Mehrab Tanjim and Muhammad Abdullah Adnan. 2018. sSketch: A scalable sketching technique for PCA in the cloud. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining. ACM, 574--582.Google ScholarGoogle Scholar
  78. Joshua B. Tenenbaum, Vin De Silva, and John C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500 (2000), 2319--2323.Google ScholarGoogle ScholarCross RefCross Ref
  79. Michael E. Tipping and Christopher M. Bishop. 1999. Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61, 3 (1999), 611--622.Google ScholarGoogle ScholarCross RefCross Ref
  80. Satosi Watanabe and Nikhil Pakvasa. 1973. Subspace method of pattern recognition. In Proc. 1st IJCPR. 25--32.Google ScholarGoogle Scholar
  81. Andrew R. Webb. 2003. Statistical Pattern Recognition. John Wiley 8 Sons.Google ScholarGoogle Scholar
  82. Zhewei Wei, Xuancheng Liu, Feifei Li, Shuo Shang, Xiaoyong Du, and Ji-Rong Wen. 2016. Matrix sketching over sliding windows. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1465--1480.Google ScholarGoogle Scholar
  83. Juyang Weng, Yilu Zhang, and Wey-Shiuan Hwang. 2003. Candid covariance-free incremental principal component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 8 (2003), 1034--1040.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Worldometers and 7 Billion World. [n.d.]. Internet Live Stats. Retrieved from http://www.internetlivestats.com/one-second/#tweets-band.Google ScholarGoogle Scholar
  85. Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, Qiansheng Cheng, Weiguo Fan, and Wei-Ying Ma. 2005. OCFS: optimal orthogonal centroid feature selection for text categorization. In Proceedings of the 28th annual international ACM Special Interest Group on Information Retrieval(SIGIR) Conference on Research and Development in Information Retrieval. ACM, 122–129.Google ScholarGoogle Scholar
  86. Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In ICML, Vol. 97. 412--420.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Real-Time Principal Component Analysis

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM/IMS Transactions on Data Science
            ACM/IMS Transactions on Data Science  Volume 1, Issue 2
            May 2020
            169 pages
            ISSN:2691-1922
            DOI:10.1145/3403596
            Issue’s Table of Contents

            Copyright © 2020 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 June 2020
            • Online AM: 7 May 2020
            • Accepted: 1 November 2019
            • Revised: 1 October 2019
            • Received: 1 March 2019
            Published in tds Volume 1, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)174
            • Downloads (Last 6 weeks)16

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!