Abstract
Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A ≈ WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets.
We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementations, our algorithm is also flexible: (1) it performs well for both dense and sparse matrices, and (2) it allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors W and H within the alternating iterations. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements.
- G. Ballard, A. Druinsky, N. Knight, and O. Schwartz. Brief announcement: Hypergraph partitioning for parallel sparse matrix-matrix multiplication. In Proceedings of SPAA, pages 86--88, 2015. URL http://doi.acm.org/10.1145/2755573.2755613. Google Scholar
Digital Library
- E. Chan, M. Heimlich, A. Purkayastha, and R. van de Geijn. Collective communication: theory, practice, and experience. Concurrency and Computation: Practice and Experience, 19(13):1749--1783, 2007. URL http://dx.doi.org/10.1002/cpe.1206. Google Scholar
Digital Library
- A. Cichocki, R. Zdunek, A. H. Phan, and S.-i. Amari. Nonnegative matrix and tensor factorizations: applications to exploratory multiway data analysis and blind source separation. Wiley, 2009. Google Scholar
Digital Library
- J. Demmel, D. Eliahu, A. Fox, S. Kamil, B. Lipshitz, O. Schwartz, and O. Spillinger. Communication-optimal parallel recursive rectangular matrix multiplication. In Proceedings of IPDPS, pages 261--272, 2013. URL http://dx.doi.org/10.1109/IPDPS.2013.80. Google Scholar
Digital Library
- J. P. Fairbanks, R. Kannan, H. Park, and D. A. Bader. Behavioral clusters in dynamic graphs. Parallel Computing, 47:38--50, 2015. URL http://dx.doi.org/10.1016/j.parco.2015.03.002.Google Scholar
Digital Library
- C. Faloutsos, A. Beutel, E. P. Xing, E. E. Papalexakis, A. Kumar, and P. P. Talukdar. Flexi-FaCT: Scalable flexible factorization of coupled tensors on Hadoop. In Proceedings of the SDM, pages 109--117, 2014. URL http://epubs.siam.org/doi/abs/10.1137/1. 9781611973440.13.Google Scholar
- R. Fujimoto, A. Guin, M. Hunter, H. Park, G. Kanitkar, R. Kannan, M. Milholen, S. Neal, and P. Pecher. A dynamic data driven application system for vehicle tracking. Procedia Computer Science, 29: 1203--1215, 2014. URL http://dx.doi.org/10.1016/j.procs.2014.05.108.Google Scholar
Cross Ref
- R. Gemulla, E. Nijkamp, P. J. Haas, and Y. Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. In Proceedings of the KDD, pages 69--77. ACM, 2011. URL http://dx.doi.org/10.1145/2020408.2020426. Google Scholar
Digital Library
- D. Grove, J. Milthorpe, and O. Tardieu. Supporting array programming in X10. In Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, ARRAY'14, pages 38:38--38:43, 2014. URL http://doi.acm.org/10.1145/2627373.2627380. Google Scholar
Digital Library
- N.-D. Ho, P. V. Dooren, and V. D. Blondel. Descent methods for nonnegative matrix factorization. CoRR, abs/0801.3199, 2008.Google Scholar
- P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. JMLR, 5:1457--1469, 2004. URL www.jmlr.org/papers/volume5/hoyer04a/hoyer04a.pdf. Google Scholar
Digital Library
- O. Kaya and B. Uçar. Scalable sparse tensor decompositions in distributed memory systems. In Proceedings of SC, pages 77:1--77:11. ACM, 2015. URL http://doi.acm.org/10.1145/2807591.2807624. Google Scholar
Digital Library
- H. Kim and H. Park. Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics, 23(12):1495--1502, 2007. URL http://dx.doi.org/10.1093/bioinformatics/btm134. Google Scholar
Digital Library
- J. Kim and H. Park. Fast nonnegative matrix factorization: An active-set-like method and comparisons. SIAM Journal on Scientific Computing, 33(6):3261--3281, 2011. URL http://dx.doi.org/10.1137/110821172. Google Scholar
Digital Library
- J. Kim, Y. He, and H. Park. Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework. Journal of Global Optimization, 58(2):285--319, 2014. URL http://dx.doi.org/10.1007/s10898-013-0035-4. Google Scholar
Digital Library
- D. Kuang, C. Ding, and H. Park. Symmetric nonnegative matrix factorization for graph clustering. In Proceedings of SDM, pages 106--117, 2012. URL http://epubs.siam.org/doi/pdf/10.1137/1.9781611972825.10.Google Scholar
Cross Ref
- D. Kuang, S. Yun, and H. Park. SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering. Journal of Global Optimization, pages 1--30, 2013. URL http://dx.doi.org/10.1007/s10898-014-0247-2. Google Scholar
Digital Library
- R. Liao, Y. Zhang, J. Guan, and S. Zhou. CloudNMF: A MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets. Genomics, proteomics & bioinformatics, 12(1): 48--51, 2014. URL http://dx.doi.org/10.1016/j.gpb.2013.06.001.Google Scholar
- C. Liu, H.-c. Yang, J. Fan, L.-W. He, and Y.-M. Wang. Distributed nonnegative matrix factorization for web-scale dyadic data analysis on MapReduce. In Proceedings of the WWW, pages 681--690. ACM, 2010. URL http://dx.doi.org/10.1145/1772690.1772760. Google Scholar
Digital Library
- E. Mejía-Roa, D. Tabas-Madrid, J. Setoain, C. García, F. Tirado, and A. Pascual-Montano. NMF-mGPU: non-negative matrix factorization on multi-GPU systems. BMC bioinformatics, 16(1):43, 2015. URL http://dx.doi.org/10.1186/s12859-015-0485-4.Google Scholar
Cross Ref
- X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. B. Tsai, M. Amde, S. Owen, D. Xin, R. Xin, M. J. Franklin, R. Zadeh, M. Zaharia, and A. Talwalkar. MLlib: Machine Learning in Apache Spark, May 2015. URL http://arxiv.org/abs/1505.06807.Google Scholar
- V. P. Pauca, F. Shahnaz, M. W. Berry, and R. J. Plemmons. Text mining using nonnegative matrix factorizations. In Proceedings of SDM, 2004.Google Scholar
- C. Sanderson. Armadillo: An open source C++ linear algebra library for fast prototyping and computationally intensive experiments. Technical report, NICTA, 2010. URL http://arma.sourceforge.net/armadillo_nicta_2010.pdf.Google Scholar
- D. Seung and L. Lee. Algorithms for non-negative matrix factorization. NIPS, 13:556--562, 2001.Google Scholar
- R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19(1):49--66, 2005. URL http://hpc.sagepub.com/content/19/1/49.abstract. Google Scholar
Digital Library
- Y.-X. Wang and Y.-J. Zhang. Nonnegative matrix factorization: A comprehensive review. TKDE, 25(6):1336--1353, June 2013. URL http://dx.doi.org/10.1109/TKDE.2012.51. Google Scholar
Digital Library
- S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing, 35(3):178--194, 2009. Google Scholar
Digital Library
- Z. Xianyi. Openblas, Last Accessed 03-Dec-2015. URL http://www.openblas.net.Google Scholar
- J. Yin, L. Gao, and Z. Zhang. Scalable nonnegative matrix factorization with block-wise updates. In Machine Learning and Knowledge Discovery in Databases, volume 8726 of LNCS, pages 337--352, 2014. URL http://dx.doi.org/10.1007/978-3-662-44845-8_22.Google Scholar
Digital Library
- M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'10, pages 10--10. USENIX Association, 2010. URL http://dl.acm.org/citation.cfm?id=1863103.1863113. Google Scholar
Digital Library
Recommendations
A high-performance parallel algorithm for nonnegative matrix factorization
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingNon-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A ≈ WH. NMF is a useful tool for many applications in different domains such as topic modeling in ...
Heuristics for exact nonnegative matrix factorization
The exact nonnegative matrix factorization (exact NMF) problem is the following: given an m-by-n nonnegative matrix X and a factorization rank r, find, if possible, an m-by-r nonnegative matrix W and an r-by-n nonnegative matrix H such that $$X = WH$$X=...
Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds
IDEAL '08: Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated LearningNonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be ...






Comments