ABSTRACT
Nonnegative matrix factorization (NMF) was popularized as a tool for data mining by Lee and Seung in 1999. NMF attempts to approximate a matrix with nonnegative entries by a product of two low-rank matrices, also with nonnegative entries. We propose an algorithm called rank-one downdate (R1D) for computing an NMF that is partly motivated by the singular value decomposition. This algorithm computes the dominant singular values and vectors of adaptively determined sub-matrices of a matrix. On each iteration, R1D extracts a rank-one submatrix from the original matrix according to an objective function. We establish a theoretical result that maximizing this objective function corresponds to correctly classifying articles in a nearly separable corpus. We also provide computational experiments showing the success of this method in identifying features in realistic datasets. The method is also much faster than other NMF routines.
References
- Asgarian, N., & Greiner, R. (2006). Using rank-1 biclusters to classify microarray data. Department of Computing Science, University of Alberta, Edmonton, AB, Canada.Google Scholar
- Bergmann, S., Ihmels, J., & Barkai, N. (2003). Iterative signature algorithm for the analysis of largescale gene expression data. Physical Review E, 67, 031902.Google Scholar
Cross Ref
- Biggs, M., Ghodsi, A., & Vavasis, S. (2008). Nonnegative matrix factorization via rank-one downdate. Available online at http://www.arxiv.org/abs/0805.0120. Google Scholar
Digital Library
- Boutsidis, C., & Gallopoulos, E. (2007). SVD based initialization: A head start for nonnegative matrix factorization. In press. Google Scholar
Digital Library
- Cohen, J., & Rothblum, U. (1993). Nonnegative ranks, decompositions and factorizations of nonnegative matrices. Linear Algebra and its Applications, 190, 149--168.Google Scholar
- Deerwester, S., Dumais, S., Furnas, G., Landauer, T., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 391--407.Google Scholar
Cross Ref
- Gillis, N. (2006). Approximation et sous-approximation de matrices par factorisation positive: algorithmes, complexité et applications. Master's thesis, Université Catholique de Louvain, Louvain-la-Neuve, Belgium. In French.Google Scholar
- Golub, G. H., & Van Loan, C. F. (1996). Matrix computations, 3rd edition. Baltimore: Johns Hopkins University Press. Google Scholar
Digital Library
- Gregory, D. A., & Pullman, N. J. (1983). Semiring rank: Boolean rank and nonnegative matrix rank. J. Combin. Inform. System Sci, 3, 223--233.Google Scholar
- Hofmann, T. (1999). Probabilistic latent semantic analysis. UAI '99: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30-August 1, 1999 (pp. 289--296). Morgan Kaufmann. Google Scholar
Digital Library
- Hoyer, P. (2000). nmfpack - matlab code for nmf. http://http://www.hiit.fi/node/70.Google Scholar
- Hoyer, P. (2004). Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research, 5, 1457--1469. Google Scholar
Digital Library
- Kim, H., & Park, H. (2007). Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics (to appear). Google Scholar
Digital Library
- Lee, D., & Seung, H. (1999). Learning the parts of objects by non-negative matrix factorization. Nature, 401, 788--791.Google Scholar
- Paatero, P., & Tapper, U. (1994). Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5, 111--126.Google Scholar
- Papadimitriou, C., Raghavan, P., Tamaki, H., & Vempala, S. (2000). Latent semantic indexing: A probabilistic analysis. J. Comput. Syst. Sci., 61, 217--235. Google Scholar
Digital Library
- Peeters, R. (2003). The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics, 131, 651--654. Google Scholar
Digital Library
- Stewart, G. W. (1993). On the early history of the singular value decomposition. SIAM Review, 35, 551--566. Google Scholar
Digital Library
- TDT Study (1997). Topic detection and tracking pilot study. http://projects.ldc.upenn.edu/TDT/.Google Scholar
- Vavasis, S. (2007). On the complexity of nonnegative matrix factorization. arxiv.org, 0708.4149.Google Scholar
Index Terms
Nonnegative matrix factorization via rank-one downdate




Comments