ABSTRACT
We study the problem of finding the dominant eigenvector of the sample covariance matrix, under additional constraints on the vector: a cardinality constraint limits the number of non-zero elements, and non-negativity forces the elements to have equal sign. This problem is known as sparse and non-negative principal component analysis (PCA), and has many applications including dimensionality reduction and feature selection. Based on expectation-maximization for probabilistic PCA, we present an algorithm for any combination of these constraints. Its complexity is at most quadratic in the number of dimensions of the data. We demonstrate significant improvements in performance and computational efficiency compared to other constrained PCA algorithms, on large data sets from biology and computer vision. Finally, we show the usefulness of non-negative sparse PCA for unsupervised feature selection in a gene clustering task.
References
- Armstrong, S., Staunton, J., Silverman, L., Pieters, R., den Boer, M., Minden, M., Sallan, S., Lander, E., Golub, T., & Korsmeyer, S. (2002). MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30, 41--47.Google Scholar
Cross Ref
- Cadima, J., & Jolliffe, I. (1995). Loadings and correlations in the interpretation of principal components. Applied Statistics, 203--214.Google Scholar
- d'Aspremont, A., Bach, F., & El Ghaoui, L. (2007). Full regularization path for sparse principal component analysis. Proceedings of the International Conference on Machine Learning. Google Scholar
Digital Library
- Horst, R., Pardalos, P., & Thoai, N. (2000). Introduction to global optimization. Kluwer Acad. Publ. Google Scholar
Digital Library
- Moghaddam, B., Weiss, Y., & Avidan, S. (2006). Spectral bounds for sparse PCA: Exact and greedy algorithms. Advances in Neural Information Processing Systems.Google Scholar
- Roweis, S. (1998). EM algorithms for PCA and sensible PCA. Advances in Neural Information Processing Systems.Google Scholar
- Sha, F., Lin, Y., Saul, L., & Lee, D. (2007). Multiplicative Updates for Nonnegative Quadratic Programming. Neural Computation, 19, 2004--2031. Google Scholar
Digital Library
- Sriperumbudur, B., Torres, D., & Lanckriet, G. (2007). Sparse eigen methods by d.c. programming. Proceedings of the International Conference on Machine Learning. Google Scholar
Digital Library
- Sung, K.-K. (1996). Learning and example selection for object and pattern recognition. Doctoral dissertation, MIT, Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Cambridge, MA. Google Scholar
Digital Library
- Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal statistical society, series B, 58, 267--288.Google Scholar
Cross Ref
- Tipping, M., & Bishop, C. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society, Series B, 21, 611--622. Google Scholar
Digital Library
- Varshavsky, R., Gottlieb, A., Linial, M., & Horn, D. (2006). Novel Unsupervised Feature Filtering of Biological Data. Bioinformatics, 22. Google Scholar
Digital Library
- Zass, R., & Shashua, A. (2006). Nonnegative sparse PCA. Advances in Neural Information Processing Systems.Google Scholar
- Zou, H., Hastie, T., & Tibshirani, R. (2004). Sparse principal component analysis. Journal of Computational and Graphical Statistics.Google Scholar
Index Terms
Expectation-maximization for sparse and non-negative PCA



Comments