Abstract
Scaling up kernel support vector machine (SVM) training has been an important topic in recent years. Despite its theoretical elegance, training kernel SVM is impractical when facing millions of data. The divide-and-conquer (DC) strategy is a natural framework of handling gigantic problems, and the divide-and-conquer solver for kernel SVM (DC-SVM) is able to train kernel SVM with millions of data with limited time cost. However, there are some drawbacks of the DC-SVM approach. First, it used an unsupervised clustering method to partition the whole problem, which is prone to construct singular subsets, and, second, it is hard to balance the computation load between sub-problems. To address these issues, this article proposed a load-balancing partition method for kernel SVM. First, it clusters sample from one class and then assigns data samples to the cluster centers by a distance measure and construct sub-problems; in this way, it is able to control the computation load and avoid singular problems. Experimental results show that the proposed method has better load-balancing performance than DC-SVM, which implies that it is suitable for distributed and embedding systems.
- Ahmed El Alaoui and Michael W. Mahoney. 2014. Fast randomized kernel methods with statistical guarantees. arXiv Preprint arXiv:1411.0306 (2014).Google Scholar
- Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3 (2011), 27. Google Scholar
Digital Library
- Ronan Collobert, Samy Bengio, and Yoshua Bengio. 2002. A parallel mixture of SVMs for very large scale problems. Neur. Comput. 14, 5 (2002), 1105--1114. Google Scholar
Digital Library
- Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297. Google Scholar
Digital Library
- Petros Drineas and Michael W. Mahoney. 2005a. Approximating a gram matrix for improved kernel-based learning. In Learning Theory. Springer, Berlin, 323--337. Google Scholar
Digital Library
- Petros Drineas and Michael W. Mahoney. 2005b. On the nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6 (2005), 2153--2175. Google Scholar
Digital Library
- Rong-En Fan, Pai-Hsuen Chen, and Chih-Jen Lin. 2005. Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6 (2005), 1889--1918. Google Scholar
Digital Library
- Hans P. Graf, Eric Cosatto, Leon Bottou, Igor Dourdanovic, and Vladimir Vapnik. 2004. Parallel support vector machines: The cascade svm. In Advances in Neural Information Processing Systems. 521--528. Google Scholar
Digital Library
- C.-J. Hsieh, S. Si, and I. S. Dhillon. 2014. A divide-and-conquer solver for kernel support vector machines. In Proceedings of the 31st International Conference on Machine Learning. 566--574. Google Scholar
Digital Library
- Po-Sen Huang, Haim Avron, Tara N. Sainath, Vikas Sindhwani, and Bhuvana Ramabhadran. 2014. Kernel methods match deep neural networks on timit. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 205--209. Google Scholar
Cross Ref
- Thorsten Joachims. 1999a. Making Large Scale SVM Learning Practical. Technical Report. Universität Dortmund.Google Scholar
- Thorsten Joachims. 1999b. SVM-Light Support Vector Machine. http://svmlight.joachims.org/.Google Scholar
- Mayanka Katyal and Atul Mishra. 2014. A comparative study of load balancing algorithms in cloud computing environment. arXiv Preprint arXiv:1403.6918 (2014).Google Scholar
- Mauricio Kugler, Susumu Kuroyanagi, Anto Satriyo Nugroho, and Akira Iwata. 2006. CombNET-III: A support vector machine based large scale classifier with probabilistic framework. IEICE Trans. Inf. Syst. 89, 9 (2006), 2533--2541. Google Scholar
Digital Library
- Sun Yuan Kung. 2014. Kernel Methods and Machine Learning. Cambridge University Press. Google Scholar
Cross Ref
- Quoc Le, Tamás Sarlós, and Alex Smola. 2013. Fastfood-approximating kernel expansions in loglinear time. In Proceedings of the International Conference on Machine Learning. Google Scholar
Digital Library
- Zichao Yang, Andrew Wilson, Alex Smola, and Le Song. 2015. A la Carte--Learning fast kernels. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. 1098--1106.Google Scholar
- Aditya Krishna Menon. 2009. Large-scale support vector machines: Algorithms and theory. Research Exam, University of California, San Diego. 1--17.Google Scholar
- John Platt. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods. MIT Press Cambridge, MA, USA, 185--208. Google Scholar
Digital Library
- Ali Rahimi and Benjamin Recht. 2007. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems. 1177--1184. Google Scholar
Digital Library
- Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press. Google Scholar
Digital Library
- Vikas Sindhwani and Haim Avron. 2014. High-performance kernel machines with implicit distributed optimization and randomization. arXiv Preprint arXiv:1409.0940 (2014).Google Scholar
- Yang You, James Demmel, Kenneth Czechowski, Le Song, and Richard Vuduc. 2015. CA-SVM: Communication-avoiding support vector machines on distributed systems. In Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 847--859. Google Scholar
Digital Library
- Kai Zhang, Liang Lan, Zhuang Wang, and Fabian Moerchen. 2012. Scaling up kernel svm on limited resources: A low-rank linearization approach. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 1425--1434.Google Scholar
Index Terms
A Load-Balancing Divide-and-Conquer SVM Solver
Recommendations
A divide-and-conquer solver for kernel support vector machines
ICML'14: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32The kernel support vector machine (SVM) is one of the most widely used classification methods; however, the amount of computation required becomes the bottleneck when facing millions of samples. In this paper, we propose and analyze a novel divide-and-...
A divide-and-conquer method for large scale ź-nonparallel support vector machines
Recently, nonparallel support vector machine (NPSVM), a branch of support vector machines (SVMs), is developed and has attracted considerable interest. A kind of developed NPSVM, ź-nonparallel support vector machine (ź-NPSVM), which inherits the ...
Multi-cluster load balancing based on process migration
APPT'07: Proceedings of the 7th international conference on Advanced parallel processing technologiesLoad balancing is important for distributed computing systems to achieve maximum resource utilization, and process migration is an efficient way to dynamically balance the load among multiple nodes. Due to limited capacity of a single cluster, it's ...






Comments