skip to main content
research-article

A Load-Balancing Divide-and-Conquer SVM Solver

Published:09 May 2017Publication History
Skip Abstract Section

Abstract

Scaling up kernel support vector machine (SVM) training has been an important topic in recent years. Despite its theoretical elegance, training kernel SVM is impractical when facing millions of data. The divide-and-conquer (DC) strategy is a natural framework of handling gigantic problems, and the divide-and-conquer solver for kernel SVM (DC-SVM) is able to train kernel SVM with millions of data with limited time cost. However, there are some drawbacks of the DC-SVM approach. First, it used an unsupervised clustering method to partition the whole problem, which is prone to construct singular subsets, and, second, it is hard to balance the computation load between sub-problems. To address these issues, this article proposed a load-balancing partition method for kernel SVM. First, it clusters sample from one class and then assigns data samples to the cluster centers by a distance measure and construct sub-problems; in this way, it is able to control the computation load and avoid singular problems. Experimental results show that the proposed method has better load-balancing performance than DC-SVM, which implies that it is suitable for distributed and embedding systems.

References

  1. Ahmed El Alaoui and Michael W. Mahoney. 2014. Fast randomized kernel methods with statistical guarantees. arXiv Preprint arXiv:1411.0306 (2014).Google ScholarGoogle Scholar
  2. Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3 (2011), 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ronan Collobert, Samy Bengio, and Yoshua Bengio. 2002. A parallel mixture of SVMs for very large scale problems. Neur. Comput. 14, 5 (2002), 1105--1114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Petros Drineas and Michael W. Mahoney. 2005a. Approximating a gram matrix for improved kernel-based learning. In Learning Theory. Springer, Berlin, 323--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Petros Drineas and Michael W. Mahoney. 2005b. On the nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6 (2005), 2153--2175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rong-En Fan, Pai-Hsuen Chen, and Chih-Jen Lin. 2005. Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6 (2005), 1889--1918. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Hans P. Graf, Eric Cosatto, Leon Bottou, Igor Dourdanovic, and Vladimir Vapnik. 2004. Parallel support vector machines: The cascade svm. In Advances in Neural Information Processing Systems. 521--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C.-J. Hsieh, S. Si, and I. S. Dhillon. 2014. A divide-and-conquer solver for kernel support vector machines. In Proceedings of the 31st International Conference on Machine Learning. 566--574. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Po-Sen Huang, Haim Avron, Tara N. Sainath, Vikas Sindhwani, and Bhuvana Ramabhadran. 2014. Kernel methods match deep neural networks on timit. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 205--209. Google ScholarGoogle ScholarCross RefCross Ref
  11. Thorsten Joachims. 1999a. Making Large Scale SVM Learning Practical. Technical Report. Universität Dortmund.Google ScholarGoogle Scholar
  12. Thorsten Joachims. 1999b. SVM-Light Support Vector Machine. http://svmlight.joachims.org/.Google ScholarGoogle Scholar
  13. Mayanka Katyal and Atul Mishra. 2014. A comparative study of load balancing algorithms in cloud computing environment. arXiv Preprint arXiv:1403.6918 (2014).Google ScholarGoogle Scholar
  14. Mauricio Kugler, Susumu Kuroyanagi, Anto Satriyo Nugroho, and Akira Iwata. 2006. CombNET-III: A support vector machine based large scale classifier with probabilistic framework. IEICE Trans. Inf. Syst. 89, 9 (2006), 2533--2541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sun Yuan Kung. 2014. Kernel Methods and Machine Learning. Cambridge University Press. Google ScholarGoogle ScholarCross RefCross Ref
  16. Quoc Le, Tamás Sarlós, and Alex Smola. 2013. Fastfood-approximating kernel expansions in loglinear time. In Proceedings of the International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zichao Yang, Andrew Wilson, Alex Smola, and Le Song. 2015. A la Carte--Learning fast kernels. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. 1098--1106.Google ScholarGoogle Scholar
  18. Aditya Krishna Menon. 2009. Large-scale support vector machines: Algorithms and theory. Research Exam, University of California, San Diego. 1--17.Google ScholarGoogle Scholar
  19. John Platt. 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods. MIT Press Cambridge, MA, USA, 185--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ali Rahimi and Benjamin Recht. 2007. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems. 1177--1184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Bernhard Scholkopf and Alexander J. Smola. 2001. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Vikas Sindhwani and Haim Avron. 2014. High-performance kernel machines with implicit distributed optimization and randomization. arXiv Preprint arXiv:1409.0940 (2014).Google ScholarGoogle Scholar
  23. Yang You, James Demmel, Kenneth Czechowski, Le Song, and Richard Vuduc. 2015. CA-SVM: Communication-avoiding support vector machines on distributed systems. In Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 847--859. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kai Zhang, Liang Lan, Zhuang Wang, and Fabian Moerchen. 2012. Scaling up kernel svm on limited resources: A low-rank linearization approach. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 1425--1434.Google ScholarGoogle Scholar

Index Terms

  1. A Load-Balancing Divide-and-Conquer SVM Solver

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!