skip to main content
10.1145/3097983.3098043acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Google Vizier: A Service for Black-Box Optimization

Published:13 August 2017Publication History

ABSTRACT

Any sufficiently complex system acts as a black box when it becomes easier to experiment with than to understand. Hence, black-box optimization has become increasingly important as systems have become more complex. In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. Google Vizier is used to optimize many of our machine learning models and other systems, and also provides core capabilities to Google's Cloud Machine Learning HyperTune subsystem. We discuss our requirements, infrastructure design, underlying algorithms, and advanced features such as transfer learning and automated early stopping that the service provides.

Skip Supplemental Material Section

Supplemental Material

golovin_google_vizier.mp4

References

  1. Rémi Bardenet, Mátyás Brendel, Balázs Kégl, and Michele Sebag. 2013. Collaborative hyperparameter tuning. ICML Vol. 2 (2013), 199.Google ScholarGoogle Scholar
  2. James S Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl 2011. Algorithms for hyper-parameter optimization. In Advances in Neural Information Processing Systems. 2546--2554.Google ScholarGoogle Scholar
  3. J Bernardo, MJ Bayarri, JO Berger, AP Dawid, D Heckerman, AFM Smith, and M West. 2011. Optimization under unknown constraints. Bayesian Statistics 9 Vol. 9 (2011), 229.Google ScholarGoogle Scholar
  4. Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D 3 data-driven documents. IEEE transactions on visualization and computer graphics, Vol. 17, 12 (2011), 2301--2309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Herman Chernoff. 1959. Sequential Design of Experiments. Ann. Math. Statist., Vol. 30, 3 (09 1959), 755--770. https://doi.org/10.1214/aoms/1177706205Google ScholarGoogle ScholarCross RefCross Ref
  6. Jasmine Collins, Jascha Sohl-Dickstein, and David Sussillo. 2017. Capacity and Trainability in Recurrent Neural Networks Profeedings of the International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  7. Andrew R Conn, Katya Scheinberg, and Luis N Vicente. 2009. Introduction to derivative-free optimization. SIAM.Google ScholarGoogle Scholar
  8. Thomas Desautels, Andreas Krause, and Joel W Burdick. 2014. Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization. Journal of Machine Learning Research Vol. 15, 1 (2014), 3873--3923.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves.. In IJCAI. 3460--3468.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Steffen Finck, Nikolaus Hansen, Raymond Rost, and Anne Auger. 2009. Real-Parameter Black-Box Optimization Benchmarking 2009: Presentation of the Noiseless Functions. http://coco.gforge.inria.fr/lib/exe/fetch.php?media=download3.6:bbobdocfunctions.pdf. (2009).[Online].Google ScholarGoogle Scholar
  11. Jacob R Gardner, Matt J Kusner, Zhixiang Eddie Xu, Kilian Q Weinberger, and John P Cunningham. 2014. Bayesian Optimization with Inequality Constraints. ICML. 937--945.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michael A Gelbart, Jasper Snoek, and Ryan P Adams. 2014. Bayesian optimization with unknown constraints. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 250--259.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Josep Ginebra and Murray K. Clayton 1995. Response Surface Bandits. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 57, 4 (1995), 771--784.Google ScholarGoogle ScholarCross RefCross Ref
  14. Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando de Freitas 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE Vol. 104, 1 (2016), 148--175. Google ScholarGoogle ScholarCross RefCross Ref
  15. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms Advances in neural information processing systems. 2951--2959.Google ScholarGoogle Scholar
  16. Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, and Ryan P. Adams 2015. Scalable Bayesian Optimization Using Deep Neural Networks Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (JMLR Workshop and Conference Proceedings), Francis R. Bach and David M. Blei (Eds.), Vol. Vol. 37. JMLR.org, 2171--2180. http://jmlr.org/proceedings/papers/v37/snoek15.htmlGoogle ScholarGoogle Scholar
  17. Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter 2016. Bayesian Optimization with Robust Bayesian Neural Networks. Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 4134--4142.Google ScholarGoogle Scholar
  18. Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger 2010. Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design. ICML (2010).Google ScholarGoogle Scholar
  19. Kevin Swersky, Jasper Snoek, and Ryan Prescott Adams. 2014. Freeze-thaw Bayesian optimization. arXiv preprint arXiv:1406.3896 (2014).Google ScholarGoogle Scholar
  20. Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P Xing 2016. Deep kernel learning Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. 370--378.Google ScholarGoogle Scholar
  21. Dani Yogatama and Gideon Mann 2014. Efficient Transfer Learning Method for Automatic Hyperparameter Tuning. JMLR: W&CP Vol. 33 (2014), 1077--1085.Google ScholarGoogle Scholar
  22. Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329 (2014).Google ScholarGoogle Scholar

Index Terms

  1. Google Vizier: A Service for Black-Box Optimization

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
              August 2017
              2240 pages
              ISBN:9781450348874
              DOI:10.1145/3097983

              Copyright © 2017 Owner/Author

              This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 13 August 2017

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader