skip to main content
research-article
Open Access

Neural architecture search using property guided synthesis

Published:31 October 2022Publication History
Skip Abstract Section

Abstract

Neural architecture search (NAS) has become an increasingly important tool within the deep learning community in recent years, yielding many practical advancements in the design of deep neural network architectures. However, most existing approaches operate within highly structured design spaces, and hence (1) explore only a small fraction of the full search space of neural architectures while also (2) requiring significant manual effort from domain experts. In this work, we develop techniques that enable efficient NAS in a significantly larger design space. In particular, we propose to perform NAS in an abstract search space of program properties. Our key insights are as follows: (1) an abstract search space can be significantly smaller than the original search space, and (2) architectures with similar program properties should also have similar performance; thus, we can search more efficiently in the abstract search space. To enable this approach, we also introduce a novel efficient synthesis procedure, which performs the role of concretizing a set of promising program properties into a satisfying neural architecture. We implement our approach, αNAS, within an evolutionary framework, where the mutations are guided by the program properties. Starting with a ResNet-34 model, αNAS produces a model with slightly improved accuracy on CIFAR-10 but 96% fewer parameters. On ImageNet, αNAS is able to improve over Vision Transformer (30% fewer FLOPS and parameters), ResNet-50 (23% fewer FLOPS, 14% fewer parameters), and EfficientNet (7% fewer FLOPS and parameters) without any degradation in accuracy.

Skip Supplemental Material Section

Supplemental Material

References

  1. Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to Optimize Halide with Tree Search and Random Programs. ACM Trans. Graph., 38, 4 (2019), Article 121, July, 12 pages. issn:0730-0301 https://doi.org/10.1145/3306346.3322967 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-Bit Training of Neural Networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA. 5151–5159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, and Quoc V Le. 2020. Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14311–14320. Google ScholarGoogle Scholar
  4. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax Google ScholarGoogle Scholar
  5. Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In Proceedings of the AAAI Conference on Artificial Intelligence. 32. Google ScholarGoogle ScholarCross RefCross Ref
  6. Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (OSDI’18). USENIX Association, USA. 579–594. isbn:9781931971478 Google ScholarGoogle Scholar
  7. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy Google ScholarGoogle Scholar
  8. Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJl-b3RcF7 Google ScholarGoogle Scholar
  9. Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. 2021. A Survey of Quantization Methods for Efficient Neural Network Inference. CoRR, abs/2103.13630 (2021), arXiv:2103.13630. arxiv:2103.13630 Google ScholarGoogle Scholar
  10. Babak Hassibi and David Stork. 1992. Second order derivatives for network pruning: Optimal Brain Surgeon. In Advances in Neural Information Processing Systems, S. Hanson, J. Cowan, and C. Giles (Eds.). 5, Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1992/file/303ed4c69846ab36c2904d3ba8573050-Paper.pdf Google ScholarGoogle Scholar
  11. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European conference on computer vision (ECCV). 784–800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, and Yonghui Wu. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in neural information processing systems, 32 (2019). Google ScholarGoogle Scholar
  14. Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. 2019. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP ’19). Association for Computing Machinery, New York, NY, USA. 47–62. https://doi.org/10.1145/3341301.3359630 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhihao Jia, James Thomas, Todd Warszawski, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2019. Optimizing DNN Computation with Relaxed Graph Substitutions. In Proceedings of the 2nd SysML Conference (SysML ’19). Google ScholarGoogle Scholar
  16. Zhihao Jia, Matei Zaharia, and Alex Aiken. 2019. Beyond Data and Model Parallelism for Deep Neural Networks.. Proceedings of Machine Learning and Systems, 1 (2019), 1–13. Google ScholarGoogle Scholar
  17. Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 77, Oct., 29 pages. issn:2475-1421 https://doi.org/10.1145/3133901 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (Eds.). 25, Curran Associates, Inc.. https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yann LeCun, John Denker, and Sara Solla. 1989. Optimal Brain Damage. In Advances in Neural Information Processing Systems, D. Touretzky (Ed.). 2, Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1989/file/6c9882bbac1c7093bd25041881277658-Paper.pdf Google ScholarGoogle Scholar
  20. Liam Li and Ameet Talwalkar. 2020. Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence. 367–377. Google ScholarGoogle Scholar
  21. Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, and Song Han. 2020. Gan compression: Efficient architectures for interactive conditional gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5284–5294. Google ScholarGoogle ScholarCross RefCross Ref
  22. Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2018. Progressive Neural Architecture Search. In Computer Vision – ECCV, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing. Google ScholarGoogle Scholar
  23. Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2018. Hierarchical Representations for Efficient Architecture Search. In International Conference on Learning Representations (ICLR’18). https://openreview.net/forum?id=BJQRKzbA- Google ScholarGoogle Scholar
  24. Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. 5068–5076. https://doi.org/10.1109/ICCV.2017.541 Google ScholarGoogle ScholarCross RefCross Ref
  25. Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, and Matei Zaharia. 2019. PipeDream: Generalized Pipeline Parallelism for DNN Training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP ’19). Association for Computing Machinery, New York, NY, USA. 1–15. isbn:9781450368735 https://doi.org/10.1145/3341301.3359646 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, and Samuel J. Kaufman. 2021. A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 1–16. https://doi.org/10.1109/PACT52795.2021.00008 Google ScholarGoogle ScholarCross RefCross Ref
  27. Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling up Superoptimization. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’16). Association for Computing Machinery, New York, NY, USA. 297–310. isbn:9781450340915 https://doi.org/10.1145/2872362.2872387 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’15). Association for Computing Machinery, New York, NY, USA. 107–126. isbn:9781450336895 https://doi.org/10.1145/2814270.2814310 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence. 33, 4780–4789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Esteban Real, Chen Liang, David R. So, and Quoc V. Le. 2020. AutoML-Zero: Evolving Machine Learning Algorithms From Scratch. In International Conference on Machine Learning (ICML’20). arXiv:2003.03384. arxiv:2003.03384 Google ScholarGoogle Scholar
  31. David So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, and Quoc V Le. 2021. Searching for Efficient Transformers for Language Modeling. In Advances in Neural Information Processing Systems. Google ScholarGoogle Scholar
  32. Mingxing Tan and Quoc V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of International Conference on International Conference on Machine Learning (ICML’19). arXiv:2104.00298. Google ScholarGoogle Scholar
  33. Mingxing Tan and Quoc V. Le. 2021. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of International Conference on International Conference on Machine Learning (ICML’21). arXiv:2104.00298. Google ScholarGoogle Scholar
  34. Jack Turner, Elliot J. Crowley, and Michael F. P. O’Boyle. 2021. Neural Architecture Search as Program Transformation Exploration. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’21). Association for Computing Machinery, New York, NY, USA. 915–927. isbn:9781450383172 https://doi.org/10.1145/3445814.3446753 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions. arXiv preprint arXiv:1802.04730, arxiv:1802.04730. Google ScholarGoogle Scholar
  36. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 30, Curran Associates, Inc.. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf Google ScholarGoogle Scholar
  37. Chenglong Wang, Alvin Cheung, and Rastislav Bodik. 2017. Synthesizing Highly Expressive SQL Queries from Input-Output Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17). Association for Computing Machinery, New York, NY, USA. 452–466. isbn:9781450349888 https://doi.org/10.1145/3062341.3062365 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, and Kailash Gopalakrishnan. 2018. Training Deep Neural Networks with 8-Bit Floating Point Numbers. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA. 7686–7695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Program Synthesis Using Abstraction Refinement. In Symposium on Principles of Programming Languages (POPL’17). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3158151 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network Morphism. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML’16). 564–572. Google ScholarGoogle Scholar
  41. Yichen Yang, Mangpo Phitchaya Phothilimtha, Yisu Remy Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality Saturation for Tensor Graph Superoptimization. In MLSys. arxiv:2101.01332 Google ScholarGoogle Scholar
  42. Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, and Mathieu Salzmann. 2019. Evaluating The Search Phase of Neural Architecture Search. In International Conference on Learning Representations. Google ScholarGoogle Scholar
  43. Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter Ma, Qiumin Xu, Hanxiao Liu, Phitchaya Phothilimtha, Shen Wang, Anna Goldie, Azalia Mirhoseini, and James Laudon. 2020. Transferable Graph Optimizers for ML Compilers. In NeurIPS 2020. Google ScholarGoogle Scholar
  44. Barret Zoph and Quoc Le. 2017. Neural Architecture Search with Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=r1Ue8Hcxg Google ScholarGoogle Scholar
  45. Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc Le. 2018. Learning Transferable Architectures for Scalable Image Recognition. In Conference on Computer Vision and Pattern Recognition (CVPR’18). 8697–8710. https://doi.org/10.1109/CVPR.2018.00907 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Neural architecture search using property guided synthesis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!