Abstract
Neural architecture search (NAS) has become an increasingly important tool within the deep learning community in recent years, yielding many practical advancements in the design of deep neural network architectures. However, most existing approaches operate within highly structured design spaces, and hence (1) explore only a small fraction of the full search space of neural architectures while also (2) requiring significant manual effort from domain experts. In this work, we develop techniques that enable efficient NAS in a significantly larger design space. In particular, we propose to perform NAS in an abstract search space of program properties. Our key insights are as follows: (1) an abstract search space can be significantly smaller than the original search space, and (2) architectures with similar program properties should also have similar performance; thus, we can search more efficiently in the abstract search space. To enable this approach, we also introduce a novel efficient synthesis procedure, which performs the role of concretizing a set of promising program properties into a satisfying neural architecture. We implement our approach, αNAS, within an evolutionary framework, where the mutations are guided by the program properties. Starting with a ResNet-34 model, αNAS produces a model with slightly improved accuracy on CIFAR-10 but 96% fewer parameters. On ImageNet, αNAS is able to improve over Vision Transformer (30% fewer FLOPS and parameters), ResNet-50 (23% fewer FLOPS, 14% fewer parameters), and EfficientNet (7% fewer FLOPS and parameters) without any degradation in accuracy.
Supplemental Material
Available for Download
Companion appendix for main paper.
- Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to Optimize Halide with Tree Search and Random Programs. ACM Trans. Graph., 38, 4 (2019), Article 121, July, 12 pages. issn:0730-0301 https://doi.org/10.1145/3306346.3322967
Google Scholar
Digital Library
- Ron Banner, Itay Hubara, Elad Hoffer, and Daniel Soudry. 2018. Scalable Methods for 8-Bit Training of Neural Networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA. 5151–5159.
Google Scholar
Digital Library
- Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, and Quoc V Le. 2020. Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14311–14320.
Google Scholar
- James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
Google Scholar
- Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In Proceedings of the AAAI Conference on Artificial Intelligence. 32.
Google Scholar
Cross Ref
- Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation (OSDI’18). USENIX Association, USA. 579–594. isbn:9781931971478
Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy
Google Scholar
- Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJl-b3RcF7
Google Scholar
- Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. 2021. A Survey of Quantization Methods for Efficient Neural Network Inference. CoRR, abs/2103.13630 (2021), arXiv:2103.13630. arxiv:2103.13630
Google Scholar
- Babak Hassibi and David Stork. 1992. Second order derivatives for network pruning: Optimal Brain Surgeon. In Advances in Neural Information Processing Systems, S. Hanson, J. Cowan, and C. Giles (Eds.). 5, Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1992/file/303ed4c69846ab36c2904d3ba8573050-Paper.pdf
Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
Google Scholar
Digital Library
- Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European conference on computer vision (ECCV). 784–800.
Google Scholar
Digital Library
- Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, and Yonghui Wu. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in neural information processing systems, 32 (2019).
Google Scholar
- Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. 2019. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP ’19). Association for Computing Machinery, New York, NY, USA. 47–62. https://doi.org/10.1145/3341301.3359630
Google Scholar
Digital Library
- Zhihao Jia, James Thomas, Todd Warszawski, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2019. Optimizing DNN Computation with Relaxed Graph Substitutions. In Proceedings of the 2nd SysML Conference (SysML ’19).
Google Scholar
- Zhihao Jia, Matei Zaharia, and Alex Aiken. 2019. Beyond Data and Model Parallelism for Deep Neural Networks.. Proceedings of Machine Learning and Systems, 1 (2019), 1–13.
Google Scholar
- Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 77, Oct., 29 pages. issn:2475-1421 https://doi.org/10.1145/3133901
Google Scholar
Digital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (Eds.). 25, Curran Associates, Inc.. https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Google Scholar
Digital Library
- Yann LeCun, John Denker, and Sara Solla. 1989. Optimal Brain Damage. In Advances in Neural Information Processing Systems, D. Touretzky (Ed.). 2, Morgan-Kaufmann. https://proceedings.neurips.cc/paper/1989/file/6c9882bbac1c7093bd25041881277658-Paper.pdf
Google Scholar
- Liam Li and Ameet Talwalkar. 2020. Random search and reproducibility for neural architecture search. In Uncertainty in artificial intelligence. 367–377.
Google Scholar
- Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, and Song Han. 2020. Gan compression: Efficient architectures for interactive conditional gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5284–5294.
Google Scholar
Cross Ref
- Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2018. Progressive Neural Architecture Search. In Computer Vision – ECCV, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing.
Google Scholar
- Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2018. Hierarchical Representations for Efficient Architecture Search. In International Conference on Learning Representations (ICLR’18). https://openreview.net/forum?id=BJQRKzbA-
Google Scholar
- Jian-Hao Luo, Jianxin Wu, and Weiyao Lin. 2017. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression. 5068–5076. https://doi.org/10.1109/ICCV.2017.541
Google Scholar
Cross Ref
- Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, and Matei Zaharia. 2019. PipeDream: Generalized Pipeline Parallelism for DNN Training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP ’19). Association for Computing Machinery, New York, NY, USA. 1–15. isbn:9781450368735 https://doi.org/10.1145/3341301.3359646
Google Scholar
Digital Library
- Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, and Samuel J. Kaufman. 2021. A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT). 1–16. https://doi.org/10.1109/PACT52795.2021.00008
Google Scholar
Cross Ref
- Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling up Superoptimization. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’16). Association for Computing Machinery, New York, NY, USA. 297–310. isbn:9781450340915 https://doi.org/10.1145/2872362.2872387
Google Scholar
Digital Library
- Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’15). Association for Computing Machinery, New York, NY, USA. 107–126. isbn:9781450336895 https://doi.org/10.1145/2814270.2814310
Google Scholar
Digital Library
- Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence. 33, 4780–4789.
Google Scholar
Digital Library
- Esteban Real, Chen Liang, David R. So, and Quoc V. Le. 2020. AutoML-Zero: Evolving Machine Learning Algorithms From Scratch. In International Conference on Machine Learning (ICML’20). arXiv:2003.03384. arxiv:2003.03384
Google Scholar
- David So, Wojciech Mańke, Hanxiao Liu, Zihang Dai, Noam Shazeer, and Quoc V Le. 2021. Searching for Efficient Transformers for Language Modeling. In Advances in Neural Information Processing Systems.
Google Scholar
- Mingxing Tan and Quoc V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of International Conference on International Conference on Machine Learning (ICML’19). arXiv:2104.00298.
Google Scholar
- Mingxing Tan and Quoc V. Le. 2021. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of International Conference on International Conference on Machine Learning (ICML’21). arXiv:2104.00298.
Google Scholar
- Jack Turner, Elliot J. Crowley, and Michael F. P. O’Boyle. 2021. Neural Architecture Search as Program Transformation Exploration. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’21). Association for Computing Machinery, New York, NY, USA. 915–927. isbn:9781450383172 https://doi.org/10.1145/3445814.3446753
Google Scholar
Digital Library
- Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions. arXiv preprint arXiv:1802.04730, arxiv:1802.04730.
Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 30, Curran Associates, Inc.. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Google Scholar
- Chenglong Wang, Alvin Cheung, and Rastislav Bodik. 2017. Synthesizing Highly Expressive SQL Queries from Input-Output Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17). Association for Computing Machinery, New York, NY, USA. 452–466. isbn:9781450349888 https://doi.org/10.1145/3062341.3062365
Google Scholar
Digital Library
- Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, and Kailash Gopalakrishnan. 2018. Training Deep Neural Networks with 8-Bit Floating Point Numbers. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA. 7686–7695.
Google Scholar
Digital Library
- Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Program Synthesis Using Abstraction Refinement. In Symposium on Principles of Programming Languages (POPL’17). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3158151
Google Scholar
Digital Library
- Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network Morphism. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML’16). 564–572.
Google Scholar
- Yichen Yang, Mangpo Phitchaya Phothilimtha, Yisu Remy Wang, Max Willsey, Sudip Roy, and Jacques Pienaar. 2021. Equality Saturation for Tensor Graph Superoptimization. In MLSys. arxiv:2101.01332
Google Scholar
- Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, and Mathieu Salzmann. 2019. Evaluating The Search Phase of Neural Architecture Search. In International Conference on Learning Representations.
Google Scholar
- Yanqi Zhou, Sudip Roy, Amirali Abdolrashidi, Daniel Wong, Peter Ma, Qiumin Xu, Hanxiao Liu, Phitchaya Phothilimtha, Shen Wang, Anna Goldie, Azalia Mirhoseini, and James Laudon. 2020. Transferable Graph Optimizers for ML Compilers. In NeurIPS 2020.
Google Scholar
- Barret Zoph and Quoc Le. 2017. Neural Architecture Search with Reinforcement Learning. In International Conference on Learning Representations. https://openreview.net/forum?id=r1Ue8Hcxg
Google Scholar
- Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc Le. 2018. Learning Transferable Architectures for Scalable Image Recognition. In Conference on Computer Vision and Pattern Recognition (CVPR’18). 8697–8710. https://doi.org/10.1109/CVPR.2018.00907
Google Scholar
Cross Ref
Index Terms
Neural architecture search using property guided synthesis
Recommendations
Neural architecture search using progressive evolution
GECCO '22: Proceedings of the Genetic and Evolutionary Computation ConferenceVanilla neural architecture search using evolutionary algorithms (EA) involves evaluating each architecture by training it from scratch, which is extremely time-consuming. This can be reduced by using a supernet to estimate the fitness of every ...
A neural architecture generator for efficient search space
AbstractNeural architecture search (NAS) has made significant progress in recent years. However, the existing methods usually search architectures in a small-scale, well-designed architecture space, discover only one architecture in a single ...
Efficient guided evolution for neural architecture search
GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference CompanionNeural Architecture Search methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to quickly converge for local minimas. In this paper, we propose G-EA, a novel approach for guided ...






Comments