skip to main content
research-article

Aggressive Energy Reduction for Video Inference with Software-only Strategies

Published:07 October 2019Publication History
Skip Abstract Section

Abstract

In the past years, several works have proposed custom hardware and software-based techniques for the acceleration of Convolutional Neural Networks (CNNs). Most of these works focus on saving computations by changing the used precision or modifying frame processing. To reach a more aggressive energy reduction, in this paper we propose software-only modifications to the CNNs inference process.

Our approach exploits the inherent locality in videos by replacing entire frame computations with a movement prediction algorithm. Furthermore, when a frame must be processed, we avoid energy-demanding floating-point operations, and at the same time reduce memory accesses by employing look-up tables in place of the original convolutions.

Using the proposed approach, one can reach significant energy gains of more than 25× for security cameras, and 12× for moving vehicles applications, with only small software modifications.

References

  1. [n.d.]. CAVIAR Test Case Scenarios. http://homepages.inf.ed.ac.uk/rbf/CAVIAR/. Accessed: April 2019.Google ScholarGoogle Scholar
  2. Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In ACM Sigplan Notices, Vol. 49. ACM, 269--284.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yunji Chen, Tao Luo, Shaoli Liu, Shijin Zhang, Liqiang He, Jia Wang, Ling Li, Tianshi Chen, Zhiwei Xu, Ninghui Sun, et al. 2014. Dadiannao: A machine-learning supercomputer. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 609--622.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yu-Hsin Chen, Tushar Krishna, Joel S. Emer, and Vivienne Sze. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127--138.Google ScholarGoogle ScholarCross RefCross Ref
  5. Yong-Sheng Chen, Yi-Ping Hung, and Chiou-Shann Fuh. 2001. Fast block matching algorithm based on the winner-update strategy. IEEE Transactions on Image Processing 10, 8 (2001), 1212--1222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jack Choquette, Olivier Giroux, and Denis Foley. 2018. Volta: Performance and programmability. IEEE Micro 38, 2 (2018), 42--52.Google ScholarGoogle ScholarCross RefCross Ref
  7. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  8. Piotr Dollár, Christian Wojek, Bernt Schiele, and Pietro Perona. 2009. Pedestrian detection: A benchmark. (2009).Google ScholarGoogle Scholar
  9. Alireza Fathi, Xiaofeng Ren, and James M. Rehg. 2011. Learning to recognize objects in egocentric activities. In CVPR 2011. IEEE, 3281--3288.Google ScholarGoogle Scholar
  10. Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kartik Hegde, Jiyong Yu, Rohit Agrawal, Mengjia Yan, Michael Pellauer, and Christopher W. Fletcher. 2018. Ucnn: Exploiting computational reuse in deep neural networks via weight repetition. In Proceedings of the 45th Annual International Symposium on Computer Architecture. IEEE Press, 674--687.Google ScholarGoogle Scholar
  12. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google ScholarGoogle Scholar
  13. Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2017. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research 18, 1 (2017), 6869--6898.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Jakubowski and G. Pastuszak. 2013. Block-based motion estimation algorithms-a survey. Opto-Electronics Review 21, 1 (2013), 86--102.Google ScholarGoogle ScholarCross RefCross Ref
  15. Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3D convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 221--231.Google ScholarGoogle ScholarCross RefCross Ref
  16. Xun Jiao, Vahideh Akhlaghi, Yu Jiang, and Rajesh K. Gupta. 2018. Energy-efficient neural networks using approximate computation reuse. In 2018 Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE). IEEE, 1223--1228.Google ScholarGoogle Scholar
  17. Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. Noscope: Optimizing neural network queries over video at scale. Proceedings of the VLDB Endowment 10, 11 (2017), 1586--1597.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1725--1732.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yin Li, Zhefan Ye, and James M. Rehg. 2015. Delving into egocentric actions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 287--295.Google ScholarGoogle Scholar
  21. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European Conference on Computer Vision. Springer, 740--755.Google ScholarGoogle Scholar
  22. Xingyu Liu, Jeff Pool, Song Han, and William J. Dally. 2018. Efficient sparse-winograd convolutional neural networks. International Conference on Learning Representations (ICLR) (2018).Google ScholarGoogle Scholar
  23. Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to model large caches. HP Laboratories (2009), 22--31.Google ScholarGoogle Scholar
  24. Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4293--4302.Google ScholarGoogle ScholarCross RefCross Ref
  25. Arnab Raha and Vijay Raghunathan. 2017. q LUT: Input-aware quantized table lookup for energy-efficient approximate accelerators. ACM Transactions on Embedded Computing Systems (TECS) 16, 5s (2017), 130.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mohammad Samragh Razlighi, Mohsen Imani, Farinaz Koushanfar, and Tajana Rosing. 2017. Looknn: Neural network with no multiplication. In Proceedings of the Conference on Design, Automation 8 Test in Europe. European Design and Automation Association, 1779--1784.Google ScholarGoogle ScholarCross RefCross Ref
  27. Joseph Redmon. 2013--2016. Darknet: Open Source Neural Networks in C. http://pjreddie.com/darknet/.Google ScholarGoogle Scholar
  28. Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7263--7271.Google ScholarGoogle ScholarCross RefCross Ref
  29. Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).Google ScholarGoogle Scholar
  30. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.Google ScholarGoogle Scholar
  31. Marc Riera, Jose-Maria Arnau, and Antonio González. 2018. Computation reuse in DNNs by exploiting input similarity. In Proceedings of the 45th Annual International Symposium on Computer Architecture. IEEE Press, 57--68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ali Shafiee, Anirban Nag, Naveen Muralimanohar, Rajeev Balasubramonian, John Paul Strachan, Miao Hu, R. Stanley Williams, and Vivek Srikumar. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Computer Architecture News 44, 3 (2016), 14--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mohammad Javad Shafiee, Brendan Chywl, Francis Li, and Alexander Wong. 2017. Fast YOLO: A fast you only look once system for real-time embedded object detection in video. arXiv preprint arXiv:1709.05943 (2017).Google ScholarGoogle Scholar
  34. Avinash Sodani. 2015. Knights landing (knl): 2nd generation intel® xeon phi processor. In 2015 IEEE Hot Chips 27 Symposium (HCS). IEEE, 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  35. Arjun Suresh, Erven Rohou, and André Seznec. 2017. Compile-time function memoization. In Proceedings of the 26th International Conference on Compiler Construction. ACM, 45--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  37. Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4820--4828.Google ScholarGoogle ScholarCross RefCross Ref
  38. Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. 2015. Beyond short snippets: Deep networks for video classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4694--4702.Google ScholarGoogle ScholarCross RefCross Ref
  39. Yuhao Zhu, Anand Samajdar, Matthew Mattina, and Paul Whatmough. 2018. Euphrates: Algorithm-soc co-design for low-power mobile continuous vision. arXiv preprint arXiv:1803.11232 (2018).Google ScholarGoogle Scholar

Index Terms

  1. Aggressive Energy Reduction for Video Inference with Software-only Strategies

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!