skip to main content
research-article

Block Walsh–Hadamard Transform-based Binary Layers in Deep Neural Networks

Authors Info & Claims
Published:18 October 2022Publication History
Skip Abstract Section

Abstract

Convolution has been the core operation of modern deep neural networks. It is well known that convolutions can be implemented in the Fourier Transform domain. In this article, we propose to use binary block Walsh–Hadamard transform (WHT) instead of the Fourier transform. We use WHT-based binary layers to replace some of the regular convolution layers in deep neural networks. We utilize both one-dimensional (1D) and 2D binary WHTs in this article. In both 1D and 2D layers, we compute the binary WHT of the input feature map and denoise the WHT domain coefficients using a nonlinearity that is obtained by combining soft-thresholding with the tanh function. After denoising, we compute the inverse WHT. We use 1D-WHT to replace the 1 × 1 convolutional layers, and 2D-WHT layers can replace the 3 × 3 convolution layers and Squeeze-and-Excite layers. 2D-WHT layers with trainable weights can be also inserted before the Global Average Pooling layers to assist the dense layers. In this way, we can reduce the number of trainable parameters significantly with a slight decrease in trainable parameters. In this article, we implement the WHT layers into MobileNet-V2, MobileNet-V3-Large, and ResNet to reduce the number of parameters significantly with negligible accuracy loss. Moreover, according to our speed test, the 2D-FWHT layer runs about 24 times as fast as the regular 3 × 3 convolution with 19.51% less RAM usage in an NVIDIA Jetson Nano experiment.

REFERENCES

  1. [1] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E.. 2012. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012), 10971105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://arxiv.org/abs/1409.1556.Google ScholarGoogle Scholar
  3. [3] Szegedy Christian, Liu Wei, Jia Yangqing, Sermanet Pierre, Reed Scott, Anguelov Dragomir, Erhan Dumitru, Vanhoucke Vincent, and Rabinovich Andrew. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 19.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Wang Fei, Jiang Mengqing, Qian Chen, Yang Shuo, Li Cheng, Zhang Honggang, Wang Xiaogang, and Tang Xiaoou. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 31563164.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision. Springer, 630645.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Badawi Diaa, Pan Hongyi, Cetin Sinan Cem, and Çetin A. Enis. 2020. Computationally efficient spatio-temporal dynamic texture recognition for volatile organic compound (VOC) leakage detection in industrial plants. IEEE J. Select. Top. Sign. Process. 14, 4 (2020), 676687.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Agarwal Chirag, Khobahi Shahin, Schonfeld Dan, and Soltanalian Mojtaba. 2021. CoroNet: A deep network architecture for enhanced identification of COVID-19 from chest x-ray images. In Medical Imaging 2021: Computer-Aided Diagnosis, Vol. 11597. International Society for Optics and Photonics, 1159722.Google ScholarGoogle Scholar
  9. [9] Partaourides Harris, Papadamou Kostantinos, Kourtellis Nicolas, Leontiades Ilias, and Chatzis Sotirios. 2020. A self-attentive emotion recognition network. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 71997203.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Stamoulis Dimitrios, Chin Ting-Wu, Prakash Anand Krishnan, Fang Haocheng, Sajja Sribhuvan, Bognar Mitchell, and Marculescu Diana. 2018. Designing adaptive neural networks for energy-constrained image classification. In Proceedings of the International Conference on Computer-Aided Design. 18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Redmon Joseph, Divvala Santosh, Girshick Ross, and Farhadi Ali. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779788.Google ScholarGoogle Scholar
  12. [12] Aslan Süleyman, Güdükbay Uğur, Töreyin B. Uğur, and Çetin A. Enis. 2020. Deep convolutional generative adversarial networks for flame detection in video. In Proceedings of the International Conference on Computational Collective Intelligence. Springer, 807815.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Menchetti Guglielmo, Chen Zhanli, Wilkie Diana J., Ansari Rashid, Yardimci Yasemin, and Çetin A. Enis. 2019. Pain detection from facial videos using two-stage deep learning. In Proceedings of the IEEE Global Conference on Signal and Information Processing (GlobalSIP’19). IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Aslan Süleyman, Güdükbay Uğur, Töreyin B. Uğur, and Çetin A. Enis. 2019. Early wildfire smoke detection based on motion-based geometric image transformation and deep convolutional generative adversarial networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 83158319.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Yu Changqian, Wang Jingbo, Peng Chao, Gao Changxin, Yu Gang, and Sang Nong. 2018. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European Conference on Computer Vision (ECCV’18). 325341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Huang Zilong, Wang Xinggang, Huang Lichao, Huang Chang, Wei Yunchao, and Liu Wenyu. 2019. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 603612.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Long Jonathan, Shelhamer Evan, and Darrell Trevor. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 34313440.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Poudel Rudra P. K., Liwicki Stephan, and Cipolla Roberto. 2019. Fast-scnn: Fast semantic segmentation network. arXiv:1902.04502. Retrieved from https://arxiv.org/abs/1902.04502.Google ScholarGoogle Scholar
  19. [19] Jin Yinli, Hao Wenbang, Wang Ping, and Wang Jun. 2019. Fast detection of traffic congestion from ultra-low frame rate image based on semantic segmentation. In Proceedings of the 14th IEEE Conference on Industrial Electronics and Applications (ICIEA’19). IEEE, 528532.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Pan Hongyi, Badawi Diaa, and Cetin Ahmet Enis. 2021. Fast walsh-hadamard transform and smooth-thresholding based binary layers in deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 46504659.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Sandler Mark, Howard Andrew, Zhu Menglong, Zhmoginov Andrey, and Chen Liang-Chieh. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 45104520.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Wu Jiaxiang, Leng Cong, Wang Yuhang, Hu Qinghao, and Cheng Jian. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 48204828.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Muneeb Usama, Koyuncu Erdem, Keshtkarjahromd Yasaman, Seferoglu Hulya, Erden Mehmet Fatih, and Cetin A Enis. 2020. Robust and computationally-efficient anomaly detection using powers-of-two networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). IEEE, 29922996.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Chen Wenlin, Wilson James, Tyree Stephen, Weinberger Kilian, and Chen Yixin. 2015. Compressing neural networks with the hashing trick. In Proceedings of the International Conference on Machine Learning. 22852294.Google ScholarGoogle Scholar
  25. [25] Pan Hongyi, Badawi Diaa, and Cetin Ahmet Enis. 2020. Computationally efficient wildfire detection method using a deep convolutional network pruned via Fourier analysis. Sensors 20, 10 (2020), 2891.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Yu Mingchao, Lin Zhifeng, Narra Krishna, Li Songze, Li Youjie, Kim Nam Sung, Schwing Alexander, Annavaram Murali, and Avestimehr Salman. 2018. Gradiveq: Vector quantization for bandwidth-efficient gradient aggregation in distributed cnn training. arXiv:1811.03617. Retrieved from https://arxiv.org/abs/1811.03617.Google ScholarGoogle Scholar
  27. [27] Han Song, Mao Huizi, and Dally William J.. 2019. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv:1510.00149. Retrieved from https://arxiv.org/abs/1510.00149.Google ScholarGoogle Scholar
  28. [28] Iandola Forrest N., Han Song, Moskewicz Matthew W., Ashraf Khalid, Dally William J., and Keutzer Kurt. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv:1602.07360. Retrieved from https://arxiv.org/abs/1602.07360.Google ScholarGoogle Scholar
  29. [29] Courbariaux Matthieu, Hubara Itay, Soudry Daniel, El-Yaniv Ran, and Bengio Yoshua. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv:1602.02830. Retrieved from https://arxiv.org/abs/1602.02830.Google ScholarGoogle Scholar
  30. [30] Bulat Adrian and Tzimiropoulos Georgios. 2018. Hierarchical binary CNNs for landmark localization with limited resources. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2 (2018), 343356.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Rastegari Mohammad, Ordonez Vicente, Redmon Joseph, and Farhadi Ali. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision. Springer, 525542.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Shen Zhiqiang, Liu Zechun, Qin Jie, Huang Lei, Cheng Kwang-Ting, and Savvides Marios. 2021. S2-BNN: Bridging the gap between self-supervised real and 1-bit neural networks via guided distribution calibration. arXiv:2102.08946. Retrieved from https://arxiv.org/abs/2102.08946.Google ScholarGoogle Scholar
  33. [33] Liu Zechun, Shen Zhiqiang, Savvides Marios, and Cheng Kwang-Ting. 2020. Reactnet: Towards precise binary neural network with generalized activation functions. In European Conference on Computer Vision. Springer, 143159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Martinez Brais, Yang Jing, Bulat Adrian, and Tzimiropoulos Georgios. 2020. Training binary neural networks with real-to-binary convolutions. arXiv:2003.11535. Retrieved from https://arxiv.org/abs/2003.11535.Google ScholarGoogle Scholar
  35. [35] Bulat Adrian, Martinez Brais, and Tzimiropoulos Georgios. 2020. Bats: Binary architecture search. arXiv:2003.01711. Retrieved from https://arxiv.org/abs/2003.01711.Google ScholarGoogle Scholar
  36. [36] Hubara Itay, Courbariaux Matthieu, Soudry Daniel, El-Yaniv Ran, and Bengio Yoshua. 2016. Binarized neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems. 41144122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Alizadeh Milad, Fernández-Marqués Javier, Lane Nicholas D., and Gal Yarin. 2018. An empirical study of binary neural networks’ optimisation. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  38. [38] Bannink Tom, Hillier Adam, Geiger Lukas, Bruin Tim de, Overweel Leon, Neeven Jelmer, and Helwegen Koen. 2021. Larq compute engine: Design, benchmark and deploy state-of-the-art binarized neural networks. Proc. Mach. Learn. Syst. 3 (2021), 680–695.Google ScholarGoogle Scholar
  39. [39] Juefei-Xu Felix, Boddeti Vishnu Naresh, and Savvides Marios. 2017. Local binary convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1928.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Lin Xiaofan, Zhao Cong, and Pan Wei. 2017. Towards accurate binary convolutional neural network. Advances in Neural Information Processing Systems 30 (2017).Google ScholarGoogle Scholar
  41. [41] Wang Ziwei, Lu Jiwen, Tao Chenxin, Zhou Jie, and Tian Qi. 2019. Learning channel-wise interactions for binary convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 568577.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Zhao Ritchie, Hu Yuwei, Dotzel Jordan, Sa Christopher De, and Zhang Zhiru. 2019. Building efficient deep neural networks with unitary group convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1130311312.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Nasrin Shamma, Badawi Diaa, Cetin Ahmet Enis, Gomes Wilfred, and Trivedi Amit Ranjan. 2021. MF-Net: Compute-In-Memory SRAM for multibit precision inference using memory-immersed data conversion and multiplication-free operators. IEEE Trans. Circ. Syst. I: Regul. Pap. 68, 5 (2021), 19661978.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Ayi Maneesh and El-Sharkawy Mohamed. 2020. RMNv2: Reduced Mobilenet V2 for CIFAR10. In Proceedings of the 10th Annual Computing and Communication Workshop and Conference (CCWC’20). IEEE, 02870292.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Singh Pravendra, Verma Vinay Kumar, Rai Piyush, and Namboodiri Vinay P.. 2019. Hetconv: Heterogeneous kernel-based convolutions for deep cnns. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 48354844.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Deveci T. Ceren, Cakir Serdar, and Cetin A. Enis. 2018. Energy efficient hadamard neural networks. arXiv:1805.05421. Retrieved from https://arxiv.org/abs/1805.05421.Google ScholarGoogle Scholar
  47. [47] Cetin A. Enis, Gerek Omer N., and Ulukus Sennur. 1993. Block wavelet transforms for image coding. IEEE Trans. Circ. Syst. Vid. Technol. 3, 6 (1993), 433435.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Walsh Joseph L.. 1923. A closed set of normal orthogonal functions. Am. J. Math. 45, 1 (1923), 524.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Fino Bernard J. and Algazi V. Ralph. 1976. Unified matrix treatment of the fast Walsh-Hadamard transform. IEEE Trans. Comput. 25, 11 (1976), 11421146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. [50] Xie Saining, Girshick Ross, Dollár Piotr, Tu Zhuowen, and He Kaiming. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 14921500.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Szegedy Christian, Ioffe Sergey, Vanhoucke Vincent, and Alemi Alexander A.. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Agante P. M. and Sá J. P. Marques De. 1999. ECG noise filtering using wavelets with soft-thresholding methods. In Computers in Cardiology 1999, Vol. 26 (Cat. No. 99CH37004). IEEE, 535538.Google ScholarGoogle Scholar
  53. [53] Donoho David L.. 1995. De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41, 3 (1995), 613627.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Marcellin Michael W., Gormish Michael J., Bilgin Ali, and Boliek Martin P.. 2000. An overview of JPEG-2000. In Proceedings of the Data Compression Conference (DCC’00). IEEE, 523541.Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Lee-Thorp James, Ainslie Joshua, Eckstein Ilya, and Ontanon Santiago. 2021. FNet: Mixing tokens with fourier transforms. arXiv:2105.03824. Retrieved from https://arxiv.org/abs/2105.03824.Google ScholarGoogle Scholar
  56. [56] Mobilenet in tensorflow’s official github. Retrieved March 1, 2021 https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet.Google ScholarGoogle Scholar
  57. [57] Transfer learning and fine-tuning. Retrieved March 1, 2021 from https://www.tensorflow.org/tutorials/images/transfer_learning.Google ScholarGoogle Scholar
  58. [58] Howard Andrew, Sandler Mark, Chu Grace, Chen Liang-Chieh, Chen Bo, Tan Mingxing, Wang Weijun, Zhu Yukun, Pang Ruoming, Vasudevan Vijay, et al. 2019. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13141324.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision. 10261034.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. [60] Block walsh-hadamard transform layer speed test code. Retrieved December, 31, 2021 from https://github.com/phy710/Block-Walsh-Hadamard-Transform-Layer-Speed-Test.Google ScholarGoogle Scholar

Index Terms

  1. Block Walsh–Hadamard Transform-based Binary Layers in Deep Neural Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Embedded Computing Systems
      ACM Transactions on Embedded Computing Systems  Volume 21, Issue 6
      November 2022
      498 pages
      ISSN:1539-9087
      EISSN:1558-3465
      DOI:10.1145/3561948
      • Editor:
      • Tulika Mitra
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 October 2022
      • Online AM: 26 January 2022
      • Accepted: 30 December 2021
      • Revised: 25 November 2021
      • Received: 15 July 2021
      Published in tecs Volume 21, Issue 6

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!