skip to main content
research-article

Kernel Attention Network for Single Image Super-Resolution

Authors Info & Claims
Published:05 July 2020Publication History
Skip Abstract Section

Abstract

Recently, attention mechanisms have shown a developing tendency toward convolutional neural network (CNN), and some representative attention mechanisms, i.e., channel attention (CA) and spatial attention (SA) have been fully applied to single image super-resolution (SISR) tasks. However, the existing architectures directly apply these attention mechanisms to SISR without much consideration of the nature characteristic, resulting in less strong representational power. In this article, we propose a novel kernel attention module (KAM) for SISR, which enables the network to adjust its receptive field size corresponding to various scales of input by dynamically selecting the appropriate kernel. Based on this, we stack multiple kernel attention modules with group and residual connection to constitute a novel architecture for SISR, which enables our network to learn more distinguishing representations through filtering the information under different receptive fields. Thus, our network is more sensitive to multi-scale features, which enables our single network to deal with multi-scale SR task by predefining the upscaling modules. Besides, other attention mechanisms in super-resolution are also investigated and illustrated in detail in this article. Thanks to the kernel attention mechanism, the extensive benchmark evaluation shows that our method outperforms the other state-of-the-art methods.

References

  1. Eirikur Agustsson and Radu Timofte. 2017. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops, Honolulu, HI, July 21--26, 2017. 1122--1131.Google ScholarGoogle Scholar
  2. Namhyuk Ahn, Byungkon Kang, and Kyung-Ah Sohn. 2018. Fast, accurate, and lightweight super-resolution with cascading residual network. In 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8--14, 2018, Part X. 256--272.Google ScholarGoogle ScholarCross RefCross Ref
  3. Peter Anderson, Xiaodong He, Chris Buehler, Damien Teney, Mark Johnson, Stephen Gould, and Lei Zhang. 2018. Bottom-up and top-down attention for image captioning and visual question answering. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 6077--6086.Google ScholarGoogle ScholarCross RefCross Ref
  4. Adrian Bulat, Jing Yang, and Georgios Tzimiropoulos. 2018. To learn image super-resolution, use a GAN to learn how to do image degradation first. In 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8--14, 2018, Part VI. 187--202.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jie Chen, Jie Shao, and Chengkun He. 2020. Movie fill in the blank by joint learning from video and text with adaptive temporal attention. Pattern Recognit. Lett. 132 (2020), 62--68.Google ScholarGoogle ScholarCross RefCross Ref
  6. Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. 2019. Second-order attention network for single image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, June 16--20, 2019. 11065--11074.Google ScholarGoogle ScholarCross RefCross Ref
  7. Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In 13th European Conference on Computer Vision, ECCV 2014, Zurich, Switzerland, September 6--12, 2014, Part IV. 184--199.Google ScholarGoogle ScholarCross RefCross Ref
  8. Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2016. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2 (2016), 295--307.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In 14th European Conference on Computer Vision, ECCV 2016, Amsterdam, The Netherlands, October 11--14, 2016, Part II. 391--407.Google ScholarGoogle ScholarCross RefCross Ref
  10. Gilad Freedman and Raanan Fattal. 2011. Image and video upscaling from local self-examples. ACM Trans. Graph. 30, 2 (2011), 12:1--12:11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Lianli Gao, Xiangpeng Li, Jingkuan Song, and Heng Tao Shen. 2020. Hierarchical LSTMs with adaptive attention for visual captioning. IEEE Trans. Pattern Anal. Mach. Intell. 42, 5 (2020), 1112--1131.Google ScholarGoogle Scholar
  12. Jianting Guo, Peijia Zheng, and Jiwu Huang. 2017. An efficient motion detection and tracking scheme for encrypted surveillance videos. TOMCCAP 13, 4 (2017), 61:1--61:23.Google ScholarGoogle Scholar
  13. Wei Han, Shiyu Chang, Ding Liu, Mo Yu, Michael Witbrock, and Thomas S. Huang. 2018. Image super-resolution via dual-state recurrent networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 1654--1663.Google ScholarGoogle Scholar
  14. Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. 2018. Deep back-projection networks for super-resolution. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 1664--1673.Google ScholarGoogle ScholarCross RefCross Ref
  15. Chen He and Haifeng Hu. 2019. Image captioning with visual-semantic double attention. TOMCCAP 15, 1 (2019), 26:1--26:16.Google ScholarGoogle Scholar
  16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, June 27--30, 2016. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  17. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861. arxiv:1704.04861Google ScholarGoogle Scholar
  18. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 7132--7141.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yanting Hu, Jie Li, Yuanfei Huang, and Xinbo Gao. 2019. Channel-wise and spatial feature modulation network for single image super-resolution. IEEE Trans. Circuits Syst. Video Techn. DOI:https://doi.org/10.1109/TCSVT.2019.2915238Google ScholarGoogle Scholar
  20. Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, July 21--26, 2017. 2261--2269.Google ScholarGoogle Scholar
  21. Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, June 7--12, 2015. 5197--5206.Google ScholarGoogle ScholarCross RefCross Ref
  22. Zheng Hui, Xinbo Gao, Yunchu Yang, and Xiumei Wang. 2019. Lightweight image super-resolution with information multi-distillation network. In 27th ACM International Conference on Multimedia, MM 2019, Nice, France, October 21--25, 2019. 2024--2032.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zheng Hui, Xiumei Wang, and Xinbo Gao. 2018. Fast and accurate single image super-resolution via information distillation network. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 723--731.Google ScholarGoogle ScholarCross RefCross Ref
  24. Michal Irani and Shmuel Peleg. 1991. Improving resolution by image registration. CVGIP: Graphical Model and Image Processing 53, 3 (1991), 231--239.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, June 27--30, 2016. 1646--1654.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Deeply-recursive convolutional network for image super-resolution. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, June 27--30, 2016. 1637--1645.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jun-Hyuk Kim, Jun-Ho Choi, Manri Cheon, and Jong-Seok Lee. 2020. MAMNet: Multi-path adaptive modulation network for image super-resolution. Neurocomput 402 (2020), 38--49. DOI:https://doi.org/10.1016/j.neucom.2020.03.069Google ScholarGoogle ScholarCross RefCross Ref
  28. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, May 7--9, 2015.Google ScholarGoogle Scholar
  29. Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2019. Fast and accurate image super-resolution with deep Laplacian pyramid networks. IEEE Trans. Pattern Anal. Mach. Intell. 41, 11 (2019), 2599--2613.Google ScholarGoogle ScholarCross RefCross Ref
  30. Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P. Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, July 21--26, 2017. 105--114.Google ScholarGoogle ScholarCross RefCross Ref
  31. Juncheng Li, Faming Fang, Kangfu Mei, and Guixu Zhang. 2018. Multi-scale residual network for image super-resolution. In 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8--14, 2018, Part VIII. 527--542.Google ScholarGoogle ScholarCross RefCross Ref
  32. Xianguo Li, Yemei Sun, Yanli Yang, and Changyun Miao. 2019. Symmetrical residual connections for single image super-resolution. TOMCCAP 15, 1 (2019), 19:1--19:10.Google ScholarGoogle Scholar
  33. Xiang Li, Wenhai Wang, Xiaolin Hu, and Jian Yang. 2019. Selective kernel networks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, June 16--20, 2019. 510--519.Google ScholarGoogle ScholarCross RefCross Ref
  34. Zhen Li, Jinglei Yang, Zheng Liu, Xiaomin Yang, Gwanggil Jeon, and Wei Wu. 2019. Feedback network for image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, June 16--20, 2019. 3867--3876.Google ScholarGoogle ScholarCross RefCross Ref
  35. Qianli Liao and Tomaso A. Poggio. 2016. Bridging the gaps between residual learning, recurrent neural networks and visual cortex. CoRR abs/1604.03640 (2016).Google ScholarGoogle Scholar
  36. Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops, Honolulu, HI, July 21--26, 2017. 1132--1140.Google ScholarGoogle ScholarCross RefCross Ref
  37. Heng Liu, Jungong Han, Shudong Hou, Ling Shao, and Ruan Yue. 2018. Single image super-resolution using a deep encoder-decoder symmetrical network with iterative back projection. Neurocomput. 282 (2018), 52--59.Google ScholarGoogle ScholarCross RefCross Ref
  38. Xiao-Jiao Mao, Chunhua Shen, and Yu-Bin Yang. 2016. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5--10, 2016, Barcelona, Spain. 2802--2810.Google ScholarGoogle Scholar
  39. David R. Martin, Charless C. Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In 8th International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7--14, 2001, Volume 2. 416--425.Google ScholarGoogle ScholarCross RefCross Ref
  40. Yusuke Matsui, Kota Ito, Yuji Aramaki, Azuma Fujimoto, Toru Ogawa, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2017. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools Appl. 76, 20 (2017), 21811--21838.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu. 2014. Recurrent models of visual attention. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8--13 2014, Montreal, Quebec, Canada. 2204--2212.Google ScholarGoogle Scholar
  42. Jongchan Park, Sanghyun Woo, Joon-Young Lee, and In So Kweon. 2018. BAM: Bottleneck attention module. In British Machine Vision Conference 2018, BMVC 2018, Northumbria University, Newcastle, UK, September 3--6, 2018. 147.Google ScholarGoogle Scholar
  43. Wenzhe Shi, Jose Caballero, Ferenc Huszar, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, June 27--30, 2016. 1874--1883.Google ScholarGoogle ScholarCross RefCross Ref
  44. Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. “Zero-shot” Super-resolution using deep internal learning. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 3118--3126.Google ScholarGoogle ScholarCross RefCross Ref
  45. Jingkuan Song, Yuyu Guo, Lianli Gao, Xuelong Li, Alan Hanjalic, and Heng Tao Shen. 2019. From deterministic to generative: Multimodal stochastic RNNs for video captioning. IEEE Trans. Neural Networks Learn. Syst. 30, 10 (2019), 3047--3058.Google ScholarGoogle ScholarCross RefCross Ref
  46. Ying Tai, Jian Yang, and Xiaoming Liu. 2017. Image super-resolution via deep recursive residual network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, July 21--26, 2017. 2790--2798.Google ScholarGoogle ScholarCross RefCross Ref
  47. Ying Tai, Jian Yang, Xiaoming Liu, and Chunyan Xu. 2017. MemNet: A persistent memory network for image restoration. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 4549--4557.Google ScholarGoogle ScholarCross RefCross Ref
  48. Anqi Wang, Haifeng Hu, and Liang Yang. 2018. Image captioning with affective guiding and selective attention. TOMCCAP 14, 3 (2018), 73:1--73:15.Google ScholarGoogle Scholar
  49. Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017. Adversarial cross-modal retrieval. In 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, October 23--27, 2017. 154--162.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 13, 4 (2004), 600--612.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Zhihao Wang, Jian Chen, and Steven C. H. Hoi. 2019. Deep learning for image super-resolution: A survey. CoRR abs/1902.06068.Google ScholarGoogle Scholar
  52. Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8--14, 2018, Part VII. 3--19.Google ScholarGoogle ScholarCross RefCross Ref
  53. Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Process. 26, 5 (2017), 2494--2507.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Jianchao Yang, John Wright, Thomas S. Huang, and Yi Ma. 2008. Image super-resolution as sparse representation of raw image patches. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2008, June 24–26, 2008, Anchorage, AK.Google ScholarGoogle Scholar
  55. Xin Yang, Haiyang Mei, Jiqing Zhang, Ke Xu, Baocai Yin, Qiang Zhang, and Xiaopeng Wei. 2019. DRFN: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Trans. Multimedia 21, 2 (2019), 328--337.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yuan Yuan, Siyuan Liu, Jiawei Zhang, Yongbing Zhang, Chao Dong, and Liang Lin. 2018. Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2018, Salt Lake City, UT, June 18--22, 2018. 701--710.Google ScholarGoogle ScholarCross RefCross Ref
  57. Dongyang Zhang, Jie Shao, Gang Hu, and Lianli Gao. 2017. Sharp and real image super-resolution using generative adversarial network. In 24th International Conference on Neural Information Processing, ICONIP 2017, Guangzhou, China, November 14--18, 2017, Part III. 217--226.Google ScholarGoogle ScholarCross RefCross Ref
  58. Kai Zhang, Wangmeng Zuo, Shuhang Gu, and Lei Zhang. 2017. Learning deep CNN denoiser prior for image restoration. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, July 21--26, 2017. 2808--2817.Google ScholarGoogle ScholarCross RefCross Ref
  59. Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Learning a single convolutional super-resolution network for multiple degradations. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 3262--3271.Google ScholarGoogle ScholarCross RefCross Ref
  60. Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8--14, 2018, Part VII. 294--310.Google ScholarGoogle ScholarCross RefCross Ref
  61. Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, and Yun Fu. 2019. Residual non-local attention networks for image restoration. In International Conference on Learning Representations, ICLR 2019, New Orleans, LA.Google ScholarGoogle Scholar
  62. Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018. Residual dense network for image super-resolution. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, June 18--22, 2018. 2472--2481.Google ScholarGoogle ScholarCross RefCross Ref
  63. Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz. 2017. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 3, 1 (2017), 47--57.Google ScholarGoogle ScholarCross RefCross Ref
  64. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. 2242--2251.Google ScholarGoogle Scholar

Index Terms

  1. Kernel Attention Network for Single Image Super-Resolution

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 3
      August 2020
      364 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3409646
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 July 2020
      • Online AM: 13 May 2020
      • Accepted: 1 May 2020
      • Revised: 1 April 2020
      • Received: 1 May 2019
      Published in tomm Volume 16, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!