skip to main content
research-article

A Decoupled Kernel Prediction Network Guided by Soft Mask for Single Image HDR Reconstruction

Authors Info & Claims
Published:17 February 2023Publication History
Skip Abstract Section

Abstract

Recent works on single image high dynamic range (HDR) reconstruction fail to hallucinate plausible textures, resulting in information missing and artifacts in large-scale under/over-exposed regions. In this article, a decoupled kernel prediction network is proposed to infer an HDR image from a low dynamic range (LDR) image. Specifically, we first adopt a simple module to generate a preliminary result, which can precisely estimate well-exposed HDR regions. Meanwhile, an encoder-decoder backbone network with a soft mask guidance module is presented to predict pixel-wise kernels, which is further convolved with the preliminary result to obtain the final HDR output. Instead of traditional kernels, our predicted kernels are decoupled along the spatial and channel dimensions. The advantages of our method are threefold at least. First, our model is guided by the soft mask so that it can focus on the most relevant information for under/over-exposed regions. Second, pixel-wise kernels are able to adaptively solve the different degradations for differently exposed regions. Third, decoupled kernels can avoid information redundancy across channels and reduce the solution space of our model. Thus, our method is able to hallucinate fine details in the under/over-exposed regions and renders visually pleasing results. Extensive experiments demonstrate that our model outperforms state-of-the-art ones.

Skip Supplemental Material Section

Supplemental Material

REFERENCES

  1. [1] A Akhil K. and Jiji C. V.. 2021. Single image HDR synthesis using a densely connected dilated ConvNet. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19–25, 2021. Computer Vision Foundation/IEEE, 526531.Google ScholarGoogle Scholar
  2. [2] Aydin Tunç Ozan, Mantiuk Rafal, and Seidel Hans-Peter. 2008. Extending quality metrics to full dynamic range images. In Proceedings of the Human Vision and Electronic Imaging XIII (Proceedings of SPIE). San Jose, 6806–10.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Bako Steve, Vogels Thijs, McWilliams Brian, Meyer Mark, Novák Jan, Harvill Alex, Sen Pradeep, DeRose Tony, and Rousselle Fabrice. 2017. Kernel-predicting convolutional networks for denoising Monte Carlo renderings. ACM Transactions on Graphics 36, 4 (2017), 97:1–97:14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Banterle Francesco, Artusi Alessandro, Debattista Kurt, and Chalmers Alan. 2017. Advanced High Dynamic Range Imaging (2nd Edition). AK Peters (CRC Press), Natick, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Banterle Francesco, Debattista Kurt, Artusi Alessandro, Pattanaik Sumanta N., Myszkowski Karol, Ledda Patrick, and Chalmers Alan. 2009. High dynamic range imaging and low dynamic range expansion for generating HDR content. Computer Graphics Forum 28, 8 (2009), 23432367.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Banterle Francesco, Ledda Patrick, Debattista Kurt, Chalmers Alan, and Bloj Marina. 2007. A framework for inverse tone mapping. Visual Computer 23, 7 (2007), 467478.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Bychkovsky Vladimir, Paris Sylvain, Chan Eric, and Durand Frédo. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011. IEEE Computer Society, 97104.Google ScholarGoogle Scholar
  8. [8] Cao Gaofeng, Zhou Fei, Liu Kanglin, and Liu Bozhi. 2021. A brightness-adaptive kernel prediction network for inverse tone mapping. Neurocomputing 464 (2021), 114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Chen Guannan, Zhang Lijie, Sun Mengdi, Gao Yan, Michelini Pablo Navarrete, and Wu Yanhong. 2021. Single-image HDR reconstruction with task-specific network based on channel adaptive RDN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19–25, 2021. Computer Vision Foundation/IEEE, 398403.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Chen Xiangyu, Liu Yihao, Zhang Zhengwen, Qiao Yu, and Dong Chao. 2021. HDRUNet: Single image HDR reconstruction with denoising and dequantization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19-25, 2021. Computer Vision Foundation/IEEE, 354363.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Dang-Nguyen Duc-Tien, Pasquini Cecilia, Conotter Valentina, and Boato Giulia. 2015. RAISE: A raw images dataset for digital image forensics. In Proceedings of the 6th ACM Multimedia Systems Conference, MMSys 2015. ACM, 219224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Debevec Paul E. and Malik Jitendra. 1997. Recovering high dynamic range radiance maps from photographs. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1997. ACM, 369378.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Eilertsen Gabriel, Hajisharif Saghi, Hanji Param, Tsirikoglou Apostolia, Mantiuk Rafal K., and Unger Jonas. 2021. How to cheat with metrics in single-image HDR reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, ICCVW 2021. IEEE, 39813990.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Eilertsen Gabriel, Kronander Joel, Denes Gyorgy, Mantiuk Rafal K., and Unger Jonas. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Transactions on Graphics 36, 6 (2017), 178:1–178:15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Endo Yuki, Kanamori Yoshihiro, and Mitani Jun. 2017. Deep reverse tone mapping. ACM Transactions on Graphics 36, 6 (2017), 177:1–177:10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Grossberg Michael D. and Nayar Shree K.. 2003. What is the space of camera response functions?. In Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003). IEEE Computer Society, 602612.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Hu Jie, Shen Li, and Sun Gang. 2018. Squeeze-and-excitation networks. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018. Computer Vision Foundation/IEEE Computer Society, 71327141.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Hu Wei, Seifi Mozhdeh, and Reinhard Erik. 2018. Over- and under-exposure reconstruction of a single plenoptic capture. ACM Transactions on Multimedia Computing, Communications, and Applications 14, 2 (2018), 52:1–52:21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Huo Yongqing, Yang Fan, Dong Le, and Brost Vincent. 2014. Physiological inverse tone mapping based on retina response. Visual Computer 30, 5 (2014), 507517.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Isola Phillip, Zhu Jun-Yan, Zhou Tinghui, and Efros Alexei A.. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. IEEE Computer Society, 59675976.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Khan Zeeshan, Khanna Mukul, and Raman Shanmuganathan. 2019. FHDR: HDR image reconstruction from a single LDR image using feedback network. In Proceedings of the 2019 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2019. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings.Google ScholarGoogle Scholar
  23. [23] Kinoshita Yuma, Shiota Sayaka, and Kiya Hitoshi. 2017. Reinhard’s global operator based inverse tone mapping with one parameter. In Proceedings of the 8th International Workshop on Signal Design and Its Applications in Communications, IWSDA 2017. IEEE, 4953.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Kovaleski Rafael Pacheco and Neto Manuel Menezes de Oliveira. 2014. High-quality reverse tone mapping for a wide range of exposures. In Proceedings of the 27th SIBGRAPI Conference on Graphics, Patterns and Images, SIBGRAPI 2014. IEEE Computer Society, 4956.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Landis Hayden. 2002. Production-ready global illumination. Siggraph Course Notes 16, 2002 (2002), 11.Google ScholarGoogle Scholar
  26. [26] Ledig Christian, Theis Lucas, Huszar Ferenc, Caballero Jose, Cunningham Andrew, Acosta Alejandro, Aitken Andrew P., Tejani Alykhan, Totz Johannes, Wang Zehan, and Shi Wenzhe. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. IEEE Computer Society, 105114.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Liu Yu-Lun, Lai Wei-Sheng, Chen Yu-Sheng, Kao Yi-Lung, Yang Ming-Hsuan, Chuang Yung-Yu, and Huang Jia-Bin. 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle. Computer Vision Foundation/IEEE, 16481657.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Mantiuk Rafal, Kim Kil Joong, Rempel Allan G., and Heidrich Wolfgang. 2011. HDR-VDP-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics 30, 4 (2011), 40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Marnerides Demetris, Bashford-Rogers Thomas, Hatchett Jonathan, and Debattista Kurt. 2018. ExpandNet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content. Computer Graphics Forum 37, 2 (2018), 3749.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Niklaus Simon, Mai Long, and Liu Feng. 2017. Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE International Conference on Computer Vision, ICCV 2017. IEEE Computer Society, 261270.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Reinhard Erik and Devlin Kate. 2005. Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics 11, 1 (2005), 1324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Rempel Allan G., Trentacoste Matthew, Seetzen Helge, Young H. David, Heidrich Wolfgang, Whitehead Lorne, and Ward Greg. 2007. Ldr2Hdr: On-the-fly reverse tone mapping of legacy video and photographs. ACM Transactions on Graphics 26, 3 (2007), 39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Ronneberger Olaf, Fischer Philipp, and Brox Thomas. 2015. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015-18th International Conference Munich, Proceedings, Part III (Lecture Notes in Computer Science), Vol. 9351. Springer, 234241.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Santos Marcel Santana, Ren Tsang Ing, and Kalantari Nima Khademi. 2020. Single image HDR reconstruction using a CNN with masked features and perceptual loss. ACM Transactions on Graphics 39, 4 (2020), 110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Wang Xintao, Yu Ke, Dong Chao, and Loy Chen Change. 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018. Computer Vision Foundation/IEEE Computer Society, 606615.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Wang Zhou, Bovik Alan C., Sheikh Hamid R., and Simoncelli Eero P.. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600612.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Wang Zhou, Simoncelli Eero P., and Bovik Alan C.. 2003. Multiscale structural similarity for image quality assessment. In Proceedings of the 37th Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. IEEE, 13981402.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Woo Sanghyun, Park Jongchan, Lee Joon-Young, and Kweon In So. 2018. CBAM: Convolutional block attention module. In Proceedings of the Computer Vision - ECCV 2018-15th European Conference, Proceedings, Part VII (Lecture Notes in Computer Science), Vol. 11211. Springer, 319.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Xia Zhihao, Perazzi Federico, Gharbi Michaël, Sunkavalli Kalyan, and Chakrabarti Ayan. 2020. Basis prediction networks for effective burst denoising with large kernels. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020. Computer Vision Foundation/IEEE, 1184111850.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Xu Xin, Wang Shiqin, Wang Zheng, Zhang Xiaolong, and Hu Ruimin. 2021. Exploring image enhancement for salient object detection in low light images. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1s (2021), 1–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Yan Qingsen, Gong Dong, Shi Qinfeng, Hengel Anton van den, Shen Chunhua, Reid Ian D., and Zhang Yanning. 2019. Attention-guided network for ghost-free high dynamic range imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019. Computer Vision Foundation/IEEE, 17511760.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Yan Qingsen, Gong Dong, Shi Qinfeng (Javen), Hengel Anton van den, Shen Chunhua, Reid Ian D., and Zhang Yanning. 2022. Dual-attention-guided network for ghost-free high dynamic range imaging. International Journal of Computer Vision 130, 1 (2022), 7694.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Yan Qingsen, Gong Dong, Shi Qinfeng (Javen), Hengel Anton van den, Sun Jinqiu, Zhu Yu, and Zhang Yanning. 2022. High dynamic range imaging via gradient-aware context aggregation network. Pattern Recognition 122 (2022), 108342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Yan Qingsen, Zhang Lei, Liu Yu, Zhu Yu, Sun Jinqiu, Shi Qinfeng, and Zhang Yanning. 2020. Deep HDR imaging via A non-local network. IEEE Transactions on Image Processing 29 (2020), 43084322.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Zhang Dongyang, Shao Jie, and Shen Heng Tao. 2020. Kernel attention network for single image super-resolution. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 3 (2020), 90:1–90:15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Zhang Han, Goodfellow Ian J., Metaxas Dimitris N., and Odena Augustus. 2019. Self-attention generative adversarial networks. In Proceedings of the 36th International Conference on Machine Learning, ICML (Proceedings of Machine Learning Research), Vol. 97. PMLR, 73547363.Google ScholarGoogle Scholar
  47. [47] Zheng Zhuoran, Ren Wenqi, Cao Xiaochun, Wang Tao, and Jia Xiuyi. 2021. Ultra-high-definition image HDR reconstruction via collaborative bilateral learning. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021. IEEE, 44294438.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Decoupled Kernel Prediction Network Guided by Soft Mask for Single Image HDR Reconstruction

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 2s
      April 2023
      545 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3572861
      • Editor:
      • Abdulmotaleb El Saddik
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 February 2023
      • Online AM: 22 July 2022
      • Accepted: 19 July 2022
      • Revised: 6 June 2022
      • Received: 28 February 2022
      Published in tomm Volume 19, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!