Abstract
We study the high dynamic range (HDR) imaging problem in dual-lens systems. Existing methods usually treat the HDR imaging problem as an image fusion problem and the HDR result is estimated by fusing the aligned short exposure image and long exposure image. However, the image fusion pipeline depends highly on the image alignment, which is difficult to be perfect. We propose to transfer the dual-lens HDR imaging problem into the disentangled enhancement of exposure correction and denoising for the short exposure image, guided by the long exposure image. In the guided exposure correction module, we make use of the guidance image and 3D color transformation to propose a guided 3D exposure CNN (GEC) to get the rough HDR result from the short exposure image. Then, in the guided denoising module, we make use of the cross-attention mechanism to propose a guided denoising transformer (GDT) to directly use the long exposure image as guidance to denoise the rough HDR result in a pyramid way. And in both modules, we bypass the difficult image alignment processing. Experimental results demonstrate the superiority of our method over the state-of-the-art ones.
- [1] . 2014. High dynamic range video reconstruction from a stereo camera setup. In Signal Processing: Image Communication. 191–202.Google Scholar
- [2] . 2022. A decoupled kernel prediction network guided by soft mask for single image HDR reconstruction. ACM Transactions on Multimedia Computing, Communications, and Applications (2022).Google Scholar
- [3] . 2018. Stereoscopic neural style transfer. The IEEE Conference on Computer Vision and Pattern Recognition (2018).Google Scholar
- [4] . 2021. HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. International Conference on Computer Vision (2021), 2502–2511.Google Scholar
- [5] . 2021. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12299–12310.Google Scholar
Cross Ref
- [6] . 2021. HDRUNet: Single Image HDR reconstruction with denoising and dequantization. CVPR (2021).Google Scholar
- [7] . 2020. Learning stereo high dynamic range imaging from a pair of cameras with different exposure parameters. IEEE Transactions on Computational Imaging (2020).Google Scholar
Cross Ref
- [8] . 2020. Learning stereo high dynamic range imaging from a pair of cameras with different exposure parameters. In IEEE Transactions on Computational Imaging. 1044–1058.Google Scholar
- [9] . 2019. New stereo high dynamic range imaging method using generative adversarial networks. In International Conference on Image Processing. 3502–3506.Google Scholar
Cross Ref
- [10] . 2014. Single-shot high dynamic range imaging using coded electronic shutter. Computer Graphics Forum 33, 7 (2014), 329–338.Google Scholar
Digital Library
- [11] . 2020. Deep joint deinterlacing and denoising for single shot dual-ISO HDR reconstruction. TIP (2020).Google Scholar
- [12] . 2022. Learning HDR video reconstruction for dual-exposure sensors with temporally-alternating exposures. Computers and Graphics 105 (2022), 57–72.Google Scholar
Digital Library
- [13] . 1997. Recovering high dynamic range radiance maps from photographs. TOG (1997).Google Scholar
- [14] . 2015. Region-based temporally consistent video post-processing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 714–722.Google Scholar
- [15] . 2021. MIEHDR CNN: Main image enhancement based ghost-free high dynamic range imaging using dual-lens systems. AAAI (2021).Google Scholar
- [16] . 2019. Shoot high-quality color images using dual-lens system with monochrome and color cameras. Neurocomputing (2019), 22–32.Google Scholar
Digital Library
- [17] . 2020. A colorization framework for monochrome-color dual-lens systems using a deep convolutional network. IEEE Transactions on Visualization and Computer Graphics 28, 3 (2020), 1469–1485.Google Scholar
Cross Ref
- [18] . 2021. Pyramid convolutional network for colorization in monochrome-color multi-lens camera system. Neurocomputing 450 (2021), 129–142.Google Scholar
Cross Ref
- [19] . 2019. Learning a deep convolutional network for colorization in monochrome-color dual-lens system. AAAI Conference on Artificial Intelligence (2019).Google Scholar
- [20] . 2020. Cycle-CNN for colorization towards real monochrome-color camera systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10721–10728.Google Scholar
Cross Ref
- [21] . 2022. Spatially consistent transformer for colorization in monochrome-color dual-lens system. IEEE Transactions on Image Processing (2022), 6747–6760.Google Scholar
Digital Library
- [22] . 2021. Self-supervised colorization towards monochrome-color camera systems using cycle CNN. IEEE Transactions on Image Processing 30 (2021), 6609–6622.Google Scholar
Digital Library
- [23] . 2011. Fast efficient algorithm for enhancement of low lighting video. IEEE Internation Conference on Multimedia and Expo (2011), 1–6.Google Scholar
- [24] . 2015. Temporally consistent region-based video exposure correction. IEEE International Conference on Multimedia and Expo (2015), 1–6.Google Scholar
- [25] . 2019. Spatio-temporal encoder-decoder fully convolutional network for video-based dimensional emotion recognition. IEEE Transactions on Affective Computing 12, 3 (2019), 565–578.Google Scholar
Cross Ref
- [26] . 2022. Decoupled low-light image enhancement. ACM Transactions on Multimedia Computing, Communications, and Applications (2022).Google Scholar
Digital Library
- [27] . 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. TOG (2016).Google Scholar
Digital Library
- [28] . 2010. Guided image filtering. ECCV (2010).Google Scholar
- [29] . 2018. Over- and under-exposure reconstruction of a single plenoptic capture. ACM Transactions on Multimedia Computing, Communications, and Applications (2018).Google Scholar
Digital Library
- [30] . 2016. Secure nonlocal denoising in outsourced images. ACM Transactions on Multimedia Computing, Communications, and Applications (2016).Google Scholar
Digital Library
- [31] . 2022. HDR-NeRF: High dynamic range neural radiance fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18398–18408.Google Scholar
Cross Ref
- [32] . 2018. Enhancing the spatial resolution of stereo images using a parallax prior. CVPR (2018).Google Scholar
- [33] . 2014. Visual persuasion: Inferring communicative intents of images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 216–223.Google Scholar
Digital Library
- [34] . 2017. Deep high dynamic range imaging of dynamic scenes. TOG (2017).Google Scholar
Digital Library
- [35] . 2021. Representative color transform for image enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4459–4468.Google Scholar
Cross Ref
- [36] . 2021. Colorization transformer. arXiv preprint arXiv: 2102.04432 (2021).Google Scholar
- [37] . 2018. Depth-aware stereo video retargeting. CVPR (2018).Google Scholar
- [38] . 2020. Fast multi-scale structural patch decomposition for multi-exposure image fusion. TIP (2020).Google Scholar
- [39] . 2021. UPHDR-GAN: Generative adversarial network for high dynamic range imaging with unpaired data. CVPR (2021).Google Scholar
- [40] . 2021. Human emotion recognition with relational region-level analysis. IEEE Transactions on Affective Computing (2021).Google Scholar
Cross Ref
- [41] . 2016. Joint image-text news topic detection and tracking by multimodal topic and-or graph. IEEE Transactions on Multimedia 19, 2 (2016), 367–381.Google Scholar
Digital Library
- [42] . 2021. SwinIR: Image restoration using Swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1833–1844.Google Scholar
Cross Ref
- [43] . 2009. High dynamic range imaging for stereoscopic scene representation. In International Conference on Image Processing. 4305–4308.Google Scholar
- [44] . 2011. SIFT flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 5 (2011), 978–994.Google Scholar
Digital Library
- [45] . 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline. CVPR (2020).Google Scholar
- [46] . 2021. ADNet: Attention-guided deformable convolutional network for high dynamic range imaging. CVPR (2021).Google Scholar
- [47] . 2021. Swin transformer: Hierarchical vision transformer using shifted windows. International Conference on Computer Vision (ICCV) (2021).Google Scholar
- [48] . 2020. Deep guided learning for fast multi-exposure image fusion. TIP (2020).Google Scholar
- [49] . 2011. HDR-VDP-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on Graphics (2011).Google Scholar
Digital Library
- [50] . 2020. Neural sensors: Learning pixel exposures for HDR imaging and video compressive sensing with programmable sensors. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 7 (2020), 1642–1653.Google Scholar
Cross Ref
- [51] . 2007. Exposure fusion. Pacific Graphics (2007).Google Scholar
- [52] . 2018. Burst denoising with kernel prediction networks. CVPR (2018).Google Scholar
- [53] . 2022. Nerf in the dark: High dynamic range view synthesis from noisy raw images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16190–16199.Google Scholar
Cross Ref
- [54] . 2021. HDR-GAN: HDR image reconstruction from multi-exposed LDR images with large motions. IEEE Transactions on Image Processing 30 (2021), 3885–3896.Google Scholar
Cross Ref
- [55] . 2017. Simultaneous stereo video deblurring and scene flow estimation. CVPR (2017).Google Scholar
- [56] . 2017. Stereo vision-based high dynamic range imaging using differently-exposed image pair. In Sensors. 1473.Google Scholar
- [57] . 2004. Digital photography with flash and no-flash image pairs. TOG (2004).Google Scholar
Digital Library
- [58] . 2019. A fast, scalable and reliable deghosting method for extreme exposure fusion. ICCP (2019).Google Scholar
- [59] . 2021. Labeled from unlabeled: Exploiting unlabeled data for few-shot deep HDR deghosting. CVPR (2021).Google Scholar
- [60] . 2020. Single image HDR reconstruction using a CNN with masked features and perceptual loss. TOG (2020).Google Scholar
Digital Library
- [61] . 2010. HDR image construction from multi-exposed stereo LDR images. In IEEE International Conference on Image Processing. 2973–2976.Google Scholar
Cross Ref
- [62] . 2019. Multi-view image fusion. ICCV (2019).Google Scholar
- [63] . 2019. Stereoscopic dark flash for low-light photography. ICCP (2019).Google Scholar
- [64] . 2019. Learning parallax attention for stereo image super-resolution. CVPR (2019).Google Scholar
- [65] . 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794–7803.Google Scholar
Cross Ref
- [66] . 2018. Facial expression synthesis by U-Net conditional generative adversarial networks. ACM International Conference on Multimedia Retrieval 450 (2018), 283–290.Google Scholar
- [67] . 2019. U-net conditional GANs for photo-realistic and identity-preserving facial expression synthesis. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15, 3s (2019), 1–23.Google Scholar
Digital Library
- [68] . 2021. Facial expression animation by landmark guided residual module. IEEE Transactions on Affective Computing (2021).Google Scholar
Cross Ref
- [69] . 2012. Combining tensor space analysis and active appearance models for aging effect simulation on face images. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42, 4 (2012), 1107–1118.Google Scholar
Digital Library
- [70] . 2004. Image quality assessment from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google Scholar
Digital Library
- [71] . 2018. Contrast enhancement estimation for digital image forensics. ACM Transactions on Multimedia Computing, Communications, and Applications (2018).Google Scholar
Digital Library
- [72] . ([n. d.].)CT2: Colorization transformer via color tokens. (n.d.).Google Scholar
- [73] . 2018. Deep high dynamic range imaging with large foreground motions. ECCV (2018).Google Scholar
- [74] . 2021. Hierarchical fusion for practical ghost-free high dynamic range imaging. ACM MM (2021).Google Scholar
- [75] . 2022. CUR transformer: A convolutional unbiased regional transformer for image denoising. ACM Transactions on Multimedia Computing, Communications, and Applications (2022).Google Scholar
- [76] . 2022. SNR-aware low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 17714–17724.Google Scholar
Cross Ref
- [77] . 2020. Depth image denoising using nuclear norm and learning graph model. ACM Transactions on Multimedia Computing, Communications, and Applications (2020).Google Scholar
Digital Library
- [78] . 2019. Attention-guided network for ghost-free high dynamic range imaging. CVPR (2019).Google Scholar
- [79] . 2020. Deep HDR imaging via a non-local network. TIP (2020).Google Scholar
- [80] . 2012. Automatic exposure correction of consumer photographs. ECCV (2012).Google Scholar
- [81] . 2020. Learning Image-adaptive 3D lookup tables for high performance photo enhancement in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).Google Scholar
Cross Ref
- [82] . 2021. Perceptual quality assessment of low-light image enhancement. ACM Transactions on Multimedia Computing, Communications, and Applications (2021).Google Scholar
Digital Library
- [83] . 2019. 2D-3D heterogeneous face recognition based on deep coupled spectral regression. IEEE Computer Vision and Pattern Recognition Workshops (2019).Google Scholar
- [84] . 2019. DAVANet: Stereo deblurring with view aggregation. CVPR (2019).Google Scholar
Index Terms
Dual-Lens HDR using Guided 3D Exposure CNN and Guided Denoising Transformer
Recommendations
A New Image Denoising Method Using Wavelet Transform
IFITA '09: Proceedings of the 2009 International Forum on Information Technology and Applications - Volume 01Wavelet image denoising has been widely used in the field of image noise. After taking into account the objective and subjective results of the noise image, this paper presents a new image denosing method. Firstly, this method decomposes the noisy image ...
Attention-guided CNN for image denoising
AbstractDeep convolutional neural networks (CNNs) have attracted considerable interest in low-level computer vision. Researches are usually devoted to improving the performance via very deep CNNs. However, as the depth increases, influences of ...






Comments