Abstract
Image enhancement has stimulated significant research works over the past years for its great application potential in video conferencing scenarios. Nevertheless, most existing image enhancement approaches are still struggling to find a good tradeoff that reduces the computational cost as much as possible while maintaining plausible result quality. Recently, curve-based mapping methods are proposed and have shown great potential for real-time and high-quality image enhancement of arbitrary resolutions. In this article, we take advantage of the curve-based mapping representation and focus on further improving the enhancement quality and robustness, while minimizing additional computational costs. Specifically, we (1) carefully re-formulate the curve function to improve learning stability, and (2) aggregate different semantic attention into the curve regression process, which can overcome the major problems of curve-based methods that generate moderate results with low contrast. The semantic attention is jointly learned with the supervision from class activation mapping of pre-trained feature extractors, thus reducing the manual annotation cost of semantic labels. Experiments have shown that our proposed method significantly improves curve-based methods both qualitatively and quantitatively, achieving visually plausible results compared with other deep neural network-based enhancement methods, and maintains a very low computational cost, i.e., taking 18.7 ms for a 360p image on a single P40 GPU. Extensive experiments demonstrate that our method is also capable of video enhancement tasks.
- [1] . 2016. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). 265–283.Google Scholar
- [2] . 2016. The Theory of Splines and Their Applications: Mathematics in Science and Engineering: A Series of Monographs and Textbooks. Elsevier.Google Scholar
- [3] . 2014. Fast local laplacian filters: Theory and applications. ACM Transactions on Graphics 33, 5 (2014), 1–14.Google Scholar
Digital Library
- [4] . 2002. Fundamental relationship between bilateral filtering, adaptive smoothing, and the nonlinear diffusion equation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 6 (2002), 844–847.Google Scholar
Digital Library
- [5] . 2015. Blind video temporal consistency. ACM Transactions on Graphics 34, 6 (2015), 1–9.Google Scholar
Digital Library
- [6] . 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the CVPR 2011. IEEE, 97–104.Google Scholar
- [7] . 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4778–4787.Google Scholar
Cross Ref
- [8] . 2017. Coherent online video style transfer. In Proceedings of the IEEE International Conference on Computer Vision. 1105–1114.Google Scholar
Cross Ref
- [9] . 2017. Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics 36, 4 (2017), 1–12.Google Scholar
Digital Library
- [10] . 2013. Optimizing color consistency in photo collections. ACM Transactions on Graphics 32, 4 (2013), 1–10.Google Scholar
Digital Library
- [11] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [12] . 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.Google Scholar
Digital Library
- [13] . 2018. Exposure: A white-box photo post-processing framework. ACM Transactions on Graphics 37, 2 (2018), 1–17.Google Scholar
Digital Library
- [14] . 2015. Bidirectional recurrent convolutional networks for multi-frame super-resolution. In Proceedings of the Advances in Neural Information Processing Systems. 235–243.Google Scholar
- [15] . 2019. Illumination-invariant person re-identification. In Proceedings of the 27th ACM International Conference on Multimedia. 365–373.Google Scholar
Digital Library
- [16] . 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv:1602.07360. Retrieved from https://arxiv.org/abs/1602.07360.Google Scholar
- [17] . 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google Scholar
Cross Ref
- [18] . 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision. Springer, 694–711.Google Scholar
Cross Ref
- [19] . 2017. Video pixel networks. In Proceedings of the 34th International Conference on Machine Learning. JMLR. org, 1771–1779.Google Scholar
Digital Library
- [20] . 2020. PieNet: Personalized image enhancement network. In Proceedings of the European Conference on Computer Vision. Springer, 374–390.Google Scholar
Digital Library
- [21] . 2014. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.Google Scholar
- [22] . 2007. Joint bilateral upsampling. In Proceedings of the ACM SIGGRAPH 2007 Papers.ACM.Google Scholar
Digital Library
- [23] . 2018. Learning blind video temporal consistency. In Proceedings of the European Conference on Computer Vision. 170–185.Google Scholar
Digital Library
- [24] . 2016. Learning recursive filters for low-level vision via a hybrid neural network. In Proceedings of the European Conference on Computer Vision. Springer, 560–576.Google Scholar
Cross Ref
- [25] . 2021. Video decolorization based on the CNN and LSTM neural network. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 3 (2021), 1–18.Google Scholar
Digital Library
- [26] . 2020. DeepLPF: Deep local parametric filters for image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12826–12835.Google Scholar
Cross Ref
- [27] . 2018. Distort-and-recover: Color enhancement using deep reinforcement learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5928–5936.Google Scholar
Cross Ref
- [28] . 2014. Seeing the arrow of time. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Digital Library
- [29] . 2020. Frame-to-frame consistent semantic segmentation. In Proceedings of the Joint Austrian Computer Vision And Robotics Workshop (ACVRW’20).Google Scholar
- [30] . 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211–252.Google Scholar
Digital Library
- [31] . 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4510–4520.Google Scholar
Cross Ref
- [32] . 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015).Google Scholar
- [33] . 2017. Deep video deblurring for hand-held cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1279–1288.Google Scholar
Cross Ref
- [34] . 2018. Reinforcement Learning: An Introduction. MIT press.Google Scholar
Digital Library
- [35] . 2017. Detail-revealing deep video super-resolution. In Proceedings of the IEEE International Conference on Computer Vision. 4472–4480.Google Scholar
Cross Ref
- [36] . 2011. Example-based image color and tone style enhancement. ACM Transactions on Graphics 30, 4 (2011), 1–12.Google Scholar
Digital Library
- [37] . 2019. Underexposed photo enhancement using deep illumination estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6849–6857.Google Scholar
Cross Ref
- [38] . 2019. Enhancing low light videos by exploring high sensitivity camera noise. In Proceedings of the IEEE International Conference on Computer Vision. 4111–4119.Google Scholar
Cross Ref
- [39] . 2021. CIELAB color space-Wikipedia, The Free Encyclopedia. Retrieved February 27, 2021 from https://en.wikipedia.org/w/index.php?title=CIELAB_color _space&oldid=1008944203
Google Scholar - [40] . 2020. Joint bilateral learning for real-time universal photorealistic style transfer. In Proceedings of the European Conference on Computer Vision. Springer, 327–342.Google Scholar
Digital Library
- [41] . 2015. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proceedings of the Advances in Neural Information Processing Systems. 802–810.Google Scholar
- [42] . 2021. Exploring image enhancement for salient object detection in low light images. ACM Transactions on Multimedia Computing, Communications, and Applications 17, 1s (2021), 1–19.Google Scholar
Digital Library
- [43] . 2016. Automatic photo adjustment using deep neural networks. ACM Transactions on Graphics 35, 2 (2016), 1–15.Google Scholar
Digital Library
- [44] . 2020. RT-VENet: A convolutional network for real-time video enhancement. In Proceedings of the 28th ACM International Conference on Multimedia. 4088–4097.Google Scholar
Digital Library
- [45] . 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921–2929.Google Scholar
Cross Ref
- [46] . 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.Google Scholar
Cross Ref
Index Terms
Real-time Image Enhancement with Attention Aggregation
Recommendations
RT-VENet: A Convolutional Network for Real-time Video Enhancement
MM '20: Proceedings of the 28th ACM International Conference on MultimediaReal-time video enhancement is in great demand due to the extensive usage of live video applications, but existing approaches are far from satisfying the strict requirements of speed and stability. We present a novel convolutional network that can ...
Adjustable Contrast Enhancement Using Fast Piecewise Linear Histogram Equalization
ICIGP '20: Proceedings of the 2020 3rd International Conference on Image and Graphics ProcessingHistogram equalization is a technique to enhance the contrast of the image by redistributing the histogram. In this paper, a fast piecewise linear histogram equalization method is introduced based on an adjustable degree of enhancement and piecewise ...
Real time image enhancement for both text and color photo images
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1Two efficient algorithms are presented for enhancing color images and gray-level scanned document images. For color images, a generic approach for enhancing both contrast and color saturation of color images is first developed. For real time application ...






Comments