Abstract
In this article, we address the problem of rain-streak removal in the videos. Unlike the image, challenges in video restoration comprise temporal consistency besides spatial enhancement. The researchers across the world have proposed several effective methods for estimating the de-noised videos with outstanding temporal consistency. However, such methods also amplify the computational cost due to their larger size. By way of analysis, incorporating separate modules for spatial and temporal enhancement may require more computational resources. It motivates us to propose a unified architecture that directly estimates the de-rained frame with maximal visual quality and minimal computational cost. To this end, we present a deep learning-based Frame-recurrent Multi-contextual Adversarial Network for rain-streak removal in videos. The proposed model is built upon a Conditional Generative Adversarial Network (CGAN)-based framework where the generator model directly estimates the de-rained frame from the previously estimated one with the help of its multi-contextual adversary. To optimize the proposed model, we have incorporated the Perceptual loss function in addition to the conventional Euclidean distance. Also, instead of traditional entropy loss from the adversary, we propose to use the Euclidean distance between the features of de-rained and clean frames, extracted from the discriminator model as a cost function for video de-raining. Various experimental observations across 11 test sets, with over 10 state-of-the-art methods, using 14 image-quality metrics, prove the efficacy of the proposed work, both visually and computationally.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, High-quality Frame Recurrent Video De-raining with Multi-contextual Adversarial Network
- Jie Chen and Lap-Pui Chau. 2013. A rain pixel recovery algorithm for videos with highly dynamic scenes. IEEE Trans. Image Proc. 23 (11 2013). DOI:DOI:https://doi.org/10.1109/TIP.2013.2290595Google Scholar
- Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, and He Li. 2018. Robust video content alignment and compensation for rain removal in a CNN framework. arxiv:cs.CV/1803.10433 (2018).Google Scholar
- N. Divakar and R. V. Babu. 2017. Image denoising via CNNs: An adversarial approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17). 1076--1083.Google Scholar
- X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley. 2017. Removing rain from single images via a deep detail network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1715--1723.Google Scholar
- K. Garg and S. K. Nayar. 2004. Detection and removal of rain from videos. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Google Scholar
- K. Garg and S. K. Nayar. 2005. When does a camera see rain? In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV’05). 1067--1074.Google Scholar
- Kshitiz Garg and Shree K. Nayar. 2007. Vision and rain. Int. J. Comput. Vision 75, 1 (Oct. 2007), 3--27. DOI:DOI:https://doi.org/10.1007/s11263-006-0028-6Google Scholar
Digital Library
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arxiv:stat.ML/1406.2661 (2014).Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR abs/1502.03167 (2015).Google Scholar
Digital Library
- T. Jiang, T. Huang, X. Zhao, L. Deng, and Y. Wang. 2019. FastDeRain: A novel video rain streak removal method using directional gradient priors. IEEE Trans. Image Proc. 28, 4 (2019), 2089--2102.Google Scholar
Digital Library
- Tai-Xiang Jiang, Ting-Zhu Huang, Xi-Le Zhao, Liang-Jian Deng, and Yao Wang. 2017. A novel tensor-based video rain streaks removal approach via utilizing discriminatively intrinsic priors. In Proceedings of the Conference on Computer Vision and Pattern Recognition. DOI:DOI:https://doi.org/10.1109/CVPR.2017.301Google Scholar
Cross Ref
- Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision.Google Scholar
Cross Ref
- L. Kang, C. Lin, and Y. Fu. 2012. Automatic single-image-based rain streaks removal via image decomposition. IEEE Trans. Image Proc. 21, 4 (2012), 1742--1755.Google Scholar
Digital Library
- J. Kim, J. Sim, and C. Kim. 2015. Video deraining and desnowing using temporal correlation and low-rank matrix completion. IEEE Trans. Image Proc. 24, 9 (2015), 2658--2670.Google Scholar
Digital Library
- Diederik P. Kingma and Jimmy Ba. 2014. ADAM: A Method for Stochastic Optimization. Retrieved from http://arxiv.org/abs/1412.6980.Google Scholar
- M. Li, Q. Xie, Q. Zhao, W. Wei, S. Gu, J. Tao, and D. Meng. 2018. Video rain streak removal by multiscale convolutional sparse coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6644--6653.Google Scholar
- Y. Li, R. T. Tan, X. Guo, J. Lu, and M. S. Brown. 2016. Rain streak removal using layer priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 2736--2744.Google Scholar
- J. Liu, W. Yang, S. Yang, and Z. Guo. 2018. Erase or fill? Deep joint recurrent rain removal and reconstruction in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3233--3242.Google Scholar
- J. Liu, W. Yang, S. Yang, and Z. Guo. 2019. D3R-Net: Dynamic routing residue recurrent network for video rain removal. IEEE Trans. Image Proc. 28, 2 (2019), 699--712.Google Scholar
Digital Library
- A. Mittal, A. K. Moorthy, and A. C. Bovik. 2012. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Proc. 21, 12 (2012), 4695--4708.Google Scholar
Digital Library
- A. Mittal, R. Soundararajan, and A. C. Bovik. 2013. Making a “Completely Blind” image quality analyzer. IEEE Sig. Proc. Lett. 20, 3 (2013), 209--212.Google Scholar
Cross Ref
- Nai-Xiang Lian, V. Zagorodnov, and Yap-Peng Tan. 2006. Edge-preserving image denoising via optimal color space projection. IEEE Trans. Image Proc. 15, 9 (2006), 2575--2587.Google Scholar
Digital Library
- John F. Nash. 1950. Equilibrium points in n-person games. Proc. Nat. Acad. Sci. 36, 1 (1950), 48--49. DOI:DOI:https://doi.org/10.1073/pnas.36.1.48Google Scholar
Cross Ref
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of the Conference on Neural Information Processing Systems.Google Scholar
- Weihong Ren, Jiandong Tian, Zhi Han, Antoni Chan, and Yandong Tang. 2017. Video desnowing and deraining based on matrix decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google Scholar
Cross Ref
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. CoRR abs/1505.04597 (2015).Google Scholar
- Mehdi S. M. Sajjadi, Raviteja Vemulapalli, and Matthew Brown. 2018. Frame-recurrent video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google Scholar
Cross Ref
- Prasen Sharma, Priyankar Jain, and Arijit Sur. 2020. Scale-aware conditional generative adversarial network for image dehazing. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’20).Google Scholar
- P. K. Sharma, P. Jain, and A. Sur. 2019. Dual-domain single image de-raining using conditional generative adversarial network. In Proceedings of the IEEE International Conference on Image Processing (ICIP’19). 2796--2800.Google Scholar
- H. R. Sheikh and A. C. Bovik. 2006. Image information and visual quality. IEEE Trans. Image Proc. 15, 2 (2006), 430--444.Google Scholar
Digital Library
- Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations.Google Scholar
- S. Sun, S. Fan, and Y. F. Wang. 2014. Exploiting image structural similarity for single image rain removal. In Proceedings of the IEEE International Conference on Image Processing (ICIP’14). 4482--4486. DOI:10.1109/ICIP.2014.7025909Google Scholar
- N. Venkatanath, D. Praneeth, Maruthi Chandrasekhar Bh, S. S. Channappayya, and S. S. Medasani. 2015. Blind image quality evaluation using perception based features. In Proceedings of the 21st National Conference on Communications (NCC’15). 1--6.Google Scholar
- J. Vis, Peter Barnum, Srinivasa Narasimhan, and Takeo Kanade. 2010. Analysis of rain and snow in frequency space. Int. J. Comput. Vis. 86 (01 2010). DOI:DOI:https://doi.org/10.1007/s11263-008-0200-2Google Scholar
- Z. Wang, E. P. Simoncelli, and A. C. Bovik. 2003. Multiscale structural similarity for image quality assessment. In Proceedings of the 37th Asilomar Conference on Signals, Systems Computers. 1398--1402.Google Scholar
- Wei Wei, Lixuan Yi, Qi Xie, Qian Zhao, Deyu Meng, and Zongben Xu. 2017. Should we encode rain streaks in video as deterministic or stochastic? In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google Scholar
Cross Ref
- W. Xue, L. Zhang, X. Mou, and A. C. Bovik. 2014. Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Trans. Image Proc. 23, 2 (2014), 684--695.Google Scholar
Digital Library
- W. Yang, J. Liu, and J. Feng. 2019. Frame-consistent recurrent video deraining with dual-level flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 1661--1670.Google Scholar
- W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan. 2017. Deep joint rain detection and removal from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 1685--1694. DOI:DOI:https://doi.org/10.1109/CVPR.2017.183Google Scholar
- H. Zhang, V. Sindagi, and V. M. Patel. 2019. Image de-raining using a conditional generative adversarial network. IEEE Trans. Circ. Syst. Vid. Technol.Google Scholar
- H. Zhang, V. Sindagi, and V. M. Patel. 2020. Joint transmission map estimation and dehazing using deep networks. IEEE Trans. Circ. Syst. Vid. Technol. 30, 7 (2020), 1975--1986.Google Scholar
Digital Library
- L. Zhang, L. Zhang, X. Mou, and D. Zhang. 2011. FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Proc. 20, 8 (2011), 2378--2386.Google Scholar
Digital Library
- Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- X. Zhang, H. Li, Y. Qi, W. K. Leow, and T. K. Ng. 2006. Rain removal in video by combining temporal and chromatic properties. In Proceedings of the IEEE International Conference on Multimedia and Expo. 461--464.Google Scholar
- X. Zhang, H. Li, Y. Qi, W. K. Leow, and T. K. Ng. 2006. Rain removal in video by combining temporal and chromatic properties. In Proceedings of the IEEE International Conference on Multimedia and Expo. 461--464.Google Scholar
- Zhou Wang and A. C. Bovik. 2002. A universal image quality index. IEEE Sig. Proc. Lett. 9, 3 (2002), 81--84.Google Scholar
Cross Ref
- Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4 (2004), 600--612.Google Scholar
Digital Library
Index Terms
High-quality Frame Recurrent Video De-raining with Multi-contextual Adversarial Network
Recommendations
A Globally-Connected and Trainable Hierarchical Fine-Attention Generative Adversarial Network based Adversarial Defense
ICVGIP '22: Proceedings of the Thirteenth Indian Conference on Computer Vision, Graphics and Image ProcessingDeep Neural Network (DNN) inferences have been proven highly susceptible to carefully engineered adversarial perturbations, presenting a pivotal hindrance to real-world Computer Vision tasks. Most of the existing defenses have poor generalization ...
Multi-scale generative adversarial network for image super-resolution
AbstractIn recent years, deep convolutional neural networks (CNNs) have been widely employed in image super-resolution. Thanks to the power of deep CNNs, the reconstruction performance is largely improved. However, the high-frequency information and ...
Multi-level Wavelet-Based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video
Computer Vision – ECCV 2020AbstractThe past few years have witnessed fast development in video quality enhancement via deep learning. Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality. In this paper, we focus ...






Comments